ALBERT

All Library Books, journals and Electronic Records Telegrafenberg

feed icon rss

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
Filter
  • Frictionless Data  (6)
  • Best practices  (3)
  • Repository  (2)
  • 1
    Publication Date: 2022-05-25
    Description: Presented at AGU Fall Meeting, American Geophysical Union, Washington, D.C., 10 – 14 Dec 2018
    Description: Data repositories often transform submissions to improve understanding and reuse of data by researchers other than the original submitter. However, scientific workflows built by the data submitters often depend on the original data format. In some cases, this makes the repository’s final data product less useful to the submitter. As a result, these two workable but different versions of the data provide value to two disparate, non-interoperable research communities around what should be a single dataset. Data repositories could bridge these two communities by exposing provenance explaining the transform from original submission to final product. A subsequent benefit of this provenance would be the transparent value-add of domain repository data curation. To improve its data management process efficiency, the Biological and Chemical Oceanography Data Management Office (BCO-DMO, https://www.bco-dmo.org) has been adopting the data containerization specification defined by the Frictionless Data project (https://frictionlessdata.io). Recently, BCO-DMO has been using the Frictionless Data Package Pipelines Python library (https://github.com/frictionlessdata/datapackage-pipelines) to capture the data curation processing steps that transform original submissions to final data products. Because these processing steps are stored using a declarative language they can be converted to a structured provenance record using the Provenance Ontology (PROV-O, https://www.w3.org/TR/prov-o/). PROV-O abstracts the Frictionless Data elements of BCO-DMO’s workflow for capturing necessary curation provenance and enables interoperability with other external provenance sources and tools. Users who are familiar with PROV-O or the Frictionless Data Pipelines can use either record to reproduce the final data product in a machine-actionable way. While there may still be some curation steps that cannot be easily automated, this process is a step towards end-to-end reproducible transforms throughout the data curation process. In this presentation, BCO-DMO will demonstrate how Frictionless Data Package Pipelines can be used to capture data curation provenance from original submission to final data product exposing the concrete value-add of domain-specific repositories.
    Description: NSF #1435578
    Keywords: Provenance ; Frictionless Data ; Data management
    Repository Name: Woods Hole Open Access Server
    Type: Presentation
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 2
    facet.materialart.
    Unknown
    Woods Hole Oceanographic Institution
    Publication Date: 2022-10-21
    Description: Presented at Ocean Sciences Meeting 2022, Virtual, February 24 - March 3, 2022
    Description: Many of the challenges currently associated with sharing oceanographic data currently facing researchers and the repositories through which they share their data, are cultural rather than technical. This talk presents an overview of obstacles and opportunities related to data sharing within the oceanographic community.
    Description: NSF-1924618
    Keywords: Oceanographic data management ; Data sharing ; Repository ; FAIR ; Best practices
    Repository Name: Woods Hole Open Access Server
    Type: Presentation
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 3
    Publication Date: 2022-10-21
    Description: Presented at Data Science Training Camp, Woods Hole, MA, January, 22 - 23, 2020.
    Description: With data and software increasingly recognized as scholarly research products, and aiming towards open science and reproducibility, it is imperative for today's oceanographers to learn foundational practices and skills for data management and research computing, as well as practices specific to the ocean sciences. This educational package was developed as a data science training camp for graduate students and professionals in the ocean sciences and implemented at the Woods Hole Oceanographic Institution (WHOI) in 2019 and 2020. Here we provide materials for the 2020 camp which was delivered in-person during two afternoons (total of 8 hours), with two modules per afternoon. We aimed for ~40 participants per camp, with disciplines spanning Earth and life sciences and engineering. Disciplines at each table were mixed on the first afternoon but similar on the second afternoon. Contents of this package include the syllabus and slide presentations for each of the four modules: 1 "Good enough practices in scientific computing," 2 Data management, 3 Software development and research computing, and 4 Best practices in the ocean sciences. The 3rd module is split into two parts. We also include a poster presented at the 2020 Ocean Science Meeting, which has some results from pre- and post-surveys. Funding: The camp was funded by WHOI Academic Programs Office through a Doherty Chair in Education Award, with additional support from WHOI Ocean Informatics Working Group, WHOI Information Services, MBLWHOI Library, the NSF-funded Biological and Chemical Oceanography Data Management Office (BCO-DMO), and an NSF-funded XSEDE Jetstream Education Allocation TG-OCE190011. We also utilized resources from the NSF-funded Pangeo project.
    Description: The camp was funded by WHOI Academic Programs Office through a Doherty Chair in Education Award, with additional support from WHOI Ocean Informatics Working Group, WHOI Information Services, MBLWHOI Library, the NSF-funded Biological and Chemical Oceanography Data Management Office (BCO-DMO), and an NSF-funded XSEDE Jetstream Education Allocation TG-OCE190011. We also utilized resources from the NSF-funded Pangeo project.
    Keywords: Data science ; Best practices ; Data management ; Scientific computing ; Training ; Workshop ; Professional development ; Curriculum development
    Repository Name: Woods Hole Open Access Server
    Type: Presentation
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 4
    Publication Date: 2022-10-21
    Description: Presented as Sharing Data Through the Biological and Chemical Oceanography Data Management Office Webinar, online, January, 2020
    Description: This talk provides an overview of the Biological and Chemical Oceanography Data Management Office and the collaborative data sharing process that occurs between individual investigators and the BCO-DMO repository. The presentation includes background on the repository, what to expect after submitting your data, and helpful data management practices that can streamline data sharing and support open science.
    Description: GBMF #8453
    Keywords: Marine Microbiology Initiative ; Data management ; Data sharing ; Repository overview ; Best practices
    Repository Name: Woods Hole Open Access Server
    Type: Presentation
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 5
    Publication Date: 2022-10-31
    Description: Dataset: Share Your Thoughts
    Description: Oceanographic data, when well-documented and stewarded toward preservation, have the potential to accelerate new science and facilitate our understanding of complex natural systems. The Biological and Chemical Oceanography Data Management Office (BCO-DMO) is funded by the NSF to document and manage marine biological, chemical, physical, and biogeochemical data, ensuring their discovery and access, and facilitating their reuse. The task of curating and providing access to research data is a collaborative process, with associated actors and critical activities occurring throughout the data’s life cycle. BCO-DMO supports all phases of the data life cycle and works closely with investigators to ensure open access of well-documented project data and information. Supporting this curation process is a flexible cyberinfrastructure that provides the means for data submission, discovery, and access; ultimately enabling reuse. Based upon community feedback, this infrastructure is undergoing evaluation and improvement to better meet oceanographic research needs. This poster will introduce the repository and describe some of the strategic enhancements coming to BCO-DMO, and presents an opportunity for you to provide feedback on enhancements yet to come. We invite you to think about your own research workflow of searching and accessing new data for research, and to provide your feedback through the poster’s interactive sections. Your input can help BCO-DMO improve its service to the research community. For a complete list of measurements, refer to the full dataset description in the supplemental file 'Dataset_description.pdf'. The most current version of this dataset is available at: https://www.bco-dmo.org/dataset/825238
    Description: NSF Division of Ocean Sciences (NSF OCE) OCE-1924618
    Keywords: Data management. stakeholder needs ; Oceanography ; BCO-DMO ; Repository ; Community building
    Repository Name: Woods Hole Open Access Server
    Type: Dataset
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 6
    Publication Date: 2022-05-26
    Description: Presented at Data Curation Network, May 15, 2020
    Description: At domain-specific data repositories, curation that strives for FAIR principles often entails transforming data submissions to improve understanding and reuse. The Biological and Chemical Oceanography Data Management Office (BCO-DMO, https://www.bco-dmo.org) has been adopting the data containerization specification of the Frictionless Data project (https://frictionlessdata.io) in an effort to improve its data curation process efficiency. In doing so, BCO-DMO has been using the Frictionless Data Package Pipelines library (https://github.com/frictionlessdata/datapackage-pipelines) to define the processing steps that transform original submissions to final data products. Because these pipelines are defined using a declarative language they can be serialized into formal provenance data structures using the Provenance Ontology (PROV-O, https://www.w3.org/TR/prov-o/). While there may still be some curation steps that cannot be easily automated, this method is a step towards reproducible transforms that bridge the original data submission to its published state in machine-actionable ways that benefit the research community through transparency in the data curation process. BCO-DMO has built a user interface on top of these modular tools for making it easer for data managers to process submission, reuse existing workflows, and make transparent the added value of domain-specific data curation.
    Description: NSF #1924618
    Keywords: Data Curation ; Provenance ; Workflows ; Frictionless Data ; Data management ; Data repository
    Repository Name: Woods Hole Open Access Server
    Type: Presentation
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 7
    Publication Date: 2022-05-26
    Description: Presented at USGS Data Management Working Group, 9, November 2020
    Description: At domain-specific data repositories, curation that strives for FAIR principles often entails transforming data submissions to improve understanding and reuse. The Biological and Chemical Oceanography Data Management Office (BCO-DMO, https://www.bco-dmo.org) has been adopting the data containerization specification of the Frictionless Data project (https://frictionlessdata.io) in an effort to improve its data curation process efficiency. In doing so, BCO-DMO has been using the Frictionless Data Package Pipelines library (https://github.com/frictionlessdata/datapackage-pipelines) to define the processing steps that transform original submissions to final data products. Because these pipelines are defined using a declarative language they can be serialized into formal provenance data structures using the Provenance Ontology (PROV-O, https://www.w3.org/TR/prov-o/). While there may still be some curation steps that cannot be easily automated, this method is a step towards reproducible transforms that bridge the original data submission to its published state in machine-actionable ways that benefit the research community through transparency in the data curation process. BCO-DMO has built a user interface on top of these modular tools for making it easier for data managers to process submission, reuse existing workflows, and make transparent the added value of domain-specific data curation.
    Description: NSF #1924618
    Keywords: Data Curation ; Provenance ; Workflows ; Frictionless Data ; Data management ; Data repository
    Repository Name: Woods Hole Open Access Server
    Type: Presentation
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 8
    Publication Date: 2022-05-26
    Description: Presented at AGU Ocean Sciences, 11 - 16 February 2018, Portland, OR
    Description: At the Biological and Chemical Oceanography Data Management Office (BCO-DMO) Big Data challenges have been steadily increasing. The sizes of data submissions have grown as instrumentation improves. Complex data types can sometimes be stored across different repositories . This signals a paradigm shift where data and information that is meant to be tightly-coupled and has traditionally been stored under the same roof is now distributed across repositories and data stores. For domain-specific repositories like BCO-DMO, a new mechanism for assembling data, metadata and supporting documentation is needed. Traditionally, data repositories have relied on a human's involvement throughout discovery and access workflows. This human could assess fitness for purpose by reading loosely coupled, unstructured information from web pages and documentation. Distributed storage was something that could be communicated in text that a human could read and understand. However, as machines play larger roles in the process of discovery and access of data, distributed resources must be described and packaged in ways that fit into machine automated workflows of discovery and access for assessing fitness for purpose by the end-user. Once machines have recommended a data resource as relevant to an investigator's needs, the data should be easy to integrate into that investigator's toolkits for analysis and visualization. BCO-DMO is exploring the idea of data containerization, or packaging data and related information for easier transport, interpretation, and use. Data containerization reduces not only the friction data repositories experience trying to describe complex data resources, but also for end-users trying to access data with their own toolkits. In researching the landscape of data containerization, the Frictionlessdata Data Package (http://frictionlessdata.io/) provides a number of valuable advantages over similar solutions. This presentation will focus on these advantages and how the Frictionlessdata Data Package addresses a number of real-world use cases faced for data discovery, access, analysis and visualization in the age of Big Data.
    Description: NSF #1435578, NSF #1639714
    Keywords: Frictionless Data ; Data management ; Data exchange ; Data Transport ; Distributed data ; Data tools ; Big data
    Repository Name: Woods Hole Open Access Server
    Type: Presentation
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 9
    Publication Date: 2022-05-26
    Description: Presented at FORCE2018 Conference, Montreal, Canada, October 10-12, 2018. FORCE: Future of Research Communications and e-Scholarship
    Description: At domain-specific data repositories, curation that strives for FAIR principles often entails transforming data submissions to improve understanding and reuse. The Biological and Chemical Oceanography Data Management Office (BCO-DMO, https://www.bco-dmo.org) has been adopting the data containerization specification of the Frictionless Data project (https://frictionlessdata.io) in an effort to improve its data curation process efficiency. In doing so, BCO-DMO has been using the Frictionless Data Package Pipelines library (https://github.com/frictionlessdata/datapackage-pipelines) to define the processing steps that transform original submissions to final data products. Because these pipelines are defined using a declarative language they can be serialized into formal provenance data structures using the Provenance Ontology (PROV-O, https://www.w3.org/TR/prov-o/). While there may still be some curation steps that cannot be easily automated, this method is a step towards reproducible transforms that bridge the original data submission to its published state in machine-actionable ways that benefit the research community through transparency in the data curation process.
    Description: NSF #1435578
    Keywords: Frictionless Data ; Data management ; Provenance ; Data repository ; Worfklow
    Repository Name: Woods Hole Open Access Server
    Type: Presentation
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 10
    Publication Date: 2022-05-26
    Description: Presented at the Fall AGU Meeting, New Orleans, LA, 11-15 December 2017
    Description: As cross-disciplinary geoscience research increasingly relies on machines to discover and access data, one of the critical questions facing data repositories is how data and supporting materials should be packaged for consumption. Traditionally, data repositories have relied on a human's involvement throughout discovery and access workflows. This human could assess fitness for purpose by reading loosely coupled, unstructured information from web pages and documentation. In attempts to shorten the time to science and access data resources across may disciplines, expectations for machines to mediate the process of discovery and access is challenging data repository infrastructure. This challenge is to find ways to deliver data and information in ways that enable machines to make better decisions by enabling them to understand the data and metadata of many data types. Additionally, once machines have recommended a data resource as relevant to an investigator's needs, the data resource should be easy to integrate into that investigator's toolkits for analysis and visualization. The Biological and Chemical Oceanography Data Management Office (BCO-DMO) supports NSF-funded OCE and PLR investigators with their project's data management needs. These needs involve a number of varying data types some of which require multiple files with differing formats. Presently, BCO-DMO has described these data types and the important relationships between the type's data files through human-readable documentation on web pages. For machines directly accessing data files from BCO-DMO, this documentation could be overlooked and lead to misinterpreting the data. Instead, BCO-DMO is exploring the idea of data containerization, or packaging data and related information for easier transport, interpretation, and use. In researching the landscape of data containerization, the Frictionlessdata Data Package (http://frictionlessdata.io/) provides a number of valuable advantages over similar solutions. This presentation will focus on these advantages and how the Frictionlessdata Data Package addresses a number of real-world use cases faced for data discovery, access, analysis and visualization.
    Description: National Science Foundation Award #1435578, Award #1639714
    Keywords: Frictionless Data ; Data management ; Data workflows ; Data transport
    Repository Name: Woods Hole Open Access Server
    Type: Presentation
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. More information can be found here...