RESUMEN
Journal editors have a large amount of power to advance open science in their respective fields by incentivising and mandating open policies and practices at their journals. The Data PASS Journal Editors Discussion Interface (JEDI, an online community for social science journal editors: www.dpjedi.org ) has collated several resources on embedding open science in journal editing ( www.dpjedi.org/resources ). However, it can be overwhelming as an editor new to open science practices to know where to start. For this reason, we created a guide for journal editors on how to get started with open science. The guide outlines steps that editors can take to implement open policies and practices within their journal, and goes through the what, why, how, and worries of each policy and practice. This manuscript introduces and summarizes the guide (full guide: https://doi.org/10.31219/osf.io/hstcx ).
RESUMEN
In 2010, ICPSR began a long process of recovering data from Gordon Streib's Cornell Study of Occupational Retirement (CSOR). Because these unique data fill a gap in our understanding of US retirement history, we determined that an extensive data recovery project was warranted. This paper describes the scope of the data collection and the steps in ICPSR's recovery process. Though the data recovery was ultimately successful, this paper documents the amount of time invested and costs associated with this kind of recovery work. It also highlights the value of these data for future research in understanding gender and retirement in a historic context. In addition to the resulting publicly available data arising from this project, extensive paper medical records are housed at ICPSR for on-site analysis or for a future digitization project. These data would provide unique health information on older women and men traced over a period of time in the 1950s and represents future work for ICPSR to undertake.
RESUMEN
The DAta Tag Suite (DATS) is a model supporting dataset description, indexing, and discovery. It is available as an annotated serialization with schema.org, a vocabulary used by major search engines, thus making the datasets discoverable on the web. DATS underlies DataMed, the National Institutes of Health Big Data to Knowledge Data Discovery Index prototype, which aims to provide a "PubMed for datasets." The experience gained while indexing a heterogeneous range of >60 repositories in DataMed helped in evaluating DATS's entities, attributes, and scope. In this work, 3 additional exemplary and diverse data sources were mapped to DATS by their representatives or experts, offering a deep scan of DATS fitness against a new set of existing data. The procedure, including feedback from users and implementers, resulted in DATS implementation guidelines and best practices, and identification of a path for evolving and optimizing the model. Finally, the work exposed additional needs when defining datasets for indexing, especially in the context of clinical and observational information.
Asunto(s)
Indización y Redacción de Resúmenes , Conjuntos de Datos como Asunto , Alergia e Inmunología , Atención a la Salud , Humanos , Almacenamiento y Recuperación de la Información , Motor de Búsqueda , Ciencias Sociales , Vocabulario ControladoRESUMEN
Today's science increasingly requires effective ways to find and access existing datasets that are distributed across a range of repositories. For researchers in the life sciences, discoverability of datasets may soon become as essential as identifying the latest publications via PubMed. Through an international collaborative effort funded by the National Institutes of Health (NIH)'s Big Data to Knowledge (BD2K) initiative, we have designed and implemented the DAta Tag Suite (DATS) model to support the DataMed data discovery index. DataMed's goal is to be for data what PubMed has been for the scientific literature. Akin to the Journal Article Tag Suite (JATS) used in PubMed, the DATS model enables submission of metadata on datasets to DataMed. DATS has a core set of elements, which are generic and applicable to any type of dataset, and an extended set that can accommodate more specialized data types. DATS is a platform-independent model also available as an annotated serialization in schema.org, which in turn is widely used by major search engines like Google, Microsoft, Yahoo and Yandex.