Pesquisa | Portal Regional da BVS

CORD-19: The Covid-19 Open Research Dataset.

Wang, Lucy Lu; Lo, Kyle; Chandrasekhar, Yoganand; Reas, Russell; Yang, Jiangjiang; Burdick, Douglas; Eide, Darrin; Funk, Kathryn; Katsis, Yannis; Kinney, Rodney; Li, Yunyao; Liu, Ziyang; Merrill, William; Mooney, Paul; Murdick, Dewey; Rishi, Devvret; Sheehan, Jerry; Shen, Zhihong; Stilson, Brandon; Wade, Alex D; Wang, Kuansan; Wang, Nancy Xin Ru; Wilhelm, Chris; Xie, Boya; Raymond, Douglas; Weld, Daniel S; Etzioni, Oren; Kohlmeier, Sebastian.

ArXiv ; 2020 Apr 22.

Artigo em Inglês | MEDLINE | ID: mdl-32510522

RESUMO

The Covid-19 Open Research Dataset (CORD-19) is a growing resource of scientific papers on Covid-19 and related historical coronavirus research. CORD-19 is designed to facilitate the development of text mining and information retrieval systems over its rich collection of metadata and structured full text papers. Since its release, CORD-19 has been downloaded over 200K times and has served as the basis of many Covid-19 text mining and discovery systems. In this article, we describe the mechanics of dataset construction, highlighting challenges and key design decisions, provide an overview of how CORD-19 has been used, and describe several shared tasks built around the dataset. We hope this resource will continue to bring together the computing community, biomedical experts, and policy makers in the search for effective treatments and management policies for Covid-19.

Improving the value of clinical research through the use of Common Data Elements.

Sheehan, Jerry; Hirschfeld, Steven; Foster, Erin; Ghitza, Udi; Goetz, Kerry; Karpinski, Joanna; Lang, Lisa; Moser, Richard P; Odenkirchen, Joanne; Reeves, Dianne; Rubinstein, Yaffa; Werner, Ellen; Huerta, Michael.

Clin Trials ; 13(6): 671-676, 2016 12.

Artigo em Inglês | MEDLINE | ID: mdl-27311638

RESUMO

The use of Common Data Elements can facilitate cross-study comparisons, data aggregation, and meta-analyses; simplify training and operations; improve overall efficiency; promote interoperability between different systems; and improve the quality of data collection. A Common Data Element is a combination of a precisely defined question (variable) paired with a specified set of responses to the question that is common to multiple datasets or used across different studies. Common Data Elements, especially when they conform to accepted standards, are identified by research communities from variable sets currently in use or are newly developed to address a designated data need. There are no formal international specifications governing the construction or use of Common Data Elements. Consequently, Common Data Elements tend to be made available by research communities on an empiric basis. Some limitations of Common Data Elements are that there may still be differences across studies in the interpretation and implementation of the Common Data Elements, variable validity in different populations, and inhibition by some existing research practices and the use of legacy data systems. Current National Institutes of Health efforts to support Common Data Element use are linked to the strengthening of National Institutes of Health Data Sharing policies and the investments in data repositories. Initiatives include cross-domain and domain-specific resources, construction of a Common Data Element Portal, and establishment of trans-National Institutes of Health working groups to address technical and implementation topics. The National Institutes of Health is seeking to lower the barriers to Common Data Element use through greater awareness and encourage the culture change necessary for their uptake and use. As National Institutes of Health, other agencies, professional societies, patient registries, and advocacy groups continue efforts to develop and promote the responsible use of Common Data Elements, particularly if linked to accepted data standards and terminologies, continued engagement with and feedback from the research community will remain important.

Assuntos

Pesquisa Biomédica , Elementos de Dados Comuns , Disseminação de Informação , Coleta de Dados , Humanos , National Institutes of Health (U.S.) , Estados Unidos

Opportunities and challenges in the use of personal health data for health research.

Bietz, Matthew J; Bloss, Cinnamon S; Calvert, Scout; Godino, Job G; Gregory, Judith; Claffey, Michael P; Sheehan, Jerry; Patrick, Kevin.

J Am Med Inform Assoc ; 23(e1): e42-8, 2016 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-26335984

RESUMO

OBJECTIVE: Understand barriers to the use of personal health data (PHD) in research from the perspective of three stakeholder groups: early adopter individuals who track data about their health, researchers who may use PHD as part of their research, and companies that market self-tracking devices, apps or services, and aggregate and manage the data that are generated. MATERIALS AND METHODS: A targeted convenience sample of 465 individuals and 134 researchers completed an extensive online survey. Thirty-five hour-long semi-structured qualitative interviews were conducted with a subset of 11 individuals and 9 researchers, as well as 15 company/key informants. RESULTS: Challenges to the use of PHD for research were identified in six areas: data ownership; data access for research; privacy; informed consent and ethics; research methods and data quality; and the unpredictable nature of the rapidly evolving ecosystem of devices, apps, and other services that leave "digital footprints." Individuals reported willingness to anonymously share PHD if it would be used to advance research for the good of the public. Researchers were enthusiastic about using PHD for research, but noted barriers related to intellectual property, licensing, and the need for legal agreements with companies. Companies were interested in research but stressed that their first priority was maintaining customer relationships. CONCLUSION: Although challenges exist in leveraging PHD for research, there are many opportunities for stakeholder engagement, and experimentation with these data is already taking place. These early examples foreshadow a much larger set of activities with the potential to positively transform how health research is conducted.

Assuntos

Pesquisa Biomédica , Registros de Saúde Pessoal , Disseminação de Informação , Conjuntos de Dados como Assunto , Feminino , Humanos , Entrevistas como Assunto , Masculino , Inquéritos e Questionários , Telemedicina

Sizing the Problem of Improving Discovery and Access to NIH-Funded Data: A Preliminary Study.

Read, Kevin B; Sheehan, Jerry R; Huerta, Michael F; Knecht, Lou S; Mork, James G; Humphreys, Betsy L.

PLoS One ; 10(7): e0132735, 2015.

Artigo em Inglês | MEDLINE | ID: mdl-26207759

RESUMO

OBJECTIVE: This study informs efforts to improve the discoverability of and access to biomedical datasets by providing a preliminary estimate of the number and type of datasets generated annually by research funded by the U.S. National Institutes of Health (NIH). It focuses on those datasets that are "invisible" or not deposited in a known repository. METHODS: We analyzed NIH-funded journal articles that were published in 2011, cited in PubMed and deposited in PubMed Central (PMC) to identify those that indicate data were submitted to a known repository. After excluding those articles, we analyzed a random sample of the remaining articles to estimate how many and what types of invisible datasets were used in each article. RESULTS: About 12% of the articles explicitly mention deposition of datasets in recognized repositories, leaving 88% that are invisible datasets. Among articles with invisible datasets, we found an average of 2.9 to 3.4 datasets, suggesting there were approximately 200,000 to 235,000 invisible datasets generated from NIH-funded research published in 2011. Approximately 87% of the invisible datasets consist of data newly collected for the research reported; 13% reflect reuse of existing data. More than 50% of the datasets were derived from live human or non-human animal subjects. CONCLUSION: In addition to providing a rough estimate of the total number of datasets produced per year by NIH-funded researchers, this study identifies additional issues that must be addressed to improve the discoverability of and access to biomedical research data: the definition of a "dataset," determination of which (if any) data are valuable for archiving and preservation, and better methods for estimating the number of datasets of interest. Lack of consensus amongst annotators about the number of datasets in a given article reinforces the need for a principled way of thinking about how to identify and characterize biomedical datasets.

Assuntos

Pesquisa Biomédica/economia , National Institutes of Health (U.S.)/organização & administração , Editoração/organização & administração , Acesso à Informação , Pesquisa Biomédica/organização & administração , Bases de Dados Bibliográficas , Humanos , National Institutes of Health (U.S.)/economia , Estados Unidos

The proposed rule for U.S. clinical trial registration and results submission.

Zarin, Deborah A; Tse, Tony; Sheehan, Jerry.

N Engl J Med ; 372(2): 174-80, 2015 Jan 08.

Artigo em Inglês | MEDLINE | ID: mdl-25539444

Assuntos

Ensaios Clínicos como Assunto/legislação & jurisprudência , Bases de Dados Factuais , National Institutes of Health (U.S.) , Sistema de Registros , Ensaios Clínicos como Assunto/normas , Aprovação de Equipamentos/legislação & jurisprudência , Aprovação de Equipamentos/normas , Aprovação de Drogas/legislação & jurisprudência , Regulamentação Governamental , Estados Unidos , United States Food and Drug Administration

The National Institutes of Health's Big Data to Knowledge (BD2K) initiative: capitalizing on biomedical big data.

Margolis, Ronald; Derr, Leslie; Dunn, Michelle; Huerta, Michael; Larkin, Jennie; Sheehan, Jerry; Guyer, Mark; Green, Eric D.

J Am Med Inform Assoc ; 21(6): 957-8, 2014.

Artigo em Inglês | MEDLINE | ID: mdl-25008006

RESUMO

Biomedical research has and will continue to generate large amounts of data (termed 'big data') in many formats and at all levels. Consequently, there is an increasing need to better understand and mine the data to further knowledge and foster new discovery. The National Institutes of Health (NIH) has initiated a Big Data to Knowledge (BD2K) initiative to maximize the use of biomedical big data. BD2K seeks to better define how to extract value from the data, both for the individual investigator and the overall research community, create the analytic tools needed to enhance utility of the data, provide the next generation of trained personnel, and develop data science concepts and tools that can be made available to all stakeholders.

Assuntos

Pesquisa Biomédica , Conjuntos de Dados como Assunto , National Institutes of Health (U.S.) , Pesquisa Translacional Biomédica , Estados Unidos

Toward more transparent and reproducible omics studies through a common metadata checklist and data publications.

Kolker, Eugene; Özdemir, Vural; Martens, Lennart; Hancock, William; Anderson, Gordon; Anderson, Nathaniel; Aynacioglu, Sukru; Baranova, Ancha; Campagna, Shawn R; Chen, Rui; Choiniere, John; Dearth, Stephen P; Feng, Wu-Chun; Ferguson, Lynnette; Fox, Geoffrey; Frishman, Dmitrij; Grossman, Robert; Heath, Allison; Higdon, Roger; Hutz, Mara H; Janko, Imre; Jiang, Lihua; Joshi, Sanjay; Kel, Alexander; Kemnitz, Joseph W; Kohane, Isaac S; Kolker, Natali; Lancet, Doron; Lee, Elaine; Li, Weizhong; Lisitsa, Andrey; Llerena, Adrian; Macnealy-Koch, Courtney; Marshall, Jean-Claude; Masuzzo, Paola; May, Amanda; Mias, George; Monroe, Matthew; Montague, Elizabeth; Mooney, Sean; Nesvizhskii, Alexey; Noronha, Santosh; Omenn, Gilbert; Rajasimha, Harsha; Ramamoorthy, Preveen; Sheehan, Jerry; Smarr, Larry; Smith, Charles V; Smith, Todd; Snyder, Michael.

OMICS ; 18(1): 10-4, 2014 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-24456465

RESUMO

Biological processes are fundamentally driven by complex interactions between biomolecules. Integrated high-throughput omics studies enable multifaceted views of cells, organisms, or their communities. With the advent of new post-genomics technologies, omics studies are becoming increasingly prevalent; yet the full impact of these studies can only be realized through data harmonization, sharing, meta-analysis, and integrated research. These essential steps require consistent generation, capture, and distribution of metadata. To ensure transparency, facilitate data harmonization, and maximize reproducibility and usability of life sciences studies, we propose a simple common omics metadata checklist. The proposed checklist is built on the rich ontologies and standards already in use by the life sciences community. The checklist will serve as a common denominator to guide experimental design, capture important parameters, and be used as a standard format for stand-alone data publications. The omics metadata checklist and data publications will create efficient linkages between omics data and knowledge-based life sciences innovation and, importantly, allow for appropriate attribution to data generators and infrastructure science builders in the post-genomics era. We ask that the life sciences community test the proposed omics metadata checklist and data publications and provide feedback for their use and improvement.

Assuntos

Disseminação de Informação/ética , Metagenômica/estatística & dados numéricos , Projetos de Pesquisa/normas , Mineração de Dados , Humanos , Metagenômica/economia , Metagenômica/tendências , Editoração , Reprodutibilidade dos Testes

Toward More Transparent and Reproducible Omics Studies Through a Common Metadata Checklist and Data Publications.

Kolker, Eugene; Özdemir, Vural; Martens, Lennart; Hancock, William; Anderson, Gordon; Anderson, Nathaniel; Aynacioglu, Sukru; Baranova, Ancha; Campagna, Shawn R; Chen, Rui; Choiniere, John; Dearth, Stephen P; Feng, Wu-Chun; Ferguson, Lynnette; Fox, Geoffrey; Frishman, Dmitrij; Grossman, Robert; Heath, Allison; Higdon, Roger; Hutz, Mara H; Janko, Imre; Jiang, Lihua; Joshi, Sanjay; Kel, Alexander; Kemnitz, Joseph W; Kohane, Isaac S; Kolker, Natali; Lancet, Doron; Lee, Elaine; Li, Weizhong; Lisitsa, Andrey; Llerena, Adrian; MacNealy-Koch, Courtney; Marshall, Jean-Claude; Masuzzo, Paola; May, Amanda; Mias, George; Monroe, Matthew; Montague, Elizabeth; Mooney, Sean; Nesvizhskii, Alexey; Noronha, Santosh; Omenn, Gilbert; Rajasimha, Harsha; Ramamoorthy, Preveen; Sheehan, Jerry; Smarr, Larry; Smith, Charles V; Smith, Todd; Snyder, Michael.

Big Data ; 1(4): 196-201, 2013 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-27447251

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA