Your browser doesn't support javascript.
loading
Multi-layer framework of identifying placenta related research towards Placenta Curated Research Dataset (PCRD) development for the PAT project.
Zhu, Qian; Frierson, Shanna; Francis, Alicia; Rogers, Lydia; Lyman, Daniel.
Afiliação
  • Zhu Q; Division of Pre-Clinical Innovation, National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, MD 20850, USA. Electronic address: Qian.Zhu@nih.gov.
  • Frierson S; Bloomberg Government, 1101 K St NW #500, Washington, DC 20005, USA. Electronic address: sfrierson@bgov.com.
  • Francis A; Public Health Science, Social & Scientific Systems, Inc., Silver Spring, MD 20910, USA.
  • Rogers L; Public Health Science, Social & Scientific Systems, Inc., Silver Spring, MD 20910, USA.
  • Lyman D; Public Health Science, Social & Scientific Systems, Inc., Silver Spring, MD 20910, USA.
J Biomed Inform ; 94: 103191, 2019 06.
Article em En | MEDLINE | ID: mdl-31048073
BACKGROUND: The placenta is a maternal-fetal organ that develops during pregnancy and provides nutrients, oxygen, and removal of waste products to the growing fetus. Better understanding of the placenta promises to help improve health of mothers and children, given its influence on health lasting a lifetime. However, the placenta is poorly understood due to its variability across different species and no live functions available after the baby is delivered. The Placenta Atlas Tool (PAT) project aims to leverage advanced computational approaches to meld numerous and diverse datasets into an integrated resource to encourage a "systems biology" approach for study of both normal and abnormal placental development and function throughout gestation. METHODS: In this study, we introduced a multi-layer framework to automatically identify PAT relevant research from PubMed and develop a Placenta Curated Research Dataset (PCRD) to ultimately support placenta research. This framework functions by multiple well-known Natural Language Processing (NLP) components; including Medical Subject Headings (MeSH) based Naïve Bayes classifier, abstract based text similarity comparison and MeSH based article prioritization to systematically filter out PAT relevant research publications for further data curation. In addition, we developed a user-friendly web application to incorporate human judgement at the final stage of publication identification. RESULTS: We obtained 22,047 articles from PubMed, and programmatically identified 6086 articles that are highly relevant to PAT via our framework. To assess performance of the framework, we manually reviewed a random set of articles by using our web tool. Based on our review, accuracy of article classification is greater than 90% and accuracy of prioritization is greater than 80%. CONCLUSIONS: We developed a multi-layer publication identification framework to systematically identify PAT relevant publications from PubMed. This framework not only demonstrates good performance in identifying placenta related research, but also can be easily extended to support research in other scientific fields.
Assuntos
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Placenta / Teorema de Bayes / Pesquisa Biomédica / Conjuntos de Dados como Assunto Limite: Female / Humans / Pregnancy Idioma: En Ano de publicação: 2019 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Placenta / Teorema de Bayes / Pesquisa Biomédica / Conjuntos de Dados como Assunto Limite: Female / Humans / Pregnancy Idioma: En Ano de publicação: 2019 Tipo de documento: Article