Search | VHL Regional Portal

The NIH Open Citation Collection: A public access, broad coverage resource.

Hutchins, B Ian; Baker, Kirk L; Davis, Matthew T; Diwersy, Mario A; Haque, Ehsanul; Harriman, Robert M; Hoppe, Travis A; Leicht, Stephen A; Meyer, Payam; Santangelo, George M.

PLoS Biol ; 17(10): e3000385, 2019 10.

Article in English | MEDLINE | ID: mdl-31600197

ABSTRACT

Citation data have remained hidden behind proprietary, restrictive licensing agreements, which raises barriers to entry for analysts wishing to use the data, increases the expense of performing large-scale analyses, and reduces the robustness and reproducibility of the conclusions. For the past several years, the National Institutes of Health (NIH) Office of Portfolio Analysis (OPA) has been aggregating and enhancing citation data that can be shared publicly. Here, we describe the NIH Open Citation Collection (NIH-OCC), a public access database for biomedical research that is made freely available to the community. This dataset, which has been carefully generated from unrestricted data sources such as MedLine, PubMed Central (PMC), and CrossRef, now underlies the citation statistics delivered in the NIH iCite analytic platform. We have also included data from a machine learning pipeline that identifies, extracts, resolves, and disambiguates references from full-text articles available on the internet. Open citation links are available to the public in a major update of iCite (https://icite.od.nih.gov).

Subject(s)

Information Dissemination/ethics , National Institutes of Health (U.S.)/legislation & jurisprudence , Open Access Publishing/legislation & jurisprudence , Organizational Policy , Bibliometrics , Biomedical Research , Humans , Machine Learning , Manuscripts as Topic , National Institutes of Health (U.S.)/economics , Open Access Publishing/economics , United States

A Machine Learning Approach to Identify NIH-Funded Applied Prevention Research.

Villani, Jennifer; Schully, Sheri D; Meyer, Payam; Myles, Ranell L; Lee, Jocelyn A; Murray, David M; Vargas, Ashley J.

Am J Prev Med ; 55(6): 926-931, 2018 12.

Article in English | MEDLINE | ID: mdl-30458951

ABSTRACT

INTRODUCTION: To fulfill its mission, the NIH Office of Disease Prevention systematically monitors NIH investments in applied prevention research. Specifically, the Office focuses on research in humans involving primary and secondary prevention, and prevention-related methods. Currently, the NIH uses the Research, Condition, and Disease Categorization system to report agency funding in prevention research. However, this system defines prevention research broadly to include primary and secondary prevention, studies on prevention methods, and basic and preclinical studies for prevention. A new methodology was needed to quantify NIH funding in applied prevention research. METHODS: A novel machine learning approach was developed and evaluated for its ability to characterize NIH-funded applied prevention research during fiscal years 2012-2015. The sensitivity, specificity, positive predictive value, accuracy, and F1 score of the machine learning method; the Research, Condition, and Disease Categorization system; and a combined approach were estimated. Analyses were completed during June-August 2017. RESULTS: Because the machine learning method was trained to recognize applied prevention research, it more accurately identified applied prevention grants (F1â¯=â¯72.7%) than the Research, Condition, and Disease Categorization system (F1â¯=â¯54.4%) and a combined approach (F1â¯=â¯63.5%) with p<0.001. CONCLUSIONS: This analysis demonstrated the use of machine learning as an efficient method to classify NIH-funded research grants in disease prevention.

Subject(s)

Financing, Government/classification , Health Services Research/economics , Machine Learning , National Institutes of Health (U.S.) , Humans , Primary Prevention , Secondary Prevention , United States

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL