Your browser doesn't support javascript.
loading
Quantitative monitoring of nucleotide sequence data from genetic resources in context of their citation in the scientific literature.
Lange, Matthias; Alako, Blaise T F; Cochrane, Guy; Ghaffar, Mehmood; Mascher, Martin; Habekost, Pia-Katharina; Hillebrand, Upneet; Scholz, Uwe; Schorch, Florian; Freitag, Jens; Scholz, Amber Hartman.
Afiliación
  • Lange M; Leibniz Institute of Plant Genetics and Crop Plant Research, Department Breeding Research, OT Gatersleben, Corrensstrasse 3, 06466 Seeland, Germany.
  • Alako BTF; European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
  • Cochrane G; European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
  • Ghaffar M; Leibniz Institute of Plant Genetics and Crop Plant Research, Department Breeding Research, OT Gatersleben, Corrensstrasse 3, 06466 Seeland, Germany.
  • Mascher M; Leibniz Institute of Plant Genetics and Crop Plant Research, Department Breeding Research, OT Gatersleben, Corrensstrasse 3, 06466 Seeland, Germany.
  • Habekost PK; German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Puschstraße 4, 04103 Leipzig, Germany.
  • Hillebrand U; Leibniz Institute of Plant Genetics and Crop Plant Research, Department Breeding Research, OT Gatersleben, Corrensstrasse 3, 06466 Seeland, Germany.
  • Scholz U; The Harz University of Applied Science, Department of Automation and Computer Science, Friedrichstraße 57, 38855 Wernigerode, Germany.
  • Schorch F; Leibniz Institute DSMZ-German Collection of Microorganisms and Cell Cultures GmbH, Department Research - Microbial Ecology and Diversity, Inhoffenstraße 7B, 38124 Braunschweig, Germany.
  • Freitag J; Leibniz Institute of Plant Genetics and Crop Plant Research, Department Breeding Research, OT Gatersleben, Corrensstrasse 3, 06466 Seeland, Germany.
  • Scholz AH; Leibniz Institute of Plant Genetics and Crop Plant Research, Department Breeding Research, OT Gatersleben, Corrensstrasse 3, 06466 Seeland, Germany.
Gigascience ; 10(12)2021 12 29.
Article en En | MEDLINE | ID: mdl-34966925
ABSTRACT

BACKGROUND:

Linking nucleotide sequence data (NSD) to scientific publication citations can enhance understanding of NSD provenance, scientific use, and reuse in the community. By connecting publications with NSD records, NSD geographical provenance information, and author geographical information, it becomes possible to assess the contribution of NSD to infer trends in scientific knowledge gain at the global level.

FINDINGS:

We extracted and linked records from the European Nucleotide Archive to citations in open-access publications aggregated at Europe PubMed Central. A total of 8,464,292 ENA accessions with geographical provenance information were associated with publications. We conducted a data quality review to uncover potential issues in publication citation information extraction and author affiliation tagging and developed and implemented best-practice recommendations for citation extraction. We constructed flat data tables and a data warehouse with an interactive web application to enable ad hoc exploration of NSD use and summary statistics.

CONCLUSIONS:

The extraction and linking of NSD with associated publication citations enables transparency. The quality review contributes to enhanced text mining methods for identifier extraction and use. Furthermore, the global provision and use of NSD enable scientists worldwide to join literature and sequence databases in a multidimensional fashion. As a concrete use case, we visualized statistics of country clusters concerning NSD access in the context of discussions around digital sequence information under the United Nations Convention on Biological Diversity.
Asunto(s)
Palabras clave

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Minería de Datos / Nucleótidos Tipo de estudio: Guideline / Prognostic_studies País/Región como asunto: Europa Idioma: En Revista: Gigascience Año: 2021 Tipo del documento: Article País de afiliación: Alemania

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Minería de Datos / Nucleótidos Tipo de estudio: Guideline / Prognostic_studies País/Región como asunto: Europa Idioma: En Revista: Gigascience Año: 2021 Tipo del documento: Article País de afiliación: Alemania