Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 44
Filtrar
1.
Methods Mol Biol ; 2744: 7-32, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38683309

RESUMEN

This chapter on the history of the DNA barcoding enterprise attempts to set the stage for the more scholarly contributions in this volume by addressing the following questions. How did the DNA barcoding enterprise begin? What were its goals, how did it develop, and to what degree are its goals being realized? We have taken a keen interest in the barcoding movement and its relationship to taxonomy, collections, and biodiversity informatics more broadly considered. This chapter integrates our two different perspectives on barcoding. DES was the Executive Secretary of the Consortium for the Barcode of Life from 2004 to 2017, with the mission to support the success of DNA barcoding without being directly involved in generating barcode data. RDMP viewed barcoding as an important entry into the landscape of biodiversity data, with many potential linkages to other components of that landscape. We also saw it as a critical step toward the era of international genomic research that was sure to follow. Like the Mercury Program that paved the way for lunar landings by the Apollo Program, we saw DNA barcoding as the proving grounds for the interdisciplinary and international cooperation that would be needed for success of whole-genome research.


Asunto(s)
Biodiversidad , Código de Barras del ADN Taxonómico , Código de Barras del ADN Taxonómico/métodos , Emprendimiento , Humanos , Invenciones
2.
BMC Ecol ; 13: 16, 2013 Apr 15.
Artículo en Inglés | MEDLINE | ID: mdl-23587026

RESUMEN

Biodiversity informatics plays a central enabling role in the research community's efforts to address scientific conservation and sustainability issues. Great strides have been made in the past decade establishing a framework for sharing data, where taxonomy and systematics has been perceived as the most prominent discipline involved. To some extent this is inevitable, given the use of species names as the pivot around which information is organised. To address the urgent questions around conservation, land-use, environmental change, sustainability, food security and ecosystem services that are facing Governments worldwide, we need to understand how the ecosystem works. So, we need a systems approach to understanding biodiversity that moves significantly beyond taxonomy and species observations. Such an approach needs to look at the whole system to address species interactions, both with their environment and with other species.It is clear that some barriers to progress are sociological, basically persuading people to use the technological solutions that are already available. This is best addressed by developing more effective systems that deliver immediate benefit to the user, hiding the majority of the technology behind simple user interfaces. An infrastructure should be a space in which activities take place and, as such, should be effectively invisible.This community consultation paper positions the role of biodiversity informatics, for the next decade, presenting the actions needed to link the various biodiversity infrastructures invisibly and to facilitate understanding that can support both business and policy-makers. The community considers the goal in biodiversity informatics to be full integration of the biodiversity research community, including citizens' science, through a commonly-shared, sustainable e-infrastructure across all sub-disciplines that reliably serves science and society alike.


Asunto(s)
Biodiversidad , Biología Computacional/instrumentación , Biología Computacional/métodos , Animales , Ecosistema , Humanos , Difusión de la Información
3.
Biodivers Data J ; 11: e107914, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37745899

RESUMEN

A major gap in the biodiversity knowledge graph is a connection between taxonomic names and the taxonomic literature. While both names and publications often have persistent identifiers (PIDs), such as Life Science Identifiers (LSIDs) or Digital Object Identifiers (DOIs), LSIDs for names are rarely linked to DOIs for publications. This article describes efforts to make those connections across three large taxonomic databases: Index Fungorum, International Plant Names Index (IPNI) and the Index of Organism Names (ION). Over a million names have been matched to DOIs or other persistent identifiers for taxonomic publications. This represents approximately 36% of names for which publication data are available. The mappings between LSIDs and publication PIDs are made available through ChecklistBank. Applications of this mapping are discussed, including a web app to locate the citation of a taxonomic name and a knowledge graph that uses data on researcher ORCID ids to connect taxonomic names and publications to authors of those names.

4.
PeerJ ; 10: e13712, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35821898

RESUMEN

Biological taxonomy rests on a long tail of publications spanning nearly three centuries. Not only is this literature vital to resolving disputes about taxonomy and nomenclature, for many species it represents a key source-indeed sometimes the only source-of information about that species. Unlike other disciplines such as biomedicine, the taxonomic community lacks a centralised, curated literature database (the "bibliography of life"). This article argues that Wikidata can be that database as it has flexible and sophisticated models of bibliographic information, and an active community of people and programs ("bots") adding, editing, and curating that information.


Asunto(s)
Programas Informáticos , Humanos , Bases de Datos Factuales
5.
Elife ; 112022 05 26.
Artículo en Inglés | MEDLINE | ID: mdl-35616633

RESUMEN

Contemporary bioinformatic and chemoinformatic capabilities hold promise to reshape knowledge management, analysis and interpretation of data in natural products research. Currently, reliance on a disparate set of non-standardized, insular, and specialized databases presents a series of challenges for data access, both within the discipline and for integration and interoperability between related fields. The fundamental elements of exchange are referenced structure-organism pairs that establish relationships between distinct molecular structures and the living organisms from which they were identified. Consolidating and sharing such information via an open platform has strong transformative potential for natural products research and beyond. This is the ultimate goal of the newly established LOTUS initiative, which has now completed the first steps toward the harmonization, curation, validation and open dissemination of 750,000+ referenced structure-organism pairs. LOTUS data is hosted on Wikidata and regularly mirrored on https://lotus.naturalproducts.net. Data sharing within the Wikidata framework broadens data access and interoperability, opening new possibilities for community curation and evolving publication models. Furthermore, embedding LOTUS data into the vast Wikidata knowledge graph will facilitate new biological and chemical insights. The LOTUS initiative represents an important advancement in the design and deployment of a comprehensive and collaborative natural products knowledge base.


Asunto(s)
Productos Biológicos , Gestión del Conocimiento , Biología Computacional , Bases de Datos Factuales , Conocimiento
6.
BMC Bioinformatics ; 12: 187, 2011 May 23.
Artículo en Inglés | MEDLINE | ID: mdl-21605356

RESUMEN

BACKGROUND: The Biodiversity Heritage Library (BHL) is a large digital archive of legacy biological literature, comprising over 31 million pages scanned from books, monographs, and journals. During the digitisation process basic metadata about the scanned items is recorded, but not article-level metadata. Given that the article is the standard unit of citation, this makes it difficult to locate cited literature in BHL. Adding the ability to easily find articles in BHL would greatly enhance the value of the archive. DESCRIPTION: A service was developed to locate articles in BHL based on matching article metadata to BHL metadata using approximate string matching, regular expressions, and string alignment. This article locating service is exposed as a standard OpenURL resolver on the BioStor web site http://biostor.org/openurl/. This resolver can be used on the web, or called by bibliographic tools that support OpenURL. CONCLUSIONS: BioStor provides tools for extracting, annotating, and visualising articles from the Biodiversity Heritage Library. BioStor is available from http://biostor.org/.


Asunto(s)
Biología , Almacenamiento y Recuperación de la Información , Bibliotecas Digitales , Publicaciones , Archivos , Biodiversidad , Publicaciones Periódicas como Asunto
7.
Brief Bioinform ; 9(5): 345-54, 2008 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-18445641

RESUMEN

A major challenge facing biodiversity informatics is integrating data stored in widely distributed databases. Initial efforts have relied on taxonomic names as the shared identifier linking records in different databases. However, taxonomic names have limitations as identifiers, being neither stable nor globally unique, and the pace of molecular taxonomic and phylogenetic research means that a lot of information in public sequence databases is not linked to formal taxonomic names. This review explores the use of other identifiers, such as specimen codes and GenBank accession numbers, to link otherwise disconnected facts in different databases. The structure of these links can also be exploited using the PageRank algorithm to rank the results of searches on biodiversity databases. The key to rich integration is a commitment to deploy and reuse globally unique, shared identifiers [such as Digital Object Identifiers (DOIs) and Life Science Identifiers (LSIDs)], and the implementation of services that link those identifiers.


Asunto(s)
Biología Computacional/métodos , Sistemas de Administración de Bases de Datos , Bases de Datos Factuales , Documentación/métodos , Almacenamiento y Recuperación de la Información/métodos , Procesamiento de Lenguaje Natural , Terminología como Asunto
8.
Database (Oxford) ; 20202020 11 27.
Artículo en Inglés | MEDLINE | ID: mdl-33439246

RESUMEN

People are one of the best known and most stable entities in the biodiversity knowledge graph. The wealth of public information associated with people and the ability to identify them uniquely open up the possibility to make more use of these data in biodiversity science. Person data are almost always associated with entities such as specimens, molecular sequences, taxonomic names, observations, images, traits and publications. For example, the digitization and the aggregation of specimen data from museums and herbaria allow us to view a scientist's specimen collecting in conjunction with the whole corpus of their works. However, the metadata of these entities are also useful in validating data, integrating data across collections and institutional databases and can be the basis of future research into biodiversity and science. In addition, the ability to reliably credit collectors for their work has the potential to change the incentive structure to promote improved curation and maintenance of natural history collections.


Asunto(s)
Biodiversidad , Historia Natural , Bases de Datos Factuales , Humanos , Museos
9.
BMC Bioinformatics ; 10 Suppl 14: S5, 2009 Nov 10.
Artículo en Inglés | MEDLINE | ID: mdl-19900301

RESUMEN

BACKGROUND: Linking together the data of interest to biodiversity researchers (including specimen records, images, taxonomic names, and DNA sequences) requires services that can mint, resolve, and discover globally unique identifiers (including, but not limited to, DOIs, HTTP URIs, and LSIDs). RESULTS: bioGUID implements a range of services, the core ones being an OpenURL resolver for bibliographic resources, and a LSID resolver. The LSID resolver supports Linked Data-friendly resolution using HTTP 303 redirects and content negotiation. Additional services include journal ISSN look-up, author name matching, and a tool to monitor the status of biodiversity data providers. CONCLUSION: bioGUID is available at http://bioguid.info/. Source code is available from http://code.google.com/p/bioguid/.


Asunto(s)
Biodiversidad , Biología Computacional , Bases de Datos Factuales , Humanos , Internet
10.
PeerJ ; 7: e6739, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-30993051

RESUMEN

Enormous quantities of biodiversity data are being made available online, but much of this data remains isolated in silos. One approach to breaking these silos is to map local, often database-specific identifiers to shared global identifiers. This mapping can then be used to construct a knowledge graph, where entities such as taxa, publications, people, places, specimens, sequences, and institutions are all part of a single, shared knowledge space. Motivated by the 2018 GBIF Ebbe Nielsen Challenge I explore the feasibility of constructing a "biodiversity knowledge graph" for the Australian fauna. The data cleaning and reconciliation steps involved in constructing the knowledge graph are described in detail. Examples are given of its application to understanding changes in patterns of taxonomic publication over time. A web interface to the knowledge graph (called "Ozymandias") is available at https://ozymandias-demo.herokuapp.com.

11.
Biodivers Data J ; (6): e27539, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-30065607

RESUMEN

Constructing a biodiversity knowledge graph will require making millions of cross links between diversity entities in different datasets. Researchers trying to bootstrap the growth of the biodiversity knowledge graph by constructing databases of links between these entities lack obvious ways to publish these sets of links. One appealing and lightweight approach is to create a "datasette", a database that is wrapped together with a simple web server that enables users to query the data. Datasettes can be packaged into Docker containers and hosted online with minimal effort. This approach is illustrated using a dataset of links between globally unique identifiers for plant taxonomic namesand identifiers for the taxonomic articles that published those names.

12.
BMC Bioinformatics ; 8: 158, 2007 May 18.
Artículo en Inglés | MEDLINE | ID: mdl-17511869

RESUMEN

BACKGROUND: TreeBASE is currently the only available large-scale database of published organismal phylogenies. Its utility is hampered by a lack of taxonomic consistency, both within the database, and with names of organisms in external genomic, specimen, and taxonomic databases. The extent to which the phylogenetic knowledge in TreeBASE becomes integrated with these other sources is limited by this lack of consistency. DESCRIPTION: Taxonomic names in TreeBASE were mapped onto names in the external taxonomic databases IPNI, ITIS, NCBI, and uBio, and graph G of these mappings was constructed. Additional edges representing taxonomic synonymies were added to G, then all components of G were extracted. These components correspond to "name clusters", and group together names in TreeBASE that are inferred to refer to the same taxon. The mapping to NCBI enables hierarchical queries to be performed, which can improve TreeBASE information retrieval by an order of magnitude. CONCLUSION: TBMap database provides a mapping of the bulk of the names in TreeBASE to names in external taxonomic databases, and a clustering of those mappings into sets of names that can be regarded as equivalent. This mapping enables queries and visualisations that cannot otherwise be constructed. A simple query interface to the mapping and names clusters is available at http://linnaeus.zoology.gla.ac.uk/~rpage/tbmap.


Asunto(s)
Clasificación/métodos , Sistemas de Administración de Bases de Datos , Bases de Datos Genéticas , Documentación/métodos , Modelos Genéticos , Filogenia , Interfaz Usuario-Computador , Secuencia de Bases , Mapeo Cromosómico , Datos de Secuencia Molecular , Análisis de Secuencia de ADN
13.
BMC Evol Biol ; 7: 227, 2007 Nov 15.
Artículo en Inglés | MEDLINE | ID: mdl-18005412

RESUMEN

BACKGROUND: The diversity of parasites attacking a host varies substantially among different host species. Understanding the factors that explain these patterns of parasite diversity is critical to identifying the ecological principles underlying biodiversity. Seabirds (Charadriiformes, Pelecaniformes and Procellariiformes) and their ectoparasitic lice (Insecta: Phthiraptera) are ideal model groups in which to study correlates of parasite species richness. We evaluated the relative importance of morphological (body size, body weight, wingspan, bill length), life-history (longevity, clutch size), ecological (population size, geographical range) and behavioural (diving versus non-diving) variables as predictors of louse diversity on 413 seabird hosts species. Diversity was measured at the level of louse suborder, genus, and species, and uneven sampling of hosts was controlled for using literature citations as a proxy for sampling effort. RESULTS: The only variable consistently correlated with louse diversity was host population size and to a lesser extent geographic range. Other variables such as clutch size, longevity, morphological and behavioural variables including body mass showed inconsistent patterns dependent on the method of analysis. CONCLUSION: The comparative analysis presented herein is (to our knowledge) the first to test correlates of parasite species richness in seabirds. We believe that the comparative data and phylogeny provide a valuable framework for testing future evolutionary hypotheses relating to the diversity and distribution of parasites on seabirds.


Asunto(s)
Biodiversidad , Charadriiformes/parasitología , Interacciones Huésped-Parásitos , Phthiraptera/clasificación , Animales , Conducta Animal , Charadriiformes/anatomía & histología , Charadriiformes/fisiología , Filogenia , Densidad de Población , Especificidad de la Especie
14.
BMC Evol Biol ; 6: 66, 2006 Aug 29.
Artículo en Inglés | MEDLINE | ID: mdl-16939643

RESUMEN

BACKGROUND: The shape of phylogenetic trees has been used to make inferences about the evolutionary process by comparing the shapes of actual phylogenies with those expected under simple models of the speciation process. Previous studies have focused on speciation events, but gene duplication is another lineage splitting event, analogous to speciation, and gene loss or deletion is analogous to extinction. Measures of the shape of gene family phylogenies can thus be used to investigate the processes of gene duplication and loss. We make the first systematic attempt to use tree shape to study gene duplication using human gene phylogenies. RESULTS: We find that gene duplication has produced gene family trees significantly less balanced than expected from a simple model of the process, and less balanced than species phylogenies: the opposite to what might be expected under the 2R hypothesis. CONCLUSION: While other explanations are plausible, we suggest that the greater imbalance of gene family trees than species trees is due to the prevalence of tandem duplications over regional duplications during the evolution of the human genome.


Asunto(s)
Filogenia , Eliminación de Gen , Duplicación de Gen , Humanos/genética , Modelos Genéticos
15.
Artículo en Inglés | MEDLINE | ID: mdl-27481786

RESUMEN

Both classical taxonomy and DNA barcoding are engaged in the task of digitizing the living world. Much of the taxonomic literature remains undigitized. The rise of open access publishing this century and the freeing of older literature from the shackles of copyright have greatly increased the online availability of taxonomic descriptions, but much of the literature of the mid- to late-twentieth century remains offline ('dark texts'). DNA barcoding is generating a wealth of computable data that in many ways are much easier to work with than classical taxonomic descriptions, but many of the sequences are not identified to species level. These 'dark taxa' hamper the classical method of integrating biodiversity data, using shared taxonomic names. Voucher specimens are a potential common currency of both the taxonomic literature and sequence databases, and could be used to help link names, literature and sequences. An obstacle to this approach is the lack of stable, resolvable specimen identifiers. The paper concludes with an appeal for a global 'digital dashboard' to assess the extent to which biodiversity data are available online.This article is part of the themed issue 'From DNA barcodes to biomes'.


Asunto(s)
Clasificación/métodos , Código de Barras del ADN Taxonómico , Publicaciones Periódicas como Asunto , Biodiversidad , Manejo de Especímenes
16.
Zookeys ; (550): 247-60, 2016.
Artículo en Inglés | MEDLINE | ID: mdl-26877663

RESUMEN

Taxonomic databases are perpetuating approaches to citing literature that may have been appropriate before the Internet, often being little more than digitised 5 × 3 index cards. Typically the original taxonomic literature is either not cited, or is represented in the form of a (typically abbreviated) text string. Hence much of the "deep data" of taxonomy, such as the original descriptions, revisions, and nomenclatural actions are largely hidden from all but the most resourceful users. At the same time there are burgeoning efforts to digitise the scientific literature, and much of this newly available content has been assigned globally unique identifiers such as Digital Object Identifiers (DOIs), which are also the identifier of choice for most modern publications. This represents an opportunity for taxonomic databases to engage with digitisation efforts. Mapping the taxonomic literature on to globally unique identifiers can be time consuming, but need be done only once. Furthermore, if we reuse existing identifiers, rather than mint our own, we can start to build the links between the diverse data that are needed to support the kinds of inference which biodiversity informatics aspires to support. Until this practice becomes widespread, the taxonomic literature will remain balkanized, and much of the knowledge that it contains will linger in obscurity.

17.
BMC Bioinformatics ; 6: 48, 2005 Mar 09.
Artículo en Inglés | MEDLINE | ID: mdl-15757517

RESUMEN

BACKGROUND: The taxonomic name of an organism is a key link between different databases that store information on that organism. However, in the absence of a single, comprehensive database of organism names, individual databases lack an easy means of checking the correctness of a name. Furthermore, the same organism may have more than one name, and the same name may apply to more than one organism. RESULTS: The Taxonomic Search Engine (TSE) is a web application written in PHP that queries multiple taxonomic databases (ITIS, Index Fungorum, IPNI, NCBI, and uBIO) and summarises the results in a consistent format. It supports "drill-down" queries to retrieve a specific record. The TSE can optionally suggest alternative spellings the user can try. It also acts as a Life Science Identifier (LSID) authority for the source taxonomic databases, providing globally unique identifiers (and associated metadata) for each name. CONCLUSION: The Taxonomic Search Engine is available at http://darwin.zoology.gla.ac.uk/~rpage/portal/ and provides a simple demonstration of the potential of the federated approach to providing access to taxonomic names.


Asunto(s)
Biología Computacional/métodos , Bases de Datos Factuales , Clasificación , Redes de Comunicación de Computadores , Sistemas de Administración de Bases de Datos , Bases de Datos como Asunto , Bases de Datos Genéticas , Bases de Datos de Proteínas , Difusión de la Información , Servicios de Información , Almacenamiento y Recuperación de la Información , Sistemas de Información , Internet , Informática Médica , National Institutes of Health (U.S.) , National Library of Medicine (U.S.) , Análisis de Secuencia de Proteína , Programas Informáticos , Diseño de Software , Integración de Sistemas , Unified Medical Language System , Estados Unidos , Interfaz Usuario-Computador
18.
BMC Bioinformatics ; 6: 208, 2005 Aug 25.
Artículo en Inglés | MEDLINE | ID: mdl-16122379

RESUMEN

BACKGROUND: The NCBI taxonomy provides one of the most powerful ways to navigate sequence data bases but currently users are forced to formulate queries according to a single taxonomic classification. Given that there is not universal agreement on the classification of organisms, providing a single classification places constraints on the questions biologists can ask. However, maintaining multiple classifications is burdensome in the face of a constantly growing NCBI classification. RESULTS: In this paper, we present a solution to the problem of generating modifications of the NCBI taxonomy, based on the computation of an edit script that summarises the differences between two classification trees. Our algorithms find the shortest possible edit script based on the identification of all shared subtrees, and only take time quasi linear in the size of the trees because classification trees have unique node labels. CONCLUSION: These algorithms have been recently implemented, and the software is freely available for download from http://darwin.zoology.gla.ac.uk/~rpage/forest/.


Asunto(s)
Algoritmos , Clasificación/métodos , Computadores Moleculares , Animales , Bases de Datos Genéticas , Bases de Datos de Ácidos Nucleicos/clasificación , Humanos , Filogenia , Especificidad de la Especie
19.
Proc Biol Sci ; 272(1560): 277-83, 2005 Feb 07.
Artículo en Inglés | MEDLINE | ID: mdl-15705552

RESUMEN

Gene duplication has certainly played a major role in structuring vertebrate genomes but the extent and nature of the duplication events involved remains controversial. A recent study identified two major episodes of gene duplication: one episode of putative genome duplication ca. 500 Myr ago and a more recent gene-family expansion attributed to segmental or tandem duplications. We confirm this pattern using methods not reliant on molecular clocks for individual gene families. However, analysis of a simple model of the birth-death process suggests that the apparent recent episode of duplication is an artefact of the birth-death process. We show that a constant-rate birth-death model is appropriate for gene duplication data, allowing us to estimate the rate of gene duplication and loss in the vertebrate genome over the last 200 Myr (0.00115 and 0.00740 Myr(-1) lineage(-1), respectively). Finally, we show that increasing rates of gene loss reduce the impact of a genome-wide duplication event on the distribution of gene duplications through time.


Asunto(s)
Evolución Molecular , Eliminación de Gen , Duplicación de Gen , Genoma Humano , Modelos Genéticos , Genómica , Humanos
20.
PLoS Curr ; 72015 Jun 23.
Artículo en Inglés | MEDLINE | ID: mdl-26146589

RESUMEN

This article describes a simple tool to display geophylogenies on web maps including Google Maps and OpenStreetMap. The tool reads a NEXUS format file that includes geographic information, and outputs a GeoJSON format file that can be displayed in a web map application.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA