Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Evol Bioinform Online ; 16: 1176934319899384, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-32372858

RESUMEN

A comprehensive phylogeny of species, i.e., a tree of life, has potential uses in a variety of contexts, including research, education, and public policy. Yet, accessing the tree of life typically requires special knowledge, complex software, or long periods of training. The Phylotastic project aims make it as easy to get a phylogeny of species as it is to get driving directions from mapping software. In prior work, we presented a design for an open system to validate and manage taxon names, find phylogeny resources, extract subtrees matching a user's taxon list, scale trees to time, and integrate related resources such as species images. Here, we report the implementation of a set of tools that together represent a robust, accessible system for on-the-fly delivery of phylogenetic knowledge. This set of tools includes a web portal to execute several customizable workflows to obtain species phylogenies (scaled by geologic time and decorated with thumbnail images); more than 30 underlying web services (accessible via a common registry); and code toolkits in R and Python (allowing others to develop custom applications using Phylotastic services). The Phylotastic system, accessible via http://www.phylotastic.org, provides a unique resource to access the current state of phylogenetic knowledge, useful for a variety of cases in which a tree extracted quickly from online resources (as distinct from a tree custom-made from character data) is sufficient, as it is for many casual uses of trees identified here.

2.
BMC Bioinformatics ; 18(1): 279, 2017 May 26.
Artículo en Inglés | MEDLINE | ID: mdl-28549446

RESUMEN

BACKGROUND: Scientific names in biology act as universal links. They allow us to cross-reference information about organisms globally. However variations in spelling of scientific names greatly diminish their ability to interconnect data. Such variations may include abbreviations, annotations, misspellings, etc. Authorship is a part of a scientific name and may also differ significantly. To match all possible variations of a name we need to divide them into their elements and classify each element according to its role. We refer to this as 'parsing' the name. Parsing categorizes name's elements into those that are stable and those that are prone to change. Names are matched first by combining them according to their stable elements. Matches are then refined by examining their varying elements. This two stage process dramatically improves the number and quality of matches. It is especially useful for the automatic data exchange within the context of "Big Data" in biology. RESULTS: We introduce Global Names Parser (gnparser). It is a Java tool written in Scala language (a language for Java Virtual Machine) to parse scientific names. It is based on a Parsing Expression Grammar. The parser can be applied to scientific names of any complexity. It assigns a semantic meaning (such as genus name, species epithet, rank, year of publication, authorship, annotations, etc.) to all elements of a name. It is able to work with nested structures as in the names of hybrids. gnparser performs with ≈99% accuracy and processes 30 million name-strings/hour per CPU thread. The gnparser library is compatible with Scala, Java, R, Jython, and JRuby. The parser can be used as a command line application, as a socket server, a web-app or as a RESTful HTTP-service. It is released under an Open source MIT license. CONCLUSIONS: Global Names Parser (gnparser) is a fast, high precision tool for biodiversity informaticians and biologists working with large numbers of scientific names. It can replace expensive and error-prone manual parsing and standardization of scientific names in many situations, and can quickly enhance the interoperability of distributed biological information.


Asunto(s)
Interfaz Usuario-Computador , Biodiversidad , Informática , Internet , Terminología como Asunto
4.
Integr Comp Biol ; 54(2): 250-63, 2014 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-24907201

RESUMEN

Marine and aquatic animals are extraordinarily useful as models for identifying mechanisms of development and evolution, regeneration, resistance to cancer, longevity and symbiosis, among many other areas of research. This is due to the great diversity of these organisms and their wide-ranging capabilities. Genomics tools are essential for taking advantage of these "free lessons" of nature. However, genomics and transcriptomics are challenging in emerging model systems. Here, we present SeaBase, a tool for helping to meet these needs. Specifically, SeaBase provides a platform for sharing and searching transcriptome data. More importantly, SeaBase will support a growing number of tools for inferring gene network mechanisms. The first dataset available on SeaBase is a developmental transcriptomic profile of the sea anemone Nematostella vectensis (Anthozoa, Cnidaria). Additional datasets are currently being prepared and we are aiming to expand SeaBase to include user-supplied data for any number of marine and aquatic organisms, thereby supporting many potentially new models for gene network studies. SeaBase can be accessed online at: http://seabase.core.cli.mbl.edu.


Asunto(s)
Organismos Acuáticos/genética , Bases de Datos como Asunto , Redes Reguladoras de Genes , Transcriptoma , Animales , Genómica , Humanos , Anémonas de Mar/genética
5.
BMC Bioinformatics ; 14: 16, 2013 Jan 16.
Artículo en Inglés | MEDLINE | ID: mdl-23324024

RESUMEN

BACKGROUND: The digitization of biodiversity data is leading to the widespread application of taxon names that are superfluous, ambiguous or incorrect, resulting in mismatched records and inflated species numbers. The ultimate consequences of misspelled names and bad taxonomy are erroneous scientific conclusions and faulty policy decisions. The lack of tools for correcting this 'names problem' has become a fundamental obstacle to integrating disparate data sources and advancing the progress of biodiversity science. RESULTS: The TNRS, or Taxonomic Name Resolution Service, is an online application for automated and user-supervised standardization of plant scientific names. The TNRS builds upon and extends existing open-source applications for name parsing and fuzzy matching. Names are standardized against multiple reference taxonomies, including the Missouri Botanical Garden's Tropicos database. Capable of processing thousands of names in a single operation, the TNRS parses and corrects misspelled names and authorities, standardizes variant spellings, and converts nomenclatural synonyms to accepted names. Family names can be included to increase match accuracy and resolve many types of homonyms. Partial matching of higher taxa combined with extraction of annotations, accession numbers and morphospecies allows the TNRS to standardize taxonomy across a broad range of active and legacy datasets. CONCLUSIONS: We show how the TNRS can resolve many forms of taxonomic semantic heterogeneity, correct spelling errors and eliminate spurious names. As a result, the TNRS can aid the integration of disparate biological datasets. Although the TNRS was developed to aid in standardizing plant names, its underlying algorithms and design can be extended to all organisms and nomenclatural codes. The TNRS is accessible via a web interface at http://tnrs.iplantcollaborative.org/ and as a RESTful web service and application programming interface. Source code is available at https://github.com/iPlantCollaborativeOpenSource/TNRS/.


Asunto(s)
Plantas/clasificación , Programas Informáticos , Algoritmos , Clasificación/métodos , Bases de Datos Factuales , Internet , Nombres , Interfaz Usuario-Computador
6.
Adv Bioinformatics ; 2012: 391574, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-22685456

RESUMEN

Centuries of biological knowledge are contained in the massive body of scientific literature, written for human-readability but too big for any one person to consume. Large-scale mining of information from the literature is necessary if biology is to transform into a data-driven science. A computer can handle the volume but cannot make sense of the language. This paper reviews and discusses the use of natural language processing (NLP) and machine-learning algorithms to extract information from systematic literature. NLP algorithms have been used for decades, but require special development for application in the biological realm due to the special nature of the language. Many tools exist for biological information extraction (cellular processes, taxonomic names, and morphological characters), but none have been applied life wide and most still require testing and development. Progress has been made in developing algorithms for automated annotation of taxonomic text, identification of taxonomic names in text, and extraction of morphological character information from taxonomic descriptions. This manuscript will briefly discuss the key steps in applying information extraction tools to enhance biodiversity science.

7.
Biochemistry ; 43(50): 15915-21, 2004 Dec 21.
Artículo en Inglés | MEDLINE | ID: mdl-15595847

RESUMEN

Human proliferating cell nuclear antigen (hPCNA) containing a single amino acid substitution at position 85, that of lysine for glutamate (E85K), was compared to wild-type (wt) hPCNA for its ability to promote DNA synthesis by purified DNA polymerase delta (pol delta) both on unmodified templates and past chemically defined template base lesions (translesion synthesis; TLS). Significant enhancement (up to 4-5-fold or greater) was seen but depended both on the exact PCNA/pol delta ratio tested and on the specific nature of the template (e.g., unmodified versus lesion-containing; chemical nature of the template base lesion). These results suggest that human PCNA, either mutated to contain lysine (K) at position 85 or bearing similar primary mutations, would promote more secondary mutagenesis in cells and/or tissues where PCNA is normally expressed at low levels relative to pol delta. Over an entire lifetime, such secondary mutagenesis could be biomedically significant.


Asunto(s)
Daño del ADN , ADN Polimerasa III/fisiología , Replicación del ADN , Mutagénesis , Antígeno Nuclear de Célula en Proliferación/genética , Sustitución de Aminoácidos , ADN Polimerasa III/metabolismo , Ácido Glutámico/genética , Humanos , Lisina/genética , Mutación Puntual/genética , Antígeno Nuclear de Célula en Proliferación/metabolismo , Moldes Genéticos
8.
BMC Biochem ; 5: 13, 2004 Aug 13.
Artículo en Inglés | MEDLINE | ID: mdl-15310391

RESUMEN

BACKGROUND: We and others have shown four distinct and presumably related effects of mammalian proliferating cell nuclear antigen (PCNA) on DNA synthesis catalyzed by mammalian DNA polymerase delta(pol delta). In the presence of homologous PCNA, pol delta exhibits 1) increased absolute activity; 2) increased processivity of DNA synthesis; 3) stable binding of synthetic oligonucleotide template-primers (t1/2 of the pol deltaPCNAtemplate-primer complex >/=2.5 h); and 4) enhanced synthesis of DNA opposite and beyond template base lesions. This last effect is potentially mutagenic in vivo. Biochemical studies performed in parallel with in vivo genetic analyses, would represent an extremely powerful approach to investigate further, both DNA replication and repair in eukaryotes. RESULTS: Drosophila PCNA, although highly similar in structure to mammalian PCNA (e.g., it is >70% identical to human PCNA in amino acid sequence), can only substitute poorly for either calf thymus or human PCNA (approximately 10% as well) in affecting calf thymus pol delta. However, by mutating one or only a few amino acids in the region of Drosophila PCNA thought to interact with pol delta, all four effects can be enhanced dramatically. CONCLUSIONS: Our results therefore suggest that all four above effects depend at least in part on the PCNA-pol delta interaction. Moreover unlike mammals, Drosophila offers the potential for immediate in vivo genetic analyses. Although it has proven difficult to obtain sufficient amounts of homologous pol delta for parallel in vitro biochemical studies, by altering Drosophila PCNA using site-directed mutagenesis as suggested by our results, in vitro biochemical studies may now be performed using human and/or calf thymus pol delta preparations.


Asunto(s)
ADN Polimerasa III/metabolismo , Proteínas de Drosophila/fisiología , Antígeno Nuclear de Célula en Proliferación/fisiología , Secuencia de Aminoácidos , Sustitución de Aminoácidos , Animales , Bovinos , ADN/metabolismo , ADN Polimerasa III/química , Replicación del ADN , Proteínas de Drosophila/química , Proteínas de Drosophila/genética , Humanos , Modelos Moleculares , Datos de Secuencia Molecular , Mutagénesis Sitio-Dirigida , Antígeno Nuclear de Célula en Proliferación/química , Antígeno Nuclear de Célula en Proliferación/genética , Unión Proteica , Conformación Proteica , Mapeo de Interacción de Proteínas , Estructura Terciaria de Proteína , Especificidad de la Especie , Timo/enzimología
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...