Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 48
Filtrar
1.
Syst Biol ; 71(6): 1290-1306, 2022 10 12.
Artigo em Inglês | MEDLINE | ID: mdl-35285502

RESUMO

Morphology remains a primary source of phylogenetic information for many groups of organisms, and the only one for most fossil taxa. Organismal anatomy is not a collection of randomly assembled and independent "parts", but instead a set of dependent and hierarchically nested entities resulting from ontogeny and phylogeny. How do we make sense of these dependent and at times redundant characters? One promising approach is using ontologies-structured controlled vocabularies that summarize knowledge about different properties of anatomical entities, including developmental and structural dependencies. Here, we assess whether evolutionary patterns can explain the proximity of ontology-annotated characters within an ontology. To do so, we measure phylogenetic information across characters and evaluate if it matches the hierarchical structure given by ontological knowledge-in much the same way as across-species diversity structure is given by phylogeny. We implement an approach to evaluate the Bayesian phylogenetic information (BPI) content and phylogenetic dissonance among ontology-annotated anatomical data subsets. We applied this to data sets representing two disparate animal groups: bees (Hexapoda: Hymenoptera: Apoidea, 209 chars) and characiform fishes (Actinopterygii: Ostariophysi: Characiformes, 463 chars). For bees, we find that BPI is not substantially explained by anatomy since dissonance is often high among morphologically related anatomical entities. For fishes, we find substantial information for two clusters of anatomical entities instantiating concepts from the jaws and branchial arch bones, but among-subset information decreases and dissonance increases substantially moving to higher-level subsets in the ontology. We further applied our approach to address particular evolutionary hypotheses with an example of morphological evolution in miniature fishes. While we show that phylogenetic information does match ontology structure for some anatomical entities, additional relationships and processes, such as convergence, likely play a substantial role in explaining BPI and dissonance, and merit future investigation. Our work demonstrates how complex morphological data sets can be interrogated with ontologies by allowing one to access how information is spread hierarchically across anatomical concepts, how congruent this information is, and what sorts of processes may play a role in explaining it: phylogeny, development, or convergence. [Apidae; Bayesian phylogenetic information; Ostariophysi; Phenoscape; phylogenetic dissonance; semantic similarity.].


Assuntos
Artrópodes , Caraciformes , Animais , Teorema de Bayes , Fósseis , Filogenia
2.
Syst Biol ; 69(2): 345-362, 2020 03 01.
Artigo em Inglês | MEDLINE | ID: mdl-31596473

RESUMO

There is a growing body of research on the evolution of anatomy in a wide variety of organisms. Discoveries in this field could be greatly accelerated by computational methods and resources that enable these findings to be compared across different studies and different organisms and linked with the genes responsible for anatomical modifications. Homology is a key concept in comparative anatomy; two important types are historical homology (the similarity of organisms due to common ancestry) and serial homology (the similarity of repeated structures within an organism). We explored how to most effectively represent historical and serial homology across anatomical structures to facilitate computational reasoning. We assembled a collection of homology assertions from the literature with a set of taxon phenotypes for the skeletal elements of vertebrate fins and limbs from the Phenoscape Knowledgebase. Using seven competency questions, we evaluated the reasoning ramifications of two logical models: the Reciprocal Existential Axioms (REA) homology model and the Ancestral Value Axioms (AVA) homology model. The AVA model returned all user-expected results in addition to the search term and any of its subclasses. The AVA model also returns any superclass of the query term in which a homology relationship has been asserted. The REA model returned the user-expected results for five out of seven queries. We identify some challenges of implementing complete homology queries due to limitations of OWL reasoning. This work lays the foundation for homology reasoning to be incorporated into other ontology-based tools, such as those that enable synthetic supermatrix construction and candidate gene discovery. [Homology; ontology; anatomy; morphology; evolution; knowledgebase; phenoscape.].


Assuntos
Classificação/métodos , Modelos Biológicos , Nadadeiras de Animais/anatomia & histologia , Animais , Extremidades/anatomia & histologia , Vertebrados/anatomia & histologia
3.
PLoS Biol ; 13(1): e1002033, 2015 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-25562316

RESUMO

Despite a large and multifaceted effort to understand the vast landscape of phenotypic data, their current form inhibits productive data analysis. The lack of a community-wide, consensus-based, human- and machine-interpretable language for describing phenotypes and their genomic and environmental contexts is perhaps the most pressing scientific bottleneck to integration across many key fields in biology, including genomics, systems biology, development, medicine, evolution, ecology, and systematics. Here we survey the current phenomics landscape, including data resources and handling, and the progress that has been made to accurately capture relevant data descriptions for phenotypes. We present an example of the kind of integration across domains that computable phenotypes would enable, and we call upon the broader biology community, publishers, and relevant funding agencies to support efforts to surmount today's data barriers and facilitate analytical reproducibility.


Assuntos
Estudos de Associação Genética , Animais , Biologia Computacional , Curadoria de Dados , Bases de Dados Factuais/normas , Interação Gene-Ambiente , Genômica , Humanos , Fenótipo , Padrões de Referência , Reprodutibilidade dos Testes , Terminologia como Assunto
4.
Mol Biol Evol ; 33(1): 13-24, 2016 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-26500251

RESUMO

Phenotypes resulting from mutations in genetic model organisms can help reveal candidate genes for evolutionarily important phenotypic changes in related taxa. Although testing candidate gene hypotheses experimentally in nonmodel organisms is typically difficult, ontology-driven information systems can help generate testable hypotheses about developmental processes in experimentally tractable organisms. Here, we tested candidate gene hypotheses suggested by expert use of the Phenoscape Knowledgebase, specifically looking for genes that are candidates responsible for evolutionarily interesting phenotypes in the ostariophysan fishes that bear resemblance to mutant phenotypes in zebrafish. For this, we searched ZFIN for genetic perturbations that result in either loss of basihyal element or loss of scales phenotypes, because these are the ancestral phenotypes observed in catfishes (Siluriformes). We tested the identified candidate genes by examining their endogenous expression patterns in the channel catfish, Ictalurus punctatus. The experimental results were consistent with the hypotheses that these features evolved through disruption in developmental pathways at, or upstream of, brpf1 and eda/edar for the ancestral losses of basihyal element and scales, respectively. These results demonstrate that ontological annotations of the phenotypic effects of genetic alterations in model organisms, when aggregated within a knowledgebase, can be used effectively to generate testable, and useful, hypotheses about evolutionary changes in morphology.


Assuntos
Peixes-Gato/genética , Evolução Molecular , Expressão Gênica , Modelos Genéticos , Fenótipo , Animais , Biologia Computacional , Expressão Gênica/genética , Expressão Gênica/fisiologia , Software
5.
PLoS Comput Biol ; 12(2): e1004691, 2016 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-26914653

RESUMO

The Bioinformatics Open Source Conference (BOSC) is organized by the Open Bioinformatics Foundation (OBF), a nonprofit group dedicated to promoting the practice and philosophy of open source software development and open science within the biological research community. Since its inception in 2000, BOSC has provided bioinformatics developers with a forum for communicating the results of their latest efforts to the wider research community. BOSC offers a focused environment for developers and users to interact and share ideas about standards; software development practices; practical techniques for solving bioinformatics problems; and approaches that promote open science and sharing of data, results, and software. BOSC is run as a two-day special interest group (SIG) before the annual Intelligent Systems in Molecular Biology (ISMB) conference. BOSC 2015 took place in Dublin, Ireland, and was attended by over 125 people, about half of whom were first-time attendees. Session topics included "Data Science;" "Standards and Interoperability;" "Open Science and Reproducibility;" "Translational Bioinformatics;" "Visualization;" and "Bioinformatics Open Source Project Updates". In addition to two keynote talks and dozens of shorter talks chosen from submitted abstracts, BOSC 2015 included a panel, titled "Open Source, Open Door: Increasing Diversity in the Bioinformatics Open Source Community," that provided an opportunity for open discussion about ways to increase the diversity of participants in BOSC in particular, and in open source bioinformatics in general. The complete program of BOSC 2015 is available online at http://www.open-bio.org/wiki/BOSC_2015_Schedule.


Assuntos
Biologia Computacional/organização & administração , Congressos como Assunto , Humanos , Irlanda
6.
Syst Biol ; 64(6): 936-52, 2015 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-26018570

RESUMO

The reality of larger and larger molecular databases and the need to integrate data scalably have presented a major challenge for the use of phenotypic data. Morphology is currently primarily described in discrete publications, entrenched in noncomputer readable text, and requires enormous investments of time and resources to integrate across large numbers of taxa and studies. Here we present a new methodology, using ontology-based reasoning systems working with the Phenoscape Knowledgebase (KB; kb.phenoscape.org), to automatically integrate large amounts of evolutionary character state descriptions into a synthetic character matrix of neomorphic (presence/absence) data. Using the KB, which includes more than 55 studies of sarcopterygian taxa, we generated a synthetic supermatrix of 639 variable characters scored for 1051 taxa, resulting in over 145,000 populated cells. Of these characters, over 76% were made variable through the addition of inferred presence/absence states derived by machine reasoning over the formal semantics of the source ontologies. Inferred data reduced the missing data in the variable character-subset from 98.5% to 78.2%. Machine reasoning also enables the isolation of conflicts in the data, that is, cells where both presence and absence are indicated; reports regarding conflicting data provenance can be generated automatically. Further, reasoning enables quantification and new visualizations of the data, here for example, allowing identification of character space that has been undersampled across the fin-to-limb transition. The approach and methods demonstrated here to compute synthetic presence/absence supermatrices are applicable to any taxonomic and phenotypic slice across the tree of life, providing the data are semantically annotated. Because such data can also be linked to model organism genetics through computational scoring of phenotypic similarity, they open a rich set of future research questions into phenotype-to-genome relationships.


Assuntos
Ontologias Biológicas , Biologia Computacional/métodos , Fenótipo , Anfíbios/anatomia & histologia , Anfíbios/classificação , Animais , Evolução Biológica , Classificação , Interpretação Estatística de Dados
7.
PLoS Biol ; 11(1): e1001468, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23335860

RESUMO

How should funding agencies enable researchers to explore high-risk but potentially high-reward science? One model that appears to work is the NSF-funded synthesis center, an incubator for community-led, innovative science.


Assuntos
Pesquisa Biomédica/economia , Financiamento Governamental/economia , Interpretação Estatística de Dados , Administração Financeira , Humanos , Pesquisadores , Estados Unidos
8.
Genesis ; 53(8): 561-71, 2015 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-26220875

RESUMO

The abundance of phenotypic diversity among species can enrich our knowledge of development and genetics beyond the limits of variation that can be observed in model organisms. The Phenoscape Knowledgebase (KB) is designed to enable exploration and discovery of phenotypic variation among species. Because phenotypes in the KB are annotated using standard ontologies, evolutionary phenotypes can be compared with phenotypes from genetic perturbations in model organisms. To illustrate the power of this approach, we review the use of the KB to find taxa showing evolutionary variation similar to that of a query gene. Matches are made between the full set of phenotypes described for a gene and an evolutionary profile, the latter of which is defined as the set of phenotypes that are variable among the daughters of any node on the taxonomic tree. Phenoscape's semantic similarity interface allows the user to assess the statistical significance of each match and flags matches that may only result from differences in annotation coverage between genetic and evolutionary studies. Tools such as this will help meet the challenge of relating the growing volume of genetic knowledge in model organisms to the diversity of phenotypes in nature. The Phenoscape KB is available at http://kb.phenoscape.org.


Assuntos
Bases de Dados Genéticas , Estudos de Associação Genética/métodos , Animais , Evolução Biológica , Biologia Computacional/métodos , Humanos , Bases de Conhecimento , Fenótipo
9.
BMC Bioinformatics ; 14: 158, 2013 May 13.
Artigo em Inglês | MEDLINE | ID: mdl-23668630

RESUMO

BACKGROUND: Scientists rarely reuse expert knowledge of phylogeny, in spite of years of effort to assemble a great "Tree of Life" (ToL). A notable exception involves the use of Phylomatic, which provides tools to generate custom phylogenies from a large, pre-computed, expert phylogeny of plant taxa. This suggests great potential for a more generalized system that, starting with a query consisting of a list of any known species, would rectify non-standard names, identify expert phylogenies containing the implicated taxa, prune away unneeded parts, and supply branch lengths and annotations, resulting in a custom phylogeny suited to the user's needs. Such a system could become a sustainable community resource if implemented as a distributed system of loosely coupled parts that interact through clearly defined interfaces. RESULTS: With the aim of building such a "phylotastic" system, the NESCent Hackathons, Interoperability, Phylogenies (HIP) working group recruited 2 dozen scientist-programmers to a weeklong programming hackathon in June 2012. During the hackathon (and a three-month follow-up period), 5 teams produced designs, implementations, documentation, presentations, and tests including: (1) a generalized scheme for integrating components; (2) proof-of-concept pruners and controllers; (3) a meta-API for taxonomic name resolution services; (4) a system for storing, finding, and retrieving phylogenies using semantic web technologies for data exchange, storage, and querying; (5) an innovative new service, DateLife.org, which synthesizes pre-computed, time-calibrated phylogenies to assign ages to nodes; and (6) demonstration projects. These outcomes are accessible via a public code repository (GitHub.com), a website (http://www.phylotastic.org), and a server image. CONCLUSIONS: Approximately 9 person-months of effort (centered on a software development hackathon) resulted in the design and implementation of proof-of-concept software for 4 core phylotastic components, 3 controllers, and 3 end-user demonstration tools. While these products have substantial limitations, they suggest considerable potential for a distributed system that makes phylogenetic knowledge readily accessible in computable form. Widespread use of phylotastic systems will create an electronic marketplace for sharing phylogenetic knowledge that will spur innovation in other areas of the ToL enterprise, such as annotation of sources and methods and third-party methods of quality assessment.


Assuntos
Filogenia , Software , Internet
10.
Syst Biol ; 61(4): 675-89, 2012 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-22357728

RESUMO

In scientific research, integration and synthesis require a common understanding of where data come from, how much they can be trusted, and what they may be used for. To make such an understanding computer-accessible requires standards for exchanging richly annotated data. The challenges of conveying reusable data are particularly acute in regard to evolutionary comparative analysis, which comprises an ever-expanding list of data types, methods, research aims, and subdisciplines. To facilitate interoperability in evolutionary comparative analysis, we present NeXML, an XML standard (inspired by the current standard, NEXUS) that supports exchange of richly annotated comparative data. NeXML defines syntax for operational taxonomic units, character-state matrices, and phylogenetic trees and networks. Documents can be validated unambiguously. Importantly, any data element can be annotated, to an arbitrary degree of richness, using a system that is both flexible and rigorous. We describe how the use of NeXML by the TreeBASE and Phenoscape projects satisfies user needs that cannot be satisfied with other available file formats. By relying on XML Schema Definition, the design of NeXML facilitates the development and deployment of software for processing, transforming, and querying documents. The adoption of NeXML for practical use is facilitated by the availability of (1) an online manual with code samples and a reference to all defined elements and attributes, (2) programming toolkits in most of the languages used commonly in evolutionary informatics, and (3) input-output support in several widely used software applications. An active, open, community-based development process enables future revision and expansion of NeXML.


Assuntos
Evolução Biológica , Biologia Computacional/normas , Linguagens de Programação , Biodiversidade , Classificação , Informática , Modelos Biológicos , Filogenia , Software
11.
PeerJ ; 10: e12618, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35186448

RESUMO

To be computationally reproducible and efficient, integration of disparate data depends on shared entities whose matching meaning (semantics) can be computationally assessed. For biodiversity data one of the most prevalent shared entities for linking data records is the associated taxon concept. Unlike Linnaean taxon names, the traditional way in which taxon concepts are provided, phylogenetic definitions are native to phylogenetic trees and offer well-defined semantics that can be transformed into formal, computationally evaluable logic expressions. These attributes make them highly suitable for phylogeny-driven comparative biology by allowing computationally verifiable and reproducible integration of taxon-linked data against Tree of Life-scale phylogenies. To achieve this, the first step is transforming phylogenetic definitions from the natural language text in which they are published to a structured interoperable data format that maintains strong ties to semantics and lends itself well to sharing, reuse, and long-term archival. To this end, we developed the Phyloreference Exchange Format (Phyx), a JSON-LD-based text format encompassing rich metadata for all elements of a phylogenetic definition, and we created a supporting software library, phyx.js, to streamline computational management of such files. Together they form a foundation layer for digitizing and computing with phylogenetic definitions of clades.


Assuntos
Semântica , Software , Filogenia , Biologia , Registros
13.
Syst Biol ; 59(4): 369-83, 2010 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-20547776

RESUMO

The rich knowledge of morphological variation among organisms reported in the systematic literature has remained in free-text format, impractical for use in large-scale synthetic phylogenetic work. This noncomputable format has also precluded linkage to the large knowledgebase of genomic, genetic, developmental, and phenotype data in model organism databases. We have undertaken an effort to prototype a curated, ontology-based evolutionary morphology database that maps to these genetic databases (http://kb.phenoscape.org) to facilitate investigation into the mechanistic basis and evolution of phenotypic diversity. Among the first requirements in establishing this database was the development of a multispecies anatomy ontology with the goal of capturing anatomical data in a systematic and computable manner. An ontology is a formal representation of a set of concepts with defined relationships between those concepts. Multispecies anatomy ontologies in particular are an efficient way to represent the diversity of morphological structures in a clade of organisms, but they present challenges in their development relative to single-species anatomy ontologies. Here, we describe the Teleost Anatomy Ontology (TAO), a multispecies anatomy ontology for teleost fishes derived from the Zebrafish Anatomical Ontology (ZFA) for the purpose of annotating varying morphological features across species. To facilitate interoperability with other anatomy ontologies, TAO uses the Common Anatomy Reference Ontology as a template for its upper level nodes, and TAO and ZFA are synchronized, with zebrafish terms specified as subtypes of teleost terms. We found that the details of ontology architecture have ramifications for querying, and we present general challenges in developing a multispecies anatomy ontology, including refinement of definitions, taxon-specific relationships among terms, and representation of taxonomically variable developmental pathways.


Assuntos
Evolução Biológica , Peixes/anatomia & histologia , Peixes/genética , Animais , Classificação , Biologia Computacional , Bases de Dados Factuais , Genômica
15.
NPJ Digit Med ; 3: 24, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32140567

RESUMO

Storing very large amounts of data and delivering them to researchers in an efficient, verifiable, and compliant manner, is one of the major challenges faced by health care providers and researchers in the life sciences. The electronic health record (EHR) at a hospital or clinic currently functions as a silo, and although EHRs contain rich and abundant information that could be used to understand, improve, and learn from care as part learning health system access to these data is difficult, and the technical, legal, ethical, and social barriers are significant. If we create a microservice ecosystem where data can be accessed through APIs, these challenges become easier to overcome: a service-driven design decouples data from clients. This decoupling provides flexibility: different users can write in their preferred language and use different clients depending on their needs. APIs can be written for iOS apps, web apps, or an R library, and this flexibility highlights the potential ecosystem-building power of APIs. In this article, we use two case studies to illustrate what it means to participate in and contribute to interconnected ecosystems that powers APIs in a healthcare systems.

16.
Mol Biol Cell ; 16(8): 3847-64, 2005 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-15944222

RESUMO

Rab GTPases and SNARE fusion proteins direct cargo trafficking through the exocytic and endocytic pathways of eukaryotic cells. We have used steady state mRNA expression profiling and computational hierarchical clustering methods to generate a global overview of the distribution of Rabs, SNAREs, and coat machinery components, as well as their respective adaptors, effectors, and regulators in 79 human and 61 mouse nonredundant tissues. We now show that this systems biology approach can be used to define building blocks for membrane trafficking based on Rab-centric protein activity hubs. These Rab-regulated hubs provide a framework for an integrated coding system, the membrome network, which regulates the dynamics of the specialized membrane architecture of differentiated cells. The distribution of Rab-regulated hubs illustrates a number of facets that guides the overall organization of subcellular compartments of cells and tissues through the activity of dynamic protein interaction networks. An interactive website for exploring datasets comprising components of the Rab-regulated hubs that define the membrome of different cell and organ systems in both human and mouse is available at http://www.membrome.org/.


Assuntos
Perfilação da Expressão Gênica , Proteínas rab de Ligação ao GTP/genética , Proteínas rab de Ligação ao GTP/metabolismo , Animais , Transporte Biológico , Humanos , Internet , Metabolismo dos Lipídeos , Camundongos , Família Multigênica , Monoéster Fosfórico Hidrolases/genética , Monoéster Fosfórico Hidrolases/metabolismo , Fosfotransferases/genética , Fosfotransferases/metabolismo
17.
F1000Res ; 72018.
Artigo em Inglês | MEDLINE | ID: mdl-30210780

RESUMO

In 2018, the annual Bioinformatics Open Source Conference was held for the first time in conjunction with the Galaxy Community Conference, as an experiment to see if we could reach people in the bioinformatics community who aren't part of the audience attracted by ISMB. Held in June 2018 at Reed College in Portland, Oregon, GCCBOSC (Galaxy Community Conference and Bioinformatics Open Source Conference) attracted over 300 participants from around the world. The meeting started with two days of training, followed by two days of talks and poster/demo sessions (with some joint and some parallel sessions). The joint sessions included well-received keynote talks by Tracy Teal, Fernando Pérez and Lucia Peixoto, as well as a panel discussion about documentation and training. After the main meeting, many attendees stayed for up to four additional collaboration days, an extended version of the Codefests that have been held in conjunction with previous BOSCs. GCCBOSC was a successful experiment. The organizers concluded that the best way to serve the broadest community of potential BOSC attendees will be to partner some years with the International Society for Computational Biology (ISMB) and others with GCC.


Assuntos
Biologia Computacional , Colaboração Intersetorial
18.
Database (Oxford) ; 20182018 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-30576485

RESUMO

Natural language descriptions of organismal phenotypes, a principal object of study in biology, are abundant in the biological literature. Expressing these phenotypes as logical statements using ontologies would enable large-scale analysis on phenotypic information from diverse systems. However, considerable human effort is required to make these phenotype descriptions amenable to machine reasoning. Natural language processing tools have been developed to facilitate this task, and the training and evaluation of these tools depend on the availability of high quality, manually annotated gold standard data sets. We describe the development of an expert-curated gold standard data set of annotated phenotypes for evolutionary biology. The gold standard was developed for the curation of complex comparative phenotypes for the Phenoscape project. It was created by consensus among three curators and consists of entity-quality expressions of varying complexity. We use the gold standard to evaluate annotations created by human curators and those generated by the Semantic CharaParser tool. Using four annotation accuracy metrics that can account for any level of relationship between terms from two phenotype annotations, we found that machine-human consistency, or similarity, was significantly lower than inter-curator (human-human) consistency. Surprisingly, allowing curatorsaccess to external information did not significantly increase the similarity of their annotations to the gold standard or have a significant effect on inter-curator consistency. We found that the similarity of machine annotations to the gold standard increased after new relevant ontology terms had been added. Evaluation by the original authors of the character descriptions indicated that the gold standard annotations came closer to representing their intended meaning than did either the curator or machine annotations. These findings point toward ways to better design software to augment human curators and the use of the gold standard corpus will allow training and assessment of new tools to improve phenotype annotation accuracy at scale.


Assuntos
Curadoria de Dados/métodos , Mineração de Dados/métodos , Ontologia Genética , Processamento de Linguagem Natural , Fenótipo , Humanos
19.
Cell Syst ; 6(4): 470-483.e8, 2018 Apr 25.
Artigo em Inglês | MEDLINE | ID: mdl-29605182

RESUMO

Paralogous transcription factors (TFs) are oftentimes reported to have identical DNA-binding motifs, despite the fact that they perform distinct regulatory functions. Differential genomic targeting by paralogous TFs is generally assumed to be due to interactions with protein co-factors or the chromatin environment. Using a computational-experimental framework called iMADS (integrative modeling and analysis of differential specificity), we show that, contrary to previous assumptions, paralogous TFs bind differently to genomic target sites even in vitro. We used iMADS to quantify, model, and analyze specificity differences between 11 TFs from 4 protein families. We found that paralogous TFs have diverged mainly at medium- and low-affinity sites, which are poorly captured by current motif models. We identify sequence and shape features differentially preferred by paralogous TFs, and we show that the intrinsic differences in specificity among paralogous TFs contribute to their differential in vivo binding. Thus, our study represents a step forward in deciphering the molecular mechanisms of differential specificity in TF families.


Assuntos
Modelos Genéticos , Fatores de Transcrição/fisiologia , Sítios de Ligação , Regulação da Expressão Gênica/fisiologia , Modelos Moleculares , Motivos de Nucleotídeos , Análise de Sequência de Proteína , Fatores de Transcrição/química
20.
Mol Ecol Resour ; 17(1): 120-128, 2017 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-27297607

RESUMO

The r computing and statistical language community has developed a myriad of resources for conducting population genetic analyses. However, resources for learning how to carry out population genetic analyses in r are scattered and often incomplete, which can make acquiring this skill unnecessarily difficult and time consuming. To address this gap, we developed an online community resource with guidance and working demonstrations for conducting population genetic analyses in r. The resource is freely available at http://popgen.nescent.org and includes material for both novices and advanced users of r for population genetics. To facilitate continued maintenance and growth of this resource, we developed a toolchain, process and conventions designed to (i) minimize financial and labour costs of upkeep; (ii) to provide a low barrier to contribution; and (iii) to ensure strong quality assurance. The toolchain includes automatic integration testing of every change and rebuilding of the website when new vignettes or edits are accepted. The process and conventions largely follow a common, distributed version control-based contribution workflow, which is used to provide and manage open peer review by designated website editors. The online resources include detailed documentation of this process, including video tutorials. We invite the community of population geneticists working in r to contribute to this resource, whether for a new use case of their own, or as one of the vignettes from the 'wish list' we maintain, or by improving existing vignettes.


Assuntos
Bioestatística/métodos , Genética Populacional/educação , Genética Populacional/métodos , Estatística como Assunto/educação , Acesso à Informação , Internet
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA