Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
IEEE Trans Pattern Anal Mach Intell ; 43(2): 663-678, 2021 02.
Artigo em Inglês | MEDLINE | ID: mdl-31380747

RESUMO

SafePredict is a novel meta-algorithm that works with any base prediction algorithm for online data to guarantee an arbitrarily chosen correctness rate, 1-ϵ, by allowing refusals. Allowing refusals means that the meta-algorithm may refuse to emit a prediction produced by the base algorithm so that the error rate on non-refused predictions does not exceed ϵ. The SafePredict error bound does not rely on any assumptions on the data distribution or the base predictor. When the base predictor happens not to exceed the target error rate ϵ, SafePredict refuses only a finite number of times. When the error rate of the base predictor changes through time SafePredict makes use of a weight-shifting heuristic that adapts to these changes without knowing when the changes occur yet still maintains the correctness guarantee. Empirical results show that (i) SafePredict compares favorably with state-of-the-art confidence-based refusal mechanisms which fail to offer robust error guarantees; and (ii) combining SafePredict with such refusal mechanisms can in many cases further reduce the number of refusals. Our software is included in the supplementary material, which can be found on the Computer Society Digital Library at http://doi.ieeecomputersociety.org/10.1109/TPAMI.2019.2932415.

2.
Sci Rep ; 10(1): 14141, 2020 Aug 19.
Artigo em Inglês | MEDLINE | ID: mdl-32811842

RESUMO

An amendment to this paper has been published and can be accessed via a link at the top of the paper.

3.
Sci Rep ; 10(1): 6804, 2020 04 22.
Artigo em Inglês | MEDLINE | ID: mdl-32321967

RESUMO

The ability to accurately predict the causal relationships from transcription factors to genes would greatly enhance our understanding of transcriptional dynamics. This could lead to applications in which one or more transcription factors could be manipulated to effect a change in genes leading to the enhancement of some desired trait. Here we present a method called OutPredict that constructs a model for each gene based on time series (and other) data and that predicts gene's expression in a previously unseen subsequent time point. The model also infers causal relationships based on the most important transcription factors for each gene model, some of which have been validated from previous physical experiments. The method benefits from known network edges and steady-state data to enhance predictive accuracy. Our results across B. subtilis, Arabidopsis, E.coli, Drosophila and the DREAM4 simulated in silico dataset show improved predictive accuracy ranging from 40% to 60% over other state-of-the-art methods. We find that gene expression models can benefit from the addition of steady-state data to predict expression values of time series. Finally, we validate, based on limited available data, that the influential edges we infer correspond to known relationships significantly more than expected by chance or by state-of-the-art methods.


Assuntos
Algoritmos , Biologia Computacional/métodos , Perfilação da Expressão Gênica/métodos , Redes Reguladoras de Genes , Modelos Genéticos , Fatores de Transcrição/genética , Simulação por Computador , Perfilação da Expressão Gênica/estatística & dados numéricos , Aprendizado de Máquina , Reprodutibilidade dos Testes
4.
PLoS One ; 6(9): e23947, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-21912654

RESUMO

Temperature-sensitive (ts) mutations are mutations that exhibit a mutant phenotype at high or low temperatures and a wild-type phenotype at normal temperature. Temperature-sensitive mutants are valuable tools for geneticists, particularly in the study of essential genes. However, finding ts mutations typically relies on generating and screening many thousands of mutations, which is an expensive and labor-intensive process. Here we describe an in silico method that uses Rosetta and machine learning techniques to predict a highly accurate "top 5" list of ts mutations given the structure of a protein of interest. Rosetta is a protein structure prediction and design code, used here to model and score how proteins accommodate point mutations with side-chain and backbone movements. We show that integrating Rosetta relax-derived features with sequence-based features results in accurate temperature-sensitive mutation predictions.


Assuntos
Alelos , Inteligência Artificial , Biologia Computacional/métodos , Proteínas/química , Proteínas/genética , Temperatura , Modelos Moleculares , Mutação , Conformação Proteica
5.
Genome Biol ; 11(12): R123, 2010.
Artigo em Inglês | MEDLINE | ID: mdl-21182762

RESUMO

BACKGROUND: Nitrate, acting as both a nitrogen source and a signaling molecule, controls many aspects of plant development. However, gene networks involved in plant adaptation to fluctuating nitrate environments have not yet been identified. RESULTS: Here we use time-series transcriptome data to decipher gene relationships and consequently to build core regulatory networks involved in Arabidopsis root adaptation to nitrate provision. The experimental approach has been to monitor genome-wide responses to nitrate at 3, 6, 9, 12, 15 and 20 minutes using Affymetrix ATH1 gene chips. This high-resolution time course analysis demonstrated that the previously known primary nitrate response is actually preceded by a very fast gene expression modulation, involving genes and functions needed to prepare plants to use or reduce nitrate. A state-space model inferred from this microarray time-series data successfully predicts gene behavior in unlearnt conditions. CONCLUSIONS: The experiments and methods allow us to propose a temporal working model for nitrate-driven gene networks. This network model is tested both in silico and experimentally. For example, the over-expression of a predicted gene hub encoding a transcription factor induced early in the cascade indeed leads to the modification of the kinetic nitrate response of sentinel genes such as NIR, NIA2, and NRT1.1, and several other transcription factors. The potential nitrate/hormone connections implicated by this time-series data are also evaluated.


Assuntos
Arabidopsis/genética , Arabidopsis/metabolismo , Perfilação da Expressão Gênica , Nitratos/metabolismo , Adaptação Fisiológica , Análise por Conglomerados , Regulação da Expressão Gênica de Plantas , Redes Reguladoras de Genes , Genes de Plantas , Modelos Genéticos , Nitrogênio/metabolismo , Análise de Sequência com Séries de Oligonucleotídeos , Raízes de Plantas/genética , Raízes de Plantas/metabolismo , RNA de Plantas/genética , Biologia de Sistemas , Fatores de Transcrição/metabolismo
6.
BMC Evol Biol ; 10: 357, 2010 Nov 18.
Artigo em Inglês | MEDLINE | ID: mdl-21087504

RESUMO

BACKGROUND: Gene duplication can lead to genetic redundancy, which masks the function of mutated genes in genetic analyses. Methods to increase sensitivity in identifying genetic redundancy can improve the efficiency of reverse genetics and lend insights into the evolutionary outcomes of gene duplication. Machine learning techniques are well suited to classifying gene family members into redundant and non-redundant gene pairs in model species where sufficient genetic and genomic data is available, such as Arabidopsis thaliana, the test case used here. RESULTS: Machine learning techniques that combine multiple attributes led to a dramatic improvement in predicting genetic redundancy over single trait classifiers alone, such as BLAST E-values or expression correlation. In withholding analysis, one of the methods used here, Support Vector Machines, was two-fold more precise than single attribute classifiers, reaching a level where the majority of redundant calls were correctly labeled. Using this higher confidence in identifying redundancy, machine learning predicts that about half of all genes in Arabidopsis showed the signature of predicted redundancy with at least one but typically less than three other family members. Interestingly, a large proportion of predicted redundant gene pairs were relatively old duplications (e.g., Ks > 1), suggesting that redundancy is stable over long evolutionary periods. CONCLUSIONS: Machine learning predicts that most genes will have a functionally redundant paralog but will exhibit redundancy with relatively few genes within a family. The predictions and gene pair attributes for Arabidopsis provide a new resource for research in genetics and genome evolution. These techniques can now be applied to other organisms.


Assuntos
Inteligência Artificial , Duplicação Gênica , Algoritmos , Arabidopsis/genética , Teorema de Bayes , Regulação da Expressão Gênica de Plantas , Genes de Plantas , Genoma de Planta , Modelos Logísticos , Família Multigênica , Curva ROC
7.
Plant Physiol ; 152(2): 500-15, 2010 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-20007449

RESUMO

Data generation is no longer the limiting factor in advancing biological research. In addition, data integration, analysis, and interpretation have become key bottlenecks and challenges that biologists conducting genomic research face daily. To enable biologists to derive testable hypotheses from the increasing amount of genomic data, we have developed the VirtualPlant software platform. VirtualPlant enables scientists to visualize, integrate, and analyze genomic data from a systems biology perspective. VirtualPlant integrates genome-wide data concerning the known and predicted relationships among genes, proteins, and molecules, as well as genome-scale experimental measurements. VirtualPlant also provides visualization techniques that render multivariate information in visual formats that facilitate the extraction of biological concepts. Importantly, VirtualPlant helps biologists who are not trained in computer science to mine lists of genes, microarray experiments, and gene networks to address questions in plant biology, such as: What are the molecular mechanisms by which internal or external perturbations affect processes controlling growth and development? We illustrate the use of VirtualPlant with three case studies, ranging from querying a gene of interest to the identification of gene networks and regulatory hubs that control seed development. Whereas the VirtualPlant software was developed to mine Arabidopsis (Arabidopsis thaliana) genomic data, its data structures, algorithms, and visualization tools are designed in a species-independent way. VirtualPlant is freely available at www.virtualplant.org.


Assuntos
Sistemas de Gerenciamento de Base de Dados , Genômica , Plantas/genética , Biologia de Sistemas , Biologia Computacional/métodos , Bases de Dados Genéticas , Redes Reguladoras de Genes , Genes de Plantas , Genoma de Planta , Análise de Sequência com Séries de Oligonucleotídeos , Interface Usuário-Computador
8.
J Exp Bot ; 58(9): 2359-67, 2007.
Artigo em Inglês | MEDLINE | ID: mdl-17470441

RESUMO

Nitrate is both a nutrient and a potent signal that stimulates plant growth. Initial experiments in the late 1950s showing that nitrate enhances nitrate reductase (NR) activity after several hours of treatment have now progressed to transcriptome studies identifying over 1000 genes that respond to muM levels of nitrate within minutes. The use of an Arabidopsis NR-null mutant allowed the identification of genes that respond to nitrate when the production of downstream metabolites of nitrate is blocked. Further dissection of the nitrate response is now possible using new bioinformatic tools such as Sungear to perform comparative studies of multiple transcriptome responses across different laboratories and environmental conditions. These analyses have identified genes and pathways (e.g. nitrate assimilation, pentose phosphate pathway, and glycolysis) that respond to nitrate under a variety of conditions (context-independent). Most of these genes and pathways are ones that were identified using the NR-null mutant as responding directly to nitrate. By contrast, other processes such as protein synthesis respond only under a subset of conditions (context-dependent). Data from the NR-null mutant suggest these latter processes may be regulated by downstream nitrogen metabolites.


Assuntos
Arabidopsis/metabolismo , Nitratos/metabolismo , Software , Arabidopsis/genética , Perfilação da Expressão Gênica , Genoma de Planta , Mutação , Análise de Sequência com Séries de Oligonucleotídeos , Reação em Cadeia da Polimerase
9.
Genome Biol ; 8(1): R7, 2007.
Artigo em Inglês | MEDLINE | ID: mdl-17217541

RESUMO

BACKGROUND: Carbon (C) and nitrogen (N) metabolites can regulate gene expression in Arabidopsis thaliana. Here, we use multi-network analysis of microarray data to identify molecular networks regulated by C and N in the Arabidopsis root system. RESULTS: We used the Arabidopsis whole genome Affymetrix gene chip to explore global gene expression responses in plants exposed transiently to a matrix of C and N treatments. We used ANOVA analysis to define quantitative models of regulation for all detected genes. Our results suggest that about half of the Arabidopsis transcriptome is regulated by C, N or CN interactions. We found ample evidence for interactions between C and N that include genes involved in metabolic pathways, protein degradation and auxin signaling. To provide a global, yet detailed, view of how the cell molecular network is adjusted in response to the CN treatments, we constructed a qualitative multi-network model of the Arabidopsis metabolic and regulatory molecular network, including 6,176 genes, 1,459 metabolites and 230,900 interactions among them. We integrated the quantitative models of CN gene regulation with the wiring diagram in the multi-network, and identified specific interacting genes in biological modules that respond to C, N or CN treatments. CONCLUSION: Our results indicate that CN regulation occurs at multiple levels, including potential post-transcriptional control by microRNAs. The network analysis of our systematic dataset of CN treatments indicates that CN sensing is a mechanism that coordinates the global and coordinated regulation of specific sets of molecular machines in the plant cell.


Assuntos
Arabidopsis/genética , Carbono/metabolismo , Perfilação da Expressão Gênica , Redes Reguladoras de Genes , Genoma de Planta/genética , Modelos Genéticos , Nitrogênio/metabolismo , Análise por Conglomerados , Regulação da Expressão Gênica de Plantas , Genes de Plantas , Ácidos Indolacéticos/metabolismo , Raízes de Plantas/genética , Fatores de Tempo
10.
Bioinformatics ; 23(2): 259-61, 2007 Jan 15.
Artigo em Inglês | MEDLINE | ID: mdl-17018536

RESUMO

UNLABELLED: Sungear is a software system that supports a rapid, visually interactive and biologist-driven comparison of large datasets. The datasets can come from microarray experiments (e.g. genes induced in each experiment), from comparative genomics (e.g. genes present in each genome) or even from non-biological applications (e.g. demographics or baseball statistics). Sungear represents multiple datasets as vertices in a polygon. Each possible intersection among the sets is represented as a circle inside the polygon. The position of the circle is determined by the position of the vertices represented in the intersection and the area of the circle is determined by the number of elements in the intersection. Sungear shows which Gene Ontology terms are over-represented in a subset of circles or anchors. The intuitive Sungear interface has enabled biologists to determine quickly which dataset or groups of datasets play a role in a biological function of interest. AVAILABILITY: A live online version of Sungear can be found at http://virtualplant-prod.bio.nyu.edu/cgi-bin/sungear/index.cgi


Assuntos
Mapeamento Cromossômico/métodos , Sistemas de Gerenciamento de Base de Dados , Bases de Dados Genéticas , Genética Populacional , Armazenamento e Recuperação da Informação/métodos , Software , Interface Usuário-Computador , Algoritmos , Gráficos por Computador
12.
Science ; 302(5652): 1956-60, 2003 Dec 12.
Artigo em Inglês | MEDLINE | ID: mdl-14671301

RESUMO

A global map of gene expression within an organ can identify genes with coordinated expression in localized domains, thereby relating gene activity to cell fate and tissue specialization. Here, we present localization of expression of more than 22,000 genes in the Arabidopsis root. Gene expression was mapped to 15 different zones of the root that correspond to cell types and tissues at progressive developmental stages. Patterns of gene expression traverse traditional anatomical boundaries and show cassettes of hormonal response. Chromosomal clustering defined some coregulated genes. This expression map correlates groups of genes to specific cell fates and should serve to guide reverse genetics.


Assuntos
Arabidopsis/genética , Perfilação da Expressão Gênica , Expressão Gênica , Raízes de Plantas/genética , Arabidopsis/citologia , Arabidopsis/crescimento & desenvolvimento , Arabidopsis/metabolismo , Separação Celular , Mapeamento Cromossômico , Cromossomos de Plantas/genética , Regulação da Expressão Gênica de Plantas , Genes de Plantas , Proteínas de Fluorescência Verde , Proteínas Luminescentes/análise , Meristema/citologia , Meristema/genética , Meristema/crescimento & desenvolvimento , Meristema/metabolismo , Análise de Sequência com Séries de Oligonucleotídeos , Reguladores de Crescimento de Plantas/fisiologia , Coifa/citologia , Coifa/genética , Coifa/crescimento & desenvolvimento , Coifa/metabolismo , Raízes de Plantas/citologia , Raízes de Plantas/crescimento & desenvolvimento , Raízes de Plantas/metabolismo , Protoplastos , RNA Mensageiro/análise , RNA Mensageiro/genética , RNA de Plantas/análise , RNA de Plantas/genética , Reação em Cadeia da Polimerase Via Transcriptase Reversa , Transdução de Sinais/genética , Fatores de Transcrição/genética , Fatores de Transcrição/fisiologia
14.
Plant Physiol ; 132(2): 440-52, 2003 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-12805577

RESUMO

Here, we report the systematic exploration and modeling of interactions between light and sugar signaling. The data set analyzed explores the interactions of sugar (sucrose) with distinct light qualities (white, blue, red, and far-red) used at different fluence rates (low or high) in etiolated seedlings and mature green plants. Boolean logic was used to model the effect of these carbon/light interactions on three target genes involved in nitrogen assimilation: asparagine synthetase (ASN1 and ASN2) and glutamine synthetase (GLN2). This analysis enabled us to assess the effects of carbon on light-induced genes (GLN2/ASN2) versus light-repressed genes (ASN1) in this pathway. New interactions between carbon and blue-light signaling were discovered, and further connections between red/far-red light and carbon were modeled. Overall, light was able to override carbon as a major regulator of ASN1 and GLN2 in etiolated seedlings. By contrast, carbon overrides light as the major regulator of GLN2 and ASN2 in light-grown plants. Specific examples include the following: Carbon attenuated the blue-light induction of GLN2 in etiolated seedlings and also attenuated the white-, blue-, and red-light induction of GLN2 and ASN2 in light-grown plants. By contrast, carbon potentiated far-red-light induction of GLN2 and ASN2 in light-grown plants. Depending on the fluence rate of far-red light, carbon either attenuated or potentiated light repression of ASN1 in light-grown plants. These studies indicate the interaction of carbon with blue, red, and far-red-light signaling and set the stage for further investigation into modeling this complex web of interacting pathways using systems biology approaches.


Assuntos
Arabidopsis/fisiologia , Carbono/metabolismo , Luz , Modelos Biológicos , Arabidopsis/metabolismo , Arabidopsis/efeitos da radiação , Transdução de Sinais
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...