Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 33
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
PLoS Comput Biol ; 8(5): e1002490, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22589706

RESUMO

Understanding of genotype-phenotype associations is important not only for furthering our knowledge on internal cellular processes, but also essential for providing the foundation necessary for genetic engineering of microorganisms for industrial use (e.g., production of bioenergy or biofuels). However, genotype-phenotype associations alone do not provide enough information to alter an organism's genome to either suppress or exhibit a phenotype. It is important to look at the phenotype-related genes in the context of the genome-scale network to understand how the genes interact with other genes in the organism. Identification of metabolic subsystems involved in the expression of the phenotype is one way of placing the phenotype-related genes in the context of the entire network. A metabolic system refers to a metabolic network subgraph; nodes are compounds and edges labels are the enzymes that catalyze the reaction. The metabolic subsystem could be part of a single metabolic pathway or span parts of multiple pathways. Arguably, comparative genome-scale metabolic network analysis is a promising strategy to identify these phenotype-related metabolic subsystems. Network Instance-Based Biased Subgraph Search (NIBBS) is a graph-theoretic method for genome-scale metabolic network comparative analysis that can identify metabolic systems that are statistically biased toward phenotype-expressing organismal networks. We set up experiments with target phenotypes like hydrogen production, TCA expression, and acid-tolerance. We show via extensive literature search that some of the resulting metabolic subsystems are indeed phenotype-related and formulate hypotheses for other systems in terms of their role in phenotype expression. NIBBS is also orders of magnitude faster than MULE, one of the most efficient maximal frequent subgraph mining algorithms that could be adjusted for this problem. Also, the set of phenotype-biased metabolic systems output by NIBBS comes very close to the set of phenotype-biased subgraphs output by an exact maximally-biased subgraph enumeration algorithm ( MBS-Enum ). The code (NIBBS and the module to visualize the identified subsystems) is available at http://freescience.org/cs/NIBBS.


Assuntos
Mineração de Dados/métodos , Bases de Dados de Proteínas , Metaboloma/fisiologia , Modelos Biológicos , Mapeamento de Interação de Proteínas/métodos , Proteoma/metabolismo , Transdução de Sinais/fisiologia , Algoritmos , Animais , Simulação por Computador , Humanos , Publicações Periódicas como Assunto , Fenótipo
2.
Proteome Sci ; 10 Suppl 1: S2, 2012 Jun 21.
Artigo em Inglês | MEDLINE | ID: mdl-22759578

RESUMO

BACKGROUND: Phenotypes exhibited by microorganisms can be useful for several purposes, e.g., ethanol as an alternate fuel. Sometimes, the target phenotype maybe required in combination with other phenotypes, in order to be useful, for e.g., an industrial process may require that the organism survive in an anaerobic, alcohol rich environment and be able to feed on both hexose and pentose sugars to produce ethanol. This combination of traits may not be available in any existing organism or if they do exist, the mechanisms involved in the phenotype-expression may not be efficient enough to be useful. Thus, it may be required to genetically modify microorganisms. However, before any genetic modification can take place, it is important to identify the underlying cellular subsystems responsible for the expression of the target phenotype. RESULTS: In this paper, we develop a method to identify statistically significant and phenotypically-biased functional modules. The method can compare the organismal network information from hundreds of phenotype expressing and phenotype non-expressing organisms to identify cellular subsystems that are more prone to occur in phenotype-expressing organisms than in phenotype non-expressing organisms. We have provided literature evidence that the phenotype-biased modules identified for phenotypes such as hydrogen production (dark and light fermentation), respiration, gram-positive, gram-negative and motility, are indeed phenotype-related. CONCLUSION: Thus we have proposed a methodology to identify phenotype-biased cellular subsystems. We have shown the effectiveness of our methodology by applying it to several target phenotypes. The code and all supplemental files can be downloaded from (http://freescience.org/cs/phenotype-biased-biclusters/).

3.
BMC Bioinformatics ; 12: 440, 2011 Nov 11.
Artigo em Inglês | MEDLINE | ID: mdl-22078292

RESUMO

BACKGROUND: Microbial communities in their natural environments exhibit phenotypes that can directly cause particular diseases, convert biomass or wastewater to energy, or degrade various environmental contaminants. Understanding how these communities realize specific phenotypic traits (e.g., carbon fixation, hydrogen production) is critical for addressing health, bioremediation, or bioenergy problems. RESULTS: In this paper, we describe a graph-theoretical method for in silico prediction of the cellular subsystems that are related to the expression of a target phenotype. The proposed (α, ß)-motif finder approach allows for identification of these phenotype-related subsystems that, in addition to metabolic subsystems, could include their regulators, sensors, transporters, and even uncharacterized proteins. By comparing dozens of genome-scale networks of functionally associated proteins, our method efficiently identifies those statistically significant functional modules that are in at least α networks of phenotype-expressing organisms but appear in no more than ß networks of organisms that do not exhibit the target phenotype. It has been shown via various experiments that the enumerated modules are indeed related to phenotype-expression when tested with different target phenotypes like hydrogen production, motility, aerobic respiration, and acid-tolerance. CONCLUSION: Thus, we have proposed a methodology that can identify potential statistically significant phenotype-related functional modules. The functional module is modeled as an (α, ß)-clique, where α and ß are two criteria introduced in this work. We also propose a novel network model, called the two-typed, divided network. The new network model and the criteria make the problem tractable even while very large networks are being compared. The code can be downloaded from http://www.freescience.org/cs/ABClique/


Assuntos
Ácidos/metabolismo , Algoritmos , Bactérias/genética , Bactérias/metabolismo , Metodologias Computacionais , Ciclo do Ácido Cítrico , Hidrogênio/metabolismo , Fenótipo , Proteobactérias
4.
BMC Bioinformatics ; 11: 118, 2010 Mar 05.
Artigo em Inglês | MEDLINE | ID: mdl-20205730

RESUMO

BACKGROUND: High-resolution tandem mass spectra can now be readily acquired with hybrid instruments, such as LTQ-Orbitrap and LTQ-FT, in high-throughput shotgun proteomics workflows. The improved spectral quality enables more accurate de novo sequencing for identification of post-translational modifications and amino acid polymorphisms. RESULTS: In this study, a new de novo sequencing algorithm, called Vonode, has been developed specifically for analysis of such high-resolution tandem mass spectra. To fully exploit the high mass accuracy of these spectra, a unique scoring system is proposed to evaluate sequence tags based primarily on mass accuracy information of fragment ions. Consensus sequence tags were inferred for 11,422 spectra with an average peptide length of 5.5 residues from a total of 40,297 input spectra acquired in a 24-hour proteomics measurement of Rhodopseudomonas palustris. The accuracy of inferred consensus sequence tags was 84%. According to our comparison, the performance of Vonode was shown to be superior to the PepNovo v2.0 algorithm, in terms of the number of de novo sequenced spectra and the sequencing accuracy. CONCLUSIONS: Here, we improved de novo sequencing performance by developing a new algorithm specifically for high-resolution tandem mass spectral data. The Vonode algorithm is freely available for download at http://compbio.ornl.gov/Vonode.


Assuntos
Algoritmos , Proteômica/métodos , Espectrometria de Massas em Tandem/métodos , Sequência de Aminoácidos , Bases de Dados de Proteínas , Peptídeos/química , Processamento de Proteína Pós-Traducional , Análise de Sequência de Proteína
5.
Mol Cell Proteomics ; 7(5): 938-48, 2008 May.
Artigo em Inglês | MEDLINE | ID: mdl-18156135

RESUMO

In this study, the pathway for anaerobic catabolism of p-coumarate by a model bacterium, Rhodopseudomonas palustris, was characterized by comparing the gene expression profiles of cultures grown in the presence of p-coumarate, benzoate, or succinate as the sole carbon sources. Gene expression was quantified at the mRNA level with transcriptomics and at the protein level with quantitative proteomics using (15)N metabolic labeling. Protein relative abundances, along with their confidence intervals for statistical significance evaluation, were estimated with the software ProRata. Both -omics measurements were used as the transcriptomics provided near-full genome coverage of gene expression profiles and the quantitative proteomics ascertained abundance changes of over 1600 proteins. The integrated gene expression data are consistent with the hypothesis that p-coumarate is converted to benzoyl-CoA, which is then degraded via a known aromatic ring reduction pathway. For the metabolism of p-coumarate to benzoyl-CoA, two alternative routes, a beta-oxidation route and a non-beta-oxidation route, are possible. The integrated gene expression data provided strong support for the non-beta-oxidation route in R. palustris. A putative gene was proposed for every step in the non-beta-oxidation route.


Assuntos
Proteínas de Bactérias/metabolismo , Ácidos Cumáricos/metabolismo , Perfilação da Expressão Gênica , Proteômica , Rodopseudomonas/crescimento & desenvolvimento , Rodopseudomonas/metabolismo , Anaerobiose/genética , Proteínas de Bactérias/análise , Proteínas de Bactérias/genética , Benzoatos/metabolismo , Biossíntese de Proteínas/genética , RNA Mensageiro/análise , RNA Mensageiro/metabolismo , Rodopseudomonas/genética , Ácido Succínico/metabolismo
6.
Bioinformatics ; 24(7): 979-86, 2008 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-18304937

RESUMO

MOTIVATION: Recent improvements in high-throughput Mass Spectrometry (MS) technology have expedited genome-wide discovery of protein-protein interactions by providing a capability of detecting protein complexes in a physiological setting. Computational inference of protein interaction networks and protein complexes from MS data are challenging. Advances are required in developing robust and seamlessly integrated procedures for assessment of protein-protein interaction affinities, mathematical representation of protein interaction networks, discovery of protein complexes and evaluation of their biological relevance. RESULTS: A multi-step but easy-to-follow framework for identifying protein complexes from MS pull-down data is introduced. It assesses interaction affinity between two proteins based on similarity of their co-purification patterns derived from MS data. It constructs a protein interaction network by adopting a knowledge-guided threshold selection method. Based on the network, it identifies protein complexes and infers their core components using a graph-theoretical approach. It deploys a statistical evaluation procedure to assess biological relevance of each found complex. On Saccharomyces cerevisiae pull-down data, the framework outperformed other more complicated schemes by at least 10% in F(1)-measure and identified 610 protein complexes with high-functional homogeneity based on the enrichment in Gene Ontology (GO) annotation. Manual examination of the complexes brought forward the hypotheses on cause of false identifications. Namely, co-purification of different protein complexes as mediated by a common non-protein molecule, such as DNA, might be a source of false positives. Protein identification bias in pull-down technology, such as the hydrophilic bias could result in false negatives.


Assuntos
Bases de Dados de Proteínas , Perfilação da Expressão Gênica/métodos , Modelos Biológicos , Mapeamento de Interação de Proteínas/métodos , Proteínas/química , Proteínas/metabolismo , Transdução de Sinais/fisiologia , Algoritmos , Biologia/métodos , Simulação por Computador , Armazenamento e Recuperação da Informação/métodos , Mapeamento de Peptídeos/métodos , Relação Estrutura-Atividade , Integração de Sistemas
7.
BMC Med Inform Decis Mak ; 9 Suppl 1: S5, 2009 Nov 03.
Artigo em Inglês | MEDLINE | ID: mdl-19891799

RESUMO

BACKGROUND: Publication databases in biomedicine (e.g., PubMed, MEDLINE) are growing rapidly in size every year, as are public databases of experimental biological data and annotations derived from the data. Publications often contain evidence that confirm or disprove annotations, such as putative protein functions, however, it is increasingly difficult for biologists to identify and process published evidence due to the volume of papers and the lack of a systematic approach to associate published evidence with experimental data and annotations. Natural Language Processing (NLP) tools can help address the growing divide by providing automatic high-throughput detection of simple terms in publication text. However, NLP tools are not mature enough to identify complex terms, relationships, or events. RESULTS: In this paper we present and extend BioDEAL, a community evidence annotation system that introduces a feedback loop into the database-publication cycle to allow scientists to connect data-driven biological concepts to publications. CONCLUSION: BioDEAL may change the way biologists relate published evidence with experimental data. Instead of biologists or research groups searching and managing evidence independently, the community can collectively build and share this knowledge.


Assuntos
Bases de Dados como Assunto/organização & administração , Armazenamento e Recuperação da Informação/métodos , Processamento de Linguagem Natural , Publicações Periódicas como Assunto , Sistemas de Gerenciamento de Base de Dados , Internet , Apoio Social
8.
Sci Rep ; 8(1): 7490, 2018 05 10.
Artigo em Inglês | MEDLINE | ID: mdl-29748598

RESUMO

Sex differences in Alzheimer's disease (AD) biology and progression are not yet fully characterized. The goal of this study is to examine the effect of sex on cognitive progression in subjects with high likelihood of mild cognitive impairment (MCI) due to Alzheimer's and followed up to 10 years in the Alzheimer's Disease Neuroimaging Initiative (ADNI). Cerebrospinal fluid total-tau and amyloid-beta (Aß42) ratio values were used to sub-classify 559 MCI subjects (216 females, 343 males) as having "high" or "low" likelihood for MCI due to Alzheimer's. Data were analyzed using mixed-effects models incorporating all follow-ups. The worsening from baseline in Alzheimer's Disease Assessment Scale-Cognitive score (mean, SD) (9 ± 12) in subjects with high likelihood of MCI due to Alzheimer's was markedly greater than that in subjects with low likelihood (1 ± 6, p < 0.0001). Among MCI due to AD subjects, the mean worsening in cognitive score was significantly greater in females (11.58 ± 14) than in males (6.87 ± 11, p = 0.006). Our findings highlight the need to further investigate these findings in other populations and develop sex specific timelines for Alzheimer's disease progression.


Assuntos
Doença de Alzheimer/epidemiologia , Doença de Alzheimer/etiologia , Cognição/fisiologia , Disfunção Cognitiva/epidemiologia , Disfunção Cognitiva/etiologia , Caracteres Sexuais , Idoso , Idoso de 80 Anos ou mais , Doença de Alzheimer/diagnóstico , Doença de Alzheimer/patologia , Disfunção Cognitiva/diagnóstico , Progressão da Doença , Feminino , Humanos , Estudos Longitudinais , Masculino , Neuroimagem , Testes Neuropsicológicos , Prevalência , Estudos Retrospectivos , Fatores de Risco
9.
IEEE Trans Vis Comput Graph ; 13(5): 991-1003, 2007.
Artigo em Inglês | MEDLINE | ID: mdl-17622682

RESUMO

Remote visualization is an enabling technology aiming to resolve the barrier of physical distance. While many researchers have developed innovative algorithms for remote visualization, previous work has focused little on systematically investigating optimal configurations of remote visualization architectures. In this paper, we study caching and prefetching, an important aspect of such architecture design, in order to optimize the fetch time in a remote visualization system. Unlike a processor cache or web cache, caching for remote visualization is unique and complex. Through actual experimentation and numerical simulation, we have discovered ways to systematically evaluate and search for optimal configurations of remote visualization caches under various scenarios, such as different network speeds, sizes of data for user requests, prefetch schemes, cache depletion schemes, etc. We have also designed a practical infrastructure software to adaptively optimize the caching architecture of general remote visualization systems, when a different application is started or the network condition varies. The lower bound of achievable latency discovered with our approach can aid the design of remote visualization algorithms and the selection of suitable network layouts for a remote visualization system.


Assuntos
Redes de Comunicação de Computadores , Gráficos por Computador , Compressão de Dados/métodos , Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos , Modelos Teóricos , Interface Usuário-Computador , Algoritmos , Simulação por Computador , Sistemas Computacionais , Análise Numérica Assistida por Computador , Processamento de Sinais Assistido por Computador , Fatores de Tempo
10.
J Comput Biol ; 24(12): 1195-1211, 2017 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-28891687

RESUMO

The problem of aligning multiple metabolic pathways is one of very challenging problems in computational biology. A metabolic pathway consists of three types of entities: reactions, compounds, and enzymes. Based on similarities between enzymes, Tohsato et al. gave an algorithm for aligning multiple metabolic pathways. However, the algorithm given by Tohsato et al. neglects the similarities among reactions, compounds, enzymes, and pathway topology. How to design algorithms for the alignment problem of multiple metabolic pathways based on the similarity of reactions, compounds, and enzymes? It is a difficult computational problem. In this article, we propose an algorithm for the problem of aligning multiple metabolic pathways based on the similarities among reactions, compounds, enzymes, and pathway topology. First, we compute a weight between each pair of like entities in different input pathways based on the entities' similarity score and topological structure using Ay et al.'s methods. We then construct a weighted k-partite graph for the reactions, compounds, and enzymes. We extract a mapping between these entities by solving the maximum-weighted k-partite matching problem by applying a novel heuristic algorithm. By analyzing the alignment results of multiple pathways in different organisms, we show that the alignments found by our algorithm correctly identify common subnetworks among multiple pathways.


Assuntos
Algoritmos , Biologia Computacional/métodos , Redes e Vias Metabólicas , Humanos , Mapeamento de Interação de Proteínas , Alinhamento de Sequência
11.
J Mol Biol ; 352(5): 1105-17, 2005 Oct 07.
Artigo em Inglês | MEDLINE | ID: mdl-16140329

RESUMO

The binding between an enzyme and its substrate is highly specific, despite the fact that many different enzymes show significant sequence and structure similarity. There must be, then, substrate specificity-determining residues that enable different enzymes to recognize their unique substrates. We reason that a coordinated, not independent, action of both conserved and non-conserved residues determine enzymatic activity and specificity. Here, we present a surface patch ranking (SPR) method for in silico discovery of substrate specificity-determining residue clusters by exploring both sequence conservation and correlated mutations. As case studies we apply SPR to several highly homologous enzymatic protein pairs, such as guanylyl versus adenylyl cyclases, lactate versus malate dehydrogenases, and trypsin versus chymotrypsin. Without using experimental data, we predict several single and multi-residue clusters that are consistent with previous mutagenesis experimental results. Most single-residue clusters are directly involved in enzyme-substrate interactions, whereas multi-residue clusters are vital for domain-domain and regulator-enzyme interactions, indicating their complementary role in specificity determination. These results demonstrate that SPR may help the selection of target residues for mutagenesis experiments and, thus, focus rational drug design, protein engineering, and functional annotation to the relevant regions of a protein.


Assuntos
Aminoácidos/química , Aminoácidos/fisiologia , Biologia Computacional , Enzimas/química , Enzimas/fisiologia , Adenilil Ciclases/fisiologia , Sequência de Aminoácidos , Animais , Sítios de Ligação/fisiologia , Bovinos , Quimotripsina/fisiologia , Cristalografia por Raios X , Enzimas/genética , Guanilato Ciclase/fisiologia , L-Lactato Desidrogenase/fisiologia , Malato Desidrogenase/fisiologia , Dados de Sequência Molecular , Estrutura Terciária de Proteína , Especificidade por Substrato/fisiologia , Tripsina/química , Tripsina/fisiologia
12.
Comput Biol Chem ; 30(1): 39-49, 2006 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-16321569

RESUMO

A key to advancing the understanding of molecular biology in the post-genomic age is the development of accurate predictive models for genetic regulation, protein interaction, metabolism, and other biochemical processes. To facilitate model development, simulation algorithms must provide an accurate representation of the system, while performing the simulation in a reasonable amount of time. Gillespie's stochastic simulation algorithm (SSA) accurately depicts spatially homogeneous models with small populations of chemical species and properly represents noise, but it is often abandoned when modeling larger systems because of its computational complexity. In this work, we examine the performance of different versions of the SSA when applied to several biochemical models. Through our analysis, we discover that transient changes in reaction execution frequencies, which are typical of biochemical models with gene induction and repression, can dramatically affect simulator performance. To account for these shifts, we propose a new algorithm called the sorting direct method that maintains a loosely sorted order of the reactions as the simulation executes. Our measurements show that the sorting direct method performs favorably when compared to other well-known exact stochastic simulation algorithms.


Assuntos
Modelos Químicos , Processos Estocásticos , Biologia de Sistemas/métodos , Algoritmos , Aliivibrio fischeri/química , Escherichia coli/química
13.
Protein Eng Des Sel ; 18(12): 589-96, 2005 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-16246824

RESUMO

Ribulose 1,5-bisphosphate carboxylase/oxygenase (RuBisCo) catalyzes a rate-limiting step in photosynthetic carbon assimilation (reacting with CO2) and its competitive photo-respiratory carbon oxidation (reacting with O2). RuBisCo enzyme with an enhanced CO2/O2 specificity would boost the ability to make great progress in agricultural production and environmental management. RuBisCos in marine non-green algae, resulting from an earlier endo-symbiotic event, diverge greatly from those in green plants and cyanobacteria and, further, have the highest CO2/O2 specificity whereas RuBisCos in cyanobacteria have the lowest. We assumed that there exist different levels of CO2/O2 specificity-determining factors, corresponding to different evolutionary events and specificity levels. Based on this assumption, we devised a scheme to identify these substrate-determining factors. From this analysis, we are able to discover different categories of the CO2/O2 specificity-determining factors that show which residue substitutions account for (relatively) small specificity changes, as happened in green plants, or a tremendous enhancement, as observed in marine non-green algae. Therefore, the analysis can improve our understanding of molecular mechanisms in the substrate specificity development and prioritize candidate specificity-determining surface residues for site-directed mutagenesis.


Assuntos
Dióxido de Carbono/metabolismo , Oxigênio/metabolismo , Ribulose-Bifosfato Carboxilase/genética , Sequência de Aminoácidos , Biologia Computacional , Cianobactérias/enzimologia , Bases de Dados de Proteínas , Eucariotos/enzimologia , Evolução Molecular , Modelos Moleculares , Dados de Sequência Molecular , Mutação , Plantas/enzimologia , Ribulose-Bifosfato Carboxilase/metabolismo , Homologia de Sequência de Aminoácidos , Especificidade por Substrato
14.
IEEE Trans Pattern Anal Mach Intell ; 27(8): 1340-3, 2005 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-16119272

RESUMO

FastMap is a dimension reduction technique that operates on distances between objects. Although only distances are used, implicitly the technique assumes that the objects are points in a p-dimensional Euclidean space. It selects a sequence of k < or = p orthogonal axes defined by distant pairs of points (called pivots) and computes the projection of the points onto the orthogonal axes. We show that FastMap uses only the outer envelope of a data set. Pivots are taken from the faces, usually vertices, of the convex hull of the data points in the original implicit Euclidean space. This provides a bridge to results in robust statistics, where the convex hull is used as a tool in multivariate outlier detection and in robust estimation methods. The connection sheds new light on the properties of FastMap, particularly its sensitivity to outliers, and provides an opportunity for a new class of dimension reduction algorithms, RobustMaps, that retain the speed of FastMap and exploit ideas in robust statistics.


Assuntos
Algoritmos , Inteligência Artificial , Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos , Armazenamento e Recuperação da Informação/métodos , Reconhecimento Automatizado de Padrão/métodos , Análise por Conglomerados , Simulação por Computador , Modelos Estatísticos , Análise Multivariada , Processamento de Sinais Assistido por Computador , Técnica de Subtração
15.
J Bioinform Comput Biol ; 13(2): 1550003, 2015 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-25477149

RESUMO

Recently, researchers seeking to understand, modify, and create beneficial traits in organisms have looked for evolutionarily conserved patterns of protein interactions. Their conservation likely means that the proteins of these conserved functional modules are important to the trait's expression. In this paper, we formulate the problem of identifying these conserved patterns as a graph optimization problem, and develop a fast heuristic algorithm for this problem. We compare the performance of our network alignment algorithm to that of the MaWISh algorithm [Koyutürk M, Kim Y, Topkara U, Subramaniam S, Szpankowski W, Grama A, Pairwise alignment of protein interaction networks, J Comput Biol13(2):182-199, 2006.], which bases its search algorithm on a related decision problem formulation. We find that our algorithm discovers conserved modules with a larger number of proteins in an order of magnitude less time. The protein sets found by our algorithm correspond to known conserved functional modules at comparable precision and recall rates as those produced by the MaWISh algorithm.


Assuntos
Algoritmos , Mapas de Interação de Proteínas , Alinhamento de Sequência/estatística & dados numéricos , Animais , Biologia Computacional , Sequência Conservada , Ontologia Genética/estatística & dados numéricos , Humanos , Mapeamento de Interação de Proteínas/estatística & dados numéricos
16.
OMICS ; 6(4): 305-30, 2002.
Artigo em Inglês | MEDLINE | ID: mdl-12626091

RESUMO

The U.S. Department of Energy recently announced the first five grants for the Genomes to Life (GTL) Program. The goal of this program is to "achieve the most far-reaching of all biological goals: a fundamental, comprehensive, and systematic understanding of life." While more information about the program can be found at the GTL website (www.doegenomestolife.org), this paper provides an overview of one of the five GTL projects funded, "Carbon Sequestration in Synechococcus Sp.: From Molecular Machines to Hierarchical Modeling." This project is a combined experimental and computational effort emphasizing developing, prototyping, and applying new computational tools and methods to elucidate the biochemical mechanisms of the carbon sequestration of Synechococcus Sp., an abundant marine cyanobacteria known to play an important role in the global carbon cycle. Understanding, predicting, and perhaps manipulating carbon fixation in the oceans has long been a major focus of biological oceanography and has more recently been of interest to a broader audience of scientists and policy makers. It is clear that the oceanic sinks and sources of CO(2) are important terms in the global environmental response to anthropogenic atmospheric inputs of CO(2) and that oceanic microorganisms play a key role in this response. However, the relationship between this global phenomenon and the biochemical mechanisms of carbon fixation in these microorganisms is poorly understood. The project includes five subprojects: an experimental investigation, three computational biology efforts, and a fifth which deals with addressing computational infrastructure challenges of relevance to this project and the Genomes to Life program as a whole. Our experimental effort is designed to provide biology and data to drive the computational efforts and includes significant investment in developing new experimental methods for uncovering protein partners, characterizing protein complexes, identifying new binding domains. We will also develop and apply new data measurement and statistical methods for analyzing microarray experiments. Our computational efforts include coupling molecular simulation methods with knowledge discovery from diverse biological data sets for high-throughput discovery and characterization of protein-protein complexes and developing a set of novel capabilities for inference of regulatory pathways in microbial genomes across multiple sources of information through the integration of computational and experimental technologies. These capabilities will be applied to Synechococcus regulatory pathways to characterize their interaction map and identify component proteins in these pathways. We will also investigate methods for combining experimental and computational results with visualization and natural language tools to accelerate discovery of regulatory pathways. Furthermore, given that the ultimate goal of this effort is to develop a systems-level of understanding of how the Synechococcus genome affects carbon fixation at the global scale, we will develop and apply a set of tools for capturing the carbon fixation behavior of complex of Synechococcus at different levels of resolution. Finally, because the explosion of data being produced by high-throughput experiments requires data analysis and models which are more computationally complex, more heterogeneous, and require coupling to ever increasing amounts of experimentally obtained data in varying formats, we have also established a companion computational infrastructure to support this effort as well as the Genomes to Life program as a whole.


Assuntos
Carbono/metabolismo , Cianobactérias/fisiologia , Genoma , Algoritmos , Carbono/fisiologia , Cianobactérias/metabolismo , Espectrometria de Massas , Modelos Biológicos , Modelos Estatísticos , Pesquisa/tendências , Software
17.
Int J Genomics ; 2013: 670623, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23710435

RESUMO

A number of tools for the alignment of protein-protein interaction (PPI) networks have laid the foundation for PPI network analysis. Most of alignment tools focus on finding conserved interaction regions across the PPI networks through either local or global mapping of similar sequences. Researchers are still trying to improve the speed, scalability, and accuracy of network alignment. In view of this, we introduce a connected-components based fast algorithm, HopeMap, for network alignment. Observing that the size of true orthologs across species is small comparing to the total number of proteins in all species, we take a different approach based on a precompiled list of homologs identified by KO terms. Applying this approach to S. cerevisiae (yeast) and D. melanogaster (fly), E. coli K12 and S. typhimurium, E. coli K12 and C. crescenttus, we analyze all clusters identified in the alignment. The results are evaluated through up-to-date known gene annotations, gene ontology (GO), and KEGG ortholog groups (KO). Comparing to existing tools, our approach is fast with linear computational cost, highly accurate in terms of KO and GO terms specificity and sensitivity, and can be extended to multiple alignments easily.

18.
Neuroimage Clin ; 3: 123-31, 2013 Aug 07.
Artigo em Inglês | MEDLINE | ID: mdl-24179856

RESUMO

Neuropsychiatric disorders such as schizophrenia, bipolar disorder and Alzheimer's disease are major public health problems. However, despite decades of research, we currently have no validated prognostic or diagnostic tests that can be applied at an individual patient level. Many neuropsychiatric diseases are due to a combination of alterations that occur in a human brain rather than the result of localized lesions. While there is hope that newer imaging technologies such as functional and anatomic connectivity MRI or molecular imaging may offer breakthroughs, the single biomarkers that are discovered using these datasets are limited by their inability to capture the heterogeneity and complexity of most multifactorial brain disorders. Recently, complex biomarkers have been explored to address this limitation using neuroimaging data. In this manuscript we consider the nature of complex biomarkers being investigated in the recent literature and present techniques to find such biomarkers that have been developed in related areas of data mining, statistics, machine learning and bioinformatics.

19.
PLoS One ; 7(4): e33744, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22496762

RESUMO

In biological networks of molecular interactions in a cell, network motifs that are biologically relevant are also functionally coherent, or form functional modules. These functionally coherent modules combine in a hierarchical manner into larger, less cohesive subsystems, thus revealing one of the essential design principles of system-level cellular organization and function-hierarchical modularity. Arguably, hierarchical modularity has not been explicitly taken into consideration by most, if not all, functional annotation systems. As a result, the existing methods would often fail to assign a statistically significant functional coherence score to biologically relevant molecular machines. We developed a methodology for hierarchical functional annotation. Given the hierarchical taxonomy of functional concepts (e.g., Gene Ontology) and the association of individual genes or proteins with these concepts (e.g., GO terms), our method will assign a Hierarchical Modularity Score (HMS) to each node in the hierarchy of functional modules; the HMS score and its p-value measure functional coherence of each module in the hierarchy. While existing methods annotate each module with a set of "enriched" functional terms in a bag of genes, our complementary method provides the hierarchical functional annotation of the modules and their hierarchically organized components. A hierarchical organization of functional modules often comes as a bi-product of cluster analysis of gene expression data or protein interaction data. Otherwise, our method will automatically build such a hierarchy by directly incorporating the functional taxonomy information into the hierarchy search process and by allowing multi-functional genes to be part of more than one component in the hierarchy. In addition, its underlying HMS scoring metric ensures that functional specificity of the terms across different levels of the hierarchical taxonomy is properly treated. We have evaluated our method using Saccharomyces cerevisiae data from KEGG and MIPS databases and several other computationally derived and curated datasets. The code and additional supplemental files can be obtained from http://code.google.com/p/functional-annotation-of-hierarchical-modularity/ (Accessed 2012 March 13).


Assuntos
Algoritmos , Biologia Computacional/métodos , Redes e Vias Metabólicas , Proteínas de Saccharomyces cerevisiae/metabolismo , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Análise por Conglomerados , Bases de Dados Factuais , Mapeamento de Interação de Proteínas , Proteínas de Saccharomyces cerevisiae/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA