Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 17 de 17
Filtrar
1.
Bioinformatics ; 27(3): 376-82, 2011 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-21067999

RESUMO

MOTIVATION: Theoretical models of biological networks are valuable tools in evolutionary inference. Theoretical models based on gene duplication and divergence provide biologically plausible evolutionary mechanics. Similarities found between empirical networks and their theoretically generated counterpart are considered evidence of the role modeled mechanics play in biological evolution. However, the method by which these models are parameterized can lead to questions about the validity of the inferences. Selecting parameter values in order to produce a particular topological value obfuscates the possibility that the model may produce a similar topology for a large range of parameter values. Alternately, a model may produce a large range of topologies, allowing (incorrect) parameter values to produce a valid topology from an otherwise flawed model. In order to lend biological credence to the modeled evolutionary mechanics, parameter values should be derived from the empirical data. Furthermore, recent work indicates that the timing and fate of gene duplications are critical to proper derivation of these parameters. RESULTS: We present a methodology for deriving evolutionary rates from empirical data that is used to parameterize duplication and divergence models of protein interaction network evolution. Our method avoids shortcomings of previous methods, which failed to consider the effect of subsequent duplications. From our parameter values, we find that concurrent and existing existing duplication and divergence models are insufficient for modeling protein interaction network evolution. We introduce a model enhancement based on heritable interaction sites on the surface of a protein and find that it more closely reflects the high clustering found in the empirical network.


Assuntos
Evolução Molecular , Modelos Biológicos , Proteínas de Saccharomyces cerevisiae/metabolismo , Análise por Conglomerados , Saccharomyces cerevisiae
2.
PLoS Comput Biol ; 7(10): e1002244, 2011 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-22046118

RESUMO

Computer science has become ubiquitous in many areas of biological research, yet most high school and even college students are unaware of this. As a result, many college biology majors graduate without adequate computational skills for contemporary fields of biology. The absence of a computational element in secondary school biology classrooms is of growing concern to the computational biology community and biology teachers who would like to acquaint their students with updated approaches in the discipline. We present a first attempt to correct this absence by introducing a computational biology element to teach genetic evolution into advanced biology classes in two local high schools. Our primary goal was to show students how computation is used in biology and why a basic understanding of computation is necessary for research in many fields of biology. This curriculum is intended to be taught by a computational biologist who has worked with a high school advanced biology teacher to adapt the unit for his/her classroom, but a motivated high school teacher comfortable with mathematics and computing may be able to teach this alone. In this paper, we present our curriculum, which takes into consideration the constraints of the required curriculum, and discuss our experiences teaching it. We describe the successes and challenges we encountered while bringing this unit to high school students, discuss how we addressed these challenges, and make suggestions for future versions of this curriculum.We believe that our curriculum can be a valuable seed for further development of computational activities aimed at high school biology students. Further, our experiences may be of value to others teaching computational biology at this level. Our curriculum can be obtained at http://ecsite.cs.colorado.edu/?page_id=149#biology or by contacting the authors.


Assuntos
Biologia Computacional/educação , Currículo , Instituições Acadêmicas , Adolescente , Algoritmos , Mineração de Dados , Humanos , Filogenia , Análise de Sequência de DNA , Estudantes
3.
Nature ; 436(7052): 861-5, 2005 Aug 11.
Artigo em Inglês | MEDLINE | ID: mdl-16094371

RESUMO

Although numerous fundamental aspects of development have been uncovered through the study of individual genes and proteins, system-level models are still missing for most developmental processes. The first two cell divisions of Caenorhabditis elegans embryogenesis constitute an ideal test bed for a system-level approach. Early embryogenesis, including processes such as cell division and establishment of cellular polarity, is readily amenable to large-scale functional analysis. A first step toward a system-level understanding is to provide 'first-draft' models both of the molecular assemblies involved and of the functional connections between them. Here we show that such models can be derived from an integrated gene/protein network generated from three different types of functional relationship: protein interaction, expression profiling similarity and phenotypic profiling similarity, as estimated from detailed early embryonic RNA interference phenotypes systematically recorded for hundreds of early embryogenesis genes. The topology of the integrated network suggests that C. elegans early embryogenesis is achieved through coordination of a limited set of molecular machines. We assessed the overall predictive value of such molecular machine models by dynamic localization of ten previously uncharacterized proteins within the living embryo.


Assuntos
Caenorhabditis elegans/embriologia , Caenorhabditis elegans/metabolismo , Desenvolvimento Embrionário , Modelos Biológicos , Biologia de Sistemas/métodos , Algoritmos , Animais , Caenorhabditis elegans/citologia , Caenorhabditis elegans/genética , Divisão Celular , Polaridade Celular , Desenvolvimento Embrionário/genética , Perfilação da Expressão Gênica , Regulação da Expressão Gênica no Desenvolvimento , Fenótipo , Ligação Proteica , Interferência de RNA , Proteínas Recombinantes de Fusão/genética , Proteínas Recombinantes de Fusão/metabolismo
4.
Nature ; 437(7062): 1173-8, 2005 Oct 20.
Artigo em Inglês | MEDLINE | ID: mdl-16189514

RESUMO

Systematic mapping of protein-protein interactions, or 'interactome' mapping, was initiated in model organisms, starting with defined biological processes and then expanding to the scale of the proteome. Although far from complete, such maps have revealed global topological and dynamic features of interactome networks that relate to known biological properties, suggesting that a human interactome map will provide insight into development and disease mechanisms at a systems level. Here we describe an initial version of a proteome-scale map of human binary protein-protein interactions. Using a stringent, high-throughput yeast two-hybrid system, we tested pairwise interactions among the products of approximately 8,100 currently available Gateway-cloned open reading frames and detected approximately 2,800 interactions. This data set, called CCSB-HI1, has a verification rate of approximately 78% as revealed by an independent co-affinity purification assay, and correlates significantly with other biological attributes. The CCSB-HI1 data set increases by approximately 70% the set of available binary interactions within the tested space and reveals more than 300 new connections to over 100 disease-associated proteins. This work represents an important step towards a systematic and comprehensive human interactome project.


Assuntos
Proteoma/metabolismo , Clonagem Molecular , Humanos , Fases de Leitura Aberta/genética , Ligação Proteica , Proteoma/genética , RNA/genética , RNA/metabolismo , Saccharomyces cerevisiae/genética , Técnicas do Sistema de Duplo-Híbrido
5.
PLoS Comput Biol ; 5(1): e1000252, 2009 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-19119408

RESUMO

Gene duplication provides much of the raw material from which functional diversity evolves. Two evolutionary mechanisms have been proposed that generate functional diversity: neofunctionalization, the de novo acquisition of function by one duplicate, and subfunctionalization, the partitioning of ancestral functions between gene duplicates. With protein interactions as a surrogate for protein functions, evidence of prodigious neofunctionalization and subfunctionalization has been identified in analyses of empirical protein interactions and evolutionary models of protein interactions. However, we have identified three phenomena that have contributed to neofunctionalization being erroneously identified as a significant factor in protein interaction network evolution. First, self-interacting proteins are underreported in interaction data due to biological artifacts and design limitations in the two most common high-throughput protein interaction assays. Second, evolutionary inferences have been drawn from paralog analysis without consideration for concurrent and subsequent duplication events. Third, the theoretical model of prodigious neofunctionalization is unable to reproduce empirical network clustering and relies on untenable parameter requirements. In light of these findings, we believe that protein interaction evolution is more persuasively characterized by subfunctionalization and self-interactions.


Assuntos
Evolução Molecular , Duplicação Gênica , Variação Genética/genética , Modelos Genéticos , Mapeamento de Interação de Proteínas/métodos , Proteínas/genética , Transdução de Sinais/genética , Simulação por Computador
6.
Nature ; 430(6995): 88-93, 2004 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-15190252

RESUMO

In apparently scale-free protein-protein interaction networks, or 'interactome' networks, most proteins interact with few partners, whereas a small but significant proportion of proteins, the 'hubs', interact with many partners. Both biological and non-biological scale-free networks are particularly resistant to random node removal but are extremely sensitive to the targeted removal of hubs. A link between the potential scale-free topology of interactome networks and genetic robustness seems to exist, because knockouts of yeast genes encoding hubs are approximately threefold more likely to confer lethality than those of non-hubs. Here we investigate how hubs might contribute to robustness and other cellular properties for protein-protein interactions dynamically regulated both in time and in space. We uncovered two types of hub: 'party' hubs, which interact with most of their partners simultaneously, and 'date' hubs, which bind their different partners at different times or locations. Both in silico studies of network connectivity and genetic interactions described in vivo support a model of organized modularity in which date hubs organize the proteome, connecting biological processes--or modules--to each other, whereas party hubs function inside modules.


Assuntos
Proteínas Fúngicas/metabolismo , Modelos Biológicos , Leveduras/metabolismo , Simulação por Computador , Proteínas Fúngicas/genética , Genes Fúngicos/genética , Ligação Proteica , Leveduras/genética
7.
BMC Bioinformatics ; 9: 198, 2008 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-18412966

RESUMO

BACKGROUND: Determining the function of uncharacterized proteins is a major challenge in the post-genomic era due to the problem's complexity and scale. Identifying a protein's function contributes to an understanding of its role in the involved pathways, its suitability as a drug target, and its potential for protein modifications. Several graph-theoretic approaches predict unidentified functions of proteins by using the functional annotations of better-characterized proteins in protein-protein interaction networks. We systematically consider the use of literature co-occurrence data, introduce a new method for quantifying the reliability of co-occurrence and test how performance differs across species. We also quantify changes in performance as the prediction algorithms annotate with increased specificity. RESULTS: We find that including information on the co-occurrence of proteins within an abstract greatly boosts performance in the Functional Flow graph-theoretic function prediction algorithm in yeast, fly and worm. This increase in performance is not simply due to the presence of additional edges since supplementing protein-protein interactions with co-occurrence data outperforms supplementing with a comparably-sized genetic interaction dataset. Through the combination of protein-protein interactions and co-occurrence data, the neighborhood around unknown proteins is quickly connected to well-characterized nodes which global prediction algorithms can exploit. Our method for quantifying co-occurrence reliability shows superior performance to the other methods, particularly at threshold values around 10% which yield the best trade off between coverage and accuracy. In contrast, the traditional way of asserting co-occurrence when at least one abstract mentions both proteins proves to be the worst method for generating co-occurrence data, introducing too many false positives. Annotating the functions with greater specificity is harder, but co-occurrence data still proves beneficial. CONCLUSION: Co-occurrence data is a valuable supplemental source for graph-theoretic function prediction algorithms. A rapidly growing literature corpus ensures that co-occurrence data is a readily-available resource for nearly every studied organism, particularly those with small protein interaction databases. Though arguably biased toward known genes, co-occurrence data provides critical additional links to well-studied regions in the interaction network that graph-theoretic function prediction algorithms can exploit.


Assuntos
Bases de Dados Bibliográficas , Armazenamento e Recuperação da Informação/métodos , Processamento de Linguagem Natural , Publicações Periódicas como Assunto , Mapeamento de Interação de Proteínas/métodos , Proteínas/classificação , Proteínas/metabolismo , Sensibilidade e Especificidade , Integração de Sistemas
8.
PLoS One ; 12(3): e0174052, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28333956

RESUMO

BACKGROUND: The RNA binding proteins (RBPs) human antigen R (HuR) and Tristetraprolin (TTP) are known to exhibit competitive binding but have opposing effects on the bound messenger RNA (mRNA). How cells discriminate between the two proteins is an interesting problem. Machine learning approaches, such as support vector machines (SVMs), may be useful in the identification of discriminative features. However, this method has yet to be applied to studies of RNA binding protein motifs. RESULTS: Applying the k-spectrum kernel to a support vector machine (SVM), we first verified the published binding sites of both HuR and TTP. Additional feature engineering highlighted the U-rich binding preference of HuR and AU-rich binding preference for TTP. Domain adaptation along with multi-task learning was used to predict the common binding sites. CONCLUSION: The distinction between HuR and TTP binding appears to be subtle content features. HuR prefers strongly U-rich sequences whereas TTP prefers AU-rich as with increasing A content, the sequences are more likely to be bound only by TTP. Our model is consistent with competitive binding of the two proteins, particularly at intermediate AU-balanced sequences. This suggests that fine changes in the A/U balance within a untranslated region (UTR) can alter the binding and subsequent stability of the message. Both feature engineering and domain adaptation emphasized the extent to which these proteins recognize similar general sequence features. This work suggests that the k-spectrum kernel method could be useful when studying RNA binding proteins and domain adaptation techniques such as feature augmentation could be employed particularly when examining RBPs with similar binding preferences.


Assuntos
Proteína Semelhante a ELAV 1/metabolismo , Máquina de Vetores de Suporte , Tristetraprolina/metabolismo , Sítios de Ligação , Domínio Catalítico , Humanos , Modelos Teóricos
9.
PLoS One ; 12(4): e0175988, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28403197

RESUMO

[This corrects the article DOI: 10.1371/journal.pone.0174052.].

10.
F1000Res ; 2: 172, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-26913183

RESUMO

Many protein complexes are densely packed, so proteins within complexes often interact with several other proteins in the complex. Steric constraints prevent most proteins from simultaneously binding more than a handful of other proteins, regardless of the number of proteins in the complex. Because of this, as complex size increases, several measures of the complex decrease within protein-protein interaction networks. However, k-connectivity, the number of vertices or edges that need to be removed in order to disconnect a graph, may be consistently high for protein complexes. The property of k-connectivity has been little used previously in the investigation of protein-protein interactions. To understand the discriminative power of k-connectivity and other topological measures for identifying unknown protein complexes, we characterized these properties in known Saccharomyces cerevisiae protein complexes in networks generated both from highly accurate X-ray crystallography experiments which give an accurate model of each complex, and also as the complexes appear in high-throughput yeast 2-hybrid studies in which new complexes may be discovered. We also computed these properties for appropriate random subgraphs. We found that clustering coefficient, mutual clustering coefficient, and k-connectivity are better indicators of known protein complexes than edge density, degree, or betweenness. This suggests new directions for future protein complex-finding algorithms.

11.
Pac Symp Biocomput ; : 190-202, 2009.
Artigo em Inglês | MEDLINE | ID: mdl-19213136

RESUMO

Protein interaction network analyses have moved beyond simple topological observations to functional and evolutionary inferences based on the construction of putative ancestral networks. Evolutionary studies of protein interaction networks are generally derived from network comparisons, are limited in scope, or are theoretic dynamic models that aren't contextualized to an organism's specific genes. A biologically faithful network evolution reconstruction which ties evolution of the network itself to the actual genes of an organism would help fill in the evolutionary gaps between the gene network "snapshots" of evolution we have from different species today. Here we present a novel framework for reverse engineering the evolution of protein interaction networks of extant species using phylogenetic gene trees and protein interaction data. We applied the framework to Saccharomyces cerevisiae data and present topological trends in the evolutionary lineage of yeast.


Assuntos
Evolução Molecular , Engenharia de Proteínas/estatística & dados numéricos , Mapeamento de Interação de Proteínas/estatística & dados numéricos , Biometria , Proteínas Fúngicas/química , Proteínas Fúngicas/genética , Modelos Genéticos , Proteínas de Saccharomyces cerevisiae/química , Proteínas de Saccharomyces cerevisiae/genética
12.
Pac Symp Biocomput ; : 433-44, 2007.
Artigo em Inglês | MEDLINE | ID: mdl-17990508

RESUMO

Integrating diverse sources of interaction information to create protein networks requires strategies sensitive to differences in accuracy and coverage of each source. Previous integration approaches calculate reliabilities of protein interaction information sources based on congruity to a designated 'gold standard.' In this paper, we provide a comparison of the two most popular existing approaches and propose a novel alternative for assessing reliabilities which does not require a gold standard. We identify a new method for combining the resultant reliabilities and compare it against an existing method. Further, we propose an extrinsic approach to evaluation of reliability estimates, considering their influence on the downstream tasks of inferring protein function and learning regulatory networks from expression data. Results using this evaluation method show 1) our method for reliability estimation is an attractive alternative to those requiring a gold standard and 2) the new method for combining reliabilities is less sensitive to noise in reliability assignments than the similar existing technique.


Assuntos
Mapeamento de Interação de Proteínas/estatística & dados numéricos , Algoritmos , Animais , Inteligência Artificial , Teorema de Bayes , Biologia Computacional , Bases de Dados Genéticas , Genes Fúngicos , Funções Verossimilhança , Camundongos , Reprodutibilidade dos Testes
13.
J Biol ; 4(2): 6, 2005.
Artigo em Inglês | MEDLINE | ID: mdl-15982408

RESUMO

BACKGROUND: Large-scale studies have revealed networks of various biological interaction types, such as protein-protein interaction, genetic interaction, transcriptional regulation, sequence homology, and expression correlation. Recurring patterns of interconnection, or 'network motifs', have revealed biological insights for networks containing either one or two types of interaction. RESULTS: To study more complex relationships involving multiple biological interaction types, we assembled an integrated Saccharomyces cerevisiae network in which nodes represent genes (or their protein products) and differently colored links represent the aforementioned five biological interaction types. We examined three- and four-node interconnection patterns containing multiple interaction types and found many enriched multi-color network motifs. Furthermore, we showed that most of the motifs form 'network themes' -- classes of higher-order recurring interconnection patterns that encompass multiple occurrences of network motifs. Network themes can be tied to specific biological phenomena and may represent more fundamental network design principles. Examples of network themes include a pair of protein complexes with many inter-complex genetic interactions -- the 'compensatory complexes' theme. Thematic maps -- networks rendered in terms of such themes -- can simplify an otherwise confusing tangle of biological relationships. We show this by mapping the S. cerevisiae network in terms of two specific network themes. CONCLUSION: Significantly enriched motifs in an integrated S. cerevisiae interaction network are often signatures of network themes, higher-order network structures that correspond to biological phenomena. Representing networks in terms of network themes provides a useful simplification of complex biological relationships.


Assuntos
Regulação Fúngica da Expressão Gênica , Genes Fúngicos/genética , Genoma , Saccharomyces cerevisiae/genética , Motivos de Aminoácidos/genética , Proteínas Fúngicas/genética , Proteínas Fúngicas/metabolismo , Modelos Biológicos , Saccharomyces cerevisiae/metabolismo , Integração de Sistemas
14.
Proc Natl Acad Sci U S A ; 100(8): 4372-6, 2003 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-12676999

RESUMO

Experimentally determined networks are susceptible to errors, yet important inferences can still be drawn from them. Many real networks have also been shown to have the small-world network properties of cohesive neighborhoods and short average distances between vertices. Although much analysis has been done on small-world networks, small-world properties have not previously been used to improve our understanding of individual edges in experimentally derived graphs. Here we focus on a small-world network derived from high-throughput (and error-prone) protein-protein interaction experiments. We exploit the neighborhood cohesiveness property of small-world networks to assess confidence for individual protein-protein interactions. By ascertaining how well each protein-protein interaction (edge) fits the pattern of a small-world network, we stratify even those edges with identical experimental evidence. This result promises to improve the quality of inference from protein-protein interaction networks in particular and small-world networks in general.


Assuntos
Modelos Biológicos , Proteínas/metabolismo , Análise por Conglomerados , Matemática , Proteínas de Saccharomyces cerevisiae/metabolismo
15.
Proc Natl Acad Sci U S A ; 101(44): 15682-7, 2004 Nov 02.
Artigo em Inglês | MEDLINE | ID: mdl-15496468

RESUMO

Genetic interactions define overlapping functions and compensatory pathways. In particular, synthetic sick or lethal (SSL) genetic interactions are important for understanding how an organism tolerates random mutation, i.e., genetic robustness. Comprehensive identification of SSL relationships remains far from complete in any organism, because mapping these networks is highly labor intensive. The ability to predict SSL interactions, however, could efficiently guide further SSL discovery. Toward this end, we predicted pairs of SSL genes in Saccharomyces cerevisiae by using probabilistic decision trees to integrate multiple types of data, including localization, mRNA expression, physical interaction, protein function, and characteristics of network topology. Experimental evidence demonstrated the reliability of this strategy, which, when extended to human SSL interactions, may prove valuable in discovering drug targets for cancer therapy and in identifying genes responsible for multigenic diseases.


Assuntos
Modelos Genéticos , Mutação , Fenótipo , Animais , Bases de Dados Genéticas , Árvores de Decisões , Genótipo , Modelos Estatísticos
16.
Science ; 303(5657): 540-3, 2004 Jan 23.
Artigo em Inglês | MEDLINE | ID: mdl-14704431

RESUMO

To initiate studies on how protein-protein interaction (or "interactome") networks relate to multicellular functions, we have mapped a large fraction of the Caenorhabditis elegans interactome network. Starting with a subset of metazoan-specific proteins, more than 4000 interactions were identified from high-throughput, yeast two-hybrid (HT=Y2H) screens. Independent coaffinity purification assays experimentally validated the overall quality of this Y2H data set. Together with already described Y2H interactions and interologs predicted in silico, the current version of the Worm Interactome (WI5) map contains approximately 5500 interactions. Topological and biological features of this interactome network, as well as its integration with phenome and transcriptome data sets, lead to numerous biological hypotheses.


Assuntos
Proteínas de Caenorhabditis elegans/metabolismo , Caenorhabditis elegans/metabolismo , Proteoma/metabolismo , Animais , Caenorhabditis elegans/genética , Proteínas de Caenorhabditis elegans/genética , Biologia Computacional , Evolução Molecular , Genes de Helmintos , Genômica , Fases de Leitura Aberta , Fenótipo , Ligação Proteica , Transcrição Gênica , Técnicas do Sistema de Duplo-Híbrido
17.
Science ; 303(5659): 808-13, 2004 Feb 06.
Artigo em Inglês | MEDLINE | ID: mdl-14764870

RESUMO

A genetic interaction network containing approximately 1000 genes and approximately 4000 interactions was mapped by crossing mutations in 132 different query genes into a set of approximately 4700 viable gene yeast deletion mutants and scoring the double mutant progeny for fitness defects. Network connectivity was predictive of function because interactions often occurred among functionally related genes, and similar patterns of interactions tended to identify components of the same pathway. The genetic network exhibited dense local neighborhoods; therefore, the position of a gene on a partially mapped network is predictive of other genetic interactions. Because digenic interactions are common in yeast, similar networks may underlie the complex genetics associated with inherited phenotypes in other organisms.


Assuntos
Genes Fúngicos , Proteínas de Saccharomyces cerevisiae/metabolismo , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Sequência de Aminoácidos , Biologia Computacional , Fibrose Cística/genética , Deleção de Genes , Genes Essenciais , Doenças Genéticas Inatas/genética , Genótipo , Humanos , Dados de Sequência Molecular , Herança Multifatorial , Mutação , Fenótipo , Polimorfismo Genético , Retinose Pigmentar/genética , Proteínas de Saccharomyces cerevisiae/química , Proteínas de Saccharomyces cerevisiae/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA