Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 26
Filtrar
Más filtros












Base de datos
Intervalo de año de publicación
1.
Sci Rep ; 7: 45863, 2017 04 10.
Artículo en Inglés | MEDLINE | ID: mdl-28393921

RESUMEN

Class I hydrophobins are functional amyloids secreted by fungi. They self-assemble into organized films at interfaces producing structures that include cellular adhesion points and hydrophobic coatings. Here, we present the first structure and solution properties of a unique Class I protein sequence of Basidiomycota origin: the Schizophyllum commune hydrophobin SC16 (hyd1). While the core ß-barrel structure and disulphide bridging characteristic of the hydrophobin family are conserved, its surface properties and secondary structure elements are reminiscent of both Class I and II hydrophobins. Sequence analyses of hydrophobins from 215 fungal species suggest this structure is largely applicable to a high-identity Basidiomycota Class I subdivision (IB). To validate this prediction, structural analysis of a comparatively distinct Class IB sequence from a different fungal order, namely the Phanerochaete carnosa PcaHyd1, indicates secondary structure properties similar to that of SC16. Together, these results form an experimental basis for a high-identity Class I subdivision and contribute to our understanding of functional amyloid formation.


Asunto(s)
Amiloide/química , Proteínas Fúngicas/química , Schizophyllum/química , Secuencia de Aminoácidos/genética , Amiloide/genética , Amiloide/ultraestructura , Proteínas Fúngicas/genética , Proteínas Fúngicas/ultraestructura , Humanos , Microscopía de Fuerza Atómica , Estructura Secundaria de Proteína , Schizophyllum/genética , Propiedades de Superficie , Agua/química
2.
Biochim Biophys Acta Proteins Proteom ; 1865(1): 43-54, 2017 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-27718363

RESUMEN

Therapeutic protein kinase inhibitors are designed on the basis of kinase structures. Here, we define intrinsically disordered regions (IDRs) in structurally hybrid kinases. We reveal that 65% of kinases have an IDR adjacent to their kinase domain (KD). These IDRs are evolutionarily more conserved than IDRs distant to KDs. Strikingly, 36 kinases have adjacent IDRs extending into their KDs, defining a unique structural and functional subset of the kinome. Functional network analysis of this subset of the kinome uncovered FAK1 as topologically the most connected hub kinase. We identify that KD-flanking IDR of FAK1 is more conserved and undergoes more post-translational modifications than other IDRs. It preferentially interacts with proteins regulating scaffolding and kinase activity, which contribute to cytoskeletal remodeling. In summary, spatially and evolutionarily conserved IDRs in kinases may influence their functions, which can be exploited for targeted therapies in diseases including those that involve aberrant cytoskeletal remodeling.


Asunto(s)
Citoesqueleto/metabolismo , Quinasa 1 de Adhesión Focal/química , Citoesqueleto/enzimología , Quinasa 1 de Adhesión Focal/metabolismo , Proteínas Intrínsecamente Desordenadas/química , Proteínas Intrínsecamente Desordenadas/metabolismo , Conformación Proteica , Procesamiento Proteico-Postraduccional
3.
Data Brief ; 10: 315-324, 2017 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-28004021

RESUMEN

We present data on the evolution of intrinsically disordered regions (IDRs) taking into account the entire human protein kinome. The evolutionary data of the IDRs with respect to the kinase domains (KDs) and kinases as a whole protein (WP) are reported. Further, we have reported its post translational modifications of FAK1 IDRs and their contribution to the cytoskeletal remodeling. We also report the data to build a protein-protein interaction (PPI) network of primary and secondary FAK1-interacting hybrid proteins. Detailed analysis of the data and its effect on FAK1-related functions have been described in "Structural pliability adjacent to the kinase domain highlights contribution of FAK1 IDRs to cytoskeletal remodeling" (Kathiriya et. al., 2016) [1].

4.
Data Brief ; 6: 715-21, 2016 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-26870755

RESUMEN

Our analysis examines the conservation of multiprotein complexes among metazoa through use of high resolution biochemical fractionation and precision mass spectrometry applied to soluble cell extracts from 5 representative model organisms Caenorhabditis elegans, Drosophila melanogaster, Mus musculus, Strongylocentrotus purpuratus, and Homo sapiens. The interaction network obtained from the data was validated globally in 4 distant species (Xenopus laevis, Nematostella vectensis, Dictyostelium discoideum, Saccharomyces cerevisiae) and locally by targeted affinity-purification experiments. Here we provide details of our massive set of supporting biochemical fractionation data available via ProteomeXchange (PXD002319-PXD002328), PPIs via BioGRID (185267); and interaction network projections via (http://metazoa.med.utoronto.ca) made fully accessible to allow further exploration. The datasets here are related to the research article on metazoan macromolecular complexes in Nature [1].

5.
Nature ; 525(7569): 339-44, 2015 Sep 17.
Artículo en Inglés | MEDLINE | ID: mdl-26344197

RESUMEN

Macromolecular complexes are essential to conserved biological processes, but their prevalence across animals is unclear. By combining extensive biochemical fractionation with quantitative mass spectrometry, here we directly examined the composition of soluble multiprotein complexes among diverse metazoan models. Using an integrative approach, we generated a draft conservation map consisting of more than one million putative high-confidence co-complex interactions for species with fully sequenced genomes that encompasses functional modules present broadly across all extant animals. Clustering reveals a spectrum of conservation, ranging from ancient eukaryotic assemblies that have probably served cellular housekeeping roles for at least one billion years, ancestral complexes that have accrued contemporary components, and rarer metazoan innovations linked to multicellularity. We validated these projections by independent co-fractionation experiments in evolutionarily distant species, affinity purification and functional analyses. The comprehensiveness, centrality and modularity of these reconstructed interactomes reflect their fundamental mechanistic importance and adaptive value to animal cell systems.


Asunto(s)
Evolución Molecular , Complejos Multiproteicos/química , Complejos Multiproteicos/metabolismo , Mapas de Interacción de Proteínas , Animales , Conjuntos de Datos como Asunto , Humanos , Mapeo de Interacción de Proteínas , Reproducibilidad de los Resultados , Biología de Sistemas , Espectrometría de Masas en Tándem
6.
BMC Bioinformatics ; 15: 157, 2014 May 22.
Artículo en Inglés | MEDLINE | ID: mdl-24886131

RESUMEN

BACKGROUND: Several methods are available for the detection of covarying positions from a multiple sequence alignment (MSA). If the MSA contains a large number of sequences, information about the proximities between residues derived from covariation maps can be sufficient to predict a protein fold. However, in many cases the structure is already known, and information on the covarying positions can be valuable to understand the protein mechanism and dynamic properties. RESULTS: In this study we have sought to determine whether a multivariate (multidimensional) extension of traditional mutual information (MI) can be an additional tool to study covariation. The performance of two multidimensional MI (mdMI) methods, designed to remove the effect of ternary/quaternary interdependencies, was tested with a set of 9 MSAs each containing <400 sequences, and was shown to be comparable to that of the newest methods based on maximum entropy/pseudolikelyhood statistical models of protein sequences. However, while all the methods tested detected a similar number of covarying pairs among the residues separated by < 8 Å in the reference X-ray structures, there was on average less than 65% overlap between the top scoring pairs detected by methods that are based on different principles. CONCLUSIONS: Given the large variety of structure and evolutionary history of different proteins it is possible that a single best method to detect covariation in all proteins does not exist, and that for each protein family the best information can be derived by merging/comparing results obtained with different methods. This approach may be particularly valuable in those cases in which the size of the MSA is small or the quality of the alignment is low, leading to significant differences in the pairs detected by different methods.


Asunto(s)
Alineación de Secuencia/métodos , Análisis de Secuencia de Proteína , Modelos Estadísticos , Estructura Secundaria de Proteína , Proteínas/química , Proteínas/clasificación
7.
PLoS Genet ; 9(2): e1003280, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-23468640

RESUMEN

Expansions of trinucleotide CAG/CTG repeats in somatic tissues are thought to contribute to ongoing disease progression through an affected individual's life with Huntington's disease or myotonic dystrophy. Broad ranges of repeat instability arise between individuals with expanded repeats, suggesting the existence of modifiers of repeat instability. Mice with expanded CAG/CTG repeats show variable levels of instability depending upon mouse strain. However, to date the genetic modifiers underlying these differences have not been identified. We show that in liver and striatum the R6/1 Huntington's disease (HD) (CAG)∼100 transgene, when present in a congenic C57BL/6J (B6) background, incurred expansion-biased repeat mutations, whereas the repeat was stable in a congenic BALB/cByJ (CBy) background. Reciprocal congenic mice revealed the Msh3 gene as the determinant for the differences in repeat instability. Expansion bias was observed in congenic mice homozygous for the B6 Msh3 gene on a CBy background, while the CAG tract was stabilized in congenics homozygous for the CBy Msh3 gene on a B6 background. The CAG stabilization was as dramatic as genetic deficiency of Msh2. The B6 and CBy Msh3 genes had identical promoters but differed in coding regions and showed strikingly different protein levels. B6 MSH3 variant protein is highly expressed and associated with CAG expansions, while the CBy MSH3 variant protein is expressed at barely detectable levels, associating with CAG stability. The DHFR protein, which is divergently transcribed from a promoter shared by the Msh3 gene, did not show varied levels between mouse strains. Thus, naturally occurring MSH3 protein polymorphisms are modifiers of CAG repeat instability, likely through variable MSH3 protein stability. Since evidence supports that somatic CAG instability is a modifier and predictor of disease, our data are consistent with the hypothesis that variable levels of CAG instability associated with polymorphisms of DNA repair genes may have prognostic implications for various repeat-associated diseases.


Asunto(s)
Enfermedad de Huntington/genética , Proteínas/genética , Expansión de Repetición de Trinucleótido/genética , Repeticiones de Trinucleótidos/genética , Animales , Cuerpo Estriado/metabolismo , Modelos Animales de Enfermedad , Inestabilidad Genómica , Humanos , Ratones , Proteína 3 Homóloga de MutS , Distrofia Miotónica/genética , Distrofia Miotónica/metabolismo , Neostriado/metabolismo , Proteínas del Tejido Nervioso/genética , Proteínas del Tejido Nervioso/metabolismo , Polimorfismo Genético , Estabilidad Proteica
8.
Mol Biol Evol ; 30(2): 332-46, 2013 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-22977115

RESUMEN

Protein interaction networks play central roles in biological systems, from simple metabolic pathways through complex programs permitting the development of organisms. Multicellularity could only have arisen from a careful orchestration of cellular and molecular roles and responsibilities, all properly controlled and regulated. Disease reflects a breakdown of this organismal homeostasis. To better understand the evolution of interactions whose dysfunction may be contributing factors to disease, we derived the human protein coevolution network using our MatrixMatchMaker algorithm and using the Orthologous MAtrix project (OMA) database as a source for protein orthologs from 103 eukaryotic genomes. We annotated the coevolution network using protein-protein interaction data, many functional data sources, and we explored the evolutionary rates and dates of emergence of the proteins in our data set. Strikingly, clustering based only on the topology of the coevolution network partitions it into two subnetworks, one generally representing ancient eukaryotic functions and the other functions more recently acquired during animal evolution. That latter subnetwork is enriched for proteins with roles in cell-cell communication, the control of cell division, and related multicellular functions. Further annotation using data from genetic disease databases and cancer genome sequences strongly implicates these proteins in both ciliopathies and cancer. The enrichment for such disease markers in the animal network suggests a functional link between these coevolving proteins. Genetic validation corroborates the recruitment of ancient cilia in the evolution of multicellularity.


Asunto(s)
Evolución Biológica , Comunicación Celular/fisiología , Proteínas/genética , Proteínas/metabolismo , Animales , Trastornos de la Motilidad Ciliar/genética , Trastornos de la Motilidad Ciliar/metabolismo , Análisis por Conglomerados , Bases de Datos de Proteínas , Femenino , Expresión Génica , Humanos , Masculino , Mutación , Neoplasias/genética , Neoplasias/metabolismo , Unión Proteica , Mapeo de Interacción de Proteínas , Mapas de Interacción de Proteínas
9.
PLoS One ; 7(10): e47108, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-23091608

RESUMEN

BACKGROUND: While the conserved positions of a multiple sequence alignment (MSA) are clearly of interest, non-conserved positions can also be important because, for example, destabilizing effects at one position can be compensated by stabilizing effects at another position. Different methods have been developed to recognize the evolutionary relationship between amino acid sites, and to disentangle functional/structural dependencies from historical/phylogenetic ones. METHODOLOGY/PRINCIPAL FINDINGS: We have used two complementary approaches to test the efficacy of these methods. In the first approach, we have used a new program, MSAvolve, for the in silico evolution of MSAs, which records a detailed history of all covarying positions, and builds a global coevolution matrix as the accumulated sum of individual matrices for the positions forced to co-vary, the recombinant coevolution, and the stochastic coevolution. We have simulated over 1600 MSAs for 8 protein families, which reflect sequences of different sizes and proteins with widely different functions. The calculated coevolution matrices were compared with the coevolution matrices obtained for the same evolved MSAs with different coevolution detection methods. In a second approach we have evaluated the capacity of the different methods to predict close contacts in the representative X-ray structures of an additional 150 protein families using only experimental MSAs. CONCLUSIONS/SIGNIFICANCE: Methods based on the identification of global correlations between pairs were found to be generally superior to methods based only on local correlations in their capacity to identify coevolving residues using either simulated or experimental MSAs. However, the significant variability in the performance of different methods with different proteins suggests that the simulation of MSAs that replicate the statistical properties of the experimental MSA can be a valuable tool to identify the coevolution detection method that is most effective in each case.


Asunto(s)
Biología Computacional/métodos , Evolución Molecular , Proteínas/química , Proteínas/genética , Alineación de Secuencia , Secuencia de Aminoácidos , Simulación por Computador , Datos de Secuencia Molecular , Reproducibilidad de los Resultados
10.
Cell ; 150(5): 1068-81, 2012 Aug 31.
Artículo en Inglés | MEDLINE | ID: mdl-22939629

RESUMEN

Cellular processes often depend on stable physical associations between proteins. Despite recent progress, knowledge of the composition of human protein complexes remains limited. To close this gap, we applied an integrative global proteomic profiling approach, based on chromatographic separation of cultured human cell extracts into more than one thousand biochemical fractions that were subsequently analyzed by quantitative tandem mass spectrometry, to systematically identify a network of 13,993 high-confidence physical interactions among 3,006 stably associated soluble human proteins. Most of the 622 putative protein complexes we report are linked to core biological processes and encompass both candidate disease genes and unannotated proteins to inform on mechanism. Strikingly, whereas larger multiprotein assemblies tend to be more extensively annotated and evolutionarily conserved, human protein complexes with five or fewer subunits are far more likely to be functionally unannotated or restricted to vertebrates, suggesting more recent functional innovations.


Asunto(s)
Complejos Multiproteicos/análisis , Mapas de Interacción de Proteínas , Proteínas/química , Proteómica/métodos , Humanos , Espectrometría de Masas en Tándem
11.
Methods Mol Biol ; 781: 237-56, 2011.
Artículo en Inglés | MEDLINE | ID: mdl-21877284

RESUMEN

Bioinformatic methods to predict protein-protein interactions (PPI) via coevolutionary analysis have -positioned themselves to compete alongside established in vitro methods, despite a lack of understanding for the underlying molecular mechanisms of the coevolutionary process. Investigating the alignment of coevolutionary predictions of PPI with experimental data can focus the effective scope of prediction and lead to better accuracies. A new rate-based coevolutionary method, MMM, preferentially finds obligate interacting proteins that form complexes, conforming to results from studies based on coimmunoprecipitation coupled with mass spectrometry. Using gold-standard databases as a benchmark for accuracy, MMM surpasses methods based on abundance ratios, suggesting that correlated evolutionary rates may yet be better than coexpression at predicting interacting proteins. At the level of protein domains, -coevolution is difficult to detect, even with MMM, except when considering small-scale experimental data involving proteins with multiple domains. Overall, these findings confirm that coevolutionary -methods can be confidently used in predicting PPI, either independently or as drivers of coimmunoprecipitation experiments.


Asunto(s)
Evolución Biológica , Biología Computacional , Mapeo de Interacción de Proteínas/métodos , Proteínas/química , Proteínas/metabolismo , Algoritmos , Inmunoprecipitación , Filogenia , Unión Proteica
12.
Algorithms Mol Biol ; 6: 17, 2011 Jun 14.
Artículo en Inglés | MEDLINE | ID: mdl-21672226

RESUMEN

BACKGROUND: The MatrixMatchMaker algorithm was recently introduced to detect the similarity between phylogenetic trees and thus the coevolution between proteins. MMM finds the largest common submatrices between pairs of phylogenetic distance matrices, and has numerous advantages over existing methods of coevolution detection. However, these advantages came at the cost of a very long execution time. RESULTS: In this paper, we show that the problem of finding the maximum submatrix reduces to a multiple maximum clique subproblem on a graph of protein pairs. This allowed us to develop a new algorithm and program implementation, MMMvII, which achieved more than 600× speedup with comparable accuracy to the original MMM. CONCLUSIONS: MMMvII will thus allow for more more extensive and intricate analyses of coevolution. AVAILABILITY: An implementation of the MMMvII algorithm is available at: http://www.uhnresearch.ca/labs/tillier/MMMWEBvII/MMMWEBvII.php.

13.
Appl Environ Microbiol ; 77(15): 5361-9, 2011 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-21666017

RESUMEN

Dehalococcoides spp. are an industrially relevant group of Chloroflexi bacteria capable of reductively dechlorinating contaminants in groundwater environments. Existing Dehalococcoides genomes revealed a high level of sequence identity within this group, including 98 to 100% 16S rRNA sequence identity between strains with diverse substrate specificities. Common molecular techniques for identification of microbial populations are often not applicable for distinguishing Dehalococcoides strains. Here we describe an oligonucleotide microarray probe set designed based on clustered Dehalococcoides genes from five different sources (strain DET195, CBDB1, BAV1, and VS genomes and the KB-1 metagenome). This "pangenome" probe set provides coverage of core Dehalococcoides genes as well as strain-specific genes while optimizing the potential for hybridization to closely related, previously unknown Dehalococcoides strains. The pangenome probe set was compared to probe sets designed independently for each of the five Dehalococcoides strains. The pangenome probe set demonstrated better predictability and higher detection of Dehalococcoides genes than strain-specific probe sets on nontarget strains with <99% average nucleotide identity. An in silico analysis of the expected probe hybridization against the recently released Dehalococcoides strain GT genome and additional KB-1 metagenome sequence data indicated that the pangenome probe set performs more robustly than the combined strain-specific probe sets in the detection of genes not included in the original design. The pangenome probe set represents a highly specific, universal tool for the detection and characterization of Dehalococcoides from contaminated sites. It has the potential to become a common platform for Dehalococcoides-focused research, allowing meaningful comparisons between microarray experiments regardless of the strain examined.


Asunto(s)
Técnicas de Tipificación Bacteriana/métodos , Chloroflexi/genética , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Secuencia de Bases , ADN Bacteriano/análisis , ADN Bacteriano/genética , Familia de Multigenes , Hibridación de Ácido Nucleico/genética , Sondas de Oligonucleótidos/genética , Proteómica/métodos , ARN Ribosómico 16S/análisis , ARN Ribosómico 16S/genética , Alineación de Secuencia , Análisis de Secuencia de ADN
14.
Biochem Cell Biol ; 88(2): 185-94, 2010 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-20453921

RESUMEN

GroEL is a chaperone thought of as essential for bacterial life. However, some species of Mollicutes are missing GroEL. We use phylogenetic analysis to show that the presence of GroEL is polyphyletic among the Mollicutes, and that there is evidence for lateral gene transfer of GroEL to Mycoplasma penetrans from the Proteobacteria. Furthermore, we propose that the presence of GroEL in Mycoplasma may be required for invasion of host tissue, suggesting that GroEL may act as an adhesin-invasin.


Asunto(s)
Chaperonina 60/genética , Chaperonina 60/metabolismo , Tenericutes/genética , Tenericutes/metabolismo , Chaperonina 60/química , Filogenia , Tenericutes/química
15.
Microb Biotechnol ; 3(6): 677-90, 2010 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-21255363

RESUMEN

One hundred and seventy-one genes encoding potential esterases from 11 bacterial genomes were cloned and overexpressed in Escherichia coli; 74 of the clones produced soluble proteins. All 74 soluble proteins were purified and screened for esterase activity; 36 proteins showed carboxyl esterase activity on short-chain esters, 17 demonstrated arylesterase activity, while 38 proteins did not exhibit any activity towards the test substrates. Esterases from Rhodopseudomonas palustris (RpEST-1, RpEST-2 and RpEST-3), Pseudomonas putida (PpEST-1, PpEST-2 and PpEST-3), Pseudomonas aeruginosa (PaEST-1) and Streptomyces avermitilis (SavEST-1) were selected for detailed biochemical characterization. All of the enzymes showed optimal activity at neutral or alkaline pH, and the half-life of each enzyme at 50°C ranged from < 5 min to over 5 h. PpEST-3, RpEST-1 and RpEST-2 demonstrated the highest specific activity with pNP-esters; these enzymes were also among the most stable at 50°C and in the presence of detergents, polar and non-polar organic solvents, and imidazolium ionic liquids. Accordingly, these enzymes are particularly interesting targets for subsequent application trials. Finally, biochemical and bioinformatic analyses were compared to reveal sequence features that could be correlated to enzymes with arylesterase activity, facilitating subsequent searches for new esterases in microbial genome sequences.


Asunto(s)
Bacterias/enzimología , Proteínas Bacterianas/genética , Proteínas Bacterianas/metabolismo , Hidrolasas de Éster Carboxílico/genética , Hidrolasas de Éster Carboxílico/metabolismo , Genoma Bacteriano , Proteínas Bacterianas/química , Proteínas Bacterianas/aislamiento & purificación , Hidrolasas de Éster Carboxílico/química , Hidrolasas de Éster Carboxílico/aislamiento & purificación , Biología Computacional , Estabilidad de Enzimas , Concentración de Iones de Hidrógeno , Especificidad por Sustrato , Temperatura
16.
Proteins ; 78(3): 548-58, 2010 Feb 15.
Artículo en Inglés | MEDLINE | ID: mdl-19768681

RESUMEN

Correlated mutation analysis (CMA) is an effective approach for predicting functional and structural residue interactions from multiple sequence alignments (MSAs) of proteins. As nearby residues may also play a role in a given functional interaction, we were interested in seeing whether covarying sites were clustered, and whether this could be used to enhance the predictive power of CMA. A large-scale search for coevolving regions within protein domains revealed that if two sites in a MSA covary, then neighboring sites in the alignment also typically covary, resulting in clusters of covarying residues. The program PatchD(http://www.uhnres.utoronto.ca/labs/tillier/) was developed to measure the covariation between disconnected sequence clusters to reveal patch covariation. Patches that exhibit strong covariation identify multiple residues that are generally nearby in the protein structure, suggesting that the detection of covarying patches can be used in conjunction with traditional CMA approaches to reveal functional interaction partners.


Asunto(s)
Análisis Mutacional de ADN/métodos , Modelos Genéticos , Proteínas/química , Proteínas/genética , Secuencia de Aminoácidos , Sitios de Unión , Análisis por Conglomerados , Secuencia Conservada , Variación Genética , Modelos Moleculares , Filogenia , Proteínas/metabolismo , Alineación de Secuencia
17.
Genome Res ; 19(10): 1861-71, 2009 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-19696150

RESUMEN

Coevolution maintains interactions between phenotypic traits through the process of reciprocal natural selection. Detecting molecular coevolution can expose functional interactions between molecules in the cell, generating insights into biological processes, pathways, and the networks of interactions important for cellular function. Prediction of interaction partners from different protein families exploits the property that interacting proteins can follow similar patterns and relative rates of evolution. Current methods for detecting coevolution based on the similarity of phylogenetic trees or evolutionary distance matrices have, however, been limited by requiring coevolution over the entire evolutionary history considered and are inaccurate in the presence of paralogous copies. We present a novel method for determining coevolving protein partners by finding the largest common submatrix in a given pair of distance matrices, with the size of the largest common submatrix measuring the strength of coevolution. This approach permits us to consider matrices of different size and scale, to find lineage-specific coevolution, and to predict multiple interaction partners. We used MatrixMatchMaker to predict protein-protein interactions in the human genome. We show that proteins that are known to interact physically are more strongly coevolving than proteins that simply belong to the same biochemical pathway. The human coevolution network is highly connected, suggesting many more protein-protein interactions than are currently known from high-throughput and other experimental evidence. These most strongly coevolving proteins suggest interactions that have been maintained over long periods of evolutionary time, and that are thus likely to be of fundamental importance to cellular function.


Asunto(s)
Evolución Molecular , Redes Reguladoras de Genes/genética , Proteínas/genética , Calibración , Biología Computacional/métodos , Bases de Datos de Proteínas , Predicción , Variación Genética , Humanos , Redes y Vías Metabólicas/genética , Filogenia , Unión Proteica/genética , Dominios y Motivos de Interacción de Proteínas/genética , Proteínas/metabolismo , Sensibilidad y Especificidad , Análisis de Secuencia de Proteína/métodos , Análisis de Secuencia de Proteína/normas , Programas Informáticos/normas
18.
Biomol Eng ; 24(3): 321-6, 2007 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-17502167

RESUMEN

RNA sequences can form structures which are conserved throughout evolution and the question of aligning two RNA secondary structures has been extensively studied. Most of the previous alignment algorithms require the input of gap opening and gap extension penalty parameters. The choice of appropriate parameter values is controversial as there is little biological information to guide their assignment. In this paper, we present an algorithm which circumvents this problem. Instead of finding an optimal alignment with predefined gap opening penalty, the algorithm finds the optimal alignment with exact number of aligned blocks.


Asunto(s)
Algoritmos , ARN/química , ARN/genética , Alineación de Secuencia/métodos , Análisis de Secuencia de ARN/métodos , Secuencia de Bases , Datos de Secuencia Molecular , Conformación de Ácido Nucleico , Homología de Secuencia de Ácido Nucleico
19.
Bioinformatics ; 23(10): 1195-202, 2007 May 15.
Artículo en Inglés | MEDLINE | ID: mdl-17392329

RESUMEN

MOTIVATION: With hundreds of completely sequenced microbial genomes available, and advancements in DNA microarray technology, the detection of genes in microbial communities consisting of hundreds of thousands of sequences may be possible. The existing strategies developed for DNA probe design, geared toward identifying specific sequences, are not suitable due to the lack of coverage, flexibility and efficiency necessary for applications in metagenomics. METHODS: ProDesign is a tool developed for the selection of oligonucleotide probes to detect members of gene families present in environmental samples. Gene family-specific probe sequences are generated based on specific and shared words, which are found with the spaced seed hashing algorithm. To detect more sequences, those sharing some common words are re-clustered into new families, then probes specific for the new families are generated. RESULTS: The program is very flexible in that it can be used for designing probes for detecting many genes families simultaneously and specifically in one or more genomes. Neither the length nor the melting temperature of the probes needs to be predefined. We have found that ProDesign provides more flexibility, coverage and speed than other software programs used in the selection of probes for genomic and gene family arrays. AVAILABILITY: ProDesign is licensed free of charge to academic users. ProDesign and Supplementary Material can be obtained by contacting the authors. A web server for ProDesign is available at http://www.uhnresearch.ca/labs/tillier/ProDesign/ProDesign.html. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Familia de Multigenes , Sondas de Oligonucleótidos/genética , Bacterias/genética , Genoma Bacteriano , Análisis por Micromatrices , Análisis de Secuencia por Matrices de Oligonucleótidos , Programas Informáticos
20.
Evol Bioinform Online ; 2: 77-90, 2007 Jan 14.
Artículo en Inglés | MEDLINE | ID: mdl-19455203

RESUMEN

In comparative genomic studies, syntenic groups of homologous sequence in the same order have been used as supplementary information that can be used in helping to determine the orthology of the compared sequences. The assumption is that orthologous gene copies are more likely to share the same genome positions and share the same gene neighbors. In this study we have defined positional homologs as those that also have homologous neighboring genes and we investigated the usefulness of this distinction for bacterial comparative genomics. We considered the identification of positionaly homologous gene pairs in bacterial genomes using protein and DNA sequence level alignments and found that the positional homologs had on average relatively lower rates of substitution at the DNA level (synonymous substitutions) than duplicate homologs in different genomic locations, regardless of the level of protein sequence divergence (measured with non-synonymous substitution rate). Since gene order conservation can indicate accuracy of orthology assignments, we also considered the effect of imposing certain alignment quality requirements on the sensitivity and specificity of identification of protein pairs by BLAST and FASTA when neighboring information is not available and in comparisons where gene order is not conserved. We found that the addition of a stringency filter based on the second best hits was an efficient way to remove dubious ortholog identifications in BLAST and FASTA analyses. Gene order conservation and DNA sequence homology are useful to consider in comparative genomic studies as they may indicate different orthology assignments than protein sequence homology alone.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...