Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
1.
Mol Biol Evol ; 33(10): 2555-64, 2016 10.
Artigo em Inglês | MEDLINE | ID: mdl-27436009

RESUMO

Deleterious mutations are expected to evolve under negative selection and are usually purged from the population. However, deleterious alleles segregate in the human population and some disease-associated variants are maintained at considerable frequencies. Here, we test the hypothesis that balancing selection may counteract purifying selection in neighboring regions and thus maintain deleterious variants at higher frequency than expected from their detrimental fitness effect. We first show in realistic simulations that balancing selection reduces the density of polymorphic sites surrounding a locus under balancing selection, but at the same time markedly increases the population frequency of the remaining variants, including even substantially deleterious alleles. To test the predictions of our simulations empirically, we then use whole-exome sequencing data from 6,500 human individuals and focus on the most established example for balancing selection in the human genome, the major histocompatibility complex (MHC). Our analysis shows an elevated frequency of putatively deleterious coding variants in nonhuman leukocyte antigen (non-HLA) genes localized in the MHC region. The mean frequency of these variants declined with physical distance from the classical HLA genes, indicating dependency on genetic linkage. These results reveal an indirect cost of the genetic diversity maintained by balancing selection, which has hitherto been perceived as mostly advantageous, and have implications both for the evolution of recombination and also for the epidemiology of various MHC-associated diseases.


Assuntos
Antígenos HLA/genética , Complexo Principal de Histocompatibilidade/genética , Seleção Genética , Deleção de Sequência , Alelos , Evolução Biológica , Simulação por Computador , Bases de Dados Genéticas , Evolução Molecular , Frequência do Gene/genética , Variação Genética , Genoma Humano , Haplótipos/genética , Humanos , Polimorfismo Genético/genética
2.
Bioinformatics ; 27(8): 1128-34, 2011 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-21349864

RESUMO

MOTIVATION: Although many methods and statistical approaches have been developed for protein identification by mass spectrometry, the problem of accurate assessment of statistical significance of protein identifications remains an open question. The main issues are as follows: (i) statistical significance of inferring peptide from experimental mass spectra must be platform independent and spectrum specific and (ii) individual spectrum matches at the peptide level must be combined into a single statistical measure at the protein level. RESULTS: We present a method and software to assign statistical significance to protein identifications from search engines for mass spectrometric data. The approach is based on asymptotic theory of order statistics. The parameters of the asymptotic distributions of identification scores are estimated for each spectrum individually. The method relies on new unbiased estimators for parameters of extreme value distribution. The estimated parameters are used to assign a spectrum-specific P-value to each peptide-spectrum match. The protein-level confidence measure combines P-values of peptide-to-spectrum matches. CONCLUSION: We extensively tested the method using triplicate mouse and yeast high-throughput proteomic experiments. The proposed statistical approach improves the sensitivity of protein identifications without compromising specificity. While the method was primarily designed to work with Mascot, it is platform-independent and is applicable to any search engine which outputs a single score for a peptide-spectrum match. We demonstrate this by testing the method in conjunction with X!Tandem. AVAILABILITY: The software is available for download at ftp://genetics.bwh.harvard.edu/SSPV/. CONTACT: ssunyaev@rics.bwh.harvard.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Espectrometria de Massas/métodos , Proteínas/química , Algoritmos , Animais , Interpretação Estatística de Dados , Bases de Dados de Proteínas , Camundongos , Peptídeos/química , Proteínas/análise , Proteômica , Proteínas de Saccharomyces cerevisiae/análise , Proteínas de Saccharomyces cerevisiae/química , Software
3.
Am J Hum Genet ; 81(6): 1298-303, 2007 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-17952847

RESUMO

The identification of DNA sequence variants underlying human complex phenotypes remains a significant challenge for several reasons: individual variants can have small phenotypic effects or low population frequencies, and multiple allelic variants may act in concert to affect a trait. We evaluated the combined effect of allelic variants in seven genes involved in high-density lipoprotein (HDL) metabolism, using forward stepwise regression. Analysis of all known common single-nucleotide polymorphisms (SNPs) in the seven candidate genes revealed four variants that were associated with incremental changes in HDL cholesterol levels in three independent samples. Conversely, analysis of 660 polymorphisms in eight genes that do not appear to be involved in HDL metabolism did not identify any associations with plasma HDL-cholesterol levels. These data indicate that several common SNPs act in concert to influence plasma levels of HDL cholesterol.


Assuntos
HDL-Colesterol/sangue , HDL-Colesterol/genética , Doença da Artéria Coronariana/genética , Polimorfismo de Nucleotídeo Único/genética , Feminino , Estudos de Associação Genética , Genótipo , Humanos , Desequilíbrio de Ligação , Masculino , Modelos Genéticos , Fenótipo
4.
J Proteomics ; 71(3): 346-56, 2008 Aug 21.
Artigo em Inglês | MEDLINE | ID: mdl-18639657

RESUMO

Homology-driven proteomics is a major tool to characterize proteomes of organisms with unsequenced genomes. This paper addresses practical aspects of automated homology-driven protein identifications by LC-MS/MS on a hybrid LTQ Orbitrap mass spectrometer. All essential software elements supporting the presented pipeline are either hosted at the publicly accessible web server, or are available for free download.


Assuntos
Espectrometria de Massas/métodos , Proteínas/análise , Proteômica/métodos , Sequência de Aminoácidos , Animais , Cromatografia Líquida/métodos , Biologia Computacional/métodos , Bases de Dados de Proteínas , Genoma , Proteínas de Insetos/química , Internet , Dados de Sequência Molecular , Mapeamento de Peptídeos , Proteínas de Plantas/química , Proteínas/química , Software
5.
J Proteome Res ; 7(8): 3382-95, 2008 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-18558732

RESUMO

Only a small fraction of spectra acquired in LC-MS/MS runs matches peptides from target proteins upon database searches. The remaining, operationally termed background, spectra originate from a variety of poorly controlled sources and affect the throughput and confidence of database searches. Here, we report an algorithm and its software implementation that rapidly removes background spectra, regardless of their precise origin. The method estimates the dissimilarity distance between screened MS/MS spectra and unannotated spectra from a partially redundant background library compiled from several control and blank runs. Filtering MS/MS queries enhanced the protein identification capacity when searches lacked spectrum to sequence matching specificity. In sequence-similarity searches it reduced by, on average, 30-fold the number of orphan hits, which were not explicitly related to background protein contaminants and required manual validation. Removing high quality background MS/MS spectra, while preserving in the data set the genuine spectra from target proteins, decreased the false positive rate of stringent database searches and improved the identification of low-abundance proteins.


Assuntos
Proteínas/análise , Animais , Cromatografia Líquida , Bases de Dados Factuais , Células HeLa , Humanos , Proteínas de Insetos/análise , Proteínas de Plantas/análise , Proteômica , Software , Espectrometria de Massas em Tandem , Traqueófitas , Triatoma
6.
Proc Natl Acad Sci U S A ; 103(23): 8774-9, 2006 Jun 06.
Artigo em Inglês | MEDLINE | ID: mdl-16731630

RESUMO

The enormous complexity of biological networks has led to the suggestion that networks are built of modules that perform particular functions and are "reused" in evolution in a manner similar to reusable domains in protein structures or modules of electronic circuits. Analysis of known biological networks has revealed several modules, many of which have transparent biological functions. However, it remains to be shown that identified structural modules constitute evolutionary building blocks, independent and easily interchangeable units. An alternative possibility is that evolutionary modules do not match structural modules. To investigate the structure of evolutionary modules and their relationship to functional ones, we integrated a metabolic network with evolutionary associations between genes inferred from comparative genomics. The resulting metabolic-genomic network places metabolic pathways into evolutionary and genomic context, thereby revealing previously unknown components and modules. We analyzed the integrated metabolic-genomic network on three levels: macro-, meso-, and microscale. The macroscale level demonstrates strong associations between neighboring enzymes and between enzymes that are distant on the network but belong to the same linear pathway. At the mesoscale level, we identified evolutionary metabolic modules and compared them with traditional metabolic pathways. Although, in some cases, there is almost exact correspondence, some pathways are split into independent modules. On the microscale level, we observed high association of enzyme subunits and weak association of isoenzymes independently catalyzing the same reaction. This study shows that evolutionary modules, rather than pathways, may be thought of as regulatory and functional units in bacterial genomes.


Assuntos
Evolução Biológica , Metabolismo/genética , Cromossomos/genética , Genes Bacterianos , Genoma Bacteriano/genética , Isoenzimas/genética , Modelos Genéticos , Óperon/genética , Subunidades Proteicas/genética
7.
Proc Natl Acad Sci U S A ; 100(21): 12123-8, 2003 Oct 14.
Artigo em Inglês | MEDLINE | ID: mdl-14517352

RESUMO

Proteins, nucleic acids, and small molecules form a dense network of molecular interactions in a cell. Molecules are nodes of this network, and the interactions between them are edges. The architecture of molecular networks can reveal important principles of cellular organization and function, similarly to the way that protein structure tells us about the function and organization of a protein. Computational analysis of molecular networks has been primarily concerned with node degree [Wagner, A. & Fell, D. A. (2001) Proc. R. Soc. London Ser. B 268, 1803-1810; Jeong, H., Tombor, B., Albert, R., Oltvai, Z. N. & Barabasi, A. L. (2000) Nature 407, 651-654] or degree correlation [Maslov, S. & Sneppen, K. (2002) Science 296, 910-913], and hence focused on single/two-body properties of these networks. Here, by analyzing the multibody structure of the network of protein-protein interactions, we discovered molecular modules that are densely connected within themselves but sparsely connected with the rest of the network. Comparison with experimental data and functional annotation of genes showed two types of modules: (i) protein complexes (splicing machinery, transcription factors, etc.) and (ii) dynamic functional units (signaling cascades, cell-cycle regulation, etc.). Discovered modules are highly statistically significant, as is evident from comparison with random graphs, and are robust to noise in the data. Our results provide strong support for the network modularity principle introduced by Hartwell et al. [Hartwell, L. H., Hopfield, J. J., Leibler, S. & Murray, A. W. (1999) Nature 402, C47-C52], suggesting that found modules constitute the "building blocks" of molecular networks.


Assuntos
Proteínas/química , Proteínas/fisiologia , Fenômenos Biofísicos , Biofísica , Substâncias Macromoleculares , Modelos Biológicos , Proteômica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA