Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 29
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Bioinformatics ; 35(9): 1594-1596, 2019 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-30252043

RESUMO

SUMMARY: Mass spectrometry-based proteomics has had a formidable development in recent years, increasing the amount of data handled and the complexity of the statistical resources needed. Here we present SanXoT, an open-source, standalone software package for the statistical analysis of high-throughput, quantitative proteomics experiments. SanXoT is based on our previously developed weighted spectrum, peptide and protein statistical model and has been specifically designed to be modular, scalable and user-configurable. SanXoT allows limitless workflows that adapt to most experimental setups, including quantitative protein analysis in multiple experiments, systems biology, quantification of post-translational modifications and comparison and merging of experimental data from technical or biological replicates. AVAILABILITY AND IMPLEMENTATION: Download links for the SanXoT Software Package, source code and documentation are available at https://wikis.cnic.es/proteomica/index.php/SSP. CONTACT: jvazquez@cnic.es or ebonzon@cnic.es. SUPPLEMENTARY INFORMATION: Supplementary information is available at Bioinformatics online.


Assuntos
Proteômica , Software , Espectrometria de Massas , Peptídeos , Proteínas
3.
Cell Rep ; 23(12): 3685-3697.e4, 2018 06 19.
Artigo em Inglês | MEDLINE | ID: mdl-29925008

RESUMO

Post-translational modifications hugely increase the functional diversity of proteomes. Recent algorithms based on ultratolerant database searching are forging a path to unbiased analysis of peptide modifications by shotgun mass spectrometry. However, these approaches identify only one-half of the modified forms potentially detectable and do not map the modified residue. Moreover, tools for the quantitative analysis of peptide modifications are currently lacking. Here, we present a suite of algorithms that allows comprehensive identification of detectable modifications, pinpoints the modified residues, and enables their quantitative analysis through an integrated statistical model. These developments were used to characterize the impact of mitochondrial heteroplasmy on the proteome and on the modified peptidome in several tissues from 12-week-old mice. Our results reveal that heteroplasmy mainly affects cardiac tissue, inducing oxidative damage to proteins of the oxidative phosphorylation system, and provide a molecular mechanism explaining the structural and functional alterations produced in heart mitochondria.


Assuntos
Mitocôndrias Cardíacas/patologia , Miocárdio/metabolismo , Miocárdio/patologia , Estresse Oxidativo , Proteoma/metabolismo , Proteômica/métodos , Animais , Células HEK293 , Humanos , Masculino , Camundongos Endogâmicos C57BL , Mitocôndrias Cardíacas/metabolismo , Fosforilação Oxidativa , Peptídeos/metabolismo , Processamento de Proteína Pós-Traducional
4.
Genome Res ; 2018 Feb 09.
Artigo em Inglês | MEDLINE | ID: mdl-29440222

RESUMO

High-throughput sequencing of full-length transcripts using long reads has paved the way for the discovery of thousands of novel transcripts, even in well-annotated mammalian species. The advances in sequencing technology have created a need for studies and tools that can characterize these novel variants. Here, we present SQANTI, an automated pipeline for the classification of long-read transcripts that can assess the quality of data and the preprocessing pipeline using 47 unique descriptors. We apply SQANTI to a neuronal mouse transcriptome using Pacific Biosciences (PacBio) long reads and illustrate how the tool is effective in characterizing and describing the composition of the full-length transcriptome. We perform extensive evaluation of ToFU PacBio transcripts by PCR to reveal that an important number of the novel transcripts are technical artifacts of the sequencing approach and that SQANTI quality descriptors can be used to engineer a filtering strategy to remove them. Most novel transcripts in this curated transcriptome are novel combinations of existing splice sites, resulting more frequently in novel ORFs than novel UTRs, and are enriched in both general metabolic and neural-specific functions. We show that these new transcripts have a major impact in the correct quantification of transcript levels by state-of-the-art short-read-based quantification algorithms. By comparing our iso-transcriptome with public proteomics databases, we find that alternative isoforms are elusive to proteogenomics detection. SQANTI allows the user to maximize the analytical outcome of long-read technologies by providing the tools to deliver quality-evaluated and curated full-length transcriptomes.

5.
Sci Rep ; 6: 38477, 2016 12 09.
Artigo em Inglês | MEDLINE | ID: mdl-27934969

RESUMO

High-density lipoproteins (HDLs) are complex protein and lipid assemblies whose composition is known to change in diverse pathological situations. Analysis of the HDL proteome can thus provide insight into the main mechanisms underlying abdominal aortic aneurysm (AAA) and potentially detect novel systemic biomarkers. We performed a multiplexed quantitative proteomics analysis of HDLs isolated from plasma of AAA patients (N = 14) and control study participants (N = 7). Validation was performed by western-blot (HDL), immunohistochemistry (tissue), and ELISA (plasma). HDL from AAA patients showed elevated expression of peroxiredoxin-6 (PRDX6), HLA class I histocompatibility antigen (HLA-I), retinol-binding protein 4, and paraoxonase/arylesterase 1 (PON1), whereas α-2 macroglobulin and C4b-binding protein were decreased. The main pathways associated with HDL alterations in AAA were oxidative stress and immune-inflammatory responses. In AAA tissue, PRDX6 colocalized with neutrophils, vascular smooth muscle cells, and lipid oxidation. Moreover, plasma PRDX6 was higher in AAA (N = 47) than in controls (N = 27), reflecting increased systemic oxidative stress. Finally, a positive correlation was recorded between PRDX6 and AAA diameter. The analysis of the HDL proteome demonstrates that redox imbalance is a major mechanism in AAA, identifying the antioxidant PRDX6 as a novel systemic biomarker of AAA.


Assuntos
Aneurisma da Aorta Abdominal/metabolismo , Lipoproteínas HDL/metabolismo , Peroxirredoxina VI/metabolismo , Proteoma , Proteômica , Idoso , Aneurisma da Aorta Abdominal/sangue , Aneurisma da Aorta Abdominal/diagnóstico , Biomarcadores , Cromatografia Líquida , Comorbidade , Biologia Computacional/métodos , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Peroxirredoxina VI/sangue , Proteômica/métodos , Reprodutibilidade dos Testes , Espectrometria de Massas em Tandem , Fluxo de Trabalho
6.
Nature ; 539(7630): 579-582, 2016 11 24.
Artigo em Inglês | MEDLINE | ID: mdl-27775717

RESUMO

Respiratory chain complexes can super-assemble into quaternary structures called supercomplexes that optimize cellular metabolism. The interaction between complexes III (CIII) and IV (CIV) is modulated by supercomplex assembly factor 1 (SCAF1, also known as COX7A2L). The discovery of SCAF1 represented strong genetic evidence that supercomplexes exist in vivo. SCAF1 is present as a long isoform (113 amino acids) or a short isoform (111 amino acids) in different mouse strains. Only the long isoform can induce the super-assembly of CIII and CIV, but it is not clear whether SCAF1 is required for the formation of the respirasome (a supercomplex of CI, CIII2 and CIV). Here we show, by combining deep proteomics and immunodetection analysis, that SCAF1 is always required for the interaction between CIII and CIV and that the respirasome is absent from most tissues of animals containing the short isoform of SCAF1, with the exception of heart and skeletal muscle. We used directed mutagenesis to characterize SCAF1 regions that interact with CIII and CIV and discovered that this interaction requires the correct orientation of a histidine residue at position 73 that is altered in the short isoform of SCAF1, explaining its inability to interact with CIV. Furthermore, we find that the CIV subunit COX7A2 is replaced by SCAF1 in supercomplexes containing CIII and CIV and by COX7A1 in CIV dimers, and that dimers seem to be more stable when they include COX6A2 rather than the COX6A1 isoform.


Assuntos
Membranas Mitocondriais/metabolismo , Isoformas de Proteínas/metabolismo , Animais , Complexo IV da Cadeia de Transporte de Elétrons/química
7.
J Cell Sci ; 129(8): 1734-49, 2016 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-26940916

RESUMO

Rab8 is a small Ras-related GTPase that regulates polarized membrane transport to the plasma membrane. Here, we developed a high-content analysis (HCA) tool to dissect Rab8-mediated actin and focal adhesion reorganization that revealed that Rab8 activation significantly induced Rac1 and Tiam1 to mediate cortical actin polymerization and RhoA-dependent stress fibre disassembly. Rab8 activation increased Rac1 activity, whereas its depletion activated RhoA, which led to reorganization of the actin cytoskeleton. Rab8 was also associated with focal adhesions, promoting their disassembly in a microtubule-dependent manner. This Rab8 effect involved calpain, MT1-MMP (also known as MMP14) and Rho GTPases. Moreover, we demonstrate the role of Rab8 in the cell migration process. Indeed, Rab8 is required for EGF-induced cell polarization and chemotaxis, as well as for the directional persistency of intrinsic cell motility. These data reveal that Rab8 drives cell motility by mechanisms both dependent and independent of Rho GTPases, thereby regulating the establishment of cell polarity, turnover of focal adhesions and actin cytoskeleton rearrangements, thus determining the directionality of cell migration.


Assuntos
Calpaína/metabolismo , Adesões Focais/metabolismo , Fatores de Troca do Nucleotídeo Guanina/metabolismo , Metaloproteinase 14 da Matriz/metabolismo , Proteínas rab de Ligação ao GTP/metabolismo , Proteínas rac1 de Ligação ao GTP/metabolismo , Proteínas rho de Ligação ao GTP/metabolismo , Citoesqueleto de Actina/metabolismo , Movimento Celular , Polaridade Celular , Células HeLa , Humanos , RNA Interferente Pequeno/genética , Fibras de Estresse/metabolismo , Proteína 1 Indutora de Invasão e Metástase de Linfoma de Células T , Proteínas rab de Ligação ao GTP/genética , Proteína rhoA de Ligação ao GTP/metabolismo
8.
Expert Rev Proteomics ; 12(6): 579-93, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26496066

RESUMO

The authors have carried out an investigation of the two "draft maps of the human proteome" published in 2014 in Nature. The findings include an abundance of poor spectra, low-scoring peptide-spectrum matches and incorrectly identified proteins in both these studies, highlighting clear issues with the application of false discovery rates. This noise means that the claims made by the two papers - the identification of high numbers of protein coding genes, the detection of novel coding regions and the draft tissue maps themselves - should be treated with considerable caution. The authors recommend that clinicians and researchers do not use the unfiltered data from these studies. Despite this these studies will inspire further investigation into tissue-based proteomics. As long as this future work has proper quality controls, it could help produce a consensus map of the human proteome and improve our understanding of the processes that underlie health and disease.


Assuntos
Bases de Dados de Proteínas , Proteoma/genética , Humanos , Peptídeos , Proteômica
9.
PLoS Comput Biol ; 11(6): e1004325, 2015 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-26061177

RESUMO

Alternative splicing of messenger RNA can generate a wide variety of mature RNA transcripts, and these transcripts may produce protein isoforms with diverse cellular functions. While there is much supporting evidence for the expression of alternative transcripts, the same is not true for the alternatively spliced protein products. Large-scale mass spectroscopy experiments have identified evidence of alternative splicing at the protein level, but with conflicting results. Here we carried out a rigorous analysis of the peptide evidence from eight large-scale proteomics experiments to assess the scale of alternative splicing that is detectable by high-resolution mass spectroscopy. We find fewer splice events than would be expected: we identified peptides for almost 64% of human protein coding genes, but detected just 282 splice events. This data suggests that most genes have a single dominant isoform at the protein level. Many of the alternative isoforms that we could identify were only subtly different from the main splice isoform. Very few of the splice events identified at the protein level disrupted functional domains, in stark contrast to the two thirds of splice events annotated in the human genome that would lead to the loss or damage of functional domains. The most striking result was that more than 20% of the splice isoforms we identified were generated by substituting one homologous exon for another. This is significantly more than would be expected from the frequency of these events in the genome. These homologous exon substitution events were remarkably conserved--all the homologous exons we identified evolved over 460 million years ago--and eight of the fourteen tissue-specific splice isoforms we identified were generated from homologous exons. The combination of proteomics evidence, ancient origin and tissue-specific splicing indicates that isoforms generated from homologous exons may have important cellular roles.


Assuntos
Processamento Alternativo/genética , Éxons/genética , Isoformas de Proteínas/genética , Sequência de Aminoácidos , Animais , Biologia Computacional , Bases de Dados Genéticas , Humanos , Camundongos , Modelos Moleculares , Dados de Sequência Molecular , Especificidade de Órgãos/genética , Peptídeos/química , Peptídeos/genética , Peptídeos/metabolismo , Conformação Proteica , Isoformas de Proteínas/química , Isoformas de Proteínas/metabolismo , Alinhamento de Sequência , Análise de Sequência de DNA
11.
J Proteome Res ; 14(4): 1880-7, 2015 Apr 03.
Artigo em Inglês | MEDLINE | ID: mdl-25732134

RESUMO

Although eukaryotic cells express a wide range of alternatively spliced transcripts, it is not clear whether genes tend to express a range of transcripts simultaneously across cells, or produce dominant isoforms in a manner that is either tissue-specific or regardless of tissue. To date, large-scale investigations into the pattern of transcript expression across distinct tissues have produced contradictory results. Here, we attempt to determine whether genes express a dominant splice variant at the protein level. We interrogate peptides from eight large-scale human proteomics experiments and databases and find that there is a single dominant protein isoform, irrespective of tissue or cell type, for the vast majority of the protein-coding genes in these experiments, in partial agreement with the conclusions from the most recent large-scale RNAseq study. Remarkably, the dominant isoforms from the experimental proteomics analyses coincided overwhelmingly with the reference isoforms selected by two completely orthogonal sources, the consensus coding sequence variants, which are agreed upon by separate manual genome curation teams, and the principal isoforms from the APPRIS database, predicted automatically from the conservation of protein sequence, structure, and function.


Assuntos
Fases de Leitura Aberta/genética , Peptídeos/genética , Isoformas de Proteínas/genética , Proteômica/métodos , Biologia Computacional , Bases de Dados de Proteínas , Humanos
12.
J Proteome Res ; 13(8): 3854-5, 2014 Aug 01.
Artigo em Inglês | MEDLINE | ID: mdl-25014353

RESUMO

This letter analyzes two large-scale proteomics studies published in the same issue of Nature. At the time of the release, both studies were portrayed as draft maps of the human proteome and great advances in the field. As with the initial publication of the human genome, these papers have broad appeal and will no doubt lead to a great deal of further analysis by the scientific community. However, we were intrigued by the number of protein-coding genes detected by the two studies, numbers that far exceeded what has been reported for the multinational Human Proteome Project effort. We carried out a simple quality test on the data using the olfactory receptor family. A high-quality proteomics experiment that does not specifically analyze nasal tissues should not expect to detect many peptides for olfactory receptors. Neither of the studies carried out experiments on nasal tissues, yet we found peptide evidence for more than 100 olfactory receptors in the two studies. These results suggest that the two studies are substantially overestimating the number of protein coding genes they identify. We conclude that the experimental data from these two studies should be used with caution.


Assuntos
Bases de Dados de Proteínas , Espectrometria de Massas , Proteoma/análise , Proteoma/química , Proteoma/metabolismo , Proteômica , Humanos
13.
Hum Mol Genet ; 23(22): 5866-78, 2014 Nov 15.
Artigo em Inglês | MEDLINE | ID: mdl-24939910

RESUMO

Determining the full complement of protein-coding genes is a key goal of genome annotation. The most powerful approach for confirming protein-coding potential is the detection of cellular protein expression through peptide mass spectrometry (MS) experiments. Here, we mapped peptides detected in seven large-scale proteomics studies to almost 60% of the protein-coding genes in the GENCODE annotation of the human genome. We found a strong relationship between detection in proteomics experiments and both gene family age and cross-species conservation. Most of the genes for which we detected peptides were highly conserved. We found peptides for >96% of genes that evolved before bilateria. At the opposite end of the scale, we identified almost no peptides for genes that have appeared since primates, for genes that did not have any protein-like features or for genes with poor cross-species conservation. These results motivated us to describe a set of 2001 potential non-coding genes based on features such as weak conservation, a lack of protein features, or ambiguous annotations from major databases, all of which correlated with low peptide detection across the seven experiments. We identified peptides for just 3% of these genes. We show that many of these genes behave more like non-coding genes than protein-coding genes and suggest that most are unlikely to code for proteins under normal circumstances. We believe that their inclusion in the human protein-coding gene catalogue should be revised as part of the ongoing human genome annotation effort.


Assuntos
Proteínas/genética , Biologia Computacional , Genoma Humano , Humanos , Fases de Leitura Aberta , Peptídeos/genética , Proteínas/metabolismo , Proteômica
14.
Nucleic Acids Res ; 41(Database issue): D110-7, 2013 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-23161672

RESUMO

Here, we present APPRIS (http://appris.bioinfo.cnio.es), a database that houses annotations of human splice isoforms. APPRIS has been designed to provide value to manual annotations of the human genome by adding reliable protein structural and functional data and information from cross-species conservation. The visual representation of the annotations provided by APPRIS for each gene allows annotators and researchers alike to easily identify functional changes brought about by splicing events. In addition to collecting, integrating and analyzing reliable predictions of the effect of splicing events, APPRIS also selects a single reference sequence for each gene, here termed the principal isoform, based on the annotations of structure, function and conservation for each transcript. APPRIS identifies a principal isoform for 85% of the protein-coding genes in the GENCODE 7 release for ENSEMBL. Analysis of the APPRIS data shows that at least 70% of the alternative (non-principal) variants would lose important functional or structural information relative to the principal isoform.


Assuntos
Processamento Alternativo , Bases de Dados de Proteínas , Anotação de Sequência Molecular , Isoformas de Proteínas/química , Isoformas de Proteínas/genética , Humanos , Internet , Isoformas de Proteínas/metabolismo
15.
Genome Res ; 22(9): 1760-74, 2012 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-22955987

RESUMO

The GENCODE Consortium aims to identify all gene features in the human genome using a combination of computational analysis, manual annotation, and experimental validation. Since the first public release of this annotation data set, few new protein-coding loci have been added, yet the number of alternative splicing transcripts annotated has steadily increased. The GENCODE 7 release contains 20,687 protein-coding and 9640 long noncoding RNA loci and has 33,977 coding transcripts not represented in UCSC genes and RefSeq. It also has the most comprehensive annotation of long noncoding RNA (lncRNA) loci publicly available with the predominant transcript form consisting of two exons. We have examined the completeness of the transcript annotation and found that 35% of transcriptional start sites are supported by CAGE clusters and 62% of protein-coding genes have annotated polyA sites. Over one-third of GENCODE protein-coding genes are supported by peptide hits derived from mass spectrometry spectra submitted to Peptide Atlas. New models derived from the Illumina Body Map 2.0 RNA-seq data identify 3689 new loci not currently in GENCODE, of which 3127 consist of two exon models indicating that they are possibly unannotated long noncoding loci. GENCODE 7 is publicly available from gencodegenes.org and via the Ensembl and UCSC Genome Browsers.


Assuntos
Bases de Dados Genéticas , Genoma Humano , Genômica/métodos , Anotação de Sequência Molecular , Animais , Biologia Computacional/métodos , DNA Complementar/química , DNA Complementar/genética , Evolução Molecular , Éxons , Loci Gênicos , Humanos , Internet , Modelos Moleculares , Fases de Leitura Aberta , Pseudogenes , Controle de Qualidade , Sítios de Splice de RNA , RNA Longo não Codificante , Reprodutibilidade dos Testes , Regiões não Traduzidas
16.
Genome Res ; 22(7): 1231-42, 2012 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-22588898

RESUMO

Chimeric RNAs comprise exons from two or more different genes and have the potential to encode novel proteins that alter cellular phenotypes. To date, numerous putative chimeric transcripts have been identified among the ESTs isolated from several organisms and using high throughput RNA sequencing. The few corresponding protein products that have been characterized mostly result from chromosomal translocations and are associated with cancer. Here, we systematically establish that some of the putative chimeric transcripts are genuinely expressed in human cells. Using high throughput RNA sequencing, mass spectrometry experimental data, and functional annotation, we studied 7424 putative human chimeric RNAs. We confirmed the expression of 175 chimeric RNAs in 16 human tissues, with an abundance varying from 0.06 to 17 RPKM (Reads Per Kilobase per Million mapped reads). We show that these chimeric RNAs are significantly more tissue-specific than non-chimeric transcripts. Moreover, we present evidence that chimeras tend to incorporate highly expressed genes. Despite the low expression level of most chimeric RNAs, we show that 12 novel chimeras are translated into proteins detectable in multiple shotgun mass spectrometry experiments. Furthermore, we confirm the expression of three novel chimeric proteins using targeted mass spectrometry. Finally, based on our functional annotation of exon organization and preserved domains, we discuss the potential features of chimeric proteins with illustrative examples and suggest that chimeras significantly exploit signal peptides and transmembrane domains, which can alter the cellular localization of cognate proteins. Taken together, these findings establish that some chimeric RNAs are translated into potentially functional proteins in humans.


Assuntos
Genoma Humano , Proteínas Mutantes Quiméricas/genética , Biossíntese de Proteínas , Sequência de Aminoácidos , Membrana Celular/genética , Membrana Celular/metabolismo , Bases de Dados de Ácidos Nucleicos , Éxons , Regulação da Expressão Gênica , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Espectrometria de Massas/métodos , Anotação de Sequência Molecular , Dados de Sequência Molecular , Proteínas Mutantes Quiméricas/metabolismo , Especificidade de Órgãos , Sinais Direcionadores de Proteínas , Estrutura Secundária de Proteína , Estrutura Terciária de Proteína , Proteômica/métodos , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Análise de Sequência de RNA/métodos , Relação Estrutura-Atividade
17.
J Proteomics ; 75(12): 3495-513, 2012 Jun 27.
Artigo em Inglês | MEDLINE | ID: mdl-22579752

RESUMO

Due to the enormous complexity of proteomes which constitute the entirety of protein species expressed by a certain cell or tissue, proteome-wide studies performed in discovery mode are still limited in their ability to reproducibly identify and quantify all proteins present in complex biological samples. Therefore, the targeted analysis of informative subsets of the proteome has been beneficial to generate reproducible data sets across multiple samples. Here we review the repertoire of antibody- and mass spectrometry (MS) -based analytical tools which is currently available for the directed analysis of predefined sets of proteins. The topics of emphasis for this review are Selected Reaction Monitoring (SRM) mass spectrometry, emerging tools to control error rates in targeted proteomic experiments, and some representative examples of applications. The ability to cost- and time-efficiently generate specific and quantitative assays for large numbers of proteins and posttranslational modifications has the potential to greatly expand the range of targeted proteomic coverage in biological studies. This article is part of a Special Section entitled: Understanding genome regulation and genetic diversity by mass spectrometry.


Assuntos
Imunoensaio/métodos , Espectrometria de Massas/métodos , Mapeamento de Peptídeos/métodos , Mapeamento de Interação de Proteínas/métodos , Proteoma/química , Proteoma/imunologia , Sítios de Ligação , Ligação Proteica
18.
Mol Biol Evol ; 29(9): 2265-83, 2012 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-22446687

RESUMO

Advances in high-throughput mass spectrometry are making proteomics an increasingly important tool in genome annotation projects. Peptides detected in mass spectrometry experiments can be used to validate gene models and verify the translation of putative coding sequences (CDSs). Here, we have identified peptides that cover 35% of the genes annotated by the GENCODE consortium for the human genome as part of a comprehensive analysis of experimental spectra from two large publicly available mass spectrometry databases. We detected the translation to protein of "novel" and "putative" protein-coding transcripts as well as transcripts annotated as pseudogenes and nonsense-mediated decay targets. We provide a detailed overview of the population of alternatively spliced protein isoforms that are detectable by peptide identification methods. We found that 150 genes expressed multiple alternative protein isoforms. This constitutes the largest set of reliably confirmed alternatively spliced proteins yet discovered. Three groups of genes were highly overrepresented. We detected alternative isoforms for 10 of the 25 possible heterogeneous nuclear ribonucleoproteins, proteins with a key role in the splicing process. Alternative isoforms generated from interchangeable homologous exons and from short indels were also significantly enriched, both in human experiments and in parallel analyses of mouse and Drosophila proteomics experiments. Our results show that a surprisingly high proportion (almost 25%) of the detected alternative isoforms are only subtly different from their constitutive counterparts. Many of the alternative splicing events that give rise to these alternative isoforms are conserved in mouse. It was striking that very few of these conserved splicing events broke Pfam functional domains or would damage globular protein structures. This evidence of a strong bias toward subtle differences in CDS and likely conserved cellular function and structure is remarkable and strongly suggests that the translation of alternative transcripts may be subject to selective constraints.


Assuntos
Processamento Alternativo , Proteínas/química , Proteínas/genética , Proteômica , Sequência de Aminoácidos , Animais , Domínio Catalítico , Drosophila , Genoma , Humanos , Camundongos , Modelos Moleculares , Anotação de Sequência Molecular , Dados de Sequência Molecular , Degradação do RNAm Mediada por Códon sem Sentido , Peptídeos/química , Peptídeos/genética , Complexo de Endopeptidases do Proteassoma/química , Biossíntese de Proteínas , Conformação Proteica , Domínios e Motivos de Interação entre Proteínas , Isoformas de Proteínas , Proteínas/metabolismo , Alinhamento de Sequência
19.
Curr Protoc Protein Sci ; Chapter 2: 2.14.1-2.14.16, 2011 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-22045561

RESUMO

Recognition and prediction of structural domains in proteins is an important part of structure and function prediction. This unit lists the range of tools available for domain prediction, and describes sequence and structural analysis tools that complement domain prediction methods. Also detailed are the basic domain prediction steps, along with suggested strategies for different protein sequences and potential pitfalls in domain boundary prediction. The difficult problem of domain orientation prediction is also discussed. All the resources necessary for domain boundary prediction are accessible via publicly available Web servers and databases and do not require computational expertise.


Assuntos
Estrutura Terciária de Proteína , Proteínas/química , Biologia Computacional , Bases de Dados de Proteínas , Modelos Biológicos , Modelos Moleculares , Software , Homologia Estrutural de Proteína
20.
PLoS One ; 5(4): e9969, 2010 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-20376314

RESUMO

BACKGROUND: Molecular biology is currently facing the challenging task of functionally characterizing the proteome. The large number of possible protein-protein interactions and complexes, the variety of environmental conditions and cellular states in which these interactions can be reorganized, and the multiple ways in which a protein can influence the function of others, requires the development of experimental and computational approaches to analyze and predict functional associations between proteins as part of their activity in the interactome. METHODOLOGY/PRINCIPAL FINDINGS: We have studied the possibility of constructing a classifier in order to combine the output of the several protein interaction prediction methods. The AODE (Averaged One-Dependence Estimators) machine learning algorithm is a suitable choice in this case and it provides better results than the individual prediction methods, and it has better performances than other tested alternative methods in this experimental set up. To illustrate the potential use of this new AODE-based Predictor of Protein InterActions (APPIA), when analyzing high-throughput experimental data, we show how it helps to filter the results of published High-Throughput proteomic studies, ranking in a significant way functionally related pairs. AVAILABILITY: All the predictions of the individual methods and of the combined APPIA predictor, together with the used datasets of functional associations are available at http://ecid.bioinfo.cnio.es/. CONCLUSIONS: We propose a strategy that integrates the main current computational techniques used to predict functional associations into a unified classifier system, specifically focusing on the evaluation of poorly characterized protein pairs. We selected the AODE classifier as the appropriate tool to perform this task. AODE is particularly useful to extract valuable information from large unbalanced and heterogeneous data sets. The combination of the information provided by five prediction interaction prediction methods with some simple sequence features in APPIA is useful in establishing reliability values and helpful to prioritize functional interactions that can be further experimentally characterized.


Assuntos
Algoritmos , Inteligência Artificial , Biologia Computacional/métodos , Ligação Proteica , Coleta de Dados , Internet , Modelos Moleculares , Proteômica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...