Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
1.
Nature ; 489(7414): 101-8, 2012 Sep 06.
Artículo en Inglés | MEDLINE | ID: mdl-22955620

RESUMEN

Eukaryotic cells make many types of primary and processed RNAs that are found either in specific subcellular compartments or throughout the cells. A complete catalogue of these RNAs is not yet available and their characteristic subcellular localizations are also poorly understood. Because RNA represents the direct output of the genetic information encoded by genomes and a significant proportion of a cell's regulatory capabilities are focused on its synthesis, processing, transport, modification and translation, the generation of such a catalogue is crucial for understanding genome function. Here we report evidence that three-quarters of the human genome is capable of being transcribed, as well as observations about the range and levels of expression, localization, processing fates, regulatory regions and modifications of almost all currently annotated and thousands of previously unannotated RNAs. These observations, taken together, prompt a redefinition of the concept of a gene.


Asunto(s)
ADN/genética , Enciclopedias como Asunto , Genoma Humano/genética , Anotación de Secuencia Molecular , Secuencias Reguladoras de Ácidos Nucleicos/genética , Transcripción Genética/genética , Transcriptoma/genética , Alelos , Línea Celular , ADN Intergénico/genética , Elementos de Facilitación Genéticos , Exones/genética , Perfilación de la Expresión Génica , Genes/genética , Genómica , Humanos , Poliadenilación/genética , Isoformas de Proteínas/genética , ARN/biosíntesis , ARN/genética , Edición de ARN/genética , Empalme del ARN/genética , Secuencias Repetitivas de Ácidos Nucleicos/genética , Análisis de Secuencia de ARN
2.
Genome Res ; 22(9): 1646-57, 2012 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-22955977

RESUMEN

Data from the Encyclopedia of DNA Elements (ENCODE) project show over 9640 human genome loci classified as long noncoding RNAs (lncRNAs), yet only ~100 have been deeply characterized to determine their role in the cell. To measure the protein-coding output from these RNAs, we jointly analyzed two recent data sets produced in the ENCODE project: tandem mass spectrometry (MS/MS) data mapping expressed peptides to their encoding genomic loci, and RNA-seq data generated by ENCODE in long polyA+ and polyA- fractions in the cell lines K562 and GM12878. We used the machine-learning algorithm RuleFit3 to regress the peptide data against RNA expression data. The most important covariate for predicting translation was, surprisingly, the Cytosol polyA- fraction in both cell lines. LncRNAs are ~13-fold less likely to produce detectable peptides than similar mRNAs, indicating that ~92% of GENCODE v7 lncRNAs are not translated in these two ENCODE cell lines. Intersecting 9640 lncRNA loci with 79,333 peptides yielded 85 unique peptides matching 69 lncRNAs. Most cases were due to a coding transcript misannotated as lncRNA. Two exceptions were an unprocessed pseudogene and a bona fide lncRNA gene, both with open reading frames (ORFs) compromised by upstream stop codons. All potentially translatable lncRNA ORFs had only a single peptide match, indicating low protein abundance and/or false-positive peptide matches. We conclude that with very few exceptions, ribosomes are able to distinguish coding from noncoding transcripts and, hence, that ectopic translation and cryptic mRNAs are rare in the human lncRNAome.


Asunto(s)
Biosíntesis de Proteínas , ARN Largo no Codificante/genética , Secuencia de Aminoácidos , Secuencia de Bases , Línea Celular , Expresión Génica , Perfilación de la Expresión Génica , Regulación de la Expresión Génica , Humanos , Células K562 , Anotación de Secuencia Molecular , Datos de Secuencia Molecular , Péptidos/genética , ARN Largo no Codificante/metabolismo , ARN Mensajero/genética , ARN Mensajero/metabolismo , Alineación de Secuencia , Espectrometría de Masas en Tándem/métodos
3.
BMC Genomics ; 14: 141, 2013 Feb 28.
Artículo en Inglés | MEDLINE | ID: mdl-23448259

RESUMEN

BACKGROUND: Proteogenomic mapping is an approach that uses mass spectrometry data from proteins to directly map protein-coding genes and could aid in locating translational regions in the human genome. In concert with the ENcyclopedia of DNA Elements (ENCODE) project, we applied proteogenomic mapping to produce proteogenomic tracks for the UCSC Genome Browser, to explore which putative translational regions may be missing from the human genome. RESULTS: We generated ~1 million high-resolution tandem mass (MS/MS) spectra for Tier 1 ENCODE cell lines K562 and GM12878 and mapped them against the UCSC hg19 human genome, and the GENCODE V7 annotated protein and transcript sets. We then compared the results from the three searches to identify the best-matching peptide for each MS/MS spectrum, thereby increasing the confidence of the putative new protein-coding regions found via the whole genome search. At a 1% false discovery rate, we identified 26,472, 24,406, and 13,128 peptides from the protein, transcript, and whole genome searches, respectively; of these, 481 were found solely via the whole genome search. The proteogenomic mapping data are available on the UCSC Genome Browser at http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=wgEncodeUncBsuProt. CONCLUSIONS: The whole genome search revealed that ~4% of the uniquely mapping identified peptides were located outside GENCODE V7 annotated exons. The comparison of the results from the disparate searches also identified 15% more spectra than would have been found solely from a protein database search. Therefore, whole genome proteogenomic mapping is a complementary method for genome annotation when performed in conjunction with other searches.


Asunto(s)
Bases de Datos Genéticas , Genoma Humano , Anotación de Secuencia Molecular , Sistemas de Lectura Abierta/genética , Línea Celular , Mapeo Cromosómico , Biología Computacional , Humanos , Espectrometría de Masas , Análisis de Secuencia de ADN
4.
Anal Chem ; 84(21): 9008-14, 2012 Nov 06.
Artículo en Inglés | MEDLINE | ID: mdl-23030679

RESUMEN

Membrane proteomics, the large-scale analysis of membrane proteins, is often constrained by the difficulties of achieving fully resolvable separation and resistance to proteolysis, both of which could lead to low recovery and low identification rates of membrane proteins. Here, we introduce a novel integrated approach, GELFrEE Optimized FASP Technology (GOFAST) for large-scale and comprehensive membrane proteins analysis. Using an array of sample preparation techniques including gel-eluted liquid fraction entrapment electrophoresis (GELFrEE), filter-aided sample preparation (FASP), and microwave-assisted on-filter enzymatic digestion, we identified 2 090 proteins from the membrane fraction of a leukemia cell line (K562). Of these, 37% are annotated as membrane proteins according to gene ontology analysis, resulting in the largest membrane proteome of leukemia cells reported to date. Our approach combines the advantages of GELFrEE high-loading capacity, gel-free separation, efficient depletion of detergents, and microwave-assisted on-filter digestion, minimizing sample losses and maximizing MS-detectable sequence coverage of individual proteins. In addition, this approach also shows great potential for the identification of alternative splicing products.


Asunto(s)
Métodos Analíticos de la Preparación de la Muestra/métodos , Electroforesis/métodos , Proteínas de la Membrana/análisis , Proteoma/análisis , Proteómica/métodos , Filtración , Humanos , Células K562 , Proteínas de la Membrana/química , Proteínas de la Membrana/aislamiento & purificación , Isoformas de Proteínas/análisis , Isoformas de Proteínas/química , Proteoma/química
5.
Antimicrob Agents Chemother ; 54(11): 4626-35, 2010 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-20696867

RESUMEN

Microbes have developed resistance to nearly every antibiotic, yet the steps leading to drug resistance remain unclear. Here we report a multistage process by which Pseudomonas aeruginosa acquires drug resistance following exposure to ciprofloxacin at levels ranging from 0.5× to 8× the initial MIC. In stage I, susceptible cells are killed en masse by the exposure. In stage II, a small, slow to nongrowing population survives antibiotic exposure that does not exhibit significantly increased resistance according to the MIC measure. In stage III, exhibited at 0.5× to 4× the MIC, a growing population emerges to reconstitute the population, and these cells display heritable increases in drug resistance of up to 50 times the original level. We studied the stage III cells by proteomic methods to uncover differences in the regulatory pathways that are involved in this phenotype, revealing upregulation of phosphorylation on two proteins, succinate-semialdehyde dehydrogenase (SSADH) and methylmalonate-semialdehyde dehydrogenase (MMSADH), and also revealing upregulation of a highly conserved protein of unknown function. Transposon disruption in the encoding genes for each of these targets substantially dampened the ability of cells to develop the stage III phenotype. Considering these results in combination with computational models of resistance and genomic sequencing results, we postulate that stage III heritable resistance develops from a combination of both genomic mutations and modulation of one or more preexisting cellular pathways.


Asunto(s)
Antiinfecciosos/farmacología , Proteínas Bacterianas/metabolismo , Ciprofloxacina/farmacología , Farmacorresistencia Bacteriana/fisiología , Pseudomonas aeruginosa/efectos de los fármacos , Pseudomonas aeruginosa/metabolismo , Proteínas Bacterianas/genética , ADN Bacteriano/genética , Farmacorresistencia Bacteriana/genética , Electroforesis en Gel Bidimensional , Metilmalonato-Semialdehído Deshidrogenasa (Acetilante)/genética , Metilmalonato-Semialdehído Deshidrogenasa (Acetilante)/metabolismo , Pruebas de Sensibilidad Microbiana , Pseudomonas aeruginosa/genética , Espectrometría de Masa por Láser de Matriz Asistida de Ionización Desorción , Succionato-Semialdehído Deshidrogenasa/genética , Succionato-Semialdehído Deshidrogenasa/metabolismo
6.
Bioinformatics ; 24(5): 674-81, 2008 Mar 01.
Artículo en Inglés | MEDLINE | ID: mdl-18187442

RESUMEN

MOTIVATION: The identification of peptides by tandem mass spectrometry (MS/MS) is a central method of proteomics research, but due to the complexity of MS/MS data and the large databases searched, the accuracy of peptide identification algorithms remains limited. To improve the accuracy of identification we applied a machine-learning approach using a hidden Markov model (HMM) to capture the complex and often subtle links between a peptide sequence and its MS/MS spectrum. MODEL: Our model, HMM_Score, represents ion types as HMM states and calculates the maximum joint probability for a peptide/spectrum pair using emission probabilities from three factors: the amino acids adjacent to each fragmentation site, the mass dependence of ion types and the intensity dependence of ion types. The Viterbi algorithm is used to calculate the most probable assignment between ion types in a spectrum and a peptide sequence, then a correction factor is added to account for the propensity of the model to favor longer peptides. An expectation value is calculated based on the model score to assess the significance of each peptide/spectrum match. RESULTS: We trained and tested HMM_Score on three data sets generated by two different mass spectrometer types. For a reference data set recently reported in the literature and validated using seven identification algorithms, HMM_Score produced 43% more positive identification results at a 1% false positive rate than the best of two other commonly used algorithms, Mascot and X!Tandem. HMM_Score is a highly accurate platform for peptide identification that works well for a variety of mass spectrometer and biological sample types. AVAILABILITY: The program is freely available on ProteomeCommons via an OpenSource license. See http://bioinfo.unc.edu/downloads/ for the download link.


Asunto(s)
Cadenas de Markov , Péptidos/química , Algoritmos , Modelos Teóricos , Espectrometría de Masa por Láser de Matriz Asistida de Ionización Desorción , Espectrometría de Masas en Tándem
7.
J Mol Biol ; 336(5): 1223-38, 2004 Mar 05.
Artículo en Inglés | MEDLINE | ID: mdl-15037081

RESUMEN

The simplest approximation of interaction potential between amino acid residues in proteins is the contact potential, which defines the effective free energy of a protein conformation by a set of amino acid contacts formed in this conformation. Finding a contact potential capable of predicting free energies of protein states across a variety of protein families will aid protein folding and engineering in silico on a computationally tractable time-scale. We test the ability of contact potentials to accurately and transferably (across various protein families) predict stability changes of proteins upon mutations. We develop a new methodology to determine the contact potentials in proteins from experimental measurements of changes in protein's thermodynamic stabilities (DeltaDeltaG) upon mutations. We apply our methodology to derive sets of contact interaction parameters for a hierarchy of interaction models including solvation and multi-body contact parameters. We test how well our models reproduce experimental measurements by statistical tests. We evaluate the maximum accuracy of predictions obtained by using contact potentials and the correlation between parameters derived from different data-sets of experimental (DeltaDeltaG) values. We argue that it is impossible to reach experimental accuracy and derive fully transferable contact parameters using the contact models of potentials. However, contact parameters may yield reliable predictions of DeltaDeltaG for datasets of mutations confined to the same amino acid positions in the sequence of a single protein.


Asunto(s)
Proteínas/química , Termodinámica , Mutación , Desnaturalización Proteica , Pliegue de Proteína , Proteínas/genética , Teoría Cuántica
8.
Microb Drug Resist ; 19(6): 428-36, 2013 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-23808957

RESUMEN

The alarming rise of ciprofloxacin-resistant Pseudomonas aeruginosa has been reported in several clinical studies. Though the mutation of resistance genes and their role in drug resistance has been researched, the process by which the bacterium acquires high-level resistance is still not well understood. How does the genomic evolution of P. aeruginosa affect resistance development? Could the exposure of antibiotics to the bacteria enrich genomic variants that lead to the development of resistance, and if so, how are these variants distributed through the genome? To answer these questions, we performed 454 pyrosequencing and a whole genome analysis both before and after exposure to ciprofloxacin. The comparative sequence data revealed 93 unique resistance strain variation sites, which included a mutation in the DNA gyrase subunit A gene. We generated variation-distribution maps comparing the wild and resistant types, and isolated 19 candidates from three discrete resistance-associated high variability regions that had available transposon mutants, to perform a ciprofloxacin exposure assay. Of these region candidates with transposon disruptions, 79% (15/19) showed a reduction in the ability to gain high-level resistance, suggesting that genes within these high variability regions might enrich for certain functions associated with resistance development.


Asunto(s)
Girasa de ADN/genética , Farmacorresistencia Bacteriana/genética , Genoma Bacteriano , Mutación , Pseudomonas aeruginosa/genética , Antibacterianos/farmacología , Ciprofloxacina/farmacología , Elementos Transponibles de ADN , Variación Genética , Secuenciación de Nucleótidos de Alto Rendimiento , Anotación de Secuencia Molecular , Pseudomonas aeruginosa/efectos de los fármacos
9.
Anal Chem ; 79(8): 3032-40, 2007 Apr 15.
Artículo en Inglés | MEDLINE | ID: mdl-17367113

RESUMEN

The identification of proteins by tandem mass spectrometry relies on knowledge of the products produced by collision-induced dissociation of peptide ions. Most previous work has focused on fragmentation statistics for ion trap systems. We analyzed fragmentation in MALDI TOF/TOF mass spectrometry, collecting statistics using a curated set of 2459 MS/MS spectra and applying bootstrap resampling to assess confidence intervals. We calculated the frequency of 18 product ion types, the correlation between both mass and intensity with ion type, the dependence of amide bond breakage on the residues surrounding the cleavage site, and the dependence of product ion detection on residues not adjacent to the cleavage site. The most frequently observed were internal ions, followed by y ions. A strong correlation between ion type and the mass and intensity of its peak was observed, with b and y ions producing the most intense and highest mass peaks. The amino acids P, W, D, and R had a strong effect on amide bond cleavage when situated next to the breakage site, whereas residues including I, K, and H had a strong effect on product ion observation when located in the peptide but not adjacent to the cleavage site, a novel observation.


Asunto(s)
Interpretación Estadística de Datos , Proteínas/química , Espectrometría de Masa por Láser de Matriz Asistida de Ionización Desorción/métodos , Aminoácidos/química
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA