Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 23
Filtrar
1.
Cell ; 163(3): 684-97, 2015 Oct 22.
Artículo en Inglés | MEDLINE | ID: mdl-26496608

RESUMEN

The central role of translation in modulating gene activity has long been recognized, yet the systematic exploration of quantitative changes in translation at a genome-wide scale in response to a specific stimulus has only recently become technically feasible. Using the well-characterized signaling pathway of the phytohormone ethylene and plant-optimized genome-wide ribosome footprinting, we have uncovered a molecular mechanism linking this hormone's perception to the activation of a gene-specific translational control mechanism. Characterization of one of the targets of this translation regulatory machinery, the ethylene signaling component EBF2, indicates that the signaling molecule EIN2 and the nonsense-mediated decay proteins UPFs play a central role in this ethylene-induced translational response. Furthermore, the 3'UTR of EBF2 is sufficient to confer translational regulation and required for the proper activation of ethylene responses. These findings represent a mechanistic paradigm of gene-specific regulation of translation in response to a key growth regulator.


Asunto(s)
Proteínas de Arabidopsis/metabolismo , Arabidopsis/metabolismo , Biosíntesis de Proteínas , Receptores de Superficie Celular/metabolismo , Transducción de Señal , Regiones no Traducidas 3' , Arabidopsis/genética , Proteínas de Arabidopsis/genética , Proteínas de Unión al ADN , Etilenos/metabolismo , Proteínas F-Box/genética , Regulación de la Expresión Génica de las Plantas , Proteínas Nucleares/metabolismo , ARN Mensajero/metabolismo , Ribosomas/metabolismo , Factores de Transcripción/metabolismo
2.
BMC Genomics ; 20(Suppl 5): 422, 2019 Jun 06.
Artículo en Inglés | MEDLINE | ID: mdl-31167636

RESUMEN

BACKGROUND: Ribo-seq is a popular technique for studying translation and its regulation. A Ribo-seq experiment produces a snap-shot of the location and abundance of actively translating ribosomes within a cell's transcriptome. In practice, Ribo-seq data analysis can be sensitive to quality issues such as read length variation, low read periodicities, and contaminations with ribosomal and transfer RNA. Various software tools for data preprocessing, quality assessment, analysis, and visualization of Ribo-seq data have been developed. However, many of these tools require considerable practical knowledge of software applications, and often multiple different tools have to be used in combination with each other. RESULTS: We present riboStreamR, a comprehensive Ribo-seq quality control (QC) platform in the form of an R Shiny web application. RiboStreamR provides visualization and analysis tools for various Ribo-seq QC metrics, including read length distribution, read periodicity, and translational efficiency. Our platform is focused on providing a user-friendly experience, and includes various options for graphical customization, report generation, and anomaly detection within Ribo-seq datasets. CONCLUSIONS: RiboStreamR takes advantage of the vast resources provided by the R and Bioconductor environments, and utilizes the Shiny R package to ensure a high level of usability. Our goal is to develop a tool which facilitates in-depth quality assessment of Ribo-seq data by providing reference datasets and automatically highlighting quality issues and anomalies within datasets.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento/métodos , ARN Mensajero/metabolismo , Ribosomas/metabolismo , Programas Informáticos , Navegador Web , Gráficos por Computador , Genómica/métodos , Humanos , Biosíntesis de Proteínas , Control de Calidad , ARN Mensajero/genética , Análisis de Secuencia de ARN , Transcriptoma
3.
Plant Physiol ; 171(1): 42-61, 2016 05.
Artículo en Inglés | MEDLINE | ID: mdl-26983993

RESUMEN

Plant meristems, like animal stem cell niches, maintain a pool of multipotent, undifferentiated cells that divide and differentiate to give rise to organs. In Arabidopsis (Arabidopsis thaliana), the carpel margin meristem is a vital meristematic structure that generates ovules from the medial domain of the gynoecium, the female floral reproductive structure. The molecular mechanisms that specify this meristematic region and regulate its organogenic potential are poorly understood. Here, we present a novel approach to analyze the transcriptional signature of the medial domain of the Arabidopsis gynoecium, highlighting the developmental stages that immediately proceed ovule initiation, the earliest stages of seed development. Using a floral synchronization system and a SHATTERPROOF2 (SHP2) domain-specific reporter, paired with FACS and RNA sequencing, we assayed the transcriptome of the gynoecial medial domain with temporal and spatial precision. This analysis reveals a set of genes that are differentially expressed within the SHP2 expression domain, including genes that have been shown previously to function during the development of medial domain-derived structures, including the ovules, thus validating our approach. Global analyses of the transcriptomic data set indicate a similarity of the pSHP2-expressing cell population to previously characterized meristematic domains, further supporting the meristematic nature of this gynoecial tissue. Our method identifies additional genes including novel isoforms, cis-natural antisense transcripts, and a previously unrecognized member of the REPRODUCTIVE MERISTEM family of transcriptional regulators that are potential novel regulators of medial domain development. This data set provides genome-wide transcriptional insight into the development of the carpel margin meristem in Arabidopsis.


Asunto(s)
Proteínas de Arabidopsis/genética , Arabidopsis/genética , Regulación de la Expresión Génica de las Plantas , Proteínas de Dominio MADS/genética , Meristema/genética , Transcriptoma , Arabidopsis/anatomía & histología , Proteínas de Arabidopsis/aislamiento & purificación , Secuencia de Bases , Hidrato de Cloral , ADN sin Sentido , Flores/genética , Genoma de Planta , Hibridación in Situ , Ácidos Indolacéticos/farmacología , Proteínas de Dominio MADS/aislamiento & purificación , Meristema/crecimiento & desarrollo , Meristema/metabolismo , Microscopía Confocal , Óvulo Vegetal/citología , Óvulo Vegetal/crecimiento & desarrollo , Óvulo Vegetal/metabolismo , Isoformas de Proteínas , Protoplastos , ARN de Planta/química , ARN de Planta/aislamiento & purificación , Semillas/crecimiento & desarrollo , Alineación de Secuencia , Factores de Transcripción , Activación Transcripcional
4.
BMC Bioinformatics ; 11 Suppl 3: S6, 2010 Apr 29.
Artículo en Inglés | MEDLINE | ID: mdl-20438653

RESUMEN

BACKGROUND: In eukaryotes, alternative splicing often generates multiple splice variants from a single gene. Here we explore the use of RNA sequencing (RNA-Seq) datasets to address the isoform quantification problem. Given a set of known splice variants, the goal is to estimate the relative abundance of the individual variants. METHODS: Our method employs a linear models framework to estimate the ratios of known isoforms in a sample. A key feature of our method is that it takes into account the non-uniformity of RNA-Seq read positions along the targeted transcripts. RESULTS: Preliminary tests indicate that the model performs well on both simulated and real data. In two publicly available RNA-Seq datasets, we identified several alternatively-spliced genes with switch-like, on/off expression properties, as well as a number of other genes that varied more subtly in isoform expression. In many cases, genes exhibiting differential expression of alternatively spliced transcripts were not differentially expressed at the gene level. CONCLUSIONS: Given that changes in isoform expression level frequently involve a continuum of isoform ratios, rather than all-or-nothing expression, and that they are often independent of general gene expression changes, we anticipate that our research will contribute to revealing a so far uninvestigated layer of the transcriptome. We believe that, in the future, researchers will prioritize genes for functional analysis based not only on observed changes in gene expression levels, but also on changes in alternative splicing.


Asunto(s)
Empalme Alternativo/genética , Biología Computacional/métodos , Bases de Datos de Ácidos Nucleicos , Isoformas de Proteínas/análisis , Análisis de Secuencia de ARN/métodos , Algoritmos , Arabidopsis/genética , Secuencia de Bases , Distribución de Chi-Cuadrado , Simulación por Computador , Regulación de la Expresión Génica de las Plantas , Genes de Plantas , Modelos Lineales , Modelos Genéticos , Datos de Secuencia Molecular , Isoformas de Proteínas/genética , Estrés Fisiológico
5.
J Proteome Res ; 9(3): 1209-17, 2010 Mar 05.
Artículo en Inglés | MEDLINE | ID: mdl-20047314

RESUMEN

Identification of proteins from proteolytic peptides or intact proteins plays an essential role in proteomics. Researchers use search engines to match the acquired peptide sequences to the target proteins. However, search engines depend on protein databases to provide candidates for consideration. Alternative splicing (AS), the mechanism where the exon of pre-mRNAs can be spliced and rearranged to generate distinct mRNA and therefore protein variants, enable higher eukaryotic organisms, with only a limited number of genes, to have the requisite complexity and diversity at the proteome level. Multiple alternative isoforms from one gene often share common segments of sequences. However, many protein databases only include a limited number of isoforms to keep minimal redundancy. As a result, the database search might not identify a target protein even with high quality tandem MS data and accurate intact precursor ion mass. We computationally predicted an exhaustive list of putative isoforms of Aspergillus flavus proteins from 20 371 expressed sequence tags to investigate whether an alternative splicing protein database can assign a greater proportion of mass spectrometry data. The newly constructed AS database provided 9807 new alternatively spliced variants in addition to 12 832 previously annotated proteins. The searches of the existing tandem MS spectra data set using the AS database identified 29 new proteins encoded by 26 genes. Nine fungal genes appeared to have multiple protein isoforms. In addition to the discovery of splice variants, AS database also showed potential to improve genome annotation. In summary, the introduction of an alternative splicing database helps identify more proteins and unveils more information about a proteome.


Asunto(s)
Aspergillus flavus/genética , Proteínas Fúngicas/genética , Isoformas de Proteínas/genética , Empalme Alternativo , Aspergillus flavus/metabolismo , Simulación por Computador , Citocromo-B(5) Reductasa/química , Citocromo-B(5) Reductasa/genética , Bases de Datos de Proteínas , Proteínas Fúngicas/biosíntesis , Proteínas Fúngicas/química , Marcaje Isotópico , Espectrometría de Masas , Modelos Genéticos , Isoformas de Proteínas/biosíntesis , Isoformas de Proteínas/química , Proteómica , Piruvato Carboxilasa/química , Piruvato Carboxilasa/genética , Sitios de Empalme de ARN , Reproducibilidad de los Resultados
6.
Plant Cell Physiol ; 51(1): 144-63, 2010 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-19996151

RESUMEN

As a step toward a comprehensive description of lignin biosynthesis in Populus trichocarpa, we identified from the genome sequence 95 phenylpropanoid gene models in 10 protein families encoding enzymes for monolignol biosynthesis. Transcript abundance was determined for all 95 genes in xylem, leaf, shoot and phloem using quantitative real-time PCR (qRT-PCR). We identified 23 genes that most probably encode monolignol biosynthesis enzymes during wood formation. Transcripts for 18 of the 23 are abundant and specific to differentiating xylem. We found evidence suggesting functional redundancy at the transcript level for phenylalanine ammonia-lyase (PAL), cinnamate 4-hydroxylase (C4H), 4-coumarate:CoA ligase (4CL), p-hydroxycinnamoyl-CoA:quinate shikimate p-hydroxycinnamoyltransferase (HCT), caffeoyl-CoA O-methyltransferase (CCoAOMT) and coniferyl aldehyde 5-hydroxylase (CAld5H). We carried out an enumeration-based motif identification and discriminant analysis on the promoters of all 95 genes. Five core motifs correctly discriminate the 18 xylem-specific genes from the 77 non-xylem genes. These motifs are similar to promoter elements known to regulate phenylpropanoid gene expression. This work suggests that genes in monolignol biosynthesis are regulated by multiple motifs, often related in sequence.


Asunto(s)
Vías Biosintéticas/genética , Lignina/biosíntesis , Lignina/genética , Populus/genética , Populus/metabolismo , ARN de Planta/genética , Secuencias de Aminoácidos/genética , Enzimas/biosíntesis , Enzimas/genética , Regulación Enzimológica de la Expresión Génica/fisiología , Regulación de la Expresión Génica de las Plantas/fisiología , Genoma de Planta/genética , Floema/enzimología , Floema/genética , Brotes de la Planta/enzimología , Brotes de la Planta/genética , Regiones Promotoras Genéticas/genética , ARN Mensajero/análisis , ARN Mensajero/genética , ARN Mensajero/metabolismo , ARN de Planta/análisis , ARN de Planta/metabolismo , Transcripción Genética/fisiología , Xilema/enzimología , Xilema/genética
7.
Hum Genomics ; 3(3): 221-35, 2009 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-19403457

RESUMEN

Transcription factors are key mediators of human complex disease processes. Identifying the target genes of transcription factors will increase our understanding of the biological network leading to disease risk. The prediction of transcription factor binding sites (TFBSs) is one method to identify these target genes; however, current prediction methods need improvement. We chose the transcription factor upstream stimulatory factor 1 ( USF1 ) to evaluate the performance of our novel TFBS prediction method because of its known genetic association with coronary artery disease (CAD) and the recent availability of USF1 chromatin immunoprecipitation microarray (ChIP-chip) results. The specific goals of our study were to develop a novel and accurate genome-scale method for predicting USF1 binding sites and associated target genes to aid in the study of CAD. Previously published USF1 ChIP-chip data for 1 per cent of the genome were used to develop and evaluate several kernel logistic regression prediction models. A combination of genomic features (phylogenetic conservation, regulatory potential, presence of a CpG island and DNaseI hypersensitivity), as well as position weight matrix (PWM) scores, were used as variables for these models. Our most accurate predictor achieved an area under the receiver operator characteristic curve of 0.827 during cross-validation experiments, significantly outperforming standard PWM-based prediction methods. When applied to the whole human genome, we predicted 24,010 USF1 binding sites within 5 kilobases upstream of the transcription start site of 9,721 genes. These predictions included 16 of 20 genes with strong evidence of USF1 regulation. Finally, in the spirit of genomic convergence, we integrated independent experimental CAD data with these USF1 binding site prediction results to develop a prioritised set of candidate genes for future CAD studies. We have shown that our novel prediction method, which employs genomic features related to the presence of regulatory elements, enables more accurate and efficient prediction of USF1 binding sites. This method can be extended to other transcription factors identified in human disease studies to help further our understanding of the biology of complex disease.


Asunto(s)
Enfermedades Cardiovasculares/metabolismo , Genómica/métodos , Factores Estimuladores hacia 5'/metabolismo , Animales , Sitios de Unión , Inmunoprecipitación de Cromatina , Humanos , Análisis de Regresión , Factores de Transcripción/metabolismo
8.
J Comput Biol ; 27(2): 259-268, 2020 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-31855064

RESUMEN

ChIP-Seq blacklists contain genomic regions that frequently produce artifacts and noise in ChIP-Seq experiments. To improve signal-to-noise ratio, ChIP-Seq pipelines often remove data points that map to blacklist regions. Existing blacklists have been compiled in a manual or semiautomated way. In this article we describe PeakPass, an efficient method to generate blacklists, and demonstrate that blacklists can increase ChIP-Seq data quality. PeakPass leverages machine learning and attempts to automate blacklist generation. PeakPass uses a random forest classifier in combination with genomic features such as sequence, annotated repeats, complexity, assembly gaps, and the ratio of multimapping to uniquely mapping reads to identify artifact regions. We have validated PeakPass on a large data set and tested it for the purpose of upgrading a blacklist to a new reference genome version. We trained PeakPass on the ENCODE blacklist for the hg19 human reference genome, and created an updated blacklist for hg38. To assess the performance of this blacklist, we tested 42 ChIP-Seq replicates from 24 experiments using 10 ChIP-Seq quality metrics including relative strand coefficient, standardized standard deviation, and enrichment of reads in promoter regions. Using the blacklist generated by PeakPass resulted in a statistically significant improvement for nine of these metrics.

9.
BMC Bioinformatics ; 10: 191, 2009 Jun 22.
Artículo en Inglés | MEDLINE | ID: mdl-19545436

RESUMEN

BACKGROUND: Quality assessment of microarray data is an important and often challenging aspect of gene expression analysis. This task frequently involves the examination of a variety of summary statistics and diagnostic plots. The interpretation of these diagnostics is often subjective, and generally requires careful expert scrutiny. RESULTS: We show how an unsupervised classification technique based on the Expectation-Maximization (EM) algorithm and the naïve Bayes model can be used to automate microarray quality assessment. The method is flexible and can be easily adapted to accommodate alternate quality statistics and platforms. We evaluate our approach using Affymetrix 3' gene expression and exon arrays and compare the performance of this method to a similar supervised approach. CONCLUSION: This research illustrates the efficacy of an unsupervised classification approach for the purpose of automated microarray data quality assessment. Since our approach requires only unannotated training data, it is easy to customize and to keep up-to-date as technology evolves. In contrast to other "black box" classification systems, this method also allows for intuitive explanations.


Asunto(s)
Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Algoritmos , Biología Computacional/métodos , Perfilación de la Expresión Génica/métodos , Distribución Normal
10.
Evol Dev ; 11(1): 50-68, 2009.
Artículo en Inglés | MEDLINE | ID: mdl-19196333

RESUMEN

microRNAs (miRNAs) are approximately 22-nucleotide noncoding RNA regulatory genes that are key players in cellular differentiation and homeostasis. They might also play important roles in shaping metazoan macroevolution. Previous studies have shown that miRNAs are continuously being added to metazoan genomes through time, and, once integrated into gene regulatory networks, show only rare mutations within the primary sequence of the mature gene product and are only rarely secondarily lost. However, because the conclusions from these studies were largely based on phylogenetic conservation of miRNAs between model systems like Drosophila and the taxon of interest, it was unclear if these trends would describe most miRNAs in most metazoan taxa. Here, we describe the shared complement of miRNAs among 18 animal species using a combination of 454 sequencing of small RNA libraries with genomic searches. We show that the evolutionary trends elucidated from the model systems are generally true for all miRNA families and metazoan taxa explored: the continuous addition of miRNA families with only rare substitutions to the mature sequence, and only rare instances of secondary loss. Despite this conservation, we document evolutionary stable shifts to the determination of position 1 of the mature sequence, a phenomenon we call seed shifting, as well as the ability to post-transcriptionally edit the 5' end of the mature read, changing the identity of the seed sequence and possibly the repertoire of downstream targets. Finally, we describe a novel type of miRNA in demosponges that, although shows a different pre-miRNA structure, still shows remarkable conservation of the mature sequence in the two sponge species analyzed. We propose that miRNAs might be excellent phylogenetic markers, and suggest that the advent of morphological complexity might have its roots in miRNA innovation.


Asunto(s)
Cordados/genética , Evolución Molecular , Invertebrados/genética , MicroARNs/genética , Filogenia , Animales , Secuencia de Bases , Northern Blotting , Biología Computacional , Secuencia Conservada/genética , Cartilla de ADN/genética , Biblioteca de Genes , Datos de Secuencia Molecular , Poríferos/genética , Análisis de Secuencia de ADN , Especificidad de la Especie
11.
Artículo en Inglés | MEDLINE | ID: mdl-17277413

RESUMEN

Accurate base-assignment in repeat regions of a whole genome shotgun assembly is an unsolved problem. Since reads in repeat regions cannot be easily attributed to a unique location in the genome, current assemblers may place these reads arbitrarily. As a result, the base-assignment error rate in repeats is likely to be much higher than that in the rest of the genome. We developed an iterative algorithm, EULER-AIR, that is able to correct base-assignment errors in finished genome sequences in public databases. The Wolbachia genome is among the best finished genomes. Using this genome project as an example, we demonstrated that EULER-AIR can 1) discover and correct base-assignment errors, 2) provide accurate read assignments, 3) utilize finishing reads for accurate base-assignment, and 4) provide guidance for designing finishing experiments. In the genome of Wolbachia, EULER-AIR found 16 positions with ambiguous base-assignment and two positions with erroneous bases. Besides Wolbachia, many other genome sequencing projects have significantly fewer finishing reads and, hence, are likely to contain more base-assignment errors in repeats. We demonstrate that EULER-AIR is a software tool that can be used to find and correct base-assignment errors in a genome assembly project.


Asunto(s)
Biología Computacional/métodos , Modelos Estadísticos , Secuencias Repetitivas de Ácidos Nucleicos/genética , Análisis de Secuencia de ADN/métodos , Algoritmos , Campylobacter jejuni/genética , Análisis por Conglomerados , Genoma Bacteriano , Lactococcus lactis/genética , Alineación de Secuencia , Análisis de Secuencia de ADN/estadística & datos numéricos , Programas Informáticos , Staphylococcus epidermidis/genética , Wolbachia/genética
12.
Nucleic Acids Res ; 33(Web Server issue): W638-43, 2005 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-15980552

RESUMEN

The Remote Analysis Computation for gene Expression data (RACE) suite is a collection of bioinformatics web tools designed for the analysis of DNA microarray data. RACE performs probe-level data preprocessing, extensive quality checks, data visualization and data normalization for Affymetrix GeneChips. In addition, it offers differential expression analysis on normalized expression levels from any array platform. RACE estimates the false discovery rates of lists of potentially regulated genes and provides a Gene Ontology-term analysis tool for GeneChip data to support the biological interpretation and annotation of results. The analysis is fully automated but can be customized by flexible parameter settings. To offer a convenient starting point for subsequent analyses, and to provide maximum transparency, the R scripts used to generate the results can be downloaded along with the output files. RACE is freely available for use at http://race.unil.ch.


Asunto(s)
Biología Computacional , Perfilación de la Expresión Génica/métodos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Programas Informáticos , Gráficos por Computador , Perfilación de la Expresión Génica/normas , Internet , Análisis de Secuencia por Matrices de Oligonucleótidos/normas , Control de Calidad , Interfaz Usuario-Computador
13.
J Leukoc Biol ; 102(6): 1371-1380, 2017 12.
Artículo en Inglés | MEDLINE | ID: mdl-29021367

RESUMEN

The vertebrate immune response comprises multiple molecular and cellular components that interface to provide defense against pathogens. Because of the dynamic complexity of the immune system and its interdependent innate and adaptive functionality, an understanding of the whole-organism response to pathogen exposure remains unresolved. Zebrafish larvae provide a unique model for overcoming this obstacle, because larvae are protected against pathogens while lacking a functional adaptive immune system during the first few weeks of life. Zebrafish larvae were exposed to immune agonists for various lengths of time, and a microarray transcriptome analysis was executed. This strategy identified known immune response genes, as well as genes with unknown immune function, including the E3 ubiquitin ligase tripartite motif-9 (Trim9). Although trim9 expression was originally described as "brain specific," its expression has been reported in stimulated human Mϕs. In this study, we found elevated levels of trim9 transcripts in vivo in zebrafish Mϕs after immune stimulation. Trim9 has been implicated in axonal migration, and we therefore investigated the impact of Trim9 disruption on Mϕ motility and found that Mϕ chemotaxis and cellular architecture are subsequently impaired in vivo. These results demonstrate that Trim9 mediates cellular movement and migration in Mϕs as well as neurons.


Asunto(s)
Movimiento Celular , Macrófagos/citología , Macrófagos/metabolismo , Proteínas del Tejido Nervioso/metabolismo , Proteínas de Motivos Tripartitos/metabolismo , Ubiquitina-Proteína Ligasas/metabolismo , Proteínas de Pez Cebra/metabolismo , Animales , Movimiento Celular/genética , Forma de la Célula , Quimiotaxis , Humanos , Proteínas del Tejido Nervioso/genética , ARN Mensajero/genética , ARN Mensajero/metabolismo , Proteínas de Motivos Tripartitos/genética , Células U937 , Ubiquitina-Proteína Ligasas/genética , Pez Cebra/genética , Pez Cebra/inmunología , Proteínas de Pez Cebra/genética
14.
OMICS ; 10(3): 358-68, 2006.
Artículo en Inglés | MEDLINE | ID: mdl-17069513

RESUMEN

Affymetrix GeneChips are one of the best established microarray platforms. This powerful technique allows users to measure the expression of thousands of genes simultaneously. However, a microarray experiment is a sophisticated and time consuming endeavor with many potential sources of unwanted variation that could compromise the results if left uncontrolled. Increasing data volume and data complexity have triggered growing concern and awareness of the importance of assessing the quality of generated microarray data. In this review, we give an overview of current methods and software tools for quality assessment of Affymetrix GeneChip data. We focus on quality metrics, diagnostic plots, probe-level methods, pseudo-images, and classification methods to identify corrupted chips. We also describe RNA quality assessment methods which play an important role in challenging RNA sources like formalin embedded biopsies, laser-micro dissected samples, or single cells. No wet-lab methods are discussed in this paper.


Asunto(s)
Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Análisis de Secuencia por Matrices de Oligonucleótidos/tendencias , Animales , Perfilación de la Expresión Génica , Humanos
15.
J Mass Spectrom ; 41(3): 281-8, 2006 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-16538648

RESUMEN

The number and wide dynamic range of components found in biological matrixes present several challenges for global proteomics. In this perspective, we will examine the potential of zero-dimensional (0D), one-dimensional (1D), and two-dimensional (2D) separations coupled with Fourier-transform ion cyclotron resonance (FT-ICR) and time-of-flight (TOF) mass spectrometry (MS) for the analysis of complex mixtures. We describe and further develop previous reports on the space occupied by peptides, to calculate the theoretical peak capacity available to each separations-mass spectrometry method examined. Briefly, the peak capacity attainable by each of the mass analyzers was determined from the mass resolving power (RP) and the m/z space occupied by peptides considered from the mass distribution of tryptic peptides from National Center for Biotechnology Information's (NCBI's) nonredundant database. Our results indicate that reverse-phase-nanoHPLC (RP-nHPLC) separation coupled with FT-ICR MS offers an order of magnitude improvement in peak capacity over RP-nHPLC separation coupled with TOF MS. The addition of an orthogonal separation method, strong cation exchange (SCX), for 2D LC-MS demonstrates an additional 10-fold improvement in peak capacity over 1D LC-MS methods. Peak capacity calculations for 0D LC, two different 1D RP-HPLC methods, and 2D LC (with various numbers of SCX fractions) for both RP-HPLC methods coupled to FT-ICR and TOF MS are examined in detail. Peak capacity production rates, which take into account the total analysis time, are also considered for each of the methods. Furthermore, the significance of the space occupied by peptides is discussed.


Asunto(s)
Electroforesis en Gel Bidimensional , Espectrometría de Masas , Péptidos/análisis , Proteómica/instrumentación , Proteómica/métodos , Biotecnología/instrumentación , Biotecnología/métodos , Análisis de Fourier
16.
Nucleic Acids Res ; 32(13): 3977-83, 2004.
Artículo en Inglés | MEDLINE | ID: mdl-15292448

RESUMEN

Alternative splicing essentially increases the diversity of the transcriptome and has important implications for physiology, development and the genesis of diseases. Conventionally, alternative splicing is investigated in a case-by-case fashion, but this becomes cumbersome and error prone if genes show a huge abundance of different splice variants. We use a different approach and integrate all transcripts derived from a gene into a single splicing graph. Each transcript corresponds to a path in the graph, and alternative splicing is displayed by bifurcations. This representation preserves the relationships between different splicing variants and allows us to investigate systematically all possible putative transcripts. We built a database of splicing graphs for human genes, using transcript information from various major sources (Ensembl, RefSeq, STACK, TIGR and UniGene). A Web interface allows users to display the splicing graphs, to interactively assemble transcripts and to access their sequences as well as neighboring genomic regions. We also provide for each gene an exhaustive pre-computed catalog of putative transcripts--in total more than 1.2 million sequences. We found that approximately 65% of the investigated genes show evidence for alternative splicing, and in 5% of the cases, a single gene might produce over 100 transcripts.


Asunto(s)
Empalme Alternativo , Programas Informáticos , Gráficos por Computador , Bases de Datos de Ácidos Nucleicos , Genoma Humano , Humanos , Internet , Sitios de Empalme de ARN , Alineación de Secuencia , Análisis de Secuencia de ADN , Análisis de Secuencia de ARN , Transcripción Genética
17.
Genet Mol Res ; 5(1): 224-32, 2006 Mar 31.
Artículo en Inglés | MEDLINE | ID: mdl-16755513

RESUMEN

Analysis of gene deletions is a fundamental approach for investigating gene function. We evaluated an algorithm that uses classification techniques to predict the phenotypic effects of gene deletions in yeast. We used a modified simulated annealing algorithm for feature selection and weighting. The selected features with high weights were phylogenetic conservation scores for bacteria, fungi (excluding Ascomycota), Ascomycota (excluding Saccharomyces cerevisiae), plants, and mammals, degree of paralogy, and number of protein-protein interactions. Classification was performed by weighted k-nearest neighbor and with support vector machine algorithms. To demonstrate how this approach might complement existing experimental procedures, we applied our algorithm to predict essential genes and genes causing morphological alterations in yeast.


Asunto(s)
Algoritmos , Eliminación de Gen , Genes Fúngicos/genética , Fenotipo , Levaduras/genética , Animales , Mutación
18.
IEEE Trans Nanobioscience ; 15(2): 148-57, 2016 03.
Artículo en Inglés | MEDLINE | ID: mdl-26886998

RESUMEN

Upstream open reading frames (uORFs) are open reading frames that occur within the 5' UTR of an mRNA. uORFs have been found in many organisms. They play an important role in gene regulation, cell development, and in various metabolic processes. It is believed that translated uORFs reduce the translational efficiency of the main coding region. However, only few uORFs are experimentally characterized. In this paper, we use ribosome footprinting together with a semi-supervised approach based on stacking classification models to identify translated uORFs in Arabidopsis thaliana. Our approach identified 5360 potentially translated uORFs in 2051 genes. GO terms enriched in genes with translated uORFs include catalytic activity, binding, transferase activity, phosphotransferase activity, kinase activity, and transcription regulator activity. The reported uORFs occur with a higher frequency in multi-isoform genes, and some uORFs are affected by alternative transcript start sites or alternative splicing events. Association rule mining revealed sequence features associated with the translation status of the uORFs. We hypothesize that uORF translation is a complex process that might be regulated by multiple factors. The identified uORFs are available online at:https://www.dropbox.com/sh/zdutupedxafhly8/AABFsdNR5zDfiozB7B4igFcja?dl=0. This paper is the extended version of our research presented at ISBRA 2015.


Asunto(s)
Proteínas de Arabidopsis/genética , Proteínas de Arabidopsis/metabolismo , Sistemas de Lectura Abierta/genética , Huella de Proteína/métodos , Ribosomas/metabolismo , Algoritmos , Análisis por Conglomerados , Genoma de Planta , ARN de Planta , Ribosomas/genética , Análisis de Secuencia de ARN , Aprendizaje Automático Supervisado
19.
Drug Discov Today ; 8(3): 113-4, 2003 Feb 01.
Artículo en Inglés | MEDLINE | ID: mdl-12568779

RESUMEN

Highlights from the first European Conference on Computational Biology (ECCB 2002), held in conjunction with the German Conference on Bioinformatics (GCB 2002), 6-9 October 2002, Saarbrücken, Germany.


Asunto(s)
Biología Computacional/métodos , Biología Computacional/tendencias , Europa (Continente) , Humanos
20.
Neuron ; 84(2): 386-98, 2014 Oct 22.
Artículo en Inglés | MEDLINE | ID: mdl-25284007

RESUMEN

Molecular diversity of surface receptors has been hypothesized to provide a mechanism for selective synaptic connectivity. Neurexins are highly diversified receptors that drive the morphological and functional differentiation of synapses. Using a single cDNA sequencing approach, we detected 1,364 unique neurexin-α and 37 neurexin-ß mRNAs produced by alternative splicing of neurexin pre-mRNAs. This molecular diversity results from near-exhaustive combinatorial use of alternative splice insertions in Nrxn1α and Nrxn2α. By contrast, Nrxn3α exhibits several highly stereotyped exon selections that incorporate novel elements for posttranscriptional regulation of a subset of transcripts. Complexity of Nrxn1α repertoires correlates with the cellular complexity of neuronal tissues, and a specific subset of isoforms is enriched in a purified cell type. Our analysis defines the molecular diversity of a critical synaptic receptor and provides evidence that neurexin diversity is linked to cellular diversity in the nervous system.


Asunto(s)
Empalme Alternativo , Encéfalo/metabolismo , Exones/genética , Proteínas del Tejido Nervioso/genética , ARN Mensajero/metabolismo , Animales , Ratones , Proteínas del Tejido Nervioso/metabolismo , Neuronas/metabolismo , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , Sinapsis/metabolismo
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA