Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 23
Filtrar
1.
Cell ; 163(3): 684-97, 2015 Oct 22.
Artigo em Inglês | MEDLINE | ID: mdl-26496608

RESUMO

The central role of translation in modulating gene activity has long been recognized, yet the systematic exploration of quantitative changes in translation at a genome-wide scale in response to a specific stimulus has only recently become technically feasible. Using the well-characterized signaling pathway of the phytohormone ethylene and plant-optimized genome-wide ribosome footprinting, we have uncovered a molecular mechanism linking this hormone's perception to the activation of a gene-specific translational control mechanism. Characterization of one of the targets of this translation regulatory machinery, the ethylene signaling component EBF2, indicates that the signaling molecule EIN2 and the nonsense-mediated decay proteins UPFs play a central role in this ethylene-induced translational response. Furthermore, the 3'UTR of EBF2 is sufficient to confer translational regulation and required for the proper activation of ethylene responses. These findings represent a mechanistic paradigm of gene-specific regulation of translation in response to a key growth regulator.


Assuntos
Proteínas de Arabidopsis/metabolismo , Arabidopsis/metabolismo , Biossíntese de Proteínas , Receptores de Superfície Celular/metabolismo , Transdução de Sinais , Regiões 3' não Traduzidas , Arabidopsis/genética , Proteínas de Arabidopsis/genética , Proteínas de Ligação a DNA , Etilenos/metabolismo , Proteínas F-Box/genética , Regulação da Expressão Gênica de Plantas , Proteínas Nucleares/metabolismo , RNA Mensageiro/metabolismo , Ribossomos/metabolismo , Fatores de Transcrição/metabolismo
2.
BMC Genomics ; 20(Suppl 5): 422, 2019 Jun 06.
Artigo em Inglês | MEDLINE | ID: mdl-31167636

RESUMO

BACKGROUND: Ribo-seq is a popular technique for studying translation and its regulation. A Ribo-seq experiment produces a snap-shot of the location and abundance of actively translating ribosomes within a cell's transcriptome. In practice, Ribo-seq data analysis can be sensitive to quality issues such as read length variation, low read periodicities, and contaminations with ribosomal and transfer RNA. Various software tools for data preprocessing, quality assessment, analysis, and visualization of Ribo-seq data have been developed. However, many of these tools require considerable practical knowledge of software applications, and often multiple different tools have to be used in combination with each other. RESULTS: We present riboStreamR, a comprehensive Ribo-seq quality control (QC) platform in the form of an R Shiny web application. RiboStreamR provides visualization and analysis tools for various Ribo-seq QC metrics, including read length distribution, read periodicity, and translational efficiency. Our platform is focused on providing a user-friendly experience, and includes various options for graphical customization, report generation, and anomaly detection within Ribo-seq datasets. CONCLUSIONS: RiboStreamR takes advantage of the vast resources provided by the R and Bioconductor environments, and utilizes the Shiny R package to ensure a high level of usability. Our goal is to develop a tool which facilitates in-depth quality assessment of Ribo-seq data by providing reference datasets and automatically highlighting quality issues and anomalies within datasets.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/métodos , RNA Mensageiro/metabolismo , Ribossomos/metabolismo , Software , Navegador , Gráficos por Computador , Genômica/métodos , Humanos , Biossíntese de Proteínas , Controle de Qualidade , RNA Mensageiro/genética , Análise de Sequência de RNA , Transcriptoma
3.
Plant Physiol ; 171(1): 42-61, 2016 05.
Artigo em Inglês | MEDLINE | ID: mdl-26983993

RESUMO

Plant meristems, like animal stem cell niches, maintain a pool of multipotent, undifferentiated cells that divide and differentiate to give rise to organs. In Arabidopsis (Arabidopsis thaliana), the carpel margin meristem is a vital meristematic structure that generates ovules from the medial domain of the gynoecium, the female floral reproductive structure. The molecular mechanisms that specify this meristematic region and regulate its organogenic potential are poorly understood. Here, we present a novel approach to analyze the transcriptional signature of the medial domain of the Arabidopsis gynoecium, highlighting the developmental stages that immediately proceed ovule initiation, the earliest stages of seed development. Using a floral synchronization system and a SHATTERPROOF2 (SHP2) domain-specific reporter, paired with FACS and RNA sequencing, we assayed the transcriptome of the gynoecial medial domain with temporal and spatial precision. This analysis reveals a set of genes that are differentially expressed within the SHP2 expression domain, including genes that have been shown previously to function during the development of medial domain-derived structures, including the ovules, thus validating our approach. Global analyses of the transcriptomic data set indicate a similarity of the pSHP2-expressing cell population to previously characterized meristematic domains, further supporting the meristematic nature of this gynoecial tissue. Our method identifies additional genes including novel isoforms, cis-natural antisense transcripts, and a previously unrecognized member of the REPRODUCTIVE MERISTEM family of transcriptional regulators that are potential novel regulators of medial domain development. This data set provides genome-wide transcriptional insight into the development of the carpel margin meristem in Arabidopsis.


Assuntos
Proteínas de Arabidopsis/genética , Arabidopsis/genética , Regulação da Expressão Gênica de Plantas , Proteínas de Domínio MADS/genética , Meristema/genética , Transcriptoma , Arabidopsis/anatomia & histologia , Proteínas de Arabidopsis/isolamento & purificação , Sequência de Bases , Hidrato de Cloral , DNA Antissenso , Flores/genética , Genoma de Planta , Hibridização In Situ , Ácidos Indolacéticos/farmacologia , Proteínas de Domínio MADS/isolamento & purificação , Meristema/crescimento & desenvolvimento , Meristema/metabolismo , Microscopia Confocal , Óvulo Vegetal/citologia , Óvulo Vegetal/crescimento & desenvolvimento , Óvulo Vegetal/metabolismo , Isoformas de Proteínas , Protoplastos , RNA de Plantas/química , RNA de Plantas/isolamento & purificação , Sementes/crescimento & desenvolvimento , Alinhamento de Sequência , Fatores de Transcrição , Ativação Transcricional
4.
BMC Bioinformatics ; 11 Suppl 3: S6, 2010 Apr 29.
Artigo em Inglês | MEDLINE | ID: mdl-20438653

RESUMO

BACKGROUND: In eukaryotes, alternative splicing often generates multiple splice variants from a single gene. Here we explore the use of RNA sequencing (RNA-Seq) datasets to address the isoform quantification problem. Given a set of known splice variants, the goal is to estimate the relative abundance of the individual variants. METHODS: Our method employs a linear models framework to estimate the ratios of known isoforms in a sample. A key feature of our method is that it takes into account the non-uniformity of RNA-Seq read positions along the targeted transcripts. RESULTS: Preliminary tests indicate that the model performs well on both simulated and real data. In two publicly available RNA-Seq datasets, we identified several alternatively-spliced genes with switch-like, on/off expression properties, as well as a number of other genes that varied more subtly in isoform expression. In many cases, genes exhibiting differential expression of alternatively spliced transcripts were not differentially expressed at the gene level. CONCLUSIONS: Given that changes in isoform expression level frequently involve a continuum of isoform ratios, rather than all-or-nothing expression, and that they are often independent of general gene expression changes, we anticipate that our research will contribute to revealing a so far uninvestigated layer of the transcriptome. We believe that, in the future, researchers will prioritize genes for functional analysis based not only on observed changes in gene expression levels, but also on changes in alternative splicing.


Assuntos
Processamento Alternativo/genética , Biologia Computacional/métodos , Bases de Dados de Ácidos Nucleicos , Isoformas de Proteínas/análise , Análise de Sequência de RNA/métodos , Algoritmos , Arabidopsis/genética , Sequência de Bases , Distribuição de Qui-Quadrado , Simulação por Computador , Regulação da Expressão Gênica de Plantas , Genes de Plantas , Modelos Lineares , Modelos Genéticos , Dados de Sequência Molecular , Isoformas de Proteínas/genética , Estresse Fisiológico
5.
J Proteome Res ; 9(3): 1209-17, 2010 Mar 05.
Artigo em Inglês | MEDLINE | ID: mdl-20047314

RESUMO

Identification of proteins from proteolytic peptides or intact proteins plays an essential role in proteomics. Researchers use search engines to match the acquired peptide sequences to the target proteins. However, search engines depend on protein databases to provide candidates for consideration. Alternative splicing (AS), the mechanism where the exon of pre-mRNAs can be spliced and rearranged to generate distinct mRNA and therefore protein variants, enable higher eukaryotic organisms, with only a limited number of genes, to have the requisite complexity and diversity at the proteome level. Multiple alternative isoforms from one gene often share common segments of sequences. However, many protein databases only include a limited number of isoforms to keep minimal redundancy. As a result, the database search might not identify a target protein even with high quality tandem MS data and accurate intact precursor ion mass. We computationally predicted an exhaustive list of putative isoforms of Aspergillus flavus proteins from 20 371 expressed sequence tags to investigate whether an alternative splicing protein database can assign a greater proportion of mass spectrometry data. The newly constructed AS database provided 9807 new alternatively spliced variants in addition to 12 832 previously annotated proteins. The searches of the existing tandem MS spectra data set using the AS database identified 29 new proteins encoded by 26 genes. Nine fungal genes appeared to have multiple protein isoforms. In addition to the discovery of splice variants, AS database also showed potential to improve genome annotation. In summary, the introduction of an alternative splicing database helps identify more proteins and unveils more information about a proteome.


Assuntos
Aspergillus flavus/genética , Proteínas Fúngicas/genética , Isoformas de Proteínas/genética , Processamento Alternativo , Aspergillus flavus/metabolismo , Simulação por Computador , Citocromo-B(5) Redutase/química , Citocromo-B(5) Redutase/genética , Bases de Dados de Proteínas , Proteínas Fúngicas/biossíntese , Proteínas Fúngicas/química , Marcação por Isótopo , Espectrometria de Massas , Modelos Genéticos , Isoformas de Proteínas/biossíntese , Isoformas de Proteínas/química , Proteômica , Piruvato Carboxilase/química , Piruvato Carboxilase/genética , Sítios de Splice de RNA , Reprodutibilidade dos Testes
6.
Plant Cell Physiol ; 51(1): 144-63, 2010 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-19996151

RESUMO

As a step toward a comprehensive description of lignin biosynthesis in Populus trichocarpa, we identified from the genome sequence 95 phenylpropanoid gene models in 10 protein families encoding enzymes for monolignol biosynthesis. Transcript abundance was determined for all 95 genes in xylem, leaf, shoot and phloem using quantitative real-time PCR (qRT-PCR). We identified 23 genes that most probably encode monolignol biosynthesis enzymes during wood formation. Transcripts for 18 of the 23 are abundant and specific to differentiating xylem. We found evidence suggesting functional redundancy at the transcript level for phenylalanine ammonia-lyase (PAL), cinnamate 4-hydroxylase (C4H), 4-coumarate:CoA ligase (4CL), p-hydroxycinnamoyl-CoA:quinate shikimate p-hydroxycinnamoyltransferase (HCT), caffeoyl-CoA O-methyltransferase (CCoAOMT) and coniferyl aldehyde 5-hydroxylase (CAld5H). We carried out an enumeration-based motif identification and discriminant analysis on the promoters of all 95 genes. Five core motifs correctly discriminate the 18 xylem-specific genes from the 77 non-xylem genes. These motifs are similar to promoter elements known to regulate phenylpropanoid gene expression. This work suggests that genes in monolignol biosynthesis are regulated by multiple motifs, often related in sequence.


Assuntos
Vias Biossintéticas/genética , Lignina/biossíntese , Lignina/genética , Populus/genética , Populus/metabolismo , RNA de Plantas/genética , Motivos de Aminoácidos/genética , Enzimas/biossíntese , Enzimas/genética , Regulação Enzimológica da Expressão Gênica/fisiologia , Regulação da Expressão Gênica de Plantas/fisiologia , Genoma de Planta/genética , Floema/enzimologia , Floema/genética , Brotos de Planta/enzimologia , Brotos de Planta/genética , Regiões Promotoras Genéticas/genética , RNA Mensageiro/análise , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , RNA de Plantas/análise , RNA de Plantas/metabolismo , Transcrição Gênica/fisiologia , Xilema/enzimologia , Xilema/genética
7.
Hum Genomics ; 3(3): 221-35, 2009 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-19403457

RESUMO

Transcription factors are key mediators of human complex disease processes. Identifying the target genes of transcription factors will increase our understanding of the biological network leading to disease risk. The prediction of transcription factor binding sites (TFBSs) is one method to identify these target genes; however, current prediction methods need improvement. We chose the transcription factor upstream stimulatory factor 1 ( USF1 ) to evaluate the performance of our novel TFBS prediction method because of its known genetic association with coronary artery disease (CAD) and the recent availability of USF1 chromatin immunoprecipitation microarray (ChIP-chip) results. The specific goals of our study were to develop a novel and accurate genome-scale method for predicting USF1 binding sites and associated target genes to aid in the study of CAD. Previously published USF1 ChIP-chip data for 1 per cent of the genome were used to develop and evaluate several kernel logistic regression prediction models. A combination of genomic features (phylogenetic conservation, regulatory potential, presence of a CpG island and DNaseI hypersensitivity), as well as position weight matrix (PWM) scores, were used as variables for these models. Our most accurate predictor achieved an area under the receiver operator characteristic curve of 0.827 during cross-validation experiments, significantly outperforming standard PWM-based prediction methods. When applied to the whole human genome, we predicted 24,010 USF1 binding sites within 5 kilobases upstream of the transcription start site of 9,721 genes. These predictions included 16 of 20 genes with strong evidence of USF1 regulation. Finally, in the spirit of genomic convergence, we integrated independent experimental CAD data with these USF1 binding site prediction results to develop a prioritised set of candidate genes for future CAD studies. We have shown that our novel prediction method, which employs genomic features related to the presence of regulatory elements, enables more accurate and efficient prediction of USF1 binding sites. This method can be extended to other transcription factors identified in human disease studies to help further our understanding of the biology of complex disease.


Assuntos
Doenças Cardiovasculares/metabolismo , Genômica/métodos , Fatores Estimuladores Upstream/metabolismo , Animais , Sítios de Ligação , Imunoprecipitação da Cromatina , Humanos , Análise de Regressão , Fatores de Transcrição/metabolismo
8.
J Comput Biol ; 27(2): 259-268, 2020 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-31855064

RESUMO

ChIP-Seq blacklists contain genomic regions that frequently produce artifacts and noise in ChIP-Seq experiments. To improve signal-to-noise ratio, ChIP-Seq pipelines often remove data points that map to blacklist regions. Existing blacklists have been compiled in a manual or semiautomated way. In this article we describe PeakPass, an efficient method to generate blacklists, and demonstrate that blacklists can increase ChIP-Seq data quality. PeakPass leverages machine learning and attempts to automate blacklist generation. PeakPass uses a random forest classifier in combination with genomic features such as sequence, annotated repeats, complexity, assembly gaps, and the ratio of multimapping to uniquely mapping reads to identify artifact regions. We have validated PeakPass on a large data set and tested it for the purpose of upgrading a blacklist to a new reference genome version. We trained PeakPass on the ENCODE blacklist for the hg19 human reference genome, and created an updated blacklist for hg38. To assess the performance of this blacklist, we tested 42 ChIP-Seq replicates from 24 experiments using 10 ChIP-Seq quality metrics including relative strand coefficient, standardized standard deviation, and enrichment of reads in promoter regions. Using the blacklist generated by PeakPass resulted in a statistically significant improvement for nine of these metrics.

9.
BMC Bioinformatics ; 10: 191, 2009 Jun 22.
Artigo em Inglês | MEDLINE | ID: mdl-19545436

RESUMO

BACKGROUND: Quality assessment of microarray data is an important and often challenging aspect of gene expression analysis. This task frequently involves the examination of a variety of summary statistics and diagnostic plots. The interpretation of these diagnostics is often subjective, and generally requires careful expert scrutiny. RESULTS: We show how an unsupervised classification technique based on the Expectation-Maximization (EM) algorithm and the naïve Bayes model can be used to automate microarray quality assessment. The method is flexible and can be easily adapted to accommodate alternate quality statistics and platforms. We evaluate our approach using Affymetrix 3' gene expression and exon arrays and compare the performance of this method to a similar supervised approach. CONCLUSION: This research illustrates the efficacy of an unsupervised classification approach for the purpose of automated microarray data quality assessment. Since our approach requires only unannotated training data, it is easy to customize and to keep up-to-date as technology evolves. In contrast to other "black box" classification systems, this method also allows for intuitive explanations.


Assuntos
Análise de Sequência com Séries de Oligonucleotídeos/métodos , Algoritmos , Biologia Computacional/métodos , Perfilação da Expressão Gênica/métodos , Distribuição Normal
10.
Evol Dev ; 11(1): 50-68, 2009.
Artigo em Inglês | MEDLINE | ID: mdl-19196333

RESUMO

microRNAs (miRNAs) are approximately 22-nucleotide noncoding RNA regulatory genes that are key players in cellular differentiation and homeostasis. They might also play important roles in shaping metazoan macroevolution. Previous studies have shown that miRNAs are continuously being added to metazoan genomes through time, and, once integrated into gene regulatory networks, show only rare mutations within the primary sequence of the mature gene product and are only rarely secondarily lost. However, because the conclusions from these studies were largely based on phylogenetic conservation of miRNAs between model systems like Drosophila and the taxon of interest, it was unclear if these trends would describe most miRNAs in most metazoan taxa. Here, we describe the shared complement of miRNAs among 18 animal species using a combination of 454 sequencing of small RNA libraries with genomic searches. We show that the evolutionary trends elucidated from the model systems are generally true for all miRNA families and metazoan taxa explored: the continuous addition of miRNA families with only rare substitutions to the mature sequence, and only rare instances of secondary loss. Despite this conservation, we document evolutionary stable shifts to the determination of position 1 of the mature sequence, a phenomenon we call seed shifting, as well as the ability to post-transcriptionally edit the 5' end of the mature read, changing the identity of the seed sequence and possibly the repertoire of downstream targets. Finally, we describe a novel type of miRNA in demosponges that, although shows a different pre-miRNA structure, still shows remarkable conservation of the mature sequence in the two sponge species analyzed. We propose that miRNAs might be excellent phylogenetic markers, and suggest that the advent of morphological complexity might have its roots in miRNA innovation.


Assuntos
Cordados/genética , Evolução Molecular , Invertebrados/genética , MicroRNAs/genética , Filogenia , Animais , Sequência de Bases , Northern Blotting , Biologia Computacional , Sequência Conservada/genética , Primers do DNA/genética , Biblioteca Gênica , Dados de Sequência Molecular , Poríferos/genética , Análise de Sequência de DNA , Especificidade da Espécie
11.
Artigo em Inglês | MEDLINE | ID: mdl-17277413

RESUMO

Accurate base-assignment in repeat regions of a whole genome shotgun assembly is an unsolved problem. Since reads in repeat regions cannot be easily attributed to a unique location in the genome, current assemblers may place these reads arbitrarily. As a result, the base-assignment error rate in repeats is likely to be much higher than that in the rest of the genome. We developed an iterative algorithm, EULER-AIR, that is able to correct base-assignment errors in finished genome sequences in public databases. The Wolbachia genome is among the best finished genomes. Using this genome project as an example, we demonstrated that EULER-AIR can 1) discover and correct base-assignment errors, 2) provide accurate read assignments, 3) utilize finishing reads for accurate base-assignment, and 4) provide guidance for designing finishing experiments. In the genome of Wolbachia, EULER-AIR found 16 positions with ambiguous base-assignment and two positions with erroneous bases. Besides Wolbachia, many other genome sequencing projects have significantly fewer finishing reads and, hence, are likely to contain more base-assignment errors in repeats. We demonstrate that EULER-AIR is a software tool that can be used to find and correct base-assignment errors in a genome assembly project.


Assuntos
Biologia Computacional/métodos , Modelos Estatísticos , Sequências Repetitivas de Ácido Nucleico/genética , Análise de Sequência de DNA/métodos , Algoritmos , Campylobacter jejuni/genética , Análise por Conglomerados , Genoma Bacteriano , Lactococcus lactis/genética , Alinhamento de Sequência , Análise de Sequência de DNA/estatística & dados numéricos , Software , Staphylococcus epidermidis/genética , Wolbachia/genética
12.
Nucleic Acids Res ; 33(Web Server issue): W638-43, 2005 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-15980552

RESUMO

The Remote Analysis Computation for gene Expression data (RACE) suite is a collection of bioinformatics web tools designed for the analysis of DNA microarray data. RACE performs probe-level data preprocessing, extensive quality checks, data visualization and data normalization for Affymetrix GeneChips. In addition, it offers differential expression analysis on normalized expression levels from any array platform. RACE estimates the false discovery rates of lists of potentially regulated genes and provides a Gene Ontology-term analysis tool for GeneChip data to support the biological interpretation and annotation of results. The analysis is fully automated but can be customized by flexible parameter settings. To offer a convenient starting point for subsequent analyses, and to provide maximum transparency, the R scripts used to generate the results can be downloaded along with the output files. RACE is freely available for use at http://race.unil.ch.


Assuntos
Biologia Computacional , Perfilação da Expressão Gênica/métodos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Software , Gráficos por Computador , Perfilação da Expressão Gênica/normas , Internet , Análise de Sequência com Séries de Oligonucleotídeos/normas , Controle de Qualidade , Interface Usuário-Computador
13.
J Leukoc Biol ; 102(6): 1371-1380, 2017 12.
Artigo em Inglês | MEDLINE | ID: mdl-29021367

RESUMO

The vertebrate immune response comprises multiple molecular and cellular components that interface to provide defense against pathogens. Because of the dynamic complexity of the immune system and its interdependent innate and adaptive functionality, an understanding of the whole-organism response to pathogen exposure remains unresolved. Zebrafish larvae provide a unique model for overcoming this obstacle, because larvae are protected against pathogens while lacking a functional adaptive immune system during the first few weeks of life. Zebrafish larvae were exposed to immune agonists for various lengths of time, and a microarray transcriptome analysis was executed. This strategy identified known immune response genes, as well as genes with unknown immune function, including the E3 ubiquitin ligase tripartite motif-9 (Trim9). Although trim9 expression was originally described as "brain specific," its expression has been reported in stimulated human Mϕs. In this study, we found elevated levels of trim9 transcripts in vivo in zebrafish Mϕs after immune stimulation. Trim9 has been implicated in axonal migration, and we therefore investigated the impact of Trim9 disruption on Mϕ motility and found that Mϕ chemotaxis and cellular architecture are subsequently impaired in vivo. These results demonstrate that Trim9 mediates cellular movement and migration in Mϕs as well as neurons.


Assuntos
Movimento Celular , Macrófagos/citologia , Macrófagos/metabolismo , Proteínas do Tecido Nervoso/metabolismo , Proteínas com Motivo Tripartido/metabolismo , Ubiquitina-Proteína Ligases/metabolismo , Proteínas de Peixe-Zebra/metabolismo , Animais , Movimento Celular/genética , Forma Celular , Quimiotaxia , Humanos , Proteínas do Tecido Nervoso/genética , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Proteínas com Motivo Tripartido/genética , Células U937 , Ubiquitina-Proteína Ligases/genética , Peixe-Zebra/genética , Peixe-Zebra/imunologia , Proteínas de Peixe-Zebra/genética
14.
OMICS ; 10(3): 358-68, 2006.
Artigo em Inglês | MEDLINE | ID: mdl-17069513

RESUMO

Affymetrix GeneChips are one of the best established microarray platforms. This powerful technique allows users to measure the expression of thousands of genes simultaneously. However, a microarray experiment is a sophisticated and time consuming endeavor with many potential sources of unwanted variation that could compromise the results if left uncontrolled. Increasing data volume and data complexity have triggered growing concern and awareness of the importance of assessing the quality of generated microarray data. In this review, we give an overview of current methods and software tools for quality assessment of Affymetrix GeneChip data. We focus on quality metrics, diagnostic plots, probe-level methods, pseudo-images, and classification methods to identify corrupted chips. We also describe RNA quality assessment methods which play an important role in challenging RNA sources like formalin embedded biopsies, laser-micro dissected samples, or single cells. No wet-lab methods are discussed in this paper.


Assuntos
Análise de Sequência com Séries de Oligonucleotídeos/métodos , Análise de Sequência com Séries de Oligonucleotídeos/tendências , Animais , Perfilação da Expressão Gênica , Humanos
15.
J Mass Spectrom ; 41(3): 281-8, 2006 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-16538648

RESUMO

The number and wide dynamic range of components found in biological matrixes present several challenges for global proteomics. In this perspective, we will examine the potential of zero-dimensional (0D), one-dimensional (1D), and two-dimensional (2D) separations coupled with Fourier-transform ion cyclotron resonance (FT-ICR) and time-of-flight (TOF) mass spectrometry (MS) for the analysis of complex mixtures. We describe and further develop previous reports on the space occupied by peptides, to calculate the theoretical peak capacity available to each separations-mass spectrometry method examined. Briefly, the peak capacity attainable by each of the mass analyzers was determined from the mass resolving power (RP) and the m/z space occupied by peptides considered from the mass distribution of tryptic peptides from National Center for Biotechnology Information's (NCBI's) nonredundant database. Our results indicate that reverse-phase-nanoHPLC (RP-nHPLC) separation coupled with FT-ICR MS offers an order of magnitude improvement in peak capacity over RP-nHPLC separation coupled with TOF MS. The addition of an orthogonal separation method, strong cation exchange (SCX), for 2D LC-MS demonstrates an additional 10-fold improvement in peak capacity over 1D LC-MS methods. Peak capacity calculations for 0D LC, two different 1D RP-HPLC methods, and 2D LC (with various numbers of SCX fractions) for both RP-HPLC methods coupled to FT-ICR and TOF MS are examined in detail. Peak capacity production rates, which take into account the total analysis time, are also considered for each of the methods. Furthermore, the significance of the space occupied by peptides is discussed.


Assuntos
Eletroforese em Gel Bidimensional , Espectrometria de Massas , Peptídeos/análise , Proteômica/instrumentação , Proteômica/métodos , Biotecnologia/instrumentação , Biotecnologia/métodos , Análise de Fourier
16.
Nucleic Acids Res ; 32(13): 3977-83, 2004.
Artigo em Inglês | MEDLINE | ID: mdl-15292448

RESUMO

Alternative splicing essentially increases the diversity of the transcriptome and has important implications for physiology, development and the genesis of diseases. Conventionally, alternative splicing is investigated in a case-by-case fashion, but this becomes cumbersome and error prone if genes show a huge abundance of different splice variants. We use a different approach and integrate all transcripts derived from a gene into a single splicing graph. Each transcript corresponds to a path in the graph, and alternative splicing is displayed by bifurcations. This representation preserves the relationships between different splicing variants and allows us to investigate systematically all possible putative transcripts. We built a database of splicing graphs for human genes, using transcript information from various major sources (Ensembl, RefSeq, STACK, TIGR and UniGene). A Web interface allows users to display the splicing graphs, to interactively assemble transcripts and to access their sequences as well as neighboring genomic regions. We also provide for each gene an exhaustive pre-computed catalog of putative transcripts--in total more than 1.2 million sequences. We found that approximately 65% of the investigated genes show evidence for alternative splicing, and in 5% of the cases, a single gene might produce over 100 transcripts.


Assuntos
Processamento Alternativo , Software , Gráficos por Computador , Bases de Dados de Ácidos Nucleicos , Genoma Humano , Humanos , Internet , Sítios de Splice de RNA , Alinhamento de Sequência , Análise de Sequência de DNA , Análise de Sequência de RNA , Transcrição Gênica
17.
Genet Mol Res ; 5(1): 224-32, 2006 Mar 31.
Artigo em Inglês | MEDLINE | ID: mdl-16755513

RESUMO

Analysis of gene deletions is a fundamental approach for investigating gene function. We evaluated an algorithm that uses classification techniques to predict the phenotypic effects of gene deletions in yeast. We used a modified simulated annealing algorithm for feature selection and weighting. The selected features with high weights were phylogenetic conservation scores for bacteria, fungi (excluding Ascomycota), Ascomycota (excluding Saccharomyces cerevisiae), plants, and mammals, degree of paralogy, and number of protein-protein interactions. Classification was performed by weighted k-nearest neighbor and with support vector machine algorithms. To demonstrate how this approach might complement existing experimental procedures, we applied our algorithm to predict essential genes and genes causing morphological alterations in yeast.


Assuntos
Algoritmos , Deleção de Genes , Genes Fúngicos/genética , Fenótipo , Leveduras/genética , Animais , Mutação
18.
IEEE Trans Nanobioscience ; 15(2): 148-57, 2016 03.
Artigo em Inglês | MEDLINE | ID: mdl-26886998

RESUMO

Upstream open reading frames (uORFs) are open reading frames that occur within the 5' UTR of an mRNA. uORFs have been found in many organisms. They play an important role in gene regulation, cell development, and in various metabolic processes. It is believed that translated uORFs reduce the translational efficiency of the main coding region. However, only few uORFs are experimentally characterized. In this paper, we use ribosome footprinting together with a semi-supervised approach based on stacking classification models to identify translated uORFs in Arabidopsis thaliana. Our approach identified 5360 potentially translated uORFs in 2051 genes. GO terms enriched in genes with translated uORFs include catalytic activity, binding, transferase activity, phosphotransferase activity, kinase activity, and transcription regulator activity. The reported uORFs occur with a higher frequency in multi-isoform genes, and some uORFs are affected by alternative transcript start sites or alternative splicing events. Association rule mining revealed sequence features associated with the translation status of the uORFs. We hypothesize that uORF translation is a complex process that might be regulated by multiple factors. The identified uORFs are available online at:https://www.dropbox.com/sh/zdutupedxafhly8/AABFsdNR5zDfiozB7B4igFcja?dl=0. This paper is the extended version of our research presented at ISBRA 2015.


Assuntos
Proteínas de Arabidopsis/genética , Proteínas de Arabidopsis/metabolismo , Fases de Leitura Aberta/genética , Pegadas de Proteínas/métodos , Ribossomos/metabolismo , Algoritmos , Análise por Conglomerados , Genoma de Planta , RNA de Plantas , Ribossomos/genética , Análise de Sequência de RNA , Aprendizado de Máquina Supervisionado
19.
Drug Discov Today ; 8(3): 113-4, 2003 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-12568779

RESUMO

Highlights from the first European Conference on Computational Biology (ECCB 2002), held in conjunction with the German Conference on Bioinformatics (GCB 2002), 6-9 October 2002, Saarbrücken, Germany.


Assuntos
Biologia Computacional/métodos , Biologia Computacional/tendências , Europa (Continente) , Humanos
20.
Neuron ; 84(2): 386-98, 2014 Oct 22.
Artigo em Inglês | MEDLINE | ID: mdl-25284007

RESUMO

Molecular diversity of surface receptors has been hypothesized to provide a mechanism for selective synaptic connectivity. Neurexins are highly diversified receptors that drive the morphological and functional differentiation of synapses. Using a single cDNA sequencing approach, we detected 1,364 unique neurexin-α and 37 neurexin-ß mRNAs produced by alternative splicing of neurexin pre-mRNAs. This molecular diversity results from near-exhaustive combinatorial use of alternative splice insertions in Nrxn1α and Nrxn2α. By contrast, Nrxn3α exhibits several highly stereotyped exon selections that incorporate novel elements for posttranscriptional regulation of a subset of transcripts. Complexity of Nrxn1α repertoires correlates with the cellular complexity of neuronal tissues, and a specific subset of isoforms is enriched in a purified cell type. Our analysis defines the molecular diversity of a critical synaptic receptor and provides evidence that neurexin diversity is linked to cellular diversity in the nervous system.


Assuntos
Processamento Alternativo , Encéfalo/metabolismo , Éxons/genética , Proteínas do Tecido Nervoso/genética , RNA Mensageiro/metabolismo , Animais , Camundongos , Proteínas do Tecido Nervoso/metabolismo , Neurônios/metabolismo , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , Sinapses/metabolismo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA