Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
1.
Nature ; 501(7468): 506-11, 2013 Sep 26.
Artigo em Inglês | MEDLINE | ID: mdl-24037378

RESUMO

Genome sequencing projects are discovering millions of genetic variants in humans, and interpretation of their functional effects is essential for understanding the genetic basis of variation in human traits. Here we report sequencing and deep analysis of messenger RNA and microRNA from lymphoblastoid cell lines of 462 individuals from the 1000 Genomes Project--the first uniformly processed high-throughput RNA-sequencing data from multiple human populations with high-quality genome sequences. We discover extremely widespread genetic variation affecting the regulation of most genes, with transcript structure and expression level variation being equally common but genetically largely independent. Our characterization of causal regulatory variation sheds light on the cellular mechanisms of regulatory and loss-of-function variation, and allows us to infer putative causal variants for dozens of disease-associated loci. Altogether, this study provides a deep understanding of the cellular mechanisms of transcriptome variation and of the landscape of functional variants in the human genome.


Assuntos
Variação Genética/genética , Genoma Humano/genética , Sequenciamento de Nucleotídeos em Larga Escala , Análise de Sequência de RNA , Transcriptoma/genética , Alelos , Linhagem Celular Transformada , Éxons/genética , Perfilação da Expressão Gênica , Humanos , Polimorfismo de Nucleotídeo Único/genética , Locos de Características Quantitativas/genética , RNA Mensageiro/análise , RNA Mensageiro/genética
2.
BMC Genomics ; 18(1): 7, 2017 01 03.
Artigo em Inglês | MEDLINE | ID: mdl-28049418

RESUMO

BACKGROUND: Chimeric transcripts are commonly defined as transcripts linking two or more different genes in the genome, and can be explained by various biological mechanisms such as genomic rearrangement, read-through or trans-splicing, but also by technical or biological artefacts. Several studies have shown their importance in cancer, cell pluripotency and motility. Many programs have recently been developed to identify chimeras from Illumina RNA-seq data (mostly fusion genes in cancer). However outputs of different programs on the same dataset can be widely inconsistent, and tend to include many false positives. Other issues relate to simulated datasets restricted to fusion genes, real datasets with limited numbers of validated cases, result inconsistencies between simulated and real datasets, and gene rather than junction level assessment. RESULTS: Here we present ChimPipe, a modular and easy-to-use method to reliably identify fusion genes and transcription-induced chimeras from paired-end Illumina RNA-seq data. We have also produced realistic simulated datasets for three different read lengths, and enhanced two gold-standard cancer datasets by associating exact junction points to validated gene fusions. Benchmarking ChimPipe together with four other state-of-the-art tools on this data showed ChimPipe to be the top program at identifying exact junction coordinates for both kinds of datasets, and the one showing the best trade-off between sensitivity and precision. Applied to 106 ENCODE human RNA-seq datasets, ChimPipe identified 137 high confidence chimeras connecting the protein coding sequence of their parent genes. In subsequent experiments, three out of four predicted chimeras, two of which recurrently expressed in a large majority of the samples, could be validated. Cloning and sequencing of the three cases revealed several new chimeric transcript structures, 3 of which with the potential to encode a chimeric protein for which we hypothesized a new role. Applying ChimPipe to human and mouse ENCODE RNA-seq data led to the identification of 131 recurrent chimeras common to both species, and therefore potentially conserved. CONCLUSIONS: ChimPipe combines discordant paired-end reads and split-reads to detect any kind of chimeras, including those originating from polymerase read-through, and shows an excellent trade-off between sensitivity and precision. The chimeras found by ChimPipe can be validated in-vitro with high accuracy.


Assuntos
Proteínas de Fusão Oncogênica , Recombinação Genética , Software , Transcrição Gênica , Animais , Biologia Computacional/métodos , Simulação por Computador , Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Camundongos , Reprodutibilidade dos Testes , Análise de Sequência de RNA
3.
Nucleic Acids Res ; 40(20): 10073-83, 2012 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-22962361

RESUMO

High-throughput sequencing of cDNA libraries constructed from cellular RNA complements (RNA-Seq) naturally provides a digital quantitative measurement for every expressed RNA molecule. Nature, impact and mutual interference of biases in different experimental setups are, however, still poorly understood-mostly due to the lack of data from intermediate protocol steps. We analysed multiple RNA-Seq experiments, involving different sample preparation protocols and sequencing platforms: we broke them down into their common--and currently indispensable--technical components (reverse transcription, fragmentation, adapter ligation, PCR amplification, gel segregation and sequencing), investigating how such different steps influence abundance and distribution of the sequenced reads. For each of those steps, we developed universally applicable models, which can be parameterised by empirical attributes of any experimental protocol. Our models are implemented in a computer simulation pipeline called the Flux Simulator, and we show that read distributions generated by different combinations of these models reproduce well corresponding evidence obtained from the corresponding experimental setups. We further demonstrate that our in silico RNA-Seq provides insights about hidden precursors that determine the final configuration of reads along gene bodies; enhancing or compensatory effects that explain apparently controversial observations can be observed. Moreover, our simulations identify hitherto unreported sources of systematic bias from RNA hydrolysis, a fragmentation technique currently employed by most RNA-Seq protocols.


Assuntos
Simulação por Computador , Perfilação da Expressão Gênica , Sequenciamento de Nucleotídeos em Larga Escala , Análise de Sequência de RNA , Hidrólise , RNA/metabolismo
4.
Sci Rep ; 11(1): 12848, 2021 06 18.
Artigo em Inglês | MEDLINE | ID: mdl-34145303

RESUMO

Chronic obstructive pulmonary disease (COPD) is a destructive inflammatory disease and the genes expressed within the lung are crucial to its pathophysiology. We have determined the RNAseq transcriptome of bronchial brush cells from 312 stringently defined ex-smoker patients. Compared to healthy controls there were for males 40 differentially expressed genes (DEGs) and 73 DEGs for females with only 26 genes shared. The gene ontology (GO) term "response to bacterium" was shared, with several different DEGs contributing in males and females. Strongly upregulated genes TCN1 and CYP1B1 were unique to males and females, respectively. For male emphysema (E)-dominant and airway disease (A)-dominant COPD (defined by computed tomography) the term "response to stress" was found for both sub-phenotypes, but this included distinct up-regulated genes for the E-sub-phenotype (neutrophil-related CSF3R, CXCL1, MNDA) and for the A-sub-phenotype (macrophage-related KLF4, F3, CD36). In E-dominant disease, a cluster of mitochondria-encoded (MT) genes forms a signature, able to identify patients with emphysema features in a confirmation cohort. The MT-CO2 gene is upregulated transcriptionally in bronchial epithelial cells with the copy number essentially unchanged. Both MT-CO2 and the neutrophil chemoattractant CXCL1 are induced by reactive oxygen in bronchial epithelial cells. Of the female DEGs unique for E- and A-dominant COPD, 88% were detected in females only. In E-dominant disease we found a pronounced expression of mast cell-associated DEGs TPSB2, TPSAB1 and CPA3. The differential genes discovered in this study point towards involvement of different types of leukocytes in the E- and A-dominant COPD sub-phenotypes in males and females.


Assuntos
Suscetibilidade a Doenças , Expressão Gênica , Leucócitos/metabolismo , Mitocôndrias/genética , Doença Pulmonar Obstrutiva Crônica/etiologia , Doença Pulmonar Obstrutiva Crônica/metabolismo , Mucosa Respiratória/metabolismo , Biomarcadores , Biologia Computacional/métodos , Feminino , Perfilação da Expressão Gênica , Humanos , Fator 4 Semelhante a Kruppel , Leucócitos/imunologia , Leucócitos/patologia , Masculino , Mitocôndrias/metabolismo , Doença Pulmonar Obstrutiva Crônica/patologia , Mucosa Respiratória/imunologia , Mucosa Respiratória/patologia , Fatores Sexuais , Transcriptoma
5.
BMC Bioinformatics ; 10: 50, 2009 Feb 06.
Artigo em Inglês | MEDLINE | ID: mdl-19200358

RESUMO

BACKGROUND: Understanding transcriptional regulation by genome-wide microarray studies can contribute to unravel complex relationships between genes. Attempts to standardize the annotation of microarray data include the Minimum Information About a Microarray Experiment (MIAME) recommendations, the MAGE-ML format for data interchange, and the use of controlled vocabularies or ontologies. The existing software systems for microarray data analysis implement the mentioned standards only partially and are often hard to use and extend. Integration of genomic annotation data and other sources of external knowledge using open standards is therefore a key requirement for future integrated analysis systems. RESULTS: The EMMA 2 software has been designed to resolve shortcomings with respect to full MAGE-ML and ontology support and makes use of modern data integration techniques. We present a software system that features comprehensive data analysis functions for spotted arrays, and for the most common synthesized oligo arrays such as Agilent, Affymetrix and NimbleGen. The system is based on the full MAGE object model. Analysis functionality is based on R and Bioconductor packages and can make use of a compute cluster for distributed services. CONCLUSION: Our model-driven approach for automatically implementing a full MAGE object model provides high flexibility and compatibility. Data integration via SOAP-based web-services is advantageous in a distributed client-server environment as the collaborative analysis of microarray data is gaining more and more relevance in international research consortia. The adequacy of the EMMA 2 software design and implementation has been proven by its application in many distributed functional genomics projects. Its scalability makes the current architecture suited for extensions towards future transcriptomics methods based on high-throughput sequencing approaches which have much higher computational requirements than microarrays.


Assuntos
Perfilação da Expressão Gênica/métodos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Software , Bases de Dados Genéticas , Genoma , Internet , Interface Usuário-Computador
6.
Bioinformatics ; 24(20): 2399-400, 2008 Oct 15.
Artigo em Inglês | MEDLINE | ID: mdl-18632748

RESUMO

Estimating Phylogenies of Species (EPoS) is a modular software framework for phylogenetic analysis, visualization and data management. It provides a plugin-based system that integrates a storage facility, a rich user interface and the ability to easily incorporate new methods, functions and visualizations. EPoS ships with persistent data management, a set of well-known phylogenetic algorithms and a multitude of tree visualization methods and layouts. Implemented algorithms cover distance-based tree construction, consensus trees and various graph-based supertree methods. The rendering system can be customized for, say, different edge and node styles.


Assuntos
Filogenia , Software , Algoritmos , Biologia Computacional/métodos , Interface Usuário-Computador
7.
Diabetes ; 63(6): 1978-93, 2014 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-24379348

RESUMO

Pancreatic ß-cell dysfunction and death are central in the pathogenesis of type 2 diabetes (T2D). Saturated fatty acids cause ß-cell failure and contribute to diabetes development in genetically predisposed individuals. Here we used RNA sequencing to map transcripts expressed in five palmitate-treated human islet preparations, observing 1,325 modified genes. Palmitate induced fatty acid metabolism and endoplasmic reticulum (ER) stress. Functional studies identified novel mediators of adaptive ER stress signaling. Palmitate modified genes regulating ubiquitin and proteasome function, autophagy, and apoptosis. Inhibition of autophagic flux and lysosome function contributed to lipotoxicity. Palmitate inhibited transcription factors controlling ß-cell phenotype, including PAX4 and GATA6. Fifty-nine T2D candidate genes were expressed in human islets, and 11 were modified by palmitate. Palmitate modified expression of 17 splicing factors and shifted alternative splicing of 3,525 transcripts. Ingenuity Pathway Analysis of modified transcripts and genes confirmed that top changed functions related to cell death. Database for Annotation, Visualization and Integrated Discovery (DAVID) analysis of transcription factor binding sites in palmitate-modified transcripts revealed a role for PAX4, GATA, and the ER stress response regulators XBP1 and ATF6. This human islet transcriptome study identified novel mechanisms of palmitate-induced ß-cell dysfunction and death. The data point to cross talk between metabolic stress and candidate genes at the ß-cell level.


Assuntos
Diabetes Mellitus Tipo 2/genética , Estresse do Retículo Endoplasmático/genética , Inflamação/genética , Ilhotas Pancreáticas/metabolismo , Palmitatos/metabolismo , Análise de Sequência de RNA , Animais , Apoptose/efeitos dos fármacos , Apoptose/genética , Western Blotting , Linhagem Celular , Células Cultivadas , Diabetes Mellitus Tipo 2/metabolismo , Estresse do Retículo Endoplasmático/efeitos dos fármacos , Feminino , Regulação Enzimológica da Expressão Gênica , Predisposição Genética para Doença , Humanos , Inflamação/metabolismo , Ilhotas Pancreáticas/efeitos dos fármacos , Masculino , Transdução de Sinais , Transcriptoma
8.
Adv Bioinformatics ; 2011: 524182, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-22229028

RESUMO

Supertree methods allow to reconstruct large phylogenetic trees by combining smaller trees with overlapping leaf sets into one, more comprehensive supertree. The most commonly used supertree method, matrix representation with parsimony (MRP), produces accurate supertrees but is rather slow due to the underlying hard optimization problem. In this paper, we present an extensive simulation study comparing the performance of MRP and the polynomial supertree methods MinCut Supertree, Modified MinCut Supertree, Build-with-distances, PhySIC, PhySIC_IST, and super distance matrix. We consider both quality and resolution of the reconstructed supertrees. Our findings illustrate the tradeoff between accuracy and running time in supertree construction, as well as the pros and cons of voting- and veto-based supertree approaches. Based on our results, we make some general suggestions for supertree methods yet to come.

9.
PLoS One ; 6(11): e27507, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-22102902

RESUMO

Although the fungal order Mortierellales constitutes one of the largest classical groups of Zygomycota, its phylogeny is poorly understood and no modern taxonomic revision is currently available. In the present study, 90 type and reference strains were used to infer a comprehensive phylogeny of Mortierellales from the sequence data of the complete ITS region and the LSU and SSU genes with a special attention to the monophyly of the genus Mortierella. Out of 15 alternative partitioning strategies compared on the basis of Bayes factors, the one with the highest number of partitions was found optimal (with mixture models yielding the best likelihood and tree length values), implying a higher complexity of evolutionary patterns in the ribosomal genes than generally recognized. Modeling the ITS1, 5.8S, and ITS2, loci separately improved model fit significantly as compared to treating all as one and the same partition. Further, within-partition mixture models suggests that not only the SSU, LSU and ITS regions evolve under qualitatively and/or quantitatively different constraints, but that significant heterogeneity can be found within these loci also. The phylogenetic analysis indicated that the genus Mortierella is paraphyletic with respect to the genera Dissophora, Gamsiella and Lobosporangium and the resulting phylogeny contradict previous, morphology-based sectional classification of Mortierella. Based on tree structure and phenotypic traits, we recognize 12 major clades, for which we attempt to summarize phenotypic similarities. M. longicollis is closely related to the outgroup taxon Rhizopus oryzae, suggesting that it belongs to the Mucorales. Our results demonstrate that traits used in previous classifications of the Mortierellales are highly homoplastic and that the Mortierellales is in a need of a reclassification, where new, phylogenetically informative phenotypic traits should be identified, with molecular phylogenies playing a decisive role.


Assuntos
Teorema de Bayes , Núcleo Celular/genética , DNA Fúngico/genética , DNA Ribossômico/genética , Fungos/genética , Filogenia , Evolução Molecular , Fungos/classificação , Ribossomos/genética
10.
PLoS One ; 5(1): e8735, 2010 Jan 15.
Artigo em Inglês | MEDLINE | ID: mdl-20090945

RESUMO

BACKGROUND: Herbivore feeding elicits dramatic increases in defenses, most of which require jasmonate (JA) signaling, and against which specialist herbivores are thought to be better adapted than generalist herbivores. Unbiased transcriptional analyses of how neonate larvae cope with these induced plant defenses are lacking. METHODOLOGY/PRINCIPAL FINDINGS: We created cDNA microarrays for Manduca sexta and Heliothis virescens separately, by spotting normalized midgut-specific cDNA libraries created from larvae that fed for 24 hours on MeJA-elicited wild-type (WT) Nicotiana attenuata plants. These microarrays were hybridized with labeled probes from neonates that fed for 24 hours on WT and isogenic plants progressively silenced in JA-mediated defenses (N: nicotine; N/PI: N and trypsin protease inhibitors; JA: all JA-mediated defenses). H. virescens neonates regulated 16 times more genes than did M. sexta neonates when they fed on plants silenced in JA-mediated defenses, and for both species, the greater the number of defenses silenced in the host plant (JA > N/PI > N), the greater were the number of transcripts regulated in the larvae. M. sexta larvae tended to down-regulate while H. virescens larvae up- and down-regulated transcripts from the same functional categories of genes. M. sexta larvae regulated transcripts in a diet-specific manner, while H. virescens larvae regulated a similar suite of transcripts across all diet types. CONCLUSIONS/SIGNIFICANCE: The observations are consistent with the expectation that specialists are better adapted than generalist herbivores to the defense responses elicited in their host plants by their feeding. While M. sexta larvae appear to be better adapted to N. attenuata's defenses, some of the elicited responses remain effective defenses against both herbivore species. The regulated genes provide novel insights into larval adaptations to N. attenuata's induced defenses, and represent potential targets for plant-mediated RNAi to falsify hypotheses about the process of adaptation.


Assuntos
Manduca/fisiologia , Mariposas/fisiologia , Nicotiana/parasitologia , Transcrição Gênica , Animais , DNA Complementar/genética , Manduca/genética , Mariposas/genética , Análise de Sequência com Séries de Oligonucleotídeos , Especificidade da Espécie
11.
Bioinformatics ; 22(7): 889-90, 2006 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-16418234

RESUMO

MOTIVATION: The first version of the graphical multiple sequence alignment environment QAlign was published in 2003. Heavy response from the molecular-biological user community clearly demonstrated the need for such a platform. RESULTS: Panta rhei extends QAlign by several features. Major redesigns on the user interface, for instance, allow users to flexibily compose views for multiple projects. The new sequence viewer handles datasets with arbitrarily many and arbitrarily large sequences that may still be edited by guided block moving. More distance-based algorithms are available to interactively reconstruct phylogenetic trees which can now also be zoomed and navigated graphicaly. AVAILABILITY: Executables and the JAVA source code are available under the Apache license at http://gi.cebitec.uni-bielefeld.de/qalign CONTACT: qalign@cebitec.uni-bielefeld.de.


Assuntos
Gráficos por Computador , Sistemas de Gerenciamento de Base de Dados , Armazenamento e Recuperação da Informação/métodos , Análise de Sequência/métodos , Software , Interface Usuário-Computador , Algoritmos , Filogenia , Alinhamento de Sequência/métodos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA