RESUMEN
Systemic lupus erythematosus (SLE) is a complex autoimmune disease involving multiple immune cells. To elucidate SLE pathogenesis, it is essential to understand the dysregulated gene expression pattern linked to various clinical statuses with a high cellular resolution. Here, we conducted a large-scale transcriptome study with 6,386 RNA sequencing data covering 27 immune cell types from 136 SLE and 89 healthy donors. We profiled two distinct cell-type-specific transcriptomic signatures: disease-state and disease-activity signatures, reflecting disease establishment and exacerbation, respectively. We then identified candidate biological processes unique to each signature. This study suggested the clinical value of disease-activity signatures, which were associated with organ involvement and therapeutic responses. However, disease-activity signatures were less enriched around SLE risk variants than disease-state signatures, suggesting that current genetic studies may not well capture clinically vital biology. Together, we identified comprehensive gene signatures of SLE, which will provide essential foundations for future genomic and genetic studies.
Asunto(s)
Lupus Eritematoso Sistémico , Transcriptoma , Humanos , Lupus Eritematoso Sistémico/genética , Análisis de Secuencia de ARNRESUMEN
Animal bodies are composed of cell types with unique expression programs that implement their distinct locations, shapes, structures, and functions. Based on these properties, cell types assemble into specific tissues and organs. To systematically explore the link between cell-type-specific gene expression and morphology, we registered an expression atlas to a whole-body electron microscopy volume of the nereid Platynereis dumerilii. Automated segmentation of cells and nuclei identifies major cell classes and establishes a link between gene activation, chromatin topography, and nuclear size. Clustering of segmented cells according to gene expression reveals spatially coherent tissues. In the brain, genetically defined groups of neurons match ganglionic nuclei with coherent projections. Besides interneurons, we uncover sensory-neurosecretory cells in the nereid mushroom bodies, which thus qualify as sensory organs. They furthermore resemble the vertebrate telencephalon by molecular anatomy. We provide an integrated browser as a Fiji plugin for remote exploration of all available multimodal datasets.
Asunto(s)
Forma de la Célula , Regulación de la Expresión Génica , Poliquetos/citología , Poliquetos/genética , Análisis de la Célula Individual , Animales , Núcleo Celular/metabolismo , Ganglios de Invertebrados/metabolismo , Perfilación de la Expresión Génica , Familia de Multigenes , Imagen Multimodal , Cuerpos Pedunculados/metabolismo , Poliquetos/ultraestructuraRESUMEN
Genetic studies have revealed many variant loci that are associated with immune-mediated diseases. To elucidate the disease pathogenesis, it is essential to understand the function of these variants, especially under disease-associated conditions. Here, we performed a large-scale immune cell gene-expression analysis, together with whole-genome sequence analysis. Our dataset consists of 28 distinct immune cell subsets from 337 patients diagnosed with 10 categories of immune-mediated diseases and 79 healthy volunteers. Our dataset captured distinctive gene-expression profiles across immune cell types and diseases. Expression quantitative trait loci (eQTL) analysis revealed dynamic variations of eQTL effects in the context of immunological conditions, as well as cell types. These cell-type-specific and context-dependent eQTLs showed significant enrichment in immune disease-associated genetic variants, and they implicated the disease-relevant cell types, genes, and environment. This atlas deepens our understanding of the immunogenetic functions of disease-associated variants under in vivo disease conditions.
Asunto(s)
Regulación de la Expresión Génica/genética , Expresión Génica/inmunología , Enfermedades del Sistema Inmune/genética , Adulto , Femenino , Expresión Génica/genética , Regulación de la Expresión Génica/inmunología , Predisposición Genética a la Enfermedad/genética , Estudio de Asociación del Genoma Completo/métodos , Humanos , Sistema Inmunológico/citología , Sistema Inmunológico/metabolismo , Enfermedades del Sistema Inmune/metabolismo , Enfermedades del Sistema Inmune/fisiopatología , Masculino , Persona de Mediana Edad , Polimorfismo de Nucleótido Simple/genética , Sitios de Carácter Cuantitativo/genética , Sitios de Carácter Cuantitativo/inmunología , Transcriptoma/genética , Secuenciación Completa del Genoma/métodosRESUMEN
BACKGROUND: Fruit crops, including tropical and subtropical fruits like Avocado (Persea americana), Fig (Ficus carica), Date Palm (Phoenix dactylifera), Mango (Mangifera indica), Guava (Psidium guajava), Papaya (Carica papaya), Pineapple (Ananas comosus), and Banana (Musa acuminata) are economically vital, contributing significantly to global agricultural output, as classified by the FAO's World Programme for the Census of Agriculture. Advancements in next-generation sequencing, have transformed fruit crop breeding by providing in-depth genomic and transcriptomic data. RNA sequencing enables high-throughput analysis of gene expression, and functional genomics, crucial for addressing horticultural challenges and enhancing fruit production. The genomic and expression data for key tropical and sub-tropical fruit crops is currently lacking a comprehensive expression atlas, revealing a significant gap in resources for horticulturists who require a unified platform with diverse datasets across various conditions and cultivars. RESULTS: The Fruit Expression Atlas (FEAtl), available at http://backlin.cabgrid.res.in/FEAtl/ , is a first-ever extensive and unified expression atlas for tropical and subtropical fruit crops developed using 3-tier architecture. The expressivity of coding and non-coding genes, encompassing 2,060 RNA-Seq samples across 91 tissue types and 177 BioProjects, it provides a comprehensive view of gene expression patterns for different tissues under various conditions. FEAtl features multiple tabs that cater to different aspects of the dataset, namely, Home, About, Analyze, Statistics, and Team and contains seven central functional modules: Transcript Information,Sample Information, Expression Profiles in FPKM and TPM, Functional Analysis, Genes Based on Tau Score, and Search for Specific Gene. The expression of a transcript of interest can be easily queried by searching by tissue ID and transcript type. Expression data can be displayed as a heat map, along with functional descriptions as well as Gene Ontology and Kyoto Encyclopedia of Genes and Genomes. CONCLUSIONS: This atlas represents a groundbreaking compilation of a wide array of information pertaining to eight distinct fruit crops and serves as a fundamental resource for comparative analysis among different fruit species and is a catalyst for functional genomic studies. Database availability: http://backlin.cabgrid.res.in/FEAtl/ .
Asunto(s)
Productos Agrícolas , Frutas , Genómica , Productos Agrícolas/genética , Frutas/genética , Genómica/métodos , Internet , Bases de Datos Genéticas , Persea/genética , Carica/genética , Musa/genética , Transcriptoma , Regulación de la Expresión Génica de las PlantasRESUMEN
Eucalyptus is a widely planted hardwood tree species due to its fast growth, superior wood properties and adaptability. However, the post-transcriptional regulatory mechanisms controlling tissue development and stress responses in Eucalyptus remain poorly understood. In this study, we performed a comprehensive analysis of the gene expression profile and the alternative splicing (AS) landscape of E. grandis using strand-specific RNA-Seq, which encompassed 201 libraries including different organs, developmental stages, and environmental stresses. We identified 10 416 genes (33.49%) that underwent AS, and numerous differentially expressed and/or differential AS genes involved in critical biological processes, such as primary-to-secondary growth transition of stems, adventitious root formation, aging and responses to phosphorus- or boron-deficiency. Co-expression analysis of AS events and gene expression patterns highlighted the potential upstream regulatory role of AS events in multiple processes. Additionally, we highlighted the lignin biosynthetic pathway to showcase the potential regulatory functions of AS events in the KNAT3 and IRL3 genes within this pathway. Our high-quality expression atlas and AS landscape serve as valuable resources for unravelling the genetic control of woody plant development, long-term adaptation, and understanding transcriptional diversity in Eucalyptus. Researchers can conveniently access these resources through the interactive ePlant browser (https://bar.utoronto.ca/eplant_eucalyptus).
Asunto(s)
Eucalyptus , Genes de Plantas , Genes de Plantas/genética , Eucalyptus/fisiología , Empalme Alternativo/genética , Madera , Transcriptoma , Perfilación de la Expresión Génica , Regulación de la Expresión Génica de las PlantasRESUMEN
Genome-wide transcriptome analysis provides systems-level insights into plant biology. Due to the limited depth of quantitative proteomics our understanding of gene-protein-complex stoichiometry is largely unknown in plants. Recently, the complexity of the proteome and its cell-/tissue-specific distribution have boosted the research community to the integration of transcriptomics and proteomics landscapes in a proteogenomic approach. Herein, we generated a quantitative proteome and transcriptome abundance atlas of 15 major sweet cherry (Prunus avium L., cv 'Tragana Edessis') tissues represented by 29 247 genes and 7584 proteins. Additionally, 199 984 alternative splicing events, particularly exon skipping and alternative 3' splicing, were identified in 23 383 transcribed regions of the analyzed tissues. Common signatures as well as differences between mRNA and protein quantities, including genes encoding transcription factors and allergens, within and across the different tissues are reported. Using our integrated dataset, we identified key putative regulators of fruit development, notably genes involved in the biosynthesis of anthocyanins and flavonoids. We also provide proteogenomic-based evidence for the involvement of ethylene signaling and pectin degradation in cherry fruit ripening. Moreover, clusters of genes and proteins with similar and different expression and suppression trends across diverse tissues and developmental stages revealed a relatively low RNA abundance-to-protein correlation. The present proteogenomic analysis allows us to identify 17 novel sweet cherry proteins without prior protein-level annotation evidenced in the currently available databases. To facilitate use by the community, we also developed the Sweet Cherry Atlas Database (https://grcherrydb.com/) for viewing and data mining these resources. This work provides new insights into the proteogenomics workflow in plants and a rich knowledge resource for future investigation of gene and protein functions in Prunus species.
Asunto(s)
Ascomicetos , Proteogenómica , Prunus avium , Antocianinas/metabolismo , Ascomicetos/metabolismo , Frutas/metabolismo , Regulación de la Expresión Génica de las Plantas/genética , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo , Proteoma/genética , Proteoma/metabolismo , Prunus avium/genética , Transcriptoma/genética , Árboles/genéticaRESUMEN
BACKGROUND: Caterpillars from the insect order Lepidoptera are some of the most widespread and destructive agricultural pests. Most of their impact is at the larval stage, where the midgut epithelium mediates the digestion and absorption of an astonishing amount of food. Although this tissue has been the subject of frequent investigation in Lepidoptera, a comprehensive expression atlas has yet to be generated. RESULTS: Here, we perform RNA-sequencing and proteomics on the gut of the polyphagous pest Helicoverpa armigera across, life stages, diet types, and compartments of the anterior-posterior axis. A striking relationship between the structural homology and expression pattern of a group of sugar transporters was observed in the early larval stages. Further comparisons were made among the spatial compartments of the midgut, which suggested a putative role for vATPases and SLC9 transporters in the generation of alkaline conditions in the H. armigera midgut. CONCLUSIONS: This comprehensive resource will aid the scientific community in understanding lepidopteran gut physiology in unprecedented resolution. It is hoped that this study advances the understanding of the lepidopteran midgut and also facilitates functional work in this field.
Asunto(s)
Mariposas Nocturnas , Animales , Sistema Digestivo , Concentración de Iones de Hidrógeno , Larva , NutrientesRESUMEN
Physcomitrella patens is a bryophyte model plant that is often used to study plant evolution and development. Its resources are of great importance for comparative genomics and evo-devo approaches. However, expression data from Physcomitrella patens were so far generated using different gene annotation versions and three different platforms: CombiMatrix and NimbleGen expression microarrays and RNA sequencing. The currently available P. patens expression data are distributed across three tools with different visualization methods to access the data. Here, we introduce an interactive expression atlas, Physcomitrella Expression Atlas Tool (PEATmoss), that unifies publicly available expression data for P. patens and provides multiple visualization methods to query the data in a single web-based tool. Moreover, PEATmoss includes 35 expression experiments not previously available in any other expression atlas. To facilitate gene expression queries across different gene annotation versions, and to access P. patens annotations and related resources, a lookup database and web tool linked to PEATmoss was implemented. PEATmoss can be accessed at https://peatmoss.online.uni-marburg.de.
Asunto(s)
Bryopsida/genética , Transcriptoma , Atlas como Asunto , Bryopsida/metabolismo , Conjuntos de Datos como Asunto , Expresión Génica/genética , Genes de Plantas/genética , Internet , Micorrizas/metabolismo , Transcriptoma/genéticaRESUMEN
BACKGROUND: Summer squash (Cucurbita pepo: Cucurbitaceae) are a popular horticultural crop for which there is insufficient genomic and transcriptomic information. Gene expression atlases are crucial for the identification of genes expressed in different tissues at various plant developmental stages. Here, we present the first comprehensive gene expression atlas for a summer squash cultivar, including transcripts obtained from seeds, shoots, leaf stem, young and developed leaves, male and female flowers, fruits of seven developmental stages, as well as primary and lateral roots. RESULTS: In total, 27,868 genes and 2352 novel transcripts were annotated from these 16 tissues, with over 18,000 genes common to all tissue groups. Of these, 3812 were identified as housekeeping genes, half of which assigned to known gene ontologies. Flowers, seeds, and young fruits had the largest number of specific genes, whilst intermediate-age fruits the fewest. There also were genes that were differentially expressed in the various tissues, the male flower being the tissue with the most differentially expressed genes in pair-wise comparisons with the remaining tissues, and the leaf stem the least. The largest expression change during fruit development was early on, from female flower to fruit two days after pollination. A weighted correlation network analysis performed on the global gene expression dataset assigned 25,413 genes to 24 coexpression groups, and some of these groups exhibited strong tissue specificity. CONCLUSIONS: These findings enrich our understanding about the transcriptomic events associated with summer squash development and ripening. This comprehensive gene expression atlas is expected not only to provide a global view of gene expression patterns in all major tissues in C. pepo but to also serve as a valuable resource for functional genomics and gene discovery in Cucurbitaceae.
Asunto(s)
Cucurbita , Cucurbita/genética , Flores/genética , Frutas/genética , Regulación de la Expresión Génica de las Plantas , Polinización , RNA-SeqRESUMEN
KEY POINTS: Tendon is a hypocellular, matrix-rich tissue that has been excluded from comparative transcriptional atlases. These atlases have provided important knowledge about biological heterogeneity between tissues, and our study addresses this important gap. We performed measures on four of the most studied tendons, the Achilles, forepaw flexor, patellar and supraspinatus tendons of both mice and rats. These tendons are functionally distinct and are also among the most commonly injured, and therefore of important translational interest. Approximately one-third of the filtered transcriptome was differentially regulated between Achilles, forepaw flexor, patellar and supraspinatus tendons within either mice or rats. Nearly two-thirds of the transcripts that are expressed in anatomically similar tendons were different between mice and rats. The overall findings from this study identified that although tendons across the body share a common anatomical definition based on their physical location between skeletal muscle and bone, tendon is a surprisingly genetically heterogeneous tissue. ABSTRACT: Tendon is a functionally important connective tissue that transmits force between skeletal muscle and bone. Previous studies have evaluated the architectural designs and mechanical properties of different tendons throughout the body. However, less is known about the underlying transcriptional differences between tendons that may dictate their designs and properties. Therefore, our objective was to develop a comprehensive atlas of the transcriptome of limb tendons in adult mice and rats using systems biology techniques. We selected the Achilles, forepaw digit flexor, patellar, and supraspinatus tendons due to their divergent functions and high rates of injury and tendinopathies in patients. Using RNA sequencing data, we generated the Comparative Tendon Transcriptional Database (CTTDb) that identified substantial diversity in the transcriptomes of tendons both within and across species. Approximately 30% of filtered transcripts were differentially regulated between tendons of a given species, and nearly 60% of the filtered transcripts present in anatomically similar tendons were different between species. Many of the genes that differed between tendons and across species are important in tissue specification and limb morphogenesis, tendon cell biology and tenogenesis, growth factor signalling, and production and maintenance of the extracellular matrix. This study indicates that tendon is a surprisingly heterogenous tissue with substantial genetic variation based on anatomical location and species.
Asunto(s)
Tendón Calcáneo , Tendinopatía , Animales , Matriz Extracelular , Humanos , Ratones , Ratas , Análisis de Secuencia de ARN , TranscriptomaRESUMEN
BACKGROUND: Long noncoding RNAs (lncRNAs) have roles in gene regulation, epigenetics, and molecular scaffolding and it is hypothesized that they underlie some mammalian evolutionary adaptations. However, for many mammalian species, the absence of a genome assembly precludes the comprehensive identification of lncRNAs. The genome of the American beaver (Castor canadensis) has recently been sequenced, setting the stage for the systematic identification of beaver lncRNAs and the characterization of their expression in various tissues. The objective of this study was to discover and profile polyadenylated lncRNAs in the beaver using high-throughput short-read sequencing of RNA from sixteen beaver tissues and to annotate the resulting lncRNAs based on their potential for orthology with known lncRNAs in other species. RESULTS: Using de novo transcriptome assembly, we found 9528 potential lncRNA contigs and 187 high-confidence lncRNA contigs. Of the high-confidence lncRNA contigs, 147 have no known orthologs (and thus are putative novel lncRNAs) and 40 have mammalian orthologs. The novel lncRNAs mapped to the Oregon State University (OSU) reference beaver genome with greater than 90% sequence identity. While the novel lncRNAs were on average shorter than their annotated counterparts, they were similar to the annotated lncRNAs in terms of the relationships between contig length and minimum free energy (MFE) and between coverage and contig length. We identified beaver orthologs of known lncRNAs such as XIST, MEG3, TINCR, and NIPBL-DT. We profiled the expression of the 187 high-confidence lncRNAs across 16 beaver tissues (whole blood, brain, lung, liver, heart, stomach, intestine, skeletal muscle, kidney, spleen, ovary, placenta, castor gland, tail, toe-webbing, and tongue) and identified both tissue-specific and ubiquitous lncRNAs. CONCLUSIONS: To our knowledge this is the first report of systematic identification of lncRNAs and their expression atlas in beaver. LncRNAs-both novel and those with known orthologs-are expressed in each of the beaver tissues that we analyzed. For some beaver lncRNAs with known orthologs, the tissue-specific expression patterns were phylogenetically conserved. The lncRNA sequence data files and raw sequence files are available via the web supplement and the NCBI Sequence Read Archive, respectively.
Asunto(s)
Perfilación de la Expresión Génica , ARN Largo no Codificante , Roedores/genética , Transcriptoma , Animales , Biología Computacional/métodos , Regulación de la Expresión Génica , Genoma , Anotación de Secuencia Molecular , Conformación de Ácido Nucleico , Especificidad de Órganos/genéticaRESUMEN
Spatio-temporal and developmental stage-specific transcriptome analysis plays a crucial role in systems biology-based improvement of any species. In this context, we report here the Arachis hypogaea gene expression atlas (AhGEA) for the world's widest cultivated subsp. fastigiata based on RNA-seq data using 20 diverse tissues across five key developmental stages. Approximately 480 million paired-end filtered reads were generated followed by identification of 81 901 transcripts from an early-maturing, high-yielding, drought-tolerant groundnut variety, ICGV 91114. Further, 57 344 genome-wide transcripts were identified with ≥1 FPKM across different tissues and stages. Our in-depth analysis of the global transcriptome sheds light into complex regulatory networks namely gravitropism and photomorphogenesis, seed development, allergens and oil biosynthesis in groundnut. Importantly, interesting insights into molecular basis of seed development and nodulation have immense potential for translational genomics research. We have also identified a set of stable expressing transcripts across the selected tissues, which could be utilized as internal controls in groundnut functional genomics studies. The AhGEA revealed potential transcripts associated with allergens, which upon appropriate validation could be deployed in the coming years to develop consumer-friendly groundnut varieties. Taken together, the AhGEA touches upon various important and key features of cultivated groundnut and provides a reference for further functional, comparative and translational genomics research for various economically important traits.
Asunto(s)
Arachis , Fabaceae , Arachis/genética , Genómica , Fenotipo , SemillasRESUMEN
The comparative study of cell types is a powerful approach toward deciphering animal evolution. To avoid selection biases, however, comparisons ideally involve all cell types present in a multicellular organism. Here, we use image registration and a newly developed "Profiling by Signal Probability Mapping" algorithm to generate a cellular resolution 3D expression atlas for an entire animal. We investigate three-segmented young worms of the marine annelid Platynereis dumerilii, with a rich diversity of differentiated cells present in relatively low number. Starting from whole-mount expression images for close to 100 neural specification and differentiation genes, our atlas identifies and molecularly characterizes 605 bilateral pairs of neurons at specific locations in the ventral nerve cord. Among these pairs, we identify sets of neurons expressing similar combinations of transcription factors, located at spatially coherent anterior-posterior, dorsal-ventral, and medial-lateral coordinates that we interpret as cell types. Comparison with motor and interneuron types in the vertebrate neural tube indicates conserved combinations, for example, of cell types cospecified by Gata1/2/3 and Tal transcription factors. These include V2b interneurons and the central spinal fluid-contacting Kolmer-Agduhr cells in the vertebrates, and several neuron types in the intermediate ventral ganglionic mass in the annelid. We propose that Kolmer-Agduhr cell-like mechanosensory neurons formed part of the mucociliary sole in protostome-deuterostome ancestors and diversified independently into several neuron types in annelid and vertebrate descendants.
Asunto(s)
Evolución Biológica , Poliquetos/genética , Algoritmos , Animales , Tipificación del Cuerpo/genética , Diferenciación Celular , Perfilación de la Expresión Génica/métodos , Regulación del Desarrollo de la Expresión Génica , Modelos Biológicos , Neuronas/citología , Poliquetos/citologíaRESUMEN
BACKGROUND: The domestic chicken (Gallus gallus) is widely used as a model in developmental biology and is also an important livestock species. We describe a novel approach to data integration to generate an mRNA expression atlas for the chicken spanning major tissue types and developmental stages, using a diverse range of publicly-archived RNA-seq datasets and new data derived from immune cells and tissues. RESULTS: Randomly down-sampling RNA-seq datasets to a common depth and quantifying expression against a reference transcriptome using the mRNA quantitation tool Kallisto ensured that disparate datasets explored comparable transcriptomic space. The network analysis tool Graphia was used to extract clusters of co-expressed genes from the resulting expression atlas, many of which were tissue or cell-type restricted, contained transcription factors that have previously been implicated in their regulation, or were otherwise associated with biological processes, such as the cell cycle. The atlas provides a resource for the functional annotation of genes that currently have only a locus ID. We cross-referenced the RNA-seq atlas to a publicly available embryonic Cap Analysis of Gene Expression (CAGE) dataset to infer the developmental time course of organ systems, and to identify a signature of the expansion of tissue macrophage populations during development. CONCLUSION: Expression profiles obtained from public RNA-seq datasets - despite being generated by different laboratories using different methodologies - can be made comparable to each other. This meta-analytic approach to RNA-seq can be extended with new datasets from novel tissues, and is applicable to any species.
Asunto(s)
Pollos/genética , Perfilación de la Expresión Génica/métodos , Análisis de Secuencia de ARN/métodos , Animales , Atlas como Asunto , Bases de Datos Genéticas , Secuenciación de Nucleótidos de Alto RendimientoRESUMEN
Melon (Cucumis melo L.) is an important Cucurbitaceae crop produced worldwide, exhibiting wide genetic variations and comprising both climacteric and non-climacteric fruit types. The muskmelon cultivar "'Earl's favorite Harukei-3 (Harukei-3)"' known for its sweetness and rich aroma is used for breeding of high-grade muskmelon in Japan. We conducted RNA sequencing (RNA-seq) transcriptome studies in 30 different tissues of the 'Harukei-3' melon. These included root, stems, leaves, flowers, regenerating callus and ovaries, in addition to the flesh and peel sampled at seven stages of fruit development. The expression patterns of 20,752 genes were determined with fragments per kilobase of transcript per million fragments sequenced (FPKM) >1 in at least one tissue. Principal component analysis distinguished 30 melon tissues based on the global gene expression profile and, further, the weighted gene correlation network analysis classified melon genes into 45 distinct coexpression groups. Some coexpression groups exhibited tissue-specific gene expression. Furthermore, we developed and published web application tools designated "'Gene expression map viewer"' and "'Coexpression viewer"' on our website Melonet-DB (http://melonet-db.agbi.tsukuba.ac.jp/) to promote functional genomics research in melon. By using both tools, we analyzed melon homologs of tomato fruit ripening regulators such as E8, RIPENING-INHIBITOR (RIN) and NON-RIPENING (NOR). The "'Coexpression viewer"' clearly distinguished fruit ripening-associated melon RIN/NOR/CNR homologs from those expressed in other tissues. In addition, several other MADS-box, NAM/ATAF/CUC (NAC) and homeobox transcription factor genes were identified as fruit ripening-associated genes. Our tools provide useful information for research not only on melon but also on other fleshy fruit plants.
Asunto(s)
Biología Computacional/métodos , Cucumis melo/genética , Bases de Datos Genéticas , Perfilación de la Expresión Génica , Regulación de la Expresión Génica de las Plantas , Cucumis melo/crecimiento & desarrollo , Flores/genética , Flores/crecimiento & desarrollo , Frutas/genética , Frutas/crecimiento & desarrollo , Regulación del Desarrollo de la Expresión Génica , Redes Reguladoras de Genes , Genes de Plantas/genética , Internet , Análisis de Secuencia de ARNRESUMEN
In developing embryos, gene regulatory networks drive cells towards discrete terminal fates, a process called canalization. We studied the behavior of the anterior-posterior segmentation network in Drosophila melanogaster embryos by depleting a key maternal input, bicoid (bcd), and measuring gene expression patterns of the network at cellular resolution. This method results in a gene expression atlas containing the levels of mRNA or protein expression of 13 core patterning genes over six time points for every cell of the blastoderm embryo. This is the first cellular resolution dataset of a genetically perturbed Drosophila embryo that captures all cells in 3D. We describe the technical developments required to build this atlas and how the method can be employed and extended by others. We also analyze this novel dataset to characterize the degree and timing of cell fate canalization in the segmentation network. We find that in two layers of this gene regulatory network, following depletion of bcd, individual cells rapidly canalize towards normal cell fates. This result supports the hypothesis that the segmentation network directly canalizes cell fate, rather than an alternative hypothesis whereby cells are initially mis-specified and later eliminated by apoptosis. Our gene expression atlas provides a high resolution picture of a classic perturbation and will enable further computational modeling of canalization and gene regulation in this transcriptional network.
Asunto(s)
Tipificación del Cuerpo/genética , Linaje de la Célula/genética , Bases de Datos Genéticas , Drosophila melanogaster/embriología , Redes Reguladoras de Genes/genética , Transcriptoma/genética , Animales , Proteínas de Drosophila , Proteínas de Homeodominio , Hibridación in Situ , Interferencia de ARN , Reacción en Cadena en Tiempo Real de la Polimerasa , Transactivadores/deficienciaRESUMEN
BACKGROUND: The availability of fast alignment-free algorithms has greatly reduced the computational burden of RNA-seq processing, especially for relatively poorly assembled genomes. Using these approaches, previous RNA-seq datasets could potentially be processed and integrated with newly sequenced libraries. Confounding factors in such integration include sequencing depth and methods of RNA extraction and selection. Different selection methods (typically, either polyA-selection or rRNA-depletion) omit different RNAs, resulting in different fractions of the transcriptome being sequenced. In particular, rRNA-depleted libraries sample a broader fraction of the transcriptome than polyA-selected libraries. This study aimed to develop a systematic means of accounting for library type that allows data from these two methods to be compared. RESULTS: The method was developed by comparing two RNA-seq datasets from ovine macrophages, identical except for RNA selection method. Gene-level expression estimates were obtained using a two-part process centred on the high-speed transcript quantification tool Kallisto. Firstly, a set of reference transcripts was defined that constitute a standardised RNA space, with expression from both datasets quantified against it. Secondly, a simple ratio-based correction was applied to the rRNA-depleted estimates. The outcome is an almost perfect correlation between gene expression estimates, independent of library type and across the full range of levels of expression. CONCLUSION: A combination of reference transcriptome filtering and a ratio-based correction can create equivalent expression profiles from both polyA-selected and rRNA-depleted libraries. This approach will allow meta-analysis and integration of existing RNA-seq data into transcriptional atlas projects.
Asunto(s)
Poli A/genética , ARN Ribosómico/genética , ARN/metabolismo , Análisis de Secuencia de ARN , Transcriptoma , Animales , Femenino , Perfilación de la Expresión Génica , Biblioteca de Genes , Lipopolisacáridos/toxicidad , Macrófagos/citología , Macrófagos/efectos de los fármacos , Macrófagos/metabolismo , Masculino , ARN/química , ARN/aislamiento & purificación , ARN Ribosómico/metabolismo , OvinosRESUMEN
BACKGROUND: Since experimental elucidation of gene function is often laborious, various in silico methods have been developed to predict gene function of uncharacterized genes. Since functionally related genes are often expressed in the same tissues, conditions and developmental stages (co-expressed), functional annotation of characterized genes can be transferred to co-expressed genes lacking annotation. With genome-wide expression data available, the construction of co-expression networks, where genes are nodes and edges connect significantly co-expressed genes, provides unprecedented opportunities to predict gene function. However, the construction of such networks requires large volumes of high-quality data, multiple processing steps and a considerable amount of computation power. While efficient tools exist to process RNA-Seq data, pipelines which combine them to construct co-expression networks efficiently are currently lacking. RESULTS: LSTrAP (Large-Scale Transcriptome Analysis Pipeline), presented here, combines all essential tools to construct co-expression networks based on RNA-Seq data into a single, efficient workflow. By supporting parallel computing on computer cluster infrastructure, processing hundreds of samples becomes feasible as shown here for Arabidopsis thaliana and Sorghum bicolor, which comprised 876 and 215 samples respectively. The former was used here to show how the quality control, included in LSTrAP, can detect spurious or low-quality samples. The latter was used to show how co-expression networks are able to group known photosynthesis genes and imply a role in this process of several, currently uncharacterized, genes. CONCLUSIONS: LSTrAP combines the most popular and performant methods to construct co-expression networks from RNA-Seq data into a single workflow. This allows large amounts of expression data, required to construct co-expression networks, to be processed efficiently and consistently across hundreds of samples. LSTrAP is implemented in Python 3.4 (or higher) and available under MIT license from https://github.molgen.mpg.de/proost/LSTrAP.
Asunto(s)
Perfilación de la Expresión Génica , Redes Reguladoras de Genes , Análisis de Secuencia de ARN/métodos , Programas Informáticos , Estadística como Asunto , Arabidopsis/genética , Regulación de la Expresión Génica de las Plantas , Análisis de Componente Principal , Control de Calidad , Sorghum/genéticaRESUMEN
While Brachypodium distachyon (Brachypodium) is an emerging model for grasses, no expression atlas or gene coexpression network is available. Such tools are of high importance to provide insights into the function of Brachypodium genes. We present a detailed Brachypodium expression atlas, capturing gene expression in its major organs at different developmental stages. The data were integrated into a large-scale coexpression database ( www.gene2function.de), enabling identification of duplicated pathways and conserved processes across 10 plant species, thus allowing genome-wide inference of gene function. We highlight the importance of the atlas and the platform through the identification of duplicated cell wall modules, and show that a lignin biosynthesis module is conserved across angiosperms. We identified and functionally characterised a putative ferulate 5-hydroxylase gene through overexpression of it in Brachypodium, which resulted in an increase in lignin syringyl units and reduced lignin content of mature stems, and led to improved saccharification of the stem biomass. Our Brachypodium expression atlas thus provides a powerful resource to reveal functionally related genes, which may advance our understanding of important biological processes in grasses.
Asunto(s)
Brachypodium/citología , Brachypodium/genética , Pared Celular/genética , Regulación de la Expresión Génica de las Plantas , Redes Reguladoras de Genes , Genes de Plantas , Lignina/metabolismo , Arabidopsis/genética , Bases de Datos Genéticas , Oryza/genética , Tallos de la Planta/metabolismo , Plantas Modificadas Genéticamente , Transcriptoma/genéticaRESUMEN
Pigeonpea (Cajanus cajan) is an important grain legume of the semi-arid tropics, mainly used for its protein rich seeds. To link the genome sequence information with agronomic traits resulting from specific developmental processes, a Cajanus cajan gene expression atlas (CcGEA) was developed using the Asha genotype. Thirty tissues/organs representing developmental stages from germination to senescence were used to generate 590.84 million paired-end RNA-Seq data. The CcGEA revealed a compendium of 28 793 genes with differential, specific, spatio-temporal and constitutive expression during various stages of development in different tissues. As an example to demonstrate the application of the CcGEA, a network of 28 flower-related genes analysed for cis-regulatory elements and splicing variants has been identified. In addition, expression analysis of these candidate genes in male sterile and male fertile genotypes suggested their critical role in normal pollen development leading to seed formation. Gene network analysis also identified two regulatory genes, a pollen-specific SF3 and a sucrose-proton symporter, that could have implications for improvement of agronomic traits such as seed production and yield. In conclusion, the CcGEA provides a valuable resource for pigeonpea to identify candidate genes involved in specific developmental processes and to understand the well-orchestrated growth and developmental process in this resilient crop.