RESUMEN
Electroencephalogram (EEG) interpretation plays a critical role in the clinical assessment of neurological conditions, most notably epilepsy. However, EEG recordings are typically analyzed manually by highly specialized and heavily trained personnel. Moreover, the low rate of capturing abnormal events during the procedure makes interpretation time-consuming, resource-hungry, and overall an expensive process. Automatic detection offers the potential to improve the quality of patient care by shortening the time to diagnosis, managing big data and optimizing the allocation of human resources towards precision medicine. Here, we present MindReader, a novel unsupervised machine-learning method comprised of the interplay between an autoencoder network, a hidden Markov model (HMM), and a generative component: after dividing the signal into overlapping frames and performing a fast Fourier transform, MindReader trains an autoencoder neural network for dimensionality reduction and compact representation of different frequency patterns for each frame. Next, we processed the temporal patterns using a HMM, while a third and generative component hypothesized and characterized the different phases that were then fed back to the HMM. MindReader then automatically generates labels that the physician can interpret as pathological and non-pathological phases, thus effectively reducing the search space for trained personnel. We evaluated MindReader's predictive performance on 686 recordings, encompassing more than 980 h from the publicly available Physionet database. Compared to manual annotations, MindReader identified 197 of 198 epileptic events (99.45%), and is, as such, a highly sensitive method, which is a prerequisite for clinical use.
Asunto(s)
Electroencefalografía , Epilepsia , Humanos , Electroencefalografía/métodos , Epilepsia/diagnóstico , Redes Neurales de la Computación , Análisis de Fourier , Aprendizaje Automático no SupervisadoRESUMEN
BACKGROUND: Generating polygenic risk scores for diseases and complex traits requires high quality GWAS summary statistic files. Often, these files can be difficult to acquire either as a result of unshared or incomplete data. To date, bioinformatics tools which focus on restoring missing columns containing identification and association data are limited, which has the potential to increase the number of usable GWAS summary statistics files. RESULTS: SumStatsRehab was able to restore rsID, effect/other alleles, chromosome, base pair position, effect allele frequencies, beta, standard error, and p-values to a better extent than any other currently available tool, with minimal loss. CONCLUSIONS: SumStatsRehab offers a unique tool utilizing both functional programming and pipeline-like architecture, allowing users to generate accurate data restorations for incomplete summary statistics files. This in turn, increases the number of usable GWAS summary statistics files, which may be invaluable for less researched health traits.
Asunto(s)
Estudio de Asociación del Genoma Completo , Polimorfismo de Nucleótido Simple , Herencia Multifactorial , Fenotipo , AlgoritmosRESUMEN
The current gold standard of gait diagnostics is dependent on large, expensive motion-capture laboratories and highly trained clinical and technical staff. Wearable sensor systems combined with machine learning may help to improve the accessibility of objective gait assessments in a broad clinical context. However, current algorithms lack flexibility and require large training datasets with tedious manual labelling of data. The current study tests the validity of a novel machine learning algorithm for automated gait partitioning of laboratory-based and sensor-based gait data. The developed artificial intelligence tool was used in patients with a central neurological lesion and severe gait impairments. To build the novel algorithm, 2% and 3% of the entire dataset (567 and 368 steps in total, respectively) were required for assessments with laboratory equipment and inertial measurement units. The mean errors of machine learning-based gait partitions were 0.021 s for the laboratory-based datasets and 0.034 s for the sensor-based datasets. Combining reinforcement learning with a deep neural network allows significant reduction in the size of the training datasets to <5%. The low number of required training data provides end-users with a high degree of flexibility. Non-experts can easily adjust the developed algorithm and modify the training library depending on the measurement system and clinical population.
Asunto(s)
Inteligencia Artificial , Marcha , Algoritmos , Humanos , Aprendizaje Automático , Redes Neurales de la ComputaciónRESUMEN
Carbon storage and cycling in boreal forests-the largest terrestrial carbon store-is moderated by complex interactions between trees and soil microorganisms. However, existing methods limit our ability to predict how changes in environmental conditions will alter these associations and the essential ecosystem services they provide. To address this, we developed a metatranscriptomic approach to analyze the impact of nutrient enrichment on Norway spruce fine roots and the community structure, function, and tree-microbe coordination of over 350 root-associated fungal species. In response to altered nutrient status, host trees redefined their relationship with the fungal community by reducing sugar efflux carriers and enhancing defense processes. This resulted in a profound restructuring of the fungal community and a collapse in functional coordination between the tree and the dominant Basidiomycete species, and an increase in functional coordination with versatile Ascomycete species. As such, there was a functional shift in community dominance from Basidiomycetes species, with important roles in enzymatically cycling recalcitrant carbon, to Ascomycete species that have melanized cell walls that are highly resistant to degradation. These changes were accompanied by prominent shifts in transcriptional coordination between over 60 predicted fungal effectors, with more than 5,000 Norway spruce transcripts, providing mechanistic insight into the complex molecular dialogue coordinating host trees and their fungal partners. The host-microbe dynamics captured by this study functionally inform how these complex and sensitive biological relationships may mediate the carbon storage potential of boreal soils under changing nutrient conditions.
Asunto(s)
Ascomicetos , Basidiomycota , Micorrizas , Picea , Ascomicetos/metabolismo , Basidiomycota/metabolismo , Carbono/metabolismo , Ecosistema , Bosques , Micorrizas/genética , Micorrizas/fisiología , Picea/genética , Picea/microbiología , Suelo/química , Microbiología del Suelo , Taiga , Transcriptoma , Árboles/metabolismo , Árboles/microbiologíaRESUMEN
The vast majority of human traits, including many disease phenotypes, are affected by alleles at numerous genomic loci. With a continually increasing set of variants with published clinical disease or biomarker associations, an easy-to-use tool for non-programmers to rapidly screen VCF files for risk alleles is needed. We have developed EZTraits as a tool to quickly evaluate genotype data against a set of rules defined by the user. These rules can be defined directly in the scripting language Lua, for genotype calls using variant ID (RS number) or chromosomal position. Alternatively, EZTraits can parse simple and intuitive text including concepts like 'any' or 'all'. Thus, EZTraits is designed to support rapid genetic analysis and hypothesis-testing by researchers, regardless of programming experience or technical background. The software is implemented in C++ and compiles and runs on Linux and MacOS. The source code is available under the MIT license from https://github.com/selfdecode/rd-eztraits.
Asunto(s)
Genómica , Programas Informáticos , Alelos , Genotipo , FenotipoRESUMEN
The health, growth, and fitness of boreal forest trees are impacted and improved by their associated microbiomes. Microbial gene expression and functional activity can be assayed with RNA sequencing (RNA-Seq) data from host samples. In contrast, phylogenetic marker gene amplicon sequencing data are used to assess taxonomic composition and community structure of the microbiome. Few studies have considered how much of this structural and taxonomic information is included in transcriptomic data from matched samples. Here, we described fungal communities using both host-derived RNA-Seq and fungal ITS1 DNA amplicon sequencing to compare the outcomes between the methods. We used a panel of root and needle samples from the coniferous tree species Picea abies (Norway spruce) growing in untreated (nutrient-deficient) and nutrient-enriched plots at the Flakaliden forest research site in boreal northern Sweden. We show that the relationship between samples and alpha and beta diversity indicated by the fungal transcriptome is in agreement with that generated by the ITS data, while also identifying a lack of taxonomic overlap due to limitations imposed by current database coverage. Furthermore, we demonstrate how metatranscriptomics data additionally provide biologically informative functional insights. At the community level, there were changes in starch and sucrose metabolism, biosynthesis of amino acids, and pentose and glucuronate interconversions, while processing of organic macromolecules, including aromatic and heterocyclic compounds, was enriched in transcripts assigned to the genus Cortinarius IMPORTANCE A deeper understanding of microbial communities associated with plants is revealing their importance for plant health and productivity. RNA extracted from plant field samples represents the host and other organisms present. Typically, gene expression studies focus on the plant component or, in a limited number of studies, expression in one or more associated organisms. However, metatranscriptomic data are rarely used for taxonomic profiling, which is currently performed using amplicon approaches. We created an assembly-based, reproducible, and hardware-agnostic workflow to taxonomically and functionally annotate fungal RNA-Seq data obtained from Norway spruce roots, which we compared to matching ITS amplicon sequencing data. While we identified some limitations and caveats, we show that functional, taxonomic, and compositional insights can all be obtained from RNA-Seq data. These findings highlight the potential of metatranscriptomics to advance our understanding of interaction, response, and effect between host plants and their associated microbial communities.
RESUMEN
The advent of novel sequencing techniques has unraveled a tremendous diversity on Earth. Genomic data allow us to understand ecology and function of organisms that we would not otherwise know existed. However, major methodological challenges remain, in particular for multicellular organisms with large genomes. Arbuscular mycorrhizal (AM) fungi are important plant symbionts with cryptic and complex multicellular life cycles, thus representing a suitable model system for method development. Here, we report a novel method for large scale, unbiased nuclear sorting, sequencing, and de novo assembling of AM fungal genomes. After comparative analyses of three assembly workflows we discuss how sequence data from single nuclei can best be used for different downstream analyses such as phylogenomics and comparative genomics of single nuclei. Based on analysis of completeness, we conclude that comprehensive de novo genome assemblies can be produced from six to seven nuclei. The method is highly applicable for a broad range of taxa, and will greatly improve our ability to study multicellular eukaryotes with complex life cycles.
Asunto(s)
Biología Computacional/métodos , Eucariontes/genética , Genoma , Genómica , Algoritmos , Hongos/genética , Genómica/métodos , Flujo de TrabajoRESUMEN
BACKGROUND: During infection by intracellular pathogens, a highly complex interplay occurs between the infected cell trying to degrade the invader and the pathogen which actively manipulates the host cell to enable survival and proliferation. Many intracellular pathogens pose important threats to human health and major efforts have been undertaken to better understand the host-pathogen interactions that eventually determine the outcome of the infection. Over the last decades, the unicellular eukaryote Dictyostelium discoideum has become an established infection model, serving as a surrogate macrophage that can be infected with a wide range of intracellular pathogens. In this study, we use high-throughput RNA-sequencing to analyze the transcriptional response of D. discoideum when infected with Mycobacterium marinum and Legionella pneumophila. The results were compared to available data from human macrophages. RESULTS: The majority of the transcriptional regulation triggered by the two pathogens was found to be unique for each bacterial challenge. Hallmark transcriptional signatures were identified for each infection, e.g. induction of endosomal sorting complexes required for transport (ESCRT) and autophagy genes in response to M. marinum and inhibition of genes associated with the translation machinery and energy metabolism in response to L. pneumophila. However, a common response to the pathogenic bacteria was also identified, which was not induced by non-pathogenic food bacteria. Finally, comparison with available data sets of regulation in human monocyte derived macrophages shows that the elicited response in D. discoideum is in many aspects similar to what has been observed in human immune cells in response to Mycobacterium tuberculosis and L. pneumophila. CONCLUSIONS: Our study presents high-throughput characterization of D. discoideum transcriptional response to intracellular pathogens using RNA-seq. We demonstrate that the transcriptional response is in essence distinct to each pathogen and that in many cases, the corresponding regulation is recapitulated in human macrophages after infection by mycobacteria and L. pneumophila. This indicates that host-pathogen interactions are evolutionary conserved, derived from the early interactions between free-living phagocytic cells and bacteria. Taken together, our results strengthen the use of D. discoideum as a general infection model.
Asunto(s)
Infecciones Bacterianas/microbiología , Dictyostelium/microbiología , Modelos Biológicos , Proteínas Protozoarias/genética , Células Cultivadas , Citoplasma/microbiología , Perfilación de la Expresión Génica , Regulación de la Expresión Génica , Interacciones Huésped-Patógeno/genética , Humanos , Legionella pneumophila/fisiología , Macrófagos/microbiología , Mycobacterium marinum/fisiología , Proteínas Protozoarias/metabolismo , Especificidad de la Especie , Transcripción GenéticaRESUMEN
An amendment to this paper has been published and can be accessed via a link at the top of the paper.
RESUMEN
Type 2 diabetes (T2D) mellitus is a complex metabolic disease commonly caused by insulin resistance in several tissues. We performed a matched two-dimensional metabolic screening in tissue samples from 43 multi-organ donors. The intra-individual analysis was assessed across five key metabolic tissues (serum, visceral adipose tissue, liver, pancreatic islets and skeletal muscle), and the inter-individual across three different groups reflecting T2D progression. We identified 92 metabolites differing significantly between non-diabetes and T2D subjects. In diabetes cases, carnitines were significantly higher in liver, while lysophosphatidylcholines were significantly lower in muscle and serum. We tracked the primary tissue of origin for multiple metabolites whose alterations were reflected in serum. An investigation of three major stages spanning from controls, to pre-diabetes and to overt T2D indicated that a subset of lysophosphatidylcholines was significantly lower in the muscle of pre-diabetes subjects. Moreover, glycodeoxycholic acid was significantly higher in liver of pre-diabetes subjects while additional increase in T2D was insignificant. We confirmed many previously reported findings and substantially expanded on them with altered markers for early and overt T2D. Overall, the analysis of this unique dataset can increase the understanding of the metabolic interplay between organs in the development of T2D.
Asunto(s)
Biomarcadores/metabolismo , Carnitina/metabolismo , Diabetes Mellitus Tipo 2/metabolismo , Lisofosfatidilcolinas/metabolismo , Metaboloma , Estado Prediabético/metabolismo , Anciano , Biomarcadores/análisis , Estudios de Casos y Controles , Diabetes Mellitus Tipo 2/patología , Femenino , Humanos , Resistencia a la Insulina , Grasa Intraabdominal/metabolismo , Grasa Intraabdominal/patología , Hígado/metabolismo , Hígado/patología , Masculino , Metabolómica , Persona de Mediana Edad , Músculo Esquelético/metabolismo , Músculo Esquelético/patología , Estado Prediabético/patología , Transducción de SeñalRESUMEN
Schizophrenia is a common mental disorder with high heritability. It is genetically complex and to date more than a hundred risk loci have been identified. Association of environmental factors and schizophrenia has also been reported, while epigenetic analyses have yielded ambiguous and sometimes conflicting results. Here, we analyzed fresh frozen post-mortem brain tissue from a cohort of 73 subjects diagnosed with schizophrenia and 52 control samples, using the Illumina Infinium HumanMethylation450 Bead Chip, to investigate genome-wide DNA methylation patterns in the two groups. Analysis of differential methylation was performed with the Bioconductor Minfi package and modern machine-learning and visualization techniques, which were shown previously to be successful in detecting and highlighting differentially methylated patterns in case-control studies. In this dataset, however, these methods did not uncover any significant signals discerning the patient group and healthy controls, suggesting that if there are methylation changes associated with schizophrenia, they are heterogeneous and complex with small effect.
Asunto(s)
Metilación de ADN/genética , Aprendizaje Automático , Esquizofrenia/genética , Encéfalo/metabolismo , Estudios de Casos y Controles , Femenino , Regulación de la Expresión Génica , Humanos , Masculino , Persona de Mediana Edad , Esquizofrenia/metabolismoRESUMEN
BACKGROUND: Studies that aim at explaining phenotypes or disease susceptibility by genetic or epigenetic variants often rely on clustering methods to stratify individuals or samples. While statistical associations may point at increased risk for certain parts of the population, the ultimate goal is to make precise predictions for each individual. This necessitates tools that allow for the rapid inspection of each data point, in particular to find explanations for outliers. RESULTS: ACES is an integrative cluster- and phenotype-browser, which implements standard clustering methods, as well as multiple visualization methods in which all sample information can be displayed quickly. In addition, ACES can automatically mine a list of phenotypes for cluster enrichment, whereby the number of clusters and their boundaries are estimated by a novel method. For visual data browsing, ACES provides a 2D or 3D PCA or Heat Map view. ACES is implemented in Java, with a focus on a user-friendly, interactive, graphical interface. CONCLUSIONS: ACES has been proven an invaluable tool for analyzing large, pre-filtered DNA methylation data sets and RNA-Sequencing data, due to its ease to link molecular markers to complex phenotypes. The source code is available from https://github.com/GrabherrGroup/ACES .
Asunto(s)
Interfaz Usuario-Computador , Análisis por Conglomerados , Metilación de ADN , Diabetes Mellitus Tipo 1/genética , Diabetes Mellitus Tipo 1/patología , Humanos , Acceso a Internet , Análisis de Componente Principal , ARN/química , ARN/metabolismoRESUMEN
The Populus genus is one of the major plant model systems, but genomic resources have thus far primarily been available for poplar species, and primarily Populus trichocarpa (Torr. & Gray), which was the first tree with a whole-genome assembly. To further advance evolutionary and functional genomic analyses in Populus, we produced genome assemblies and population genetics resources of two aspen species, Populus tremula L. and Populus tremuloides Michx. The two aspen species have distributions spanning the Northern Hemisphere, where they are keystone species supporting a wide variety of dependent communities and produce a diverse array of secondary metabolites. Our analyses show that the two aspens share a similar genome structure and a highly conserved gene content with P. trichocarpa but display substantially higher levels of heterozygosity. Based on population resequencing data, we observed widespread positive and negative selection acting on both coding and noncoding regions. Furthermore, patterns of genetic diversity and molecular evolution in aspen are influenced by a number of features, such as expression level, coexpression network connectivity, and regulatory variation. To maximize the community utility of these resources, we have integrated all presented data within the PopGenIE web resource (PopGenIE.org).
Asunto(s)
Populus/genética , Evolución Biológica , ADN de Plantas/genética , Evolución Molecular , Variación Genética , Genética de Población/métodos , Genoma de Planta , Genómica , Desequilibrio de Ligamiento/genética , Filogenia , Selección Genética/genética , Análisis de Secuencia de ADN/métodos , Árboles/genéticaRESUMEN
The massive increase in computational power over the recent years and wider applicationsof machine learning methods, coincidental or not, were paralleled by remarkable advances inhigh-throughput DNA sequencing technologies.[...].
RESUMEN
Micro (mi)RNAs regulate gene expression in many eukaryotic organisms where they control diverse biological processes. Their biogenesis, from primary transcripts to mature miRNAs, have been extensively characterized in animals and plants, showing distinct differences between these phylogenetically distant groups of organisms. However, comparably little is known about miRNA biogenesis in organisms whose evolutionary position is placed in between plants and animals and/or in unicellular organisms. Here, we investigate miRNA maturation in the unicellular amoeba Dictyostelium discoideum, belonging to Amoebozoa, which branched out after plants but before animals. High-throughput sequencing of small RNAs and poly(A)-selected RNAs demonstrated that the Dicer-like protein DrnB is required, and essentially specific, for global miRNA maturation in D. discoideum. Our RNA-seq data also showed that longer miRNA transcripts, generally preceded by a T-rich putative promoter motif, accumulate in a drnB knock-out strain. For two model miRNAs we defined the transcriptional start sites (TSSs) of primary (pri)-miRNAs and showed that they carry the RNA polymerase II specific m7G-cap. The generation of the 3'-ends of these pri-miRNAs differs, with pri-mir-1177 reading into the downstream gene, and pri-mir-1176 displaying a distinct end. This 3´-end is processed to shorter intermediates, stabilized in DrnB-depleted cells, of which some carry a short oligo(A)-tail. Furthermore, we identified 10 new miRNAs, all DrnB dependent and developmentally regulated. Thus, the miRNA machinery in D. discoideum shares features with both plants and animals, which is in agreement with its evolutionary position and perhaps also an adaptation to its complex lifestyle: unicellular growth and multicellular development.
Asunto(s)
Dictyostelium/metabolismo , MicroARNs/biosíntesis , Proteínas Protozoarias/metabolismo , ARN Protozoario/biosíntesis , Ribonucleasa III/metabolismo , Adaptación Biológica , Evolución Biológica , Dictyostelium/genética , Técnicas de Inactivación de Genes , Genoma de Protozoos/genética , Secuenciación de Nucleótidos de Alto Rendimiento , MicroARNs/análisis , MicroARNs/genética , Sondas de Oligonucleótidos/análisis , Sondas de Oligonucleótidos/genética , Sondas de Oligonucleótidos/metabolismo , Regiones Promotoras Genéticas/genética , Proteínas Protozoarias/genética , ARN Protozoario/análisis , ARN Protozoario/genética , Ribonucleasa III/genética , Transcripción GenéticaRESUMEN
BACKGROUND: The mammalian adipose tissue plays a central role in energy-balance control, whereas the avian visceral fat hardly expresses leptin, the key adipokine in mammals. Therefore, to assess the endocrine role of adipose tissue in birds, we compared the transcriptome and proteome between two metabolically different types of chickens, broilers and layers, bred towards efficient meat and egg production, respectively. RESULTS: Broilers and layer hens, grown up to sexual maturation under free-feeding conditions, differed 4.0-fold in weight and 1.6-fold in ovarian-follicle counts, yet the relative accumulation of visceral fat was comparable. RNA-seq and mass-spectrometry (MS) analyses of visceral fat revealed differentially expressed genes between broilers and layers, 1106 at the mRNA level (FDR ≤ 0.05), and 203 at the protein level (P ≤ 0.05). In broilers, Ingenuity Pathway Analysis revealed activation of the PTEN-pathway, and in layers increased response to external signals. The expression pattern of genes encoding fat-secreted proteins in broilers and layers was characterized in the RNA-seq and MS data, as well as by qPCR on visceral fat under free feeding and 24 h-feed deprivation. This characterization was expanded using available RNA-seq data of tissues from red junglefowl, and of visceral fat from broilers of different types. These comparisons revealed expression of new adipokines and secreted proteins (LCAT, LECT2, SERPINE2, SFTP1, ZP1, ZP3, APOV1, VTG1 and VTG2) at the mRNA and/or protein levels, with dynamic gene expression patterns in the selected chicken lines (except for ZP1; FDR/P ≤ 0.05) and feed deprivation (NAMPT, SFTPA1 and ZP3) (P ≤ 0.05). In contrast, some of the most prominent adipokines in mammals, leptin, TNF, IFNG, and IL6 were expressed at a low level (FPKM/RPKM< 1) and did not show differential mRNA expression neither between broiler and layer lines nor between fed vs. feed-deprived chickens. CONCLUSIONS: Our study revealed that RNA and protein expression in visceral fat changes with selective breeding, suggesting endocrine roles of visceral fat in the selected phenotypes. In comparison to gene expression in visceral fat of mammals, our findings points to a more direct cross talk of the chicken visceral fat with the reproductive system and lower involvement in the regulation of appetite, inflammation and insulin resistance.
Asunto(s)
Pollos/genética , Grasa Intraabdominal/metabolismo , Reproducción/genética , Adipoquinas/genética , Animales , Ingestión de Alimentos , Femenino , Perfilación de la Expresión Génica , Genómica , Grasa Intraabdominal/química , Nicotinamida Fosforribosiltransferasa/genética , Fosfohidrolasa PTEN/genética , Fosfohidrolasa PTEN/metabolismo , Fenotipo , Proteómica , Proteína A Asociada a Surfactante Pulmonar/genética , ARN Mensajero/metabolismo , Análisis de Secuencia de ARN , Transducción de Señal/genética , TranscriptomaRESUMEN
BACKGROUND: Giardia intestinalis is a non-invasive protozoan parasite that causes giardiasis in humans, the most common form of parasite-induced diarrhea. Disease mechanisms are not completely defined and very few virulence factors are known. METHODOLOGY: To identify putative virulence factors and elucidate mechanistic pathways leading to disease, we have used proteomics to identify the major excretory-secretory products (ESPs) when Giardia trophozoites of WB and GS isolates (assemblages A and B, respectively) interact with intestinal epithelial cells (IECs) in vitro. FINDINGS: The main parts of the IEC and parasite secretomes are constitutively released proteins, the majority of which are associated with metabolism but several proteins are released in response to their interaction (87 and 41 WB and GS proteins, respectively, 76 and 45 human proteins in response to the respective isolates). In parasitized IECs, the secretome profile indicated effects on the cell actin cytoskeleton and the induction of immune responses whereas that of Giardia showed anti-oxidation, proteolysis (protease-associated) and induction of encystation responses. The Giardia secretome also contained immunodominant and glycosylated proteins as well as new candidate virulence factors and assemblage-specific differences were identified. A minor part of Giardia ESPs had signal peptides (29% for both isolates) and extracellular vesicles were detected in the ESPs fractions, suggesting alternative secretory pathways. Microscopic analyses showed ESPs binding to IECs and partial internalization. Parasite ESPs reduced ERK1/2 and P38 phosphorylation and NF-κB nuclear translocation. Giardia ESPs altered gene expression in IECs, with a transcriptional profile indicating recruitment of immune cells via chemokines, disturbances in glucose homeostasis, cholesterol and lipid metabolism, cell cycle and induction of apoptosis. CONCLUSIONS: This is the first study identifying Giardia ESPs and evaluating their effects on IECs. It highlights the importance of host and parasite ESPs during interactions and reveals the intricate cellular responses that can explain disease mechanisms and attenuated inflammatory responses during giardiasis.
Asunto(s)
Giardia lamblia/patogenicidad , Interacciones Huésped-Parásitos , Mucosa Intestinal/parasitología , Proteómica , Células CACO-2 , Quinasas MAP Reguladas por Señal Extracelular/fisiología , Giardia lamblia/metabolismo , Giardiasis/etiología , Humanos , Sistema de Señalización de MAP Quinasas/fisiología , Transcripción GenéticaRESUMEN
Salamanders exhibit an extraordinary ability among vertebrates to regenerate complex body parts. However, scarce genomic resources have limited our understanding of regeneration in adult salamanders. Here, we present the ~20 Gb genome and transcriptome of the Iberian ribbed newt Pleurodeles waltl, a tractable species suitable for laboratory research. We find that embryonic stem cell-specific miRNAs mir-93b and mir-427/430/302, as well as Harbinger DNA transposons carrying the Myb-like proto-oncogene have expanded dramatically in the Pleurodeles waltl genome and are co-expressed during limb regeneration. Moreover, we find that a family of salamander methyltransferases is expressed specifically in adult appendages. Using CRISPR/Cas9 technology to perturb transcription factors, we demonstrate that, unlike the axolotl, Pax3 is present and necessary for development and that contrary to mammals, muscle regeneration is normal without functional Pax7 gene. Our data provide a foundation for comparative genomic studies that generate models for the uneven distribution of regenerative capacities among vertebrates.
Asunto(s)
Extremidades/fisiología , Genoma/genética , MicroARNs/genética , Pleurodeles/genética , Regeneración/genética , Ambystoma mexicanum/genética , Animales , Sistemas CRISPR-Cas , Elementos Transponibles de ADN/genética , Células Madre Embrionarias/metabolismo , Edición Génica , Perfilación de la Expresión Génica , Genómica , Músculo Esquelético/fisiología , Factor de Transcripción PAX3/genética , Factor de Transcripción PAX7/genética , Proto-Oncogenes/genética , Regeneración/fisiologíaRESUMEN
BACKGROUND: Measuring how gene expression changes in the course of an experiment assesses how an organism responds on a molecular level. Sequencing of RNA molecules, and their subsequent quantification, aims to assess global gene expression changes on the RNA level (transcriptome). While advances in high-throughput RNA-sequencing (RNA-seq) technologies allow for inexpensive data generation, accurate post-processing and normalization across samples is required to eliminate any systematic noise introduced by the biochemical and/or technical processes. Existing methods thus either normalize on selected known reference genes that are invariant in expression across the experiment, assume that the majority of genes are invariant, or that the effects of up- and down-regulated genes cancel each other out during the normalization. RESULTS: Here, we present a novel method, moose2 , which predicts invariant genes in silico through a dynamic programming (DP) scheme and applies a quadratic normalization based on this subset. The method allows for specifying a set of known or experimentally validated invariant genes, which guides the DP. We experimentally verified the predictions of this method in the bacterium Escherichia coli, and show how moose2 is able to (i) estimate the expression value distances between RNA-seq samples, (ii) reduce the variation of expression values across all samples, and (iii) to subsequently reveal new functional groups of genes during the late stages of DNA damage. We further applied the method to three eukaryotic data sets, on which its performance compares favourably to other methods. The software is implemented in C++ and is publicly available from http://grabherr.github.io/moose2/. CONCLUSIONS: The proposed RNA-seq normalization method, moose2 , is a valuable alternative to existing methods, with two major advantages: (i) in silico prediction of invariant genes provides a list of potential reference genes for downstream analyses, and (ii) non-linear artefacts in RNA-seq data are handled adequately to minimize variations between replicates.
RESUMEN
Whole genome sequencing (WGS) is a very valuable resource to understand the evolutionary history of poorly known species. However, in organisms with large genomes, as most amphibians, WGS is still excessively challenging and transcriptome sequencing (RNA-seq) represents a cost-effective tool to explore genome-wide variability. Non-model organisms do not usually have a reference genome and the transcriptome must be assembled de-novo. We used RNA-seq to obtain the transcriptomic profile for Oreobates cruralis, a poorly known South American direct-developing frog. In total, 550,871 transcripts were assembled, corresponding to 422,999 putative genes. Of those, we identified 23,500, 37,349, 38,120 and 45,885 genes present in the Pfam, EggNOG, KEGG and GO databases, respectively. Interestingly, our results suggested that genes related to immune system and defense mechanisms are abundant in the transcriptome of O. cruralis. We also present a pipeline to assist with pre-processing, assembling, evaluating and functionally annotating a de-novo transcriptome from RNA-seq data of non-model organisms. Our pipeline guides the inexperienced user in an intuitive way through all the necessary steps to build de-novo transcriptome assemblies using readily available software and is freely available at: https://github.com/biomendi/TRANSCRIPTOME-ASSEMBLY-PIPELINE/wiki.