Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 170
Filtrar
Más filtros

Tipo del documento
Intervalo de año de publicación
1.
Nat Methods ; 21(5): 793-797, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38509328

RESUMEN

SQANTI3 is a tool designed for the quality control, curation and annotation of long-read transcript models obtained with third-generation sequencing technologies. Leveraging its annotation framework, SQANTI3 calculates quality descriptors of transcript models, junctions and transcript ends. With this information, potential artifacts can be identified and replaced with reliable sequences. Furthermore, the integrated functional annotation feature enables subsequent functional iso-transcriptomics analyses.


Asunto(s)
Anotación de Secuencia Molecular , Transcriptoma , Humanos , Anotación de Secuencia Molecular/métodos , Programas Informáticos , Perfilación de la Expresión Génica/métodos , Análisis de Secuencia de ARN/métodos , Isoformas de Proteínas/genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos
2.
Nucleic Acids Res ; 52(5): e28, 2024 Mar 21.
Artículo en Inglés | MEDLINE | ID: mdl-38340337

RESUMEN

Advances in affordable transcriptome sequencing combined with better exon and gene prediction has motivated many to compare transcription across the tree of life. We develop a mathematical framework to calculate complexity and compare transcript models. Structural features, i.e. intron retention (IR), donor/acceptor site variation, alternative exon cassettes, alternative 5'/3' UTRs, are compared and the distance between transcript models is calculated with nucleotide level precision. All metrics are implemented in a PyPi package, TranD and output can be used to summarize splicing patterns for a transcriptome (1GTF) and between transcriptomes (2GTF). TranD output enables quantitative comparisons between: annotations augmented by empirical RNA-seq data and the original transcript models; transcript model prediction tools for longread RNA-seq (e.g. FLAIR versus Isoseq3); alternate annotations for a species (e.g. RefSeq vs Ensembl); and between closely related species. In C. elegans, Z. mays, D. melanogaster, D. simulans and H. sapiens, alternative exons were observed more frequently in combination with an alternative donor/acceptor than alone. Transcript models in RefSeq and Ensembl are linked and both have unique transcript models with empirical support. D. melanogaster and D. simulans, share many transcript models and long-read RNAseq data suggests that both species are under-annotated. We recommend combined references.


Asunto(s)
Empalme Alternativo , Transcriptoma , Animales , Caenorhabditis elegans/genética , Drosophila melanogaster/genética , Perfilación de la Expresión Génica , Nucleótidos , Empalme del ARN , Análisis de Secuencia de ARN , Especificidad de la Especie , Transcriptoma/genética , Programas Informáticos
3.
Nucleic Acids Res ; 50(W1): W551-W559, 2022 07 05.
Artículo en Inglés | MEDLINE | ID: mdl-35609982

RESUMEN

PaintOmics is a web server for the integrative analysis and visualisation of multi-omics datasets using biological pathway maps. PaintOmics 4 has several notable updates that improve and extend analyses. Three pathway databases are now supported: KEGG, Reactome and MapMan, providing more comprehensive pathway knowledge for animals and plants. New metabolite analysis methods fill gaps in traditional pathway-based enrichment methods. The metabolite hub analysis selects compounds with a high number of significant genes in their neighbouring network, suggesting regulation by gene expression changes. The metabolite class activity analysis tests the hypothesis that a metabolic class has a higher-than-expected proportion of significant elements, indicating that these compounds are regulated in the experiment. Finally, PaintOmics 4 includes a regulatory omics module to analyse the contribution of trans-regulatory layers (microRNA and transcription factors, RNA-binding proteins) to regulate pathways. We show the performance of PaintOmics 4 on both mouse and plant data to highlight how these new analysis features provide novel insights into regulatory biology. PaintOmics 4 is available at https://paintomics.org/.


Asunto(s)
MicroARNs , Multiómica , Animales , Ratones , Bases de Datos Factuales , MicroARNs/genética , Factores de Transcripción , Biología Computacional/métodos
4.
Bioinformatics ; 38(9): 2657-2658, 2022 04 28.
Artículo en Inglés | MEDLINE | ID: mdl-35238331

RESUMEN

MOTIVATION: Batch effects in omics datasets are usually a source of technical noise that masks the biological signal and hampers data analysis. Batch effect removal has been widely addressed for individual omics technologies. However, multi-omic datasets may combine data obtained in different batches where omics type and batch are often confounded. Moreover, systematic biases may be introduced without notice during data acquisition, which creates a hidden batch effect. Current methods fail to address batch effect correction in these cases. RESULTS: In this article, we introduce the MultiBaC R package, a tool for batch effect removal in multi-omics and hidden batch effect scenarios. The package includes a diversity of graphical outputs for model validation and assessment of the batch effect correction. AVAILABILITY AND IMPLEMENTATION: MultiBaC package is available on Bioconductor (https://www.bioconductor.org/packages/release/bioc/html/MultiBaC.html) and GitHub (https://github.com/ConesaLab/MultiBaC.git). The data underlying this article are available in Gene Expression Omnibus repository (accession numbers GSE11521, GSE1002, GSE56622 and GSE43747). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Biología Computacional , Programas Informáticos
5.
Mol Syst Biol ; 17(6): e9864, 2021 06.
Artículo en Inglés | MEDLINE | ID: mdl-34132490

RESUMEN

Understanding stem cell regulatory circuits is the next challenge in plant biology, as these cells are essential for tissue growth and organ regeneration in response to stress. In the Arabidopsis primary root apex, stem cell-specific transcription factors BRAVO and WOX5 co-localize in the quiescent centre (QC) cells, where they commonly repress cell division so that these cells can act as a reservoir to replenish surrounding stem cells, yet their molecular connection remains unknown. Genetic and biochemical analysis indicates that BRAVO and WOX5 form a transcription factor complex that modulates gene expression in the QC cells to preserve overall root growth and architecture. Furthermore, by using mathematical modelling we establish that BRAVO uses the WOX5/BRAVO complex to promote WOX5 activity in the stem cells. Our results unveil the importance of transcriptional regulatory circuits in plant stem cell development.


Asunto(s)
Proteínas de Arabidopsis , Arabidopsis , Arabidopsis/genética , Arabidopsis/metabolismo , Proteínas de Arabidopsis/genética , Proteínas de Arabidopsis/metabolismo , División Celular , Regulación de la Expresión Génica de las Plantas , Proteínas de Homeodominio/genética , Meristema/genética , Meristema/metabolismo , Nitrilos , Raíces de Plantas/genética , Raíces de Plantas/metabolismo
6.
PLoS Biol ; 17(4): e2006506, 2019 04.
Artículo en Inglés | MEDLINE | ID: mdl-30978178

RESUMEN

The differentiation of self-renewing progenitor cells requires not only the regulation of lineage- and developmental stage-specific genes but also the coordinated adaptation of housekeeping functions from a metabolically active, proliferative state toward quiescence. How metabolic and cell-cycle states are coordinated with the regulation of cell type-specific genes is an important question, because dissociation between differentiation, cell cycle, and metabolic states is a hallmark of cancer. Here, we use a model system to systematically identify key transcriptional regulators of Ikaros-dependent B cell-progenitor differentiation. We find that the coordinated regulation of housekeeping functions and tissue-specific gene expression requires a feedforward circuit whereby Ikaros down-regulates the expression of Myc. Our findings show how coordination between differentiation and housekeeping states can be achieved by interconnected regulators. Similar principles likely coordinate differentiation and housekeeping functions during progenitor cell differentiation in other cell lineages.


Asunto(s)
Linfocitos B/citología , Genes myc , Células Precursoras de Linfocitos B/citología , Animales , Linfocitos B/metabolismo , Ciclo Celular/fisiología , Diferenciación Celular/genética , Linaje de la Célula , Bases de Datos Genéticas , Regulación hacia Abajo , Regulación de la Expresión Génica , Genes Esenciales , Humanos , Factor de Transcripción Ikaros/metabolismo , Activación de Linfocitos , Ratones , Células Precursoras de Linfocitos B/metabolismo , Factores de Transcripción/metabolismo
7.
Genome Res ; 2018 Feb 09.
Artículo en Inglés | MEDLINE | ID: mdl-29440222

RESUMEN

High-throughput sequencing of full-length transcripts using long reads has paved the way for the discovery of thousands of novel transcripts, even in well-annotated mammalian species. The advances in sequencing technology have created a need for studies and tools that can characterize these novel variants. Here, we present SQANTI, an automated pipeline for the classification of long-read transcripts that can assess the quality of data and the preprocessing pipeline using 47 unique descriptors. We apply SQANTI to a neuronal mouse transcriptome using Pacific Biosciences (PacBio) long reads and illustrate how the tool is effective in characterizing and describing the composition of the full-length transcriptome. We perform extensive evaluation of ToFU PacBio transcripts by PCR to reveal that an important number of the novel transcripts are technical artifacts of the sequencing approach and that SQANTI quality descriptors can be used to engineer a filtering strategy to remove them. Most novel transcripts in this curated transcriptome are novel combinations of existing splice sites, resulting more frequently in novel ORFs than novel UTRs, and are enriched in both general metabolic and neural-specific functions. We show that these new transcripts have a major impact in the correct quantification of transcript levels by state-of-the-art short-read-based quantification algorithms. By comparing our iso-transcriptome with public proteomics databases, we find that alternative isoforms are elusive to proteogenomics detection. SQANTI allows the user to maximize the analytical outcome of long-read technologies by providing the tools to deliver quality-evaluated and curated full-length transcriptomes.

8.
Brief Bioinform ; 20(2): 471-481, 2019 03 22.
Artículo en Inglés | MEDLINE | ID: mdl-29040385

RESUMEN

Over the last few years, RNA-seq has been used to study alterations in alternative splicing related to several diseases. Bioinformatics workflows used to perform these studies can be divided into two groups, those finding changes in the absolute isoform expression and those studying differential splicing. Many computational methods for transcriptomics analysis have been developed, evaluated and compared; however, there are not enough reports of systematic and objective assessment of processing pipelines as a whole. Moreover, comparative studies have been performed considering separately the changes in absolute or relative isoform expression levels. Consequently, no consensus exists about the best practices and appropriate workflows to analyse alternative and differential splicing. To assist the adequate pipeline choice, we present here a benchmarking of nine commonly used workflows to detect differential isoform expression and splicing. We evaluated the workflows performance over different experimental scenarios where changes in absolute and relative isoform expression occurred simultaneously. In addition, the effect of the number of isoforms per gene, and the magnitude of the expression change over pipeline performances were also evaluated. Our results suggest that workflow performance is influenced by the number of replicates per condition and the conditions heterogeneity. In general, workflows based on DESeq2, DEXSeq, Limma and NOISeq performed well over a wide range of transcriptomics experiments. In particular, we suggest the use of workflows based on Limma when high precision is required, and DESeq2 and DEXseq pipelines to prioritize sensitivity. When several replicates per condition are available, NOISeq and Limma pipelines are indicated.


Asunto(s)
Empalme Alternativo , Benchmarking/métodos , Biología Computacional/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Proteínas de Neoplasias/genética , Neoplasias de la Próstata/genética , Análisis de Secuencia de ARN/métodos , Estudios de Casos y Controles , Perfilación de la Expresión Génica , Humanos , Masculino , Proteínas de Neoplasias/metabolismo , Próstata/metabolismo , Neoplasias de la Próstata/metabolismo , Isoformas de Proteínas , Flujo de Trabajo
9.
Bioinformatics ; 36(Suppl_2): i795-i803, 2020 12 30.
Artículo en Inglés | MEDLINE | ID: mdl-33381819

RESUMEN

MOTIVATION: Molecular pathway databases represent cellular processes in a structured and standardized way. These databases support the community-wide utilization of pathway information in biological research and the computational analysis of high-throughput biochemical data. Although pathway databases are critical in genomics research, the fast progress of biomedical sciences prevents databases from staying up-to-date. Moreover, the compartmentalization of cellular reactions into defined pathways reflects arbitrary choices that might not always be aligned with the needs of the researcher. Today, no tool exists that allow the easy creation of user-defined pathway representations. RESULTS: Here we present Padhoc, a pipeline for pathway ad hoc reconstruction. Based on a set of user-provided keywords, Padhoc combines natural language processing, database knowledge extraction, orthology search and powerful graph algorithms to create navigable pathways tailored to the user's needs. We validate Padhoc with a set of well-established Escherichia coli pathways and demonstrate usability to create not-yet-available pathways in model (human) and non-model (sweet orange) organisms. AVAILABILITY AND IMPLEMENTATION: Padhoc is freely available at https://github.com/ConesaLab/padhoc. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Biología Computacional , Programas Informáticos , Algoritmos , Bases de Datos Factuales , Genómica , Humanos
10.
Bioinformatics ; 36(Suppl_2): i618-i624, 2020 12 30.
Artículo en Inglés | MEDLINE | ID: mdl-33381847

RESUMEN

MOTIVATION: microRNAs (miRNAs) are essential components of gene expression regulation at the post-transcriptional level. miRNAs have a well-defined molecular structure and this has facilitated the development of computational and high-throughput approaches to predict miRNAs genes. However, due to their short size, miRNAs have often been incorrectly annotated in both plants and animals. Consequently, published miRNA annotations and miRNA databases are enriched for false miRNAs, jeopardizing their utility as molecular information resources. To address this problem, we developed MirCure, a new software for quality control, filtering and curation of miRNA candidates. MirCure is an easy-to-use tool with a graphical interface that allows both scoring of miRNA reliability and browsing of supporting evidence by manual curators. RESULTS: Given a list of miRNA candidates, MirCure evaluates a number of miRNA-specific features based on gene expression, biogenesis and conservation data, and generates a score that can be used to discard poorly supported miRNA annotations. MirCure can also curate and adjust the annotation of the 5p and 3p arms based on user-provided small RNA-seq data. We evaluated MirCure on a set of manually curated animal and plant miRNAs and demonstrated great accuracy. Moreover, we show that MirCure can be used to revisit previous bona fide miRNAs annotations to improve miRNA databases. AVAILABILITY AND IMPLEMENTATION: The MirCure software and all the additional scripts used in this project are publicly available at https://github.com/ConesaLab/MirCure. A Docker image of MirCure is available at https://hub.docker.com/r/conesalab/mircure. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
MicroARNs , Animales , Biología Computacional , MicroARNs/genética , Plantas/genética , Control de Calidad , Reproducibilidad de los Resultados , Programas Informáticos
11.
EMBO Rep ; 20(12): e47964, 2019 12 05.
Artículo en Inglés | MEDLINE | ID: mdl-31680439

RESUMEN

RNA-binding proteins (RBPs) participate in all steps of gene expression, underscoring their potential as regulators of RNA homeostasis. We structurally and functionally characterize Mip6, a four-RNA recognition motif (RRM)-containing RBP, as a functional and physical interactor of the export factor Mex67. Mip6-RRM4 directly interacts with the ubiquitin-associated (UBA) domain of Mex67 through a loop containing tryptophan 442. Mip6 shuttles between the nucleus and the cytoplasm in a Mex67-dependent manner and concentrates in cytoplasmic foci under stress. Photoactivatable ribonucleoside-enhanced crosslinking and immunoprecipitation experiments show preferential binding of Mip6 to mRNAs regulated by the stress-response Msn2/4 transcription factors. Consistent with this binding, MIP6 deletion affects their export and expression levels. Additionally, Mip6 interacts physically and/or functionally with proteins with a role in mRNA metabolism and transcription such as Rrp6, Xrn1, Sgf73, and Rpb1. These results reveal a novel role for Mip6 in the homeostasis of Msn2/4-dependent transcripts through its direct interaction with the Mex67 UBA domain.


Asunto(s)
Núcleo Celular/metabolismo , Proteínas Nucleares/metabolismo , Proteínas de Transporte Nucleocitoplasmático/metabolismo , Proteínas de Unión al ARN/metabolismo , Proteínas de Saccharomyces cerevisiae/metabolismo , Transporte Activo de Núcleo Celular , Sitios de Unión , Proteínas de Unión al ADN/genética , Proteínas de Unión al ADN/metabolismo , Proteínas Nucleares/química , Proteínas Nucleares/genética , Proteínas de Transporte Nucleocitoplasmático/química , Proteínas de Transporte Nucleocitoplasmático/genética , Unión Proteica , ARN Mensajero/genética , ARN Mensajero/metabolismo , Proteínas de Unión al ARN/química , Proteínas de Unión al ARN/genética , Saccharomyces cerevisiae , Proteínas de Saccharomyces cerevisiae/química , Proteínas de Saccharomyces cerevisiae/genética , Estrés Fisiológico , Factores de Transcripción/genética , Factores de Transcripción/metabolismo
12.
Cell Biol Toxicol ; 37(1): 129-149, 2021 02.
Artículo en Inglés | MEDLINE | ID: mdl-33404927

RESUMEN

Patients with liver cirrhosis may develop covert or minimal hepatic encephalopathy (MHE). Hyperammonemia (HA) and peripheral inflammation play synergistic roles in inducing the cognitive and motor alterations in MHE. The cerebellum is one of the main cerebral regions affected in MHE. Rats with chronic HA show some motor and cognitive alterations reproducing neurological impairment in cirrhotic patients with MHE. Neuroinflammation and altered neurotransmission and signal transduction in the cerebellum from hyperammonemic (HA) rats are associated with motor and cognitive dysfunction, but underlying mechanisms are not completely known. The aim of this work was to use a multi-omic approach to study molecular alterations in the cerebellum from hyperammonemic rats to uncover new molecular mechanisms associated with hyperammonemia-induced cerebellar function impairment. We analyzed metabolomic, transcriptomic, and proteomic data from the same cerebellums from control and HA rats and performed a multi-omic integrative analysis of signaling pathway enrichment with the PaintOmics tool. The histaminergic system, corticotropin-releasing hormone, cyclic GMP-protein kinase G pathway, and intercellular communication in the cerebellar immune system were some of the most relevant enriched pathways in HA rats. In summary, this is a good approach to find altered pathways, which helps to describe the molecular mechanisms involved in the alteration of brain function in rats with chronic HA and to propose possible therapeutic targets to improve MHE symptoms.


Asunto(s)
Cerebelo/fisiopatología , Hiperamonemia/complicaciones , Animales , Presentación de Antígeno/inmunología , Moléculas de Adhesión Celular/metabolismo , GMP Cíclico/metabolismo , Proteínas Quinasas Dependientes de GMP Cíclico/metabolismo , Hiperamonemia/inmunología , Ligandos , Masculino , Ratas Wistar , Transmisión Sináptica/fisiología
13.
Genome Res ; 27(11): 1807-1815, 2017 11.
Artículo en Inglés | MEDLINE | ID: mdl-29025893

RESUMEN

Genome-wide association studies (GWAS) have identified multiple, shared allelic associations with many autoimmune diseases. However, the pathogenic contributions of variants residing in risk loci remain unresolved. The location of the majority of shared disease-associated variants in noncoding regions suggests they contribute to risk of autoimmunity through effects on gene expression in the immune system. In the current study, we test this hypothesis by applying RNA sequencing to CD4+, CD8+, and CD19+ lymphocyte populations isolated from 81 subjects with type 1 diabetes (T1D). We characterize and compare the expression patterns across these cell types for three gene sets: all genes, the set of genes implicated in autoimmune disease risk by GWAS, and the subset of these genes specifically implicated in T1D. We performed RNA sequencing and aligned the reads to both the human reference genome and a catalog of all possible splicing events developed from the genome, thereby providing a comprehensive evaluation of the roles of gene expression and alternative splicing (AS) in autoimmunity. Autoimmune candidate genes displayed greater expression specificity in the three lymphocyte populations relative to other genes, with significantly increased levels of splicing events, particularly those predicted to have substantial effects on protein isoform structure and function (e.g., intron retention, exon skipping). The majority of single-nucleotide polymorphisms within T1D-associated loci were also associated with one or more cis-expression quantitative trait loci (cis-eQTLs) and/or splicing eQTLs. Our findings highlight a substantial, and previously underrecognized, role for AS in the pathogenesis of autoimmune disorders and particularly for T1D.


Asunto(s)
Empalme Alternativo , Diabetes Mellitus Tipo 1/genética , Perfilación de la Expresión Génica/métodos , Linfocitos/química , Análisis de Secuencia de ARN/métodos , Adulto , Linfocitos T CD4-Positivos/química , Linfocitos T CD4-Positivos/inmunología , Linfocitos T CD8-positivos/química , Linfocitos T CD8-positivos/inmunología , Diabetes Mellitus Tipo 1/inmunología , Femenino , Redes Reguladoras de Genes , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Humanos , Linfocitos/inmunología , Masculino , Isoformas de Proteínas/química , Isoformas de Proteínas/metabolismo , Sitios de Carácter Cuantitativo , Receptores CCR1/metabolismo
14.
BMC Plant Biol ; 20(1): 539, 2020 Nov 30.
Artículo en Inglés | MEDLINE | ID: mdl-33256589

RESUMEN

BACKGROUND: RNA sequencing has been widely used to profile genome-wide gene expression and identify candidate genes controlling disease resistance and other important traits in plants. Gerbera daisy is one of the most important flowers in the global floricultural trade, and powdery mildew (PM) is the most important disease of gerbera. Genetic improvement of gerbera PM resistance has become a crucial goal in gerbera breeding. A better understanding of the genetic control of gerbera resistance to PM can expedite the development of PM-resistant cultivars. RESULTS: The objectives of this study were to identify gerbera genotypes with contrasting phenotypes in PM resistance and sequence and analyze their leaf transcriptomes to identify disease resistance and susceptibility genes differentially expressed and associated with PM resistance. An additional objective was to identify SNPs and SSRs for use in future genetic studies. We identified two gerbera genotypes, UFGE 4033 and 06-245-03, that were resistant and susceptible to PM, respectively. De novo assembly of their leaf transcriptomes using four complementary pipelines resulted in 145,348 transcripts with a N50 of 1124 bp, of which 67,312 transcripts contained open reading frames and 48,268 were expressed in both genotypes. A total of 494 transcripts were likely involved in disease resistance, and 17 and 24 transcripts were up- and down-regulated, respectively, in UFGE 4033 compared to 06-245-03. These gerbera disease resistance transcripts were most similar to the NBS-LRR class of plant resistance genes conferring resistance to various pathogens in plants. Four disease susceptibility transcripts (MLO-like) were expressed only or highly expressed in 06-245-03, offering excellent candidate targets for gene editing for PM resistance in gerbera. A total of 449,897 SNPs and 19,393 SSRs were revealed in the gerbera transcriptomes, which can be a valuable resource for developing new molecular markers. CONCLUSION: This study represents the first transcriptomic analysis of gerbera PM resistance, a highly important yet complex trait in a globally important floral crop. The differentially expressed disease resistance and susceptibility transcripts identified provide excellent targets for development of molecular markers and genetic maps, cloning of disease resistance genes, or targeted mutagenesis of disease susceptibility genes for PM resistance in gerbera.


Asunto(s)
Ascomicetos , Asteraceae/genética , Resistencia a la Enfermedad/genética , Enfermedades de las Plantas/genética , Transcriptoma/genética , Asteraceae/microbiología , Genotipo , Repeticiones de Microsatélite , Fenotipo , Fitomejoramiento , Enfermedades de las Plantas/microbiología , Hojas de la Planta/metabolismo , Polimorfismo de Nucleótido Simple , RNA-Seq , Reacción en Cadena en Tiempo Real de la Polimerasa
15.
PLoS Comput Biol ; 15(11): e1006555, 2019 11.
Artículo en Inglés | MEDLINE | ID: mdl-31682608

RESUMEN

Rapid advances in single-cell assays have outpaced methods for analysis of those data types. Different single-cell assays show extensive variation in sensitivity and signal to noise levels. In particular, scATAC-seq generates extremely sparse and noisy datasets. Existing methods developed to analyze this data require cells amenable to pseudo-time analysis or require datasets with drastically different cell-types. We describe a novel approach using self-organizing maps (SOM) to link scATAC-seq regions with scRNA-seq genes that overcomes these challenges and can generate draft regulatory networks. Our SOMatic package generates chromatin and gene expression SOMs separately and combines them using a linking function. We applied SOMatic on a mouse pre-B cell differentiation time-course using controlled Ikaros over-expression to recover gene ontology enrichments, identify motifs in genomic regions showing similar single-cell profiles, and generate a gene regulatory network that both recovers known interactions and predicts new Ikaros targets during the differentiation process. The ability of linked SOMs to detect emergent properties from multiple types of highly-dimensional genomic data with very different signal properties opens new avenues for integrative analysis of heterogeneous data.


Asunto(s)
Análisis de Secuencia de ADN/métodos , Análisis de Secuencia de ARN/métodos , Análisis de la Célula Individual/métodos , Algoritmos , Animales , Biología Computacional/métodos , Perfilación de la Expresión Génica/métodos , Redes Reguladoras de Genes/genética , Genoma , Genómica/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Programas Informáticos
16.
Nucleic Acids Res ; 46(W1): W503-W509, 2018 07 02.
Artículo en Inglés | MEDLINE | ID: mdl-29800320

RESUMEN

The increasing availability of multi-omic platforms poses new challenges to data analysis. Joint visualization of multi-omics data is instrumental in better understanding interconnections across molecular layers and in fully utilizing the multi-omic resources available to make biological discoveries. We present here PaintOmics 3, a web-based resource for the integrated visualization of multiple omic data types onto KEGG pathway diagrams. PaintOmics 3 combines server-end capabilities for data analysis with the potential of modern web resources for data visualization, providing researchers with a powerful framework for interactive exploration of their multi-omics information. Unlike other visualization tools, PaintOmics 3 covers a comprehensive pathway analysis workflow, including automatic feature name/identifier conversion, multi-layered feature matching, pathway enrichment, network analysis, interactive heatmaps, trend charts, and more. It accepts a wide variety of omic types, including transcriptomics, proteomics and metabolomics, as well as region-based approaches such as ATAC-seq or ChIP-seq data. The tool is freely available at www.paintomics.org.


Asunto(s)
Regulación de la Expresión Génica , Redes y Vías Metabólicas/genética , Transducción de Señal/genética , Programas Informáticos , Transcriptoma , Línea Celular Transformada , Reprogramación Celular , Gráficos por Computador , Fibroblastos/citología , Fibroblastos/metabolismo , Genómica/métodos , Humanos , Internet , Metabolómica/métodos , Anotación de Secuencia Molecular , Proteómica/métodos
17.
Mol Microbiol ; 107(1): 116-131, 2018 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-29105190

RESUMEN

Transcriptional regulation is the key to ensuring that proteins are expressed at the proper time and the proper amount. In Escherichia coli, the transcription factor cAMP receptor protein (CRP) is responsible for much of this regulation. Questions remain, however, regarding the regulation of CRP activity itself. Here, we demonstrate that a lysine (K100) on the surface of CRP has a dual function: to promote CRP activity at Class II promoters, and to ensure proper CRP steady state levels. Both functions require the lysine's positive charge; intriguingly, the positive charge of K100 can be neutralized by acetylation using the central metabolite acetyl phosphate as the acetyl donor. We propose that CRP K100 acetylation could be a mechanism by which the cell downwardly tunes CRP-dependent Class II promoter activity, whilst elevating CRP steady state levels, thus indirectly increasing Class I promoter activity. This mechanism would operate under conditions that favor acetate fermentation, such as during growth on glucose as the sole carbon source or when carbon flux exceeds the capacity of the central metabolic pathways.


Asunto(s)
Proteína Receptora de AMP Cíclico/genética , Proteína Receptora de AMP Cíclico/metabolismo , Proteínas de Escherichia coli/genética , Proteínas de Escherichia coli/metabolismo , Lisina/metabolismo , Acetilación , Sitios de Unión , Escherichia coli/genética , Escherichia coli/metabolismo , Regulación Bacteriana de la Expresión Génica/genética , Regiones Promotoras Genéticas/genética , Procesamiento Proteico-Postraduccional/genética , Proteínas Represoras/metabolismo , Factores de Transcripción/metabolismo
18.
Bioinformatics ; 34(9): 1547-1554, 2018 05 01.
Artículo en Inglés | MEDLINE | ID: mdl-29272325

RESUMEN

Motivation: Best performing named entity recognition (NER) methods for biomedical literature are based on hand-crafted features or task-specific rules, which are costly to produce and difficult to generalize to other corpora. End-to-end neural networks achieve state-of-the-art performance without hand-crafted features and task-specific knowledge in non-biomedical NER tasks. However, in the biomedical domain, using the same architecture does not yield competitive performance compared with conventional machine learning models. Results: We propose a novel end-to-end deep learning approach for biomedical NER tasks that leverages the local contexts based on n-gram character and word embeddings via Convolutional Neural Network (CNN). We call this approach GRAM-CNN. To automatically label a word, this method uses the local information around a word. Therefore, the GRAM-CNN method does not require any specific knowledge or feature engineering and can be theoretically applied to a wide range of existing NER problems. The GRAM-CNN approach was evaluated on three well-known biomedical datasets containing different BioNER entities. It obtained an F1-score of 87.26% on the Biocreative II dataset, 87.26% on the NCBI dataset and 72.57% on the JNLPBA dataset. Those results put GRAM-CNN in the lead of the biological NER methods. To the best of our knowledge, we are the first to apply CNN based structures to BioNER problems. Availability and implementation: The GRAM-CNN source code, datasets and pre-trained model are available online at: https://github.com/valdersoul/GRAM-CNN. Contact: andyli@ece.ufl.edu or aconesa@ufl.edu. Supplementary information: Supplementary data are available at Bioinformatics online.


Asunto(s)
Biología Computacional/métodos , Aprendizaje Profundo , Programas Informáticos
19.
Bioinformatics ; 34(3): 524-526, 2018 02 01.
Artículo en Inglés | MEDLINE | ID: mdl-28968682

RESUMEN

Motivation: As sequencing technologies improve their capacity to detect distinct transcripts of the same gene and to address complex experimental designs such as longitudinal studies, there is a need to develop statistical methods for the analysis of isoform expression changes in time series data. Results: Iso-maSigPro is a new functionality of the R package maSigPro for transcriptomics time series data analysis. Iso-maSigPro identifies genes with a differential isoform usage across time. The package also includes new clustering and visualization functions that allow grouping of genes with similar expression patterns at the isoform level, as well as those genes with a shift in major expressed isoform. Availability and implementation: The package is freely available under the LGPL license from the Bioconductor web site. Contact: mj.nueda@ua.es or aconesa@ufl.edu. Supplementary information: Supplementary data are available at Bioinformatics online.


Asunto(s)
Perfilación de la Expresión Génica/métodos , Isoformas de ARN/análisis , Análisis de Secuencia de ARN/métodos , Programas Informáticos , Animales , Linfocitos B/metabolismo , Linfocitos B/fisiología , Diferenciación Celular , Regulación de la Expresión Génica , Ratones , Isoformas de ARN/genética
20.
Histopathology ; 75(4): 496-507, 2019 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-31025430

RESUMEN

AIMS: To discern the differences in expression profiling of two histological subtypes of colorectal carcinoma (CRC) arising from the serrated route (serrated adenocarcinoma (SAC) and CRC showing histological and molecular features of a high level of microsatellite instability (hmMSI-H) both sharing common features (female gender, right-sided location, mucinous histology, and altered CpG methylation), but dramatically differing in terms of prognosis, development of an immune response, and treatment options. METHODS AND RESULTS: Molecular signatures of SAC and hmMSI-H were obtained by the use of transcriptomic arrays; quantitative polymerase chain reaction (qPCR) and immunohistochemistry (IHC) were used to validate differentially expressed genes. An over-representation of innate immunity functions (granulomonocytic recruitment, chemokine production, Toll-like receptor signalling, and antigen processing and presentation) was obtained from this comparison, and intercellular cell adhesion molecule-1 (ICAM1) was more highly expressed in hmMSI-H, whereas two genes [those encoding calcitonin gene-related peptide-receptor component protein and C-X-C motif chemokine ligand 14 (CXCL14)] were more highly expressed in SAC. These array results were subsequently validated by qPCR, and by IHC for CXCL14 and ICAM1. Information retrieved from public databanks confirmed our findings. CONCLUSIONS: Our findings highlight specific functions and genes that provide a better understanding of the role of the immune response in the serrated pathological route and may be of help in identifying actionable molecules.


Asunto(s)
Adenocarcinoma/genética , Adenocarcinoma/patología , Neoplasias Colorrectales/genética , Neoplasias Colorrectales/patología , Adulto , Anciano , Biomarcadores de Tumor/análisis , Biomarcadores de Tumor/genética , Femenino , Perfilación de la Expresión Génica , Humanos , Masculino , Inestabilidad de Microsatélites , Persona de Mediana Edad , Transcriptoma
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA