Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 67
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
Nucleic Acids Res ; 51(D1): D167-D178, 2023 01 06.
Artículo en Inglés | MEDLINE | ID: mdl-36399497

RESUMEN

Dysregulation of RNA splicing contributes to both rare and complex diseases. RNA-sequencing data from human tissues has shown that this process can be inaccurate, resulting in the presence of novel introns detected at low frequency across samples and within an individual. To enable the full spectrum of intron use to be explored, we have developed IntroVerse, which offers an extensive catalogue on the splicing of 332,571 annotated introns and a linked set of 4,679,474 novel junctions covering 32,669 different genes. This dataset has been generated through the analysis of 17,510 human control RNA samples from 54 tissues provided by the Genotype-Tissue Expression Consortium. IntroVerse has two unique features: (i) it provides a complete catalogue of novel junctions and (ii) each novel junction has been assigned to a specific annotated intron. This unique, hierarchical structure offers multiple uses, including the identification of novel transcripts from known genes and their tissue-specific usage, and the assessment of background splicing noise for introns thought to be mis-spliced in disease states. IntroVerse provides a user-friendly web interface and is freely available at https://rytenlab.com/browser/app/introverse.


Asunto(s)
Bases de Datos Genéticas , Intrones , Empalme del ARN , Humanos , Empalme Alternativo , Secuencia de Bases , Intrones/genética , ARN , Empalme del ARN/genética
2.
BMC Bioinformatics ; 24(1): 340, 2023 Sep 13.
Artículo en Inglés | MEDLINE | ID: mdl-37704947

RESUMEN

BACKGROUND: Bisulfite sequencing is a powerful tool for profiling genomic methylation, an epigenetic modification critical in the understanding of cancer, psychiatric disorders, and many other conditions. Raw data generated by whole genome bisulfite sequencing (WGBS) requires several computational steps before it is ready for statistical analysis, and particular care is required to process data in a timely and memory-efficient manner. Alignment to a reference genome is one of the most computationally demanding steps in a WGBS workflow, taking several hours or even days with commonly used WGBS-specific alignment software. This naturally motivates the creation of computational workflows that can utilize GPU-based alignment software to greatly speed up the bottleneck step. In addition, WGBS produces raw data that is large and often unwieldy; a lack of memory-efficient representation of data by existing pipelines renders WGBS impractical or impossible to many researchers. RESULTS: We present BiocMAP, a Bioconductor-friendly methylation analysis pipeline consisting of two modules, to address the above concerns. The first module performs computationally-intensive read alignment using Arioc, a GPU-accelerated short-read aligner. Since GPUs are not always available on the same computing environments where traditional CPU-based analyses are convenient, the second module may be run in a GPU-free environment. This module extracts and merges DNA methylation proportions-the fractions of methylated cytosines across all cells in a sample at a given genomic site. Bioconductor-based output objects in R utilize an on-disk data representation to drastically reduce required main memory and make WGBS projects computationally feasible to more researchers. CONCLUSIONS: BiocMAP is implemented using Nextflow and available at http://research.libd.org/BiocMAP/ . To enable reproducible analysis across a variety of typical computing environments, BiocMAP can be containerized with Docker or Singularity, and executed locally or with the SLURM or SGE scheduling engines. By providing Bioconductor objects, BiocMAP's output can be integrated with powerful analytical open source software for analyzing methylation data.


Asunto(s)
Genómica , Sulfitos , Humanos , Análisis de Secuencia de ADN , Secuenciación Completa del Genoma
3.
Hippocampus ; 33(9): 1009-1027, 2023 09.
Artículo en Inglés | MEDLINE | ID: mdl-37226416

RESUMEN

Activity-regulated gene (ARG) expression patterns in the hippocampus (HPC) regulate synaptic plasticity, learning, and memory, and are linked to both risk and treatment responses for many neuropsychiatric disorders. The HPC contains discrete classes of neurons with specialized functions, but cell type-specific activity-regulated transcriptional programs are not well characterized. Here, we used single-nucleus RNA-sequencing (snRNA-seq) in a mouse model of acute electroconvulsive seizures (ECS) to identify cell type-specific molecular signatures associated with induced activity in HPC neurons. We used unsupervised clustering and a priori marker genes to computationally annotate 15,990 high-quality HPC neuronal nuclei from N = 4 mice across all major HPC subregions and neuron types. Activity-induced transcriptomic responses were divergent across neuron populations, with dentate granule cells being particularly responsive to activity. Differential expression analysis identified both upregulated and downregulated cell type-specific gene sets in neurons following ECS. Within these gene sets, we identified enrichment of pathways associated with varying biological processes such as synapse organization, cellular signaling, and transcriptional regulation. Finally, we used matrix factorization to reveal continuous gene expression patterns differentially associated with cell type, ECS, and biological processes. This work provides a rich resource for interrogating activity-regulated transcriptional responses in HPC neurons at single-nuclei resolution in the context of ECS, which can provide biological insight into the roles of defined neuronal subtypes in HPC function.


Asunto(s)
Hipocampo , Neuronas , Ratones , Animales , Hipocampo/fisiología , Neuronas/fisiología , Aprendizaje/fisiología , Regulación de la Expresión Génica/genética , Convulsiones , Expresión Génica
4.
Genome Res ; 30(7): 1073-1081, 2020 07.
Artículo en Inglés | MEDLINE | ID: mdl-32079618

RESUMEN

Long noncoding RNAs (lncRNAs) have emerged as key coordinators of biological and cellular processes. Characterizing lncRNA expression across cells and tissues is key to understanding their role in determining phenotypes, including human diseases. We present here FC-R2, a comprehensive expression atlas across a broadly defined human transcriptome, inclusive of over 109,000 coding and noncoding genes, as described in the FANTOM CAGE-Associated Transcriptome (FANTOM-CAT) study. This atlas greatly extends the gene annotation used in the original recount2 resource. We demonstrate the utility of the FC-R2 atlas by reproducing key findings from published large studies and by generating new results across normal and diseased human samples. In particular, we (a) identify tissue-specific transcription profiles for distinct classes of coding and noncoding genes, (b) perform differential expression analysis across thirteen cancer types, identifying novel noncoding genes potentially involved in tumor pathogenesis and progression, and (c) confirm the prognostic value for several enhancer lncRNAs expression in cancer. Our resource is instrumental for the systematic molecular characterization of lncRNA by the FANTOM6 Consortium. In conclusion, comprised of over 70,000 samples, the FC-R2 atlas will empower other researchers to investigate functions and biological roles of both known coding genes and novel lncRNAs.


Asunto(s)
Transcriptoma , Bases de Datos Genéticas , Elementos de Facilitación Genéticos , Perfilación de la Expresión Génica , Genoma Humano , Humanos , Neoplasias/genética , Especificidad de Órganos , Pronóstico , ARN Largo no Codificante/genética , ARN Largo no Codificante/metabolismo , ARN Mensajero/metabolismo
5.
Bioinformatics ; 38(11): 3128-3131, 2022 05 26.
Artículo en Inglés | MEDLINE | ID: mdl-35482478

RESUMEN

SUMMARY: SpatialExperiment is a new data infrastructure for storing and accessing spatially-resolved transcriptomics data, implemented within the R/Bioconductor framework, which provides advantages of modularity, interoperability, standardized operations and comprehensive documentation. Here, we demonstrate the structure and user interface with examples from the 10x Genomics Visium and seqFISH platforms, and provide access to example datasets and visualization tools in the STexampleData, TENxVisiumData and ggspavis packages. AVAILABILITY AND IMPLEMENTATION: The SpatialExperiment, STexampleData, TENxVisiumData and ggspavis packages are available from Bioconductor. The package versions described in this manuscript are available in Bioconductor version 3.15 onwards. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Programas Informáticos , Transcriptoma , Genómica
6.
BMC Genomics ; 23(1): 434, 2022 Jun 10.
Artículo en Inglés | MEDLINE | ID: mdl-35689177

RESUMEN

BACKGROUND: Spatially-resolved transcriptomics has now enabled the quantification of high-throughput and transcriptome-wide gene expression in intact tissue while also retaining the spatial coordinates. Incorporating the precise spatial mapping of gene activity advances our understanding of intact tissue-specific biological processes. In order to interpret these novel spatial data types, interactive visualization tools are necessary. RESULTS: We describe spatialLIBD, an R/Bioconductor package to interactively explore spatially-resolved transcriptomics data generated with the 10x Genomics Visium platform. The package contains functions to interactively access, visualize, and inspect the observed spatial gene expression data and data-driven clusters identified with supervised or unsupervised analyses, either on the user's computer or through a web application. CONCLUSIONS: spatialLIBD is available at https://bioconductor.org/packages/spatialLIBD . It is fully compatible with SpatialExperiment and the Bioconductor ecosystem. Its functionality facilitates analyzing and interactively exploring spatially-resolved data from the Visium platform.


Asunto(s)
Ecosistema , Transcriptoma , Genómica , Programas Informáticos
7.
Bioinformatics ; 37(18): 3014-3016, 2021 09 29.
Artículo en Inglés | MEDLINE | ID: mdl-33693500

RESUMEN

MOTIVATION: A common way to summarize sequencing datasets is to quantify data lying within genes or other genomic intervals. This can be slow and can require different tools for different input file types. RESULTS: Megadepth is a fast tool for quantifying alignments and coverage for BigWig and BAM/CRAM input files, using substantially less memory than the next-fastest competitor. Megadepth can summarize coverage within all disjoint intervals of the Gencode V35 gene annotation for more than 19 000 GTExV8 BigWig files in approximately 1 h using 32 threads. Megadepth is available both as a command-line tool and as an R/Bioconductor package providing much faster quantification compared to the rtracklayer package. AVAILABILITY AND IMPLEMENTATION: https://github.com/ChristopherWilks/megadepth, https://bioconductor.org/packages/megadepth. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Genoma , Genómica , Programas Informáticos , Anotación de Secuencia Molecular
8.
BMC Bioinformatics ; 22(1): 224, 2021 May 01.
Artículo en Inglés | MEDLINE | ID: mdl-33932985

RESUMEN

BACKGROUND: RNA sequencing (RNA-seq) is a common and widespread biological assay, and an increasing amount of data is generated with it. In practice, there are a large number of individual steps a researcher must perform before raw RNA-seq reads yield directly valuable information, such as differential gene expression data. Existing software tools are typically specialized, only performing one step-such as alignment of reads to a reference genome-of a larger workflow. The demand for a more comprehensive and reproducible workflow has led to the production of a number of publicly available RNA-seq pipelines. However, we have found that most require computational expertise to set up or share among several users, are not actively maintained, or lack features we have found to be important in our own analyses. RESULTS: In response to these concerns, we have developed a Scalable Pipeline for Expression Analysis and Quantification (SPEAQeasy), which is easy to install and share, and provides a bridge towards R/Bioconductor downstream analysis solutions. SPEAQeasy is portable across computational frameworks (SGE, SLURM, local, docker integration) and different configuration files are provided ( http://research.libd.org/SPEAQeasy/ ). CONCLUSIONS: SPEAQeasy is user-friendly and lowers the computational-domain entry barrier for biologists and clinicians to RNA-seq data processing as the main input file is a table with sample names and their corresponding FASTQ files. The goal is to provide a flexible pipeline that is immediately usable by researchers, regardless of their technical background or computing environment.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento , Programas Informáticos , RNA-Seq , Análisis de Secuencia de ARN , Flujo de Trabajo
9.
Bioinformatics ; 36(16): 4532-4534, 2020 08 15.
Artículo en Inglés | MEDLINE | ID: mdl-32573705

RESUMEN

SUMMARY: RegulonDB has collected, harmonized and centralized data from hundreds of experiments for nearly two decades and is considered a point of reference for transcriptional regulation in Escherichia coli K12. Here, we present the regutools R package to facilitate programmatic access to RegulonDB data in computational biology. regutools gives researchers the possibility of writing reproducible workflows with automated queries to RegulonDB. The regutools package serves as a bridge between RegulonDB data and the Bioconductor ecosystem by reusing the data structures and statistical methods powered by other Bioconductor packages. We demonstrate the integration of regutools with Bioconductor by analyzing transcription factor DNA binding sites and transcriptional regulatory networks from RegulonDB. We anticipate that regutools will serve as a useful building block in our progress to further our understanding of gene regulatory networks. AVAILABILITY AND IMPLEMENTATION: regutools is an R package available through Bioconductor at bioconductor.org/packages/regutools.


Asunto(s)
Ecosistema , Escherichia coli K12 , Biología Computacional , Escherichia coli K12/genética , Redes Reguladoras de Genes , Programas Informáticos
10.
Mol Psychiatry ; 25(12): 3267-3277, 2020 12.
Artículo en Inglés | MEDLINE | ID: mdl-30131587

RESUMEN

Cigarette smoking during pregnancy is a major public health concern. While there are well-described consequences in early child development, there is very little known about the effects of maternal smoking on human cortical biology during prenatal life. We therefore performed a genome-wide differential gene expression analysis using RNA sequencing (RNA-seq) on prenatal (N = 33; 16 smoking-exposed) as well as adult (N = 207; 57 active smokers) human postmortem prefrontal cortices. Smoking exposure during the prenatal period was directly associated with differential expression of 14 genes; in contrast, during adulthood, despite a much larger sample size, only two genes showed significant differential expression (FDR < 10%). Moreover, 1,315 genes showed significantly different exposure effects between maternal smoking during pregnancy and direct exposure in adulthood (FDR < 10%)-these differences were largely driven by prenatal differences that were enriched for pathways previously implicated in addiction and synaptic function. Furthermore, prenatal and age-dependent differentially expressed genes were enriched for genes implicated in non-syndromic autism spectrum disorder (ASD) and were differentially expressed as a set between patients with ASD and controls in postmortem cortical regions. These results underscore the enhanced sensitivity to the biological effect of smoking exposure in the developing brain and offer insight into how maternal smoking during pregnancy affects gene expression in the prenatal human cortex. They also begin to address the relationship between in utero exposure to smoking and the heightened risks for the subsequent development of neuropsychiatric disorders.


Asunto(s)
Trastorno del Espectro Autista , Efectos Tardíos de la Exposición Prenatal , Adulto , Encéfalo , Femenino , Humanos , Exposición Materna , Embarazo , Efectos Tardíos de la Exposición Prenatal/genética , Análisis de Secuencia de ARN , Fumar/efectos adversos , Fumar/genética , Transcriptoma/genética
11.
Nucleic Acids Res ; 46(9): e54, 2018 05 18.
Artículo en Inglés | MEDLINE | ID: mdl-29514223

RESUMEN

Publicly available genomic data are a valuable resource for studying normal human variation and disease, but these data are often not well labeled or annotated. The lack of phenotype information for public genomic data severely limits their utility for addressing targeted biological questions. We develop an in silico phenotyping approach for predicting critical missing annotation directly from genomic measurements using well-annotated genomic and phenotypic data produced by consortia like TCGA and GTEx as training data. We apply in silico phenotyping to a set of 70 000 RNA-seq samples we recently processed on a common pipeline as part of the recount2 project. We use gene expression data to build and evaluate predictors for both biological phenotypes (sex, tissue, sample source) and experimental conditions (sequencing strategy). We demonstrate how these predictions can be used to study cross-sample properties of public genomic data, select genomic projects with specific characteristics, and perform downstream analyses using predicted phenotypes. The methods to perform phenotype prediction are available in the phenopredict R package and the predictions for recount2 are available from the recount R package. With data and phenotype information available for 70,000 human samples, expression data is available for use on a scale that was not previously feasible.


Asunto(s)
Perfilación de la Expresión Génica , Fenotipo , Análisis de Secuencia de ARN , Simulación por Computador , Femenino , Humanos , Masculino , Programas Informáticos
12.
Proteomics ; 19(15): e1800315, 2019 08.
Artículo en Inglés | MEDLINE | ID: mdl-30983154

RESUMEN

Understanding the molecular profile of every human cell type is essential for understanding its role in normal physiology and disease. Technological advancements in DNA sequencing, mass spectrometry, and computational methods allow us to carry out multiomics analyses although such approaches are not routine yet. Human umbilical vein endothelial cells (HUVECs) are a widely used model system to study pathological and physiological processes associated with the cardiovascular system. In this study, next-generation sequencing and high-resolution mass spectrometry to profile the transcriptome and proteome of primary HUVECs is employed. Analysis of 145 million paired-end reads from next-generation sequencing confirmed expression of 12 186 protein-coding genes (FPKM ≥0.1), 439 novel long non-coding RNAs, and revealed 6089 novel isoforms that were not annotated in GENCODE. Proteomics analysis identifies 6477 proteins including confirmation of N-termini for 1091 proteins, isoforms for 149 proteins, and 1034 phosphosites. A database search to specifically identify other post-translational modifications provide evidence for a number of modification sites on 117 proteins which include ubiquitylation, lysine acetylation, and mono-, di- and tri-methylation events. Evidence for 11 "missing proteins," which are proteins for which there was insufficient or no protein level evidence, is provided. Peptides supporting missing protein and novel events are validated by comparison of MS/MS fragmentation patterns with synthetic peptides. Finally, 245 variant peptides derived from 207 expressed proteins in addition to alternate translational start sites for seven proteins and evidence for novel proteoforms for five proteins resulting from alternative splicing are identified. Overall, it is believed that the integrated approach employed in this study is widely applicable to study any primary cell type for deeper molecular characterization.


Asunto(s)
Proteómica/métodos , Transcriptoma/genética , Empalme Alternativo/genética , Células Endoteliales de la Vena Umbilical Humana , Humanos
13.
BMC Genomics ; 20(1): 513, 2019 Jun 21.
Artículo en Inglés | MEDLINE | ID: mdl-31226924

RESUMEN

BACKGROUND: RNA sequencing offers advantages over other quantification methods for microRNA (miRNA), yet numerous biases make reliable quantification challenging. Previous evaluations of these biases have focused on adapter ligation bias with limited evaluation of reverse transcription bias or amplification bias. Furthermore, evaluations of the quantification of isomiRs (miRNA isoforms) or the influence of starting amount on performance have been very limited. No study had yet evaluated the quantification of isomiRs of altered length or compared the consistency of results derived from multiple moderate starting inputs. We therefore evaluated quantifications of miRNA and isomiRs using four library preparation kits, with various starting amounts, as well as quantifications following removal of duplicate reads using unique molecular identifiers (UMIs) to mitigate reverse transcription and amplification biases. RESULTS: All methods resulted in false isomiR detection; however, the adapter-free method tested was especially prone to false isomiR detection. We demonstrate that using UMIs improves accuracy and we provide a guide for input amounts to improve consistency. CONCLUSIONS: Our data show differences and limitations of current methods, thus raising concerns about the validity of quantification of miRNA and isomiRs across studies. We advocate for the use of UMIs to improve accuracy and reliability of miRNA quantifications.


Asunto(s)
Análisis de Secuencia de ARN/normas , Animales , Sesgo , Humanos , Ratones , Isoformas de ARN , ARN Viral , Ratas , Reproducibilidad de los Resultados , Análisis de Secuencia de ARN/métodos
14.
Acta Neuropathol ; 137(4): 557-569, 2019 04.
Artículo en Inglés | MEDLINE | ID: mdl-30712078

RESUMEN

Late-onset Alzheimer's disease (AD) is a complex age-related neurodegenerative disorder that likely involves epigenetic factors. To better understand the epigenetic state associated with AD, we surveyed 420,852 DNA methylation (DNAm) sites from neurotypical controls (N = 49) and late-onset AD patients (N = 24) across four brain regions (hippocampus, entorhinal cortex, dorsolateral prefrontal cortex and cerebellum). We identified 858 sites with robust differential methylation collectively annotated to 772 possible genes (FDR < 5%, within 10 kb). These sites were overrepresented in AD genetic risk loci (p = 0.00655) and were enriched for changes during normal aging (p < 2.2 × 10-16), and nearby genes were enriched for processes related to cell-adhesion, immunity, and calcium homeostasis (FDR < 5%). To functionally validate these associations, we generated and analyzed corresponding transcriptome data to prioritize 130 genes within 10 kb of the differentially methylated sites. These 130 genes were differentially expressed between AD cases and controls and their expression was associated with nearby DNAm (p < 0.05). This integrated analysis implicates novel genes in Alzheimer's disease, such as ANKRD30B. These results highlight DNAm differences in Alzheimer's disease that have gene expression correlates, further implicating DNAm as an epigenetic mechanism underlying pathological molecular changes associated with AD. Furthermore, our framework illustrates the value of integrating epigenetic and transcriptomic data for understanding complex disease.


Asunto(s)
Enfermedad de Alzheimer/genética , Encéfalo/metabolismo , Metilación de ADN , Perfilación de la Expresión Génica , Anciano , Anciano de 80 o más Años , Enfermedad de Alzheimer/metabolismo , Enfermedad de Alzheimer/patología , Encéfalo/patología , Islas de CpG/genética , Bases de Datos Genéticas , Epigenómica , Femenino , Humanos , Masculino , Persona de Mediana Edad
16.
Nucleic Acids Res ; 45(2): e9, 2017 01 25.
Artículo en Inglés | MEDLINE | ID: mdl-27694310

RESUMEN

Differential expression analysis of RNA sequencing (RNA-seq) data typically relies on reconstructing transcripts or counting reads that overlap known gene structures. We previously introduced an intermediate statistical approach called differentially expressed region (DER) finder that seeks to identify contiguous regions of the genome showing differential expression signal at single base resolution without relying on existing annotation or potentially inaccurate transcript assembly.We present the derfinder software that improves our annotation-agnostic approach to RNA-seq analysis by: (i) implementing a computationally efficient bump-hunting approach to identify DERs that permits genome-scale analyses in a large number of samples, (ii) introducing a flexible statistical modeling framework, including multi-group and time-course analyses and (iii) introducing a new set of data visualizations for expressed region analysis. We apply this approach to public RNA-seq data from the Genotype-Tissue Expression (GTEx) project and BrainSpan project to show that derfinder permits the analysis of hundreds of samples at base resolution in R, identifies expression outside of known gene boundaries and can be used to visualize expressed regions at base-resolution. In simulations, our base resolution approaches enable discovery in the presence of incomplete annotation and is nearly as powerful as feature-level methods when the annotation is complete.derfinder analysis using expressed region-level and single base-level approaches provides a compromise between full transcript reconstruction and feature-level analysis. The package is available from Bioconductor at www.bioconductor.org/packages/derfinder.


Asunto(s)
Perfilación de la Expresión Génica/métodos , Programas Informáticos , Regulación de la Expresión Génica , Genómica/métodos , Secuenciación de Nucleótidos de Alto Rendimiento , Anotación de Secuencia Molecular , Especificidad de Órganos/genética , Transcriptoma , Navegador Web
17.
Bioinformatics ; 33(24): 4033-4040, 2017 Dec 15.
Artículo en Inglés | MEDLINE | ID: mdl-27592709

RESUMEN

MOTIVATION: RNA sequencing (RNA-seq) experiments now span hundreds to thousands of samples. Current spliced alignment software is designed to analyze each sample separately. Consequently, no information is gained from analyzing multiple samples together, and it requires extra work to obtain analysis products that incorporate data from across samples. RESULTS: We describe Rail-RNA, a cloud-enabled spliced aligner that analyzes many samples at once. Rail-RNA eliminates redundant work across samples, making it more efficient as samples are added. For many samples, Rail-RNA is more accurate than annotation-assisted aligners. We use Rail-RNA to align 667 RNA-seq samples from the GEUVADIS project on Amazon Web Services in under 16 h for US$0.91 per sample. Rail-RNA outputs alignments in SAM/BAM format; but it also outputs (i) base-level coverage bigWigs for each sample; (ii) coverage bigWigs encoding normalized mean and median coverages at each base across samples analyzed; and (iii) exon-exon splice junctions and indels (features) in columnar formats that juxtapose coverages in samples in which a given feature is found. Supplementary outputs are ready for use with downstream packages for reproducible statistical analysis. We use Rail-RNA to identify expressed regions in the GEUVADIS samples and show that both annotated and unannotated (novel) expressed regions exhibit consistent patterns of variation across populations and with respect to known confounding variables. AVAILABILITY AND IMPLEMENTATION: Rail-RNA is open-source software available at http://rail.bio. CONTACTS: anellore@gmail.com or langmea@cs.jhu.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Empalme del ARN , Alineación de Secuencia/métodos , Análisis de Secuencia de ARN/métodos , Programas Informáticos , Exones , Perfilación de la Expresión Génica
19.
Proc Natl Acad Sci U S A ; 108(48): E1236-43, 2011 Nov 29.
Artículo en Inglés | MEDLINE | ID: mdl-22074846

RESUMEN

Many different systems of bacterial interactions have been described. However, relatively few studies have explored how interactions between different microorganisms might influence bacterial development. To explore such interspecies interactions, we focused on Bacillus subtilis, which characteristically develops into matrix-producing cannibals before entering sporulation. We investigated whether organisms from the natural environment of B. subtilis--the soil--were able to alter the development of B. subtilis. To test this possibility, we developed a coculture microcolony screen in which we used fluorescent reporters to identify soil bacteria able to induce matrix production in B. subtilis. Most of the bacteria that influence matrix production in B. subtilis are members of the genus Bacillus, suggesting that such interactions may be predominantly with close relatives. The interactions we observed were mediated via two different mechanisms. One resulted in increased expression of matrix genes via the activation of a sensor histidine kinase, KinD. The second was kinase independent and conceivably functions by altering the relative subpopulations of B. subtilis cell types by preferentially killing noncannibals. These two mechanisms were grouped according to the inducing strain's relatedness to B. subtilis. Our results suggest that bacteria preferentially alter their development in response to secreted molecules from closely related bacteria and do so using mechanisms that depend on the phylogenetic relatedness of the interacting bacteria.


Asunto(s)
Bacillus subtilis/fisiología , Proteínas Bacterianas/metabolismo , Biopelículas/crecimiento & desarrollo , Matriz Extracelular/metabolismo , Regulación Bacteriana de la Expresión Génica/fisiología , Percepción de Quorum/fisiología , Microbiología del Suelo , Secuencia de Bases , Fluorescencia , Funciones de Verosimilitud , Viabilidad Microbiana , Modelos Genéticos , Datos de Secuencia Molecular , Filogenia , ARN Ribosómico 16S/genética , Análisis de Secuencia de ADN , Especificidad de la Especie
20.
bioRxiv ; 2024 Apr 06.
Artículo en Inglés | MEDLINE | ID: mdl-38617294

RESUMEN

Relative cell type fraction estimates in bulk RNA-sequencing data are important to control for cell composition differences across heterogenous tissue samples. Current computational tools estimate relative RNA abundances rather than cell type proportions in tissues with varying cell sizes, leading to biased estimates. We present lute, a computational tool to accurately deconvolute cell types with varying sizes. Our software wraps existing deconvolution algorithms in a standardized framework. Using simulated and real datasets, we demonstrate how lute adjusts for differences in cell sizes to improve the accuracy of cell composition. Software is available from https://bioconductor.org/packages/lute.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA