Búsqueda | BVS CLAP/SMR-OPS/OMS

1.

mbQTL: an R/Bioconductor package for microbial quantitative trait loci (QTL) estimation.

Movassagh, Mercedeh; Schiff, Steven J; Paulson, Joseph N.

Bioinformatics ; 39(9)2023 09 02.

Artículo en Inglés | MEDLINE | ID: mdl-37707523

RESUMEN

MOTIVATION: In recent years, significant strides have been made in the field of genomics, with the commencement of large-scale studies aimed at collecting host mutational profiles and microbiome data. The amalgamation of host gene mutational profiles in both healthy and diseased subjects with microbial abundance data holds immense promise in providing insights into several crucial research questions, including the development and progression of diseases, as well as individual responses to therapeutic interventions. With the advent of sequencing methods such as 16s ribosomal RNA (rRNA) sequencing and whole genome sequencing, there is increasing evidence of interplay of human genetics and microbial communities. Quantitative trait loci associated with microbial abundance (mbQTLs), are genetic variants that influence the abundance of microbial populations within the host. RESULTS: Here, we introduce mbQTL, the first R package integrating 16S ribosomal RNA (rRNA) sequencing and single-nucleotide variation (SNV) and single-nucleotide polymorphism (SNP) data. We describe various statistical methods implemented for the identification of microbe-SNV pairs, relevant statistical measures, and plot functionality for interpretation. AVAILABILITY AND IMPLEMENTATION: mbQTL is available on bioconductor at https://bioconductor.org/packages/mbQTL/.

Asunto(s)

Microbiota , Sitios de Carácter Cuantitativo , Humanos , ARN Ribosómico 16S/genética , Genómica , Mutación , Nucleótidos

2.

Neonatal Paenibacilliosis: Paenibacillus Infection as a Novel Cause of Sepsis in Term Neonates With High Risk of Sequelae in Uganda.

Ericson, Jessica E; Burgoine, Kathy; Kumbakumba, Elias; Ochora, Moses; Hehnly, Christine; Bajunirwe, Francis; Bazira, Joel; Fronterre, Claudio; Hagmann, Cornelia; Kulkarni, Abhaya V; Kumar, M Senthil; Magombe, Joshua; Mbabazi-Kabachelor, Edith; Morton, Sarah U; Movassagh, Mercedeh; Mugamba, John; Mulondo, Ronald; Natukwatsa, Davis; Kaaya, Brian Nsubuga; Olupot-Olupot, Peter; Onen, Justin; Sheldon, Kathryn; Smith, Jasmine; Ssentongo, Paddy; Ssenyonga, Peter; Warf, Benjamin; Wegoye, Emmanuel; Zhang, Lijun; Kiwanuka, Julius; Paulson, Joseph N; Broach, James R; Schiff, Steven J.

Clin Infect Dis ; 77(5): 768-775, 2023 09 11.

Artículo en Inglés | MEDLINE | ID: mdl-37279589

RESUMEN

BACKGROUND: Paenibacillus thiaminolyticus may be an underdiagnosed cause of neonatal sepsis. METHODS: We prospectively enrolled a cohort of 800 full-term neonates presenting with a clinical diagnosis of sepsis at 2 Ugandan hospitals. Quantitative polymerase chain reaction specific to P. thiaminolyticus and to the Paenibacillus genus were performed on the blood and cerebrospinal fluid (CSF) of 631 neonates who had both specimen types available. Neonates with Paenibacillus genus or species detected in either specimen type were considered to potentially have paenibacilliosis, (37/631, 6%). We described antenatal, perinatal, and neonatal characteristics, presenting signs, and 12-month developmental outcomes for neonates with paenibacilliosis versus clinical sepsis due to other causes. RESULTS: Median age at presentation was 3 days (interquartile range 1, 7). Fever (92%), irritability (84%), and clinical signs of seizures (51%) were common. Eleven (30%) had an adverse outcome: 5 (14%) neonates died during the first year of life; 5 of 32 (16%) survivors developed postinfectious hydrocephalus (PIH) and 1 (3%) additional survivor had neurodevelopmental impairment without hydrocephalus. CONCLUSIONS: Paenibacillus species was identified in 6% of neonates with signs of sepsis who presented to 2 Ugandan referral hospitals; 70% were P. thiaminolyticus. Improved diagnostics for neonatal sepsis are urgently needed. Optimal antibiotic treatment for this infection is unknown but ampicillin and vancomycin will be ineffective in many cases. These results highlight the need to consider local pathogen prevalence and the possibility of unusual pathogens when determining antibiotic choice for neonatal sepsis.

Asunto(s)

Hidrocefalia , Sepsis Neonatal , Paenibacillus , Sepsis , Recién Nacido , Humanos , Femenino , Embarazo , Uganda/epidemiología , Sepsis/complicaciones , Sepsis/epidemiología , Sepsis/tratamiento farmacológico , Antibacterianos/uso terapéutico , Progresión de la Enfermedad

3.

GameRank: R package for feature selection and construction.

Henneges, Carsten; Paulson, Joseph N.

Bioinformatics ; 38(20): 4840-4842, 2022 10 14.

Artículo en Inglés | MEDLINE | ID: mdl-35951761

RESUMEN

MOTIVATION: Building calibrated and discriminating predictive models can be developed through the direct optimization of model performance metrics with combinatorial search algorithms. Often, predictive algorithms are desired in clinical settings to identify patients that may be high and low risk. However, due to the large combinatorial search space, these algorithms are slow and do not guarantee the global optimality of their selection. RESULTS: Here, we present a novel and quick maximum likelihood-based feature selection algorithm, named GameRank. The method is implemented into an R package composed of additional functions to build calibrated and discriminative predictive models. AVAILABILITY AND IMPLEMENTATION: GameRank is available at https://github.com/Genentech/GameRank and released under the MIT License.

Asunto(s)

Algoritmos , Programas Informáticos , Humanos , Funciones de Verosimilitud , Proyectos de Investigación

4.

mirTarRnaSeq: An R/Bioconductor Statistical Package for miRNA-mRNA Target Identification and Interaction Analysis.

Movassagh, Mercedeh; Morton, Sarah U; Hehnly, Christine; Smith, Jasmine; Doan, Trang T; Irizarry, Rafael; Broach, James R; Schiff, Steven J; Bailey, Jeffrey A; Paulson, Joseph N.

BMC Genomics ; 23(1): 439, 2022 Jun 13.

Artículo en Inglés | MEDLINE | ID: mdl-35698050

RESUMEN

We introduce mirTarRnaSeq, an R/Bioconductor package for quantitative assessment of miRNA-mRNA relationships within sample cohorts. mirTarRnaSeq is a statistical package to explore predicted or pre-hypothesized miRNA-mRNA relationships following target prediction.We present two use cases applying mirTarRnaSeq. First, to identify miRNA targets, we examined EBV miRNAs for interaction with human and virus transcriptomes of stomach adenocarcinoma. This revealed enrichment of mRNA targets highly expressed in CD105+ endothelial cells, monocytes, CD4+ T cells, NK cells, CD19+ B cells, and CD34 cells. Next, to investigate miRNA-mRNA relationships in SARS-CoV-2 (COVID-19) infection across time, we used paired miRNA and RNA sequenced datasets of SARS-CoV-2 infected lung epithelial cells across three time points (4, 12, and 24 hours post-infection). mirTarRnaSeq identified evidence for human miRNAs targeting cytokine signaling and neutrophil regulation immune pathways from 4 to 24 hours after SARS-CoV-2 infection. Confirming the clinical relevance of these predictions, three of the immune specific mRNA-miRNA relationships identified in human lung epithelial cells after SARS-CoV-2 infection were also observed to be differentially expressed in blood from patients with COVID-19. Overall, mirTarRnaSeq is a robust tool that can address a wide-range of biological questions providing improved prediction of miRNA-mRNA interactions.

Asunto(s)

COVID-19 , MicroARNs , COVID-19/genética , Células Endoteliales , Humanos , MicroARNs/genética , MicroARNs/metabolismo , ARN Mensajero/genética , ARN Mensajero/metabolismo , SARS-CoV-2

5.

MicrobiomeExplorer: an R package for the analysis and visualization of microbial communities.

Reeder, Janina; Huang, Mo; Kaminker, Joshua S; Paulson, Joseph N.

Bioinformatics ; 37(9): 1317-1318, 2021 06 09.

Artículo en Inglés | MEDLINE | ID: mdl-32960962

RESUMEN

SUMMARY: We developed the MicrobiomeExplorer R package to facilitate the analysis and visualization of microbial communities. The MicrobiomeExplorer R package allows a user to perform typical microbiome analytic workflows and visualize their results, either through the command line or an interactive Shiny application included with the package. In addition to applying common analytical workflows, the application enables automated analysis report generation. AVAILABILITY AND IMPLEMENTATION: Available at https://github.com/zoecastillo/microbiomeExplorer. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Asunto(s)

Microbiota , Programas Informáticos

6.

Prognostic mutational subtyping in de novo diffuse large B-cell lymphoma.

Kim, Eugene; Jiang, Yanwen; Xu, Tao; Bazeos, Alexandra; Knapp, Andrea; Bolen, Christopher R; Humphrey, Kathryn; Nielsen, Tina G; Penuel, Elicia; Paulson, Joseph N.

BMC Cancer ; 22(1): 231, 2022 Mar 03.

Artículo en Inglés | MEDLINE | ID: mdl-35236331

RESUMEN

BACKGROUND: Diffuse large B-cell lymphoma (DLBCL) is a heterogeneous disease defined using a number of well-established molecular subsets. Application of non-negative matrix factorization (NMF) to whole exome sequence data has previously been used to identify six distinct molecular clusters in DLBCL with potential clinical relevance. In this study, we applied NMF-clustering to targeted sequencing data utilizing the FoundationOne Heme® panel from the Phase III GOYA (NCT01287741) and Phase Ib/II CAVALLI studies (NCT02055820) in de novo DLBCL. Biopsy samples, survival outcomes, RNA-Seq and targeted exome-sequencing data were available for 423 patients in GOYA (obinutuzumab [G]-cyclophosphamide, doxorubicin, vincristine, and prednisone [CHOP] vs rituximab [R]-CHOP) and 86 patients in CAVALLI (venetoclax+[G/R]-CHOP). RESULTS: When the NMF algorithm was applied to samples from the GOYA study analyzed using a comprehensive genomic profiling platform, four of the six groups previously reported were observed: MYD88/CD79B, BCL2/EZH2, NOTCH2/TNFAIP3, and no mutations. Mutation profiles, cell-of-origin subset distributions and clinical associations of MYD88/CD79B and BCL2/EZH2 groups were similar to those described in previous NMF studies. In contrast, application of NMF to the CAVALLI study yielded only three; MYD88/CD79B-, BCL2/EZH2-like clusters, and a no mutations group, and there was a trend towards improved outcomes for BCL2/EZH2 over MYD88/CD79B. CONCLUSIONS: This analysis supports the utility of NMF used in conjunction with targeted sequencing platforms for identifying patients with different prognostic subsets. The observed trend for improved overall survival in the BCL2/EZH2 group is consistent with the mechanism of action of venetoclax, suggesting that targeting sequencing and NMF has potential for identifying patients who are more likely to gain benefit from venetoclax therapy.

Asunto(s)

Linfoma de Células B Grandes Difuso/genética , Mutación/genética , Adulto , Anciano , Protocolos de Quimioterapia Combinada Antineoplásica/uso terapéutico , Compuestos Bicíclicos Heterocíclicos con Puentes/uso terapéutico , Ensayos Clínicos Fase II como Asunto , Ensayos Clínicos Fase III como Asunto , Proteína Potenciadora del Homólogo Zeste 2/genética , Femenino , Humanos , Linfoma de Células B Grandes Difuso/tratamiento farmacológico , Masculino , Persona de Mediana Edad , Pronóstico , Proteínas Proto-Oncogénicas c-bcl-2/genética , RNA-Seq , Sulfonamidas/uso terapéutico , Resultado del Tratamiento , Secuenciación del Exoma

7.

Multivariable association discovery in population-scale meta-omics studies.

Mallick, Himel; Rahnavard, Ali; McIver, Lauren J; Ma, Siyuan; Zhang, Yancong; Nguyen, Long H; Tickle, Timothy L; Weingart, George; Ren, Boyu; Schwager, Emma H; Chatterjee, Suvo; Thompson, Kelsey N; Wilkinson, Jeremy E; Subramanian, Ayshwarya; Lu, Yiren; Waldron, Levi; Paulson, Joseph N; Franzosa, Eric A; Bravo, Hector Corrada; Huttenhower, Curtis.

PLoS Comput Biol ; 17(11): e1009442, 2021 11.

Artículo en Inglés | MEDLINE | ID: mdl-34784344

RESUMEN

It is challenging to associate features such as human health outcomes, diet, environmental conditions, or other metadata to microbial community measurements, due in part to their quantitative properties. Microbiome multi-omics are typically noisy, sparse (zero-inflated), high-dimensional, extremely non-normal, and often in the form of count or compositional measurements. Here we introduce an optimized combination of novel and established methodology to assess multivariable association of microbial community features with complex metadata in population-scale observational studies. Our approach, MaAsLin 2 (Microbiome Multivariable Associations with Linear Models), uses generalized linear and mixed models to accommodate a wide variety of modern epidemiological studies, including cross-sectional and longitudinal designs, as well as a variety of data types (e.g., counts and relative abundances) with or without covariates and repeated measurements. To construct this method, we conducted a large-scale evaluation of a broad range of scenarios under which straightforward identification of meta-omics associations can be challenging. These simulation studies reveal that MaAsLin 2's linear model preserves statistical power in the presence of repeated measures and multiple covariates, while accounting for the nuances of meta-omics features and controlling false discovery. We also applied MaAsLin 2 to a microbial multi-omics dataset from the Integrative Human Microbiome (HMP2) project which, in addition to reproducing established results, revealed a unique, integrated landscape of inflammatory bowel diseases (IBD) across multiple time points and omics profiles.

Asunto(s)

Biología Computacional , Microbioma Gastrointestinal , Análisis Multivariante , Simulación por Computador , Humanos , Enfermedades Inflamatorias del Intestino/genética , Enfermedades Inflamatorias del Intestino/metabolismo , Enfermedades Inflamatorias del Intestino/patología

8.

metagenomeFeatures: an R package for working with 16S rRNA reference databases and marker-gene survey feature data.

Olson, Nathan D; Shah, Nidhi; Kancherla, Jayaram; Wagner, Justin; Paulson, Joseph N; Corrada Bravo, Hector.

Bioinformatics ; 35(19): 3870-3872, 2019 10 01.

Artículo en Inglés | MEDLINE | ID: mdl-30821316

RESUMEN

SUMMARY: We developed the metagenomeFeatures R Bioconductor package along with annotation packages for three 16S rRNA databases (Greengenes, RDP and SILVA) to facilitate working with 16S rRNA databases and marker-gene survey feature data. The metagenomeFeatures package defines two classes, MgDb for working with 16S rRNA sequence databases, and mgFeatures for marker-gene survey feature data. The associated annotation packages provide a consistent interface to the different databases facilitating database comparison and exploration. The mgFeatures-class represents a crucial step in the development of a common data structure for working with 16S marker-gene survey data in R. AVAILABILITY AND IMPLEMENTATION: https://bioconductor.org/packages/release/bioc/html/metagenomeFeatures.html. SUPPLEMENTARY INFORMATION: Supplementary material is available at Bioinformatics online.

Asunto(s)

Bases de Datos de Ácidos Nucleicos , Programas Informáticos , ARN Ribosómico 16S , Encuestas y Cuestionarios

9.

Prognostic impact of somatic mutations in diffuse large B-cell lymphoma and relationship to cell-of-origin: data from the phase III GOYA study.

Bolen, Christopher R; Klanova, Magdalena; Trneny, Marek; Sehn, Laurie H; He, Jie; Tong, Jing; Paulson, Joseph N; Kim, Eugene; Vitolo, Umberto; Di Rocco, Alice; Fingerle-Rowson, Günter; Nielsen, Tina; Lenz, Georg; Oestergaard, Mikkel Z.

Haematologica ; 105(9): 2298-2307, 2020 09 01.

Artículo en Inglés | MEDLINE | ID: mdl-33054054

RESUMEN

Diffuse large B-cell lymphoma represents a biologically and clinically heterogeneous diagnostic category with well-defined cell-of-origin subtypes. Using data from the GOYA study (NCT01287741), we characterized the mutational profile of diffuse large B-cell lymphoma and evaluated the prognostic impact of somatic mutations in relation to cell-of-origin. Targeted DNA next-generation sequencing was performed in 499 formalin-fixed paraffin-embedded tissue biopsies from previously untreated patients. Prevalence of genetic alterations/mutations was examined. Multivariate Cox regression was used to evaluate the prognostic effect of individual genomic alterations. Of 465 genes analyzed, 59 were identified with mutations occurring in at least 10 of 499 patients (≥2% prevalence); 334 additional genes had mutations occurring in ≥1 patient. Single nucleotide variants were the most common mutation type. On multivariate analysis, BCL2 alterations were most strongly associated with shorter progression-free survival (multivariate hazard ratio: 2.6; 95% confidence interval: 1.6 to 4.2). BCL2 alterations were detected in 102 of 499 patients; 92 had BCL2 translocations, 90% of whom had germinal center B-cell-like diffuse large B-cell lymphoma. BCL2 alterations were also significantly correlated with BCL2 gene and protein expression levels. Validation of published mutational subsets revealed consistent patterns of co-occurrence, but no consistent prognostic differences between subsets. Our data confirm the molecular heterogeneity of diffuse large B-cell lymphoma, with potential treatment targets occurring in distinct cell-of-origin subtypes. clinicaltrials.gov identifier: NCT01287741.

Asunto(s)

Linfoma de Células B Grandes Difuso , Proteínas Proto-Oncogénicas c-myc , Protocolos de Quimioterapia Combinada Antineoplásica , Ciclofosfamida/uso terapéutico , Doxorrubicina/uso terapéutico , Humanos , Linfoma de Células B Grandes Difuso/diagnóstico , Linfoma de Células B Grandes Difuso/tratamiento farmacológico , Linfoma de Células B Grandes Difuso/genética , Mutación , Prednisona/uso terapéutico , Pronóstico , Proteínas Proto-Oncogénicas c-bcl-2/genética , Proteínas Proto-Oncogénicas c-myc/genética , Rituximab/uso terapéutico , Vincristina/uso terapéutico

10.

Metaviz: interactive statistical and visual analysis of metagenomic data.

Wagner, Justin; Chelaru, Florin; Kancherla, Jayaram; Paulson, Joseph N; Zhang, Alexander; Felix, Victor; Mahurkar, Anup; Elmqvist, Niklas; Corrada Bravo, Héctor.

Nucleic Acids Res ; 46(6): 2777-2787, 2018 04 06.

Artículo en Inglés | MEDLINE | ID: mdl-29529268

RESUMEN

Large studies profiling microbial communities and their association with healthy or disease phenotypes are now commonplace. Processed data from many of these studies are publicly available but significant effort is required for users to effectively organize, explore and integrate it, limiting the utility of these rich data resources. Effective integrative and interactive visual and statistical tools to analyze many metagenomic samples can greatly increase the value of these data for researchers. We present Metaviz, a tool for interactive exploratory data analysis of annotated microbiome taxonomic community profiles derived from marker gene or whole metagenome shotgun sequencing. Metaviz is uniquely designed to address the challenge of browsing the hierarchical structure of metagenomic data features while rendering visualizations of data values that are dynamically updated in response to user navigation. We use Metaviz to provide the UMD Metagenome Browser web service, allowing users to browse and explore data for more than 7000 microbiomes from published studies. Users can also deploy Metaviz as a web service, or use it to analyze data through the metavizr package to interoperate with state-of-the-art analysis tools available through Bioconductor. Metaviz is free and open source with the code, documentation and tutorials publicly accessible.

Asunto(s)

Biología Computacional/métodos , Metagenoma/genética , Metagenómica/métodos , Secuenciación Completa del Genoma/métodos , Bacterias/clasificación , Bacterias/genética , Niño , Biología Computacional/estadística & datos numéricos , Diarrea/diagnóstico , Diarrea/genética , Humanos , Internet , Metagenómica/estadística & datos numéricos , Reproducibilidad de los Resultados , Navegador Web , Secuenciación Completa del Genoma/estadística & datos numéricos

11.

Exploring regulation in tissues with eQTL networks.

Fagny, Maud; Paulson, Joseph N; Kuijjer, Marieke L; Sonawane, Abhijeet R; Chen, Cho-Yi; Lopes-Ramos, Camila M; Glass, Kimberly; Quackenbush, John; Platig, John.

Proc Natl Acad Sci U S A ; 114(37): E7841-E7850, 2017 09 12.

Artículo en Inglés | MEDLINE | ID: mdl-28851834

RESUMEN

Characterizing the collective regulatory impact of genetic variants on complex phenotypes is a major challenge in developing a genotype to phenotype map. Using expression quantitative trait locus (eQTL) analyses, we constructed bipartite networks in which edges represent significant associations between genetic variants and gene expression levels and found that the network structure informs regulatory function. We show, in 13 tissues, that these eQTL networks are organized into dense, highly modular communities grouping genes often involved in coherent biological processes. We find communities representing shared processes across tissues, as well as communities associated with tissue-specific processes that coalesce around variants in tissue-specific active chromatin regions. Node centrality is also highly informative, with the global and community hubs differing in regulatory potential and likelihood of being disease associated.

Asunto(s)

Estudio de Asociación del Genoma Completo/métodos , Especificidad de Órganos/genética , Sitios de Carácter Cuantitativo/genética , Expresión Génica/genética , Regulación de la Expresión Génica/genética , Redes Reguladoras de Genes/genética , Predisposición Genética a la Enfermedad/genética , Variación Genética , Genotipo , Humanos , Fenotipo , Polimorfismo de Nucleótido Simple/genética , Sitios de Carácter Cuantitativo/fisiología , Transcriptoma/genética

12.

Smooth quantile normalization.

Hicks, Stephanie C; Okrah, Kwame; Paulson, Joseph N; Quackenbush, John; Irizarry, Rafael A; Bravo, Héctor Corrada.

Biostatistics ; 19(2): 185-198, 2018 04 01.

Artículo en Inglés | MEDLINE | ID: mdl-29036413

RESUMEN

Between-sample normalization is a critical step in genomic data analysis to remove systematic bias and unwanted technical variation in high-throughput data. Global normalization methods are based on the assumption that observed variability in global properties is due to technical reasons and are unrelated to the biology of interest. For example, some methods correct for differences in sequencing read counts by scaling features to have similar median values across samples, but these fail to reduce other forms of unwanted technical variation. Methods such as quantile normalization transform the statistical distributions across samples to be the same and assume global differences in the distribution are induced by only technical variation. However, it remains unclear how to proceed with normalization if these assumptions are violated, for example, if there are global differences in the statistical distributions between biological conditions or groups, and external information, such as negative or control features, is not available. Here, we introduce a generalization of quantile normalization, referred to as smooth quantile normalization (qsmooth), which is based on the assumption that the statistical distribution of each sample should be the same (or have the same distributional shape) within biological groups or conditions, but allowing that they may differ between groups. We illustrate the advantages of our method on several high-throughput datasets with global differences in distributions corresponding to different biological conditions. We also perform a Monte Carlo simulation study to illustrate the bias-variance tradeoff and root mean squared error of qsmooth compared to other global normalization methods. A software implementation is available from https://github.com/stephaniehicks/qsmooth.

Asunto(s)

Bioestadística/métodos , Interpretación Estadística de Datos , Genómica/estadística & datos numéricos , Secuenciación de Nucleótidos de Alto Rendimiento/estadística & datos numéricos , Modelos Estadísticos , Humanos

13.

Estimating gene regulatory networks with pandaR.

Schlauch, Daniel; Paulson, Joseph N; Young, Albert; Glass, Kimberly; Quackenbush, John.

Bioinformatics ; 33(14): 2232-2234, 2017 Jul 15.

Artículo en Inglés | MEDLINE | ID: mdl-28334344

RESUMEN

CONTACT: johnq@jimmy.harvard.edu or dschlauch@fas.harvard.edu. AVAILABILITY AND IMPLEMENTATION: PandaR is provided as a Bioconductor R Package and is available at bioconductor.org/packages/pandaR.

Asunto(s)

Biología Computacional/métodos , Redes Reguladoras de Genes , Programas Informáticos , Humanos , Modelos Biológicos , Mapas de Interacción de Proteínas , Transcriptoma

14.

Tissue-aware RNA-Seq processing and normalization for heterogeneous and sparse data.

Paulson, Joseph N; Chen, Cho-Yi; Lopes-Ramos, Camila M; Kuijjer, Marieke L; Platig, John; Sonawane, Abhijeet R; Fagny, Maud; Glass, Kimberly; Quackenbush, John.

BMC Bioinformatics ; 18(1): 437, 2017 Oct 03.

Artículo en Inglés | MEDLINE | ID: mdl-28974199

RESUMEN

BACKGROUND: Although ultrahigh-throughput RNA-Sequencing has become the dominant technology for genome-wide transcriptional profiling, the vast majority of RNA-Seq studies typically profile only tens of samples, and most analytical pipelines are optimized for these smaller studies. However, projects are generating ever-larger data sets comprising RNA-Seq data from hundreds or thousands of samples, often collected at multiple centers and from diverse tissues. These complex data sets present significant analytical challenges due to batch and tissue effects, but provide the opportunity to revisit the assumptions and methods that we use to preprocess, normalize, and filter RNA-Seq data - critical first steps for any subsequent analysis. RESULTS: We find that analysis of large RNA-Seq data sets requires both careful quality control and the need to account for sparsity due to the heterogeneity intrinsic in multi-group studies. We developed Yet Another RNA Normalization software pipeline (YARN), that includes quality control and preprocessing, gene filtering, and normalization steps designed to facilitate downstream analysis of large, heterogeneous RNA-Seq data sets and we demonstrate its use with data from the Genotype-Tissue Expression (GTEx) project. CONCLUSIONS: An R package instantiating YARN is available at http://bioconductor.org/packages/yarn .

Asunto(s)

Bases de Datos Genéticas , Especificidad de Órganos/genética , Análisis de Secuencia de ARN/métodos , Análisis de Secuencia de ARN/normas , Perfilación de la Expresión Génica , Regulación de la Expresión Génica , Humanos , Anotación de Secuencia Molecular , Análisis de Componente Principal , Control de Calidad , Estándares de Referencia , Tamaño de la Muestra , Programas Informáticos

15.

Regulatory network changes between cell lines and their tissues of origin.

Lopes-Ramos, Camila M; Paulson, Joseph N; Chen, Cho-Yi; Kuijjer, Marieke L; Fagny, Maud; Platig, John; Sonawane, Abhijeet R; DeMeo, Dawn L; Quackenbush, John; Glass, Kimberly.

BMC Genomics ; 18(1): 723, 2017 Sep 12.

Artículo en Inglés | MEDLINE | ID: mdl-28899340

RESUMEN

BACKGROUND: Cell lines are an indispensable tool in biomedical research and often used as surrogates for tissues. Although there are recognized important cellular and transcriptomic differences between cell lines and tissues, a systematic overview of the differences between the regulatory processes of a cell line and those of its tissue of origin has not been conducted. The RNA-Seq data generated by the GTEx project is the first available data resource in which it is possible to perform a large-scale transcriptional and regulatory network analysis comparing cell lines with their tissues of origin. RESULTS: We compared 127 paired Epstein-Barr virus transformed lymphoblastoid cell lines (LCLs) and whole blood samples, and 244 paired primary fibroblast cell lines and skin samples. While gene expression analysis confirms that these cell lines carry the expression signatures of their primary tissues, albeit at reduced levels, network analysis indicates that expression changes are the cumulative result of many previously unreported alterations in transcription factor (TF) regulation. More specifically, cell cycle genes are over-expressed in cell lines compared to primary tissues, and this alteration in expression is a result of less repressive TF targeting. We confirmed these regulatory changes for four TFs, including SMAD5, using independent ChIP-seq data from ENCODE. CONCLUSIONS: Our results provide novel insights into the regulatory mechanisms controlling the expression differences between cell lines and tissues. The strong changes in TF regulation that we observe suggest that network changes, in addition to transcriptional levels, should be considered when using cell lines as models for tissues.

Asunto(s)

Perfilación de la Expresión Génica , Redes Reguladoras de Genes , Ciclo Celular/genética , Línea Celular , Humanos , Especificidad de Órganos

16.

Privacy-preserving microbiome analysis using secure computation.

Wagner, Justin; Paulson, Joseph N; Wang, Xiao; Bhattacharjee, Bobby; Corrada Bravo, Héctor.

Bioinformatics ; 32(12): 1873-9, 2016 06 15.

Artículo en Inglés | MEDLINE | ID: mdl-26873931

RESUMEN

MOTIVATION: Developing targeted therapeutics and identifying biomarkers relies on large amounts of research participant data. Beyond human DNA, scientists now investigate the DNA of micro-organisms inhabiting the human body. Recent work shows that an individual's collection of microbial DNA consistently identifies that person and could be used to link a real-world identity to a sensitive attribute in a research dataset. Unfortunately, the current suite of DNA-specific privacy-preserving analysis tools does not meet the requirements for microbiome sequencing studies. RESULTS: To address privacy concerns around microbiome sequencing, we implement metagenomic analyses using secure computation. Our implementation allows comparative analysis over combined data without revealing the feature counts for any individual sample. We focus on three analyses and perform an evaluation on datasets currently used by the microbiome research community. We use our implementation to simulate sharing data between four policy-domains. Additionally, we describe an application of our implementation for patients to combine data that allows drug developers to query against and compensate patients for the analysis. AVAILABILITY AND IMPLEMENTATION: The software is freely available for download at: http://cbcb.umd.edu/â¼hcorrada/projects/secureseq.html SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. CONTACT: hcorrada@umiacs.umd.edu.

Asunto(s)

Microbiota , ADN , Humanos , Metagenómica , Privacidad , Programas Informáticos

17.

Individual-specific changes in the human gut microbiota after challenge with enterotoxigenic Escherichia coli and subsequent ciprofloxacin treatment.

Pop, Mihai; Paulson, Joseph N; Chakraborty, Subhra; Astrovskaya, Irina; Lindsay, Brianna R; Li, Shan; Bravo, Héctor Corrada; Harro, Clayton; Parkhill, Julian; Walker, Alan W; Walker, Richard I; Sack, David A; Stine, O Colin.

BMC Genomics ; 17: 440, 2016 06 08.

Artículo en Inglés | MEDLINE | ID: mdl-27277524

RESUMEN

BACKGROUND: Enterotoxigenic Escherichia coli (ETEC) is a major cause of diarrhea in inhabitants from low-income countries and in visitors to these countries. The impact of the human intestinal microbiota on the initiation and progression of ETEC diarrhea is not yet well understood. RESULTS: We used 16S rRNA (ribosomal RNA) gene sequencing to study changes in the fecal microbiota of 12 volunteers during a human challenge study with ETEC (H10407) and subsequent treatment with ciprofloxacin. Five subjects developed severe diarrhea and seven experienced few or no symptoms. Diarrheal symptoms were associated with high concentrations of fecal E. coli as measured by quantitative culture, quantitative PCR, and normalized number of 16S rRNA gene sequences. Large changes in other members of the microbiota varied greatly from individual to individual, whether or not diarrhea occurred. Nonetheless the variation within an individual was small compared to variation between individuals. Ciprofloxacin treatment reorganized microbiota populations; however, the original structure was largely restored at one and three month follow-up visits. CONCLUSION: Symptomatic ETEC infections, but not asymptomatic infections, were associated with high fecal concentrations of E. coli. Both infection and ciprofloxacin treatment caused variable changes in other bacteria that generally reverted to baseline levels after three months.

Asunto(s)

Ciprofloxacina/uso terapéutico , Escherichia coli Enterotoxigénica/efectos de los fármacos , Escherichia coli Enterotoxigénica/fisiología , Infecciones por Escherichia coli/tratamiento farmacológico , Infecciones por Escherichia coli/microbiología , Microbioma Gastrointestinal/efectos de los fármacos , Adulto , Ciprofloxacina/farmacología , Diarrea/tratamiento farmacológico , Diarrea/microbiología , Heces/microbiología , Femenino , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Masculino , Metagenoma , Metagenómica/métodos , Persona de Mediana Edad , ARN Ribosómico 16S , Curva ROC , Resultado del Tratamiento , Adulto Joven

18.

Differential abundance analysis for microbial marker-gene surveys.

Paulson, Joseph N; Stine, O Colin; Bravo, Héctor Corrada; Pop, Mihai.

Nat Methods ; 10(12): 1200-2, 2013 Dec.

Artículo en Inglés | MEDLINE | ID: mdl-24076764

RESUMEN

We introduce a methodology to assess differential abundance in sparse high-throughput microbial marker-gene survey data. Our approach, implemented in the metagenomeSeq Bioconductor package, relies on a novel normalization technique and a statistical model that accounts for undersampling-a common feature of large-scale marker-gene studies. Using simulated data and several published microbiota data sets, we show that metagenomeSeq outperforms the tools currently used in this field.

Asunto(s)

Marcadores Genéticos , Metagenómica/métodos , Microbiota , ARN Ribosómico 16S/genética , Algoritmos , Animales , Área Bajo la Curva , Análisis por Conglomerados , Simulación por Computador , Bases de Datos Genéticas , Perfilación de la Expresión Génica/métodos , Variación Genética , Humanos , Intestinos/microbiología , Ratones , Modelos Genéticos , Modelos Estadísticos , Distribución Normal , Fenotipo , Análisis de Secuencia de ADN , Programas Informáticos

19.

Microbiota that affect risk for shigellosis in children in low-income countries.

Lindsay, Brianna; Oundo, Joe; Hossain, M Anowar; Antonio, Martin; Tamboura, Boubou; Walker, Alan W; Paulson, Joseph N; Parkhill, Julian; Omore, Richard; Faruque, Abu S G; Das, Suman Kumar; Ikumapayi, Usman N; Adeyemi, Mitchell; Sanogo, Doh; Saha, Debasish; Sow, Samba; Farag, Tamer H; Nasrin, Dilruba; Li, Shan; Panchalingam, Sandra; Levine, Myron M; Kotloff, Karen; Magder, Laurence S; Hungerford, Laura; Sommerfelt, Halvor; Pop, Mihai; Nataro, James P; Stine, O Colin.

Emerg Infect Dis ; 21(2): 242-50, 2015 Feb.

Artículo en Inglés | MEDLINE | ID: mdl-25625766

RESUMEN

Pathogens in the gastrointestinal tract exist within a vast population of microbes. We examined associations between pathogens and composition of gut microbiota as they relate to Shigella spp./enteroinvasive Escherichia coli infection. We analyzed 3,035 stool specimens (1,735 nondiarrheal and 1,300 moderate-to-severe diarrheal) from the Global Enteric Multicenter Study for 9 enteropathogens. Diarrheal specimens had a higher number of enteropathogens (diarrheal mean 1.4, nondiarrheal mean 0.95; p<0.0001). Rotavirus showed a negative association with Shigella spp. in cases of diarrhea (odds ratio 0.31, 95% CI 0.17-0.55) and had a large combined effect on moderate-to-severe diarrhea (odds ratio 29, 95% CI 3.8-220). In 4 Lactobacillus taxa identified by 16S rRNA gene sequencing, the association between pathogen and disease was decreased, which is consistent with the possibility that Lactobacillus spp. are protective against Shigella spp.-induced diarrhea. Bacterial diversity of gut microbiota was associated with diarrhea status, not high levels of the Shigella spp. ipaH gene.

Asunto(s)

Disentería Bacilar/epidemiología , Disentería Bacilar/microbiología , Microbiota , Shigella/genética , Factores de Edad , Biodiversidad , Estudios de Casos y Controles , Preescolar , Países en Desarrollo , Diarrea/diagnóstico , Diarrea/epidemiología , Diarrea/microbiología , Disentería Bacilar/diagnóstico , Heces/microbiología , Heces/virología , Tracto Gastrointestinal/microbiología , Tracto Gastrointestinal/virología , Genes Bacterianos , Humanos , Lactante , Recién Nacido , Metagenoma , Oportunidad Relativa , ARN Ribosómico 16S/genética , Riesgo , Índice de Severidad de la Enfermedad , Shigella/clasificación

20.

bacLIFE: a user-friendly computational workflow for genome analysis and prediction of lifestyle-associated genes in bacteria.

Guerrero-Egido, Guillermo; Pintado, Adrian; Bretscher, Kevin M; Arias-Giraldo, Luisa-Maria; Paulson, Joseph N; Spaink, Herman P; Claessen, Dennis; Ramos, Cayo; Cazorla, Francisco M; Medema, Marnix H; Raaijmakers, Jos M; Carrión, Víctor J.

Nat Commun ; 15(1): 2072, 2024 Mar 07.

Artículo en Inglés | MEDLINE | ID: mdl-38453959

RESUMEN

Bacteria have an extensive adaptive ability to live in close association with eukaryotic hosts, exhibiting detrimental, neutral or beneficial effects on host growth and health. However, the genes involved in niche adaptation are mostly unknown and their functions poorly characterized. Here, we present bacLIFE ( https://github.com/Carrion-lab/bacLIFE ) a streamlined computational workflow for genome annotation, large-scale comparative genomics, and prediction of lifestyle-associated genes (LAGs). As a proof of concept, we analyzed 16,846 genomes from the Burkholderia/Paraburkholderia and Pseudomonas genera, which led to the identification of hundreds of genes potentially associated with a plant pathogenic lifestyle. Site-directed mutagenesis of 14 of these predicted LAGs of unknown function, followed by plant bioassays, showed that 6 predicted LAGs are indeed involved in the phytopathogenic lifestyle of Burkholderia plantarii and Pseudomonas syringae pv. phaseolicola. These 6 LAGs encompassed a glycosyltransferase, extracellular binding proteins, homoserine dehydrogenases and hypothetical proteins. Collectively, our results highlight bacLIFE as an effective computational tool for prediction of LAGs and the generation of hypotheses for a better understanding of bacteria-host interactions.

Asunto(s)

Genoma Bacteriano , Pseudomonas syringae , Genoma Bacteriano/genética , Pseudomonas syringae/genética , Flujo de Trabajo , Genómica/métodos

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA