ABSTRACT
The growth of omic data presents evolving challenges in data manipulation, analysis and integration. Addressing these challenges, Bioconductor provides an extensive community-driven biological data analysis platform. Meanwhile, tidy R programming offers a revolutionary data organization and manipulation standard. Here we present the tidyomics software ecosystem, bridging Bioconductor to the tidy R paradigm. This ecosystem aims to streamline omic analysis, ease learning and encourage cross-disciplinary collaborations. We demonstrate the effectiveness of tidyomics by analyzing 7.5 million peripheral blood mononuclear cells from the Human Cell Atlas, spanning six data frameworks and ten analysis tools.
Subject(s)
Software , Humans , Computational Biology/methods , Leukocytes, Mononuclear/metabolism , Leukocytes, Mononuclear/cytology , Genomics/methods , Data AnalysisABSTRACT
Molecular quantitative trait loci (QTLs) allow us to understand the biology captured in genome-wide association studies (GWASs). The placenta regulates fetal development and shows sex differences in DNA methylation. We therefore hypothesized that placental methylation QTL (mQTL) explain variation in genetic risk for childhood onset traits, and that effects differ by sex. We analyzed 411 term placentas from two studies and found 49,252 methylation (CpG) sites with mQTL and 2,489 CpG sites with sex-dependent mQTL. All mQTL were enriched in regions that typically affect gene expression in prenatal tissues. All mQTL were also enriched in GWAS results for growth- and immune-related traits, but male- and female-specific mQTL were more enriched than cross-sex mQTL. mQTL colocalized with trait loci at 777 CpG sites, with 216 (28%) specific to males or females. Overall, mQTL specific to male and female placenta capture otherwise overlooked variation in childhood traits.
ABSTRACT
The growth of omic data presents evolving challenges in data manipulation, analysis, and integration. Addressing these challenges, Bioconductor1 provides an extensive community-driven biological data analysis platform. Meanwhile, tidy R programming2 offers a revolutionary standard for data organisation and manipulation. Here, we present the tidyomics software ecosystem, bridging Bioconductor to the tidy R paradigm. This ecosystem aims to streamline omic analysis, ease learning, and encourage cross-disciplinary collaborations. We demonstrate the effectiveness of tidyomics by analysing 7.5 million peripheral blood mononuclear cells from the Human Cell Atlas3, spanning six data frameworks and ten analysis tools.