Your browser doesn't support javascript.
loading
: 20 | 50 | 100
1 - 20 de 24
1.
Front Endocrinol (Lausanne) ; 14: 1237727, 2023.
Article En | MEDLINE | ID: mdl-37810879

The gut microbiome affects the inflammatory environment through effects on T-cells, which influence the production of immune mediators and inflammatory cytokines that stimulate osteoclastogenesis and bone loss in mice. However, there are few large human studies of the gut microbiome and skeletal health. We investigated the association between the human gut microbiome and high resolution peripheral quantitative computed tomography (HR-pQCT) scans of the radius and tibia in two large cohorts; Framingham Heart Study (FHS [n=1227, age range: 32 - 89]), and the Osteoporosis in Men Study (MrOS [n=836, age range: 78 - 98]). Stool samples from study participants underwent amplification and sequencing of the V4 hypervariable region of the 16S rRNA gene. The resulting 16S rRNA sequencing data were processed separately for each cohort, with the DADA2 pipeline incorporated in the16S bioBakery workflow. Resulting amplicon sequence variants were assigned taxonomies using the SILVA reference database. Controlling for multiple covariates, we tested for associations between microbial taxa abundances and HR-pQCT measures using general linear models as implemented in microbiome multivariable association with linear model (MaAslin2). Abundance of 37 microbial genera in FHS, and 4 genera in MrOS, were associated with various skeletal measures (false discovery rate [FDR] ≤ 0.1) including the association of DTU089 with bone measures, which was independently replicated in the two cohorts. A meta-analysis of the taxa-bone associations further revealed (FDR ≤ 0.25) that greater abundances of the genera; Akkermansia and DTU089, were associated with lower radius total vBMD, and tibia cortical vBMD respectively. Conversely, higher abundances of the genera; Lachnospiraceae NK4A136 group, and Faecalibacterium were associated with greater tibia cortical vBMD. We also investigated functional capabilities of microbial taxa by testing for associations between predicted (based on 16S rRNA amplicon sequence data) metabolic pathways abundance and bone phenotypes in each cohort. While there were no concordant functional associations observed in both cohorts, a meta-analysis revealed 8 pathways including the super-pathway of histidine, purine, and pyrimidine biosynthesis, associated with bone measures of the tibia cortical compartment. In conclusion, our findings suggest that there is a link between the gut microbiome and skeletal metabolism.


Bone Density , Gastrointestinal Microbiome , Adult , Aged , Aged, 80 and over , Humans , Male , Middle Aged , Bone and Bones , Bone Density/genetics , Cohort Studies , Gastrointestinal Microbiome/genetics , RNA, Ribosomal, 16S/genetics
2.
Sci Transl Med ; 15(706): eabn4722, 2023 07 26.
Article En | MEDLINE | ID: mdl-37494472

Musculoskeletal diseases affect up to 20% of adults worldwide. The gut microbiome has been implicated in inflammatory conditions, but large-scale metagenomic evaluations have not yet traced the routes by which immunity in the gut affects inflammatory arthritis. To characterize the community structure and associated functional processes driving gut microbial involvement in arthritis, the Inflammatory Arthritis Microbiome Consortium investigated 440 stool shotgun metagenomes comprising 221 adults diagnosed with rheumatoid arthritis, ankylosing spondylitis, or psoriatic arthritis and 219 healthy controls and individuals with joint pain without an underlying inflammatory cause. Diagnosis explained about 2% of gut taxonomic variability, which is comparable in magnitude to inflammatory bowel disease. We identified several candidate microbes with differential carriage patterns in patients with elevated blood markers for inflammation. Our results confirm and extend previous findings of increased carriage of typically oral and inflammatory taxa and decreased abundance and prevalence of typical gut clades, indicating that distal inflammatory conditions, as well as local conditions, correspond to alterations to the gut microbial composition. We identified several differentially encoded pathways in the gut microbiome of patients with inflammatory arthritis, including changes in vitamin B salvage and biosynthesis and enrichment of iron sequestration. Although several of these changes characteristic of inflammation could have causal roles, we hypothesize that they are mainly positive feedback responses to changes in host physiology and immune homeostasis. By connecting taxonomic alternations to functional alterations, this work expands our understanding of the shifts in the gut ecosystem that occur in response to systemic inflammation during arthritis.


Arthritis, Rheumatoid , Gastrointestinal Microbiome , Microbiota , Humans , Gastrointestinal Microbiome/genetics , Inflammation , Phenotype , Metabolic Networks and Pathways
3.
Nat Biotechnol ; 41(11): 1633-1644, 2023 Nov.
Article En | MEDLINE | ID: mdl-36823356

Metagenomic assembly enables new organism discovery from microbial communities, but it can only capture few abundant organisms from most metagenomes. Here we present MetaPhlAn 4, which integrates information from metagenome assemblies and microbial isolate genomes for more comprehensive metagenomic taxonomic profiling. From a curated collection of 1.01 M prokaryotic reference and metagenome-assembled genomes, we define unique marker genes for 26,970 species-level genome bins, 4,992 of them taxonomically unidentified at the species level. MetaPhlAn 4 explains ~20% more reads in most international human gut microbiomes and >40% in less-characterized environments such as the rumen microbiome and proves more accurate than available alternatives on synthetic evaluations while also reliably quantifying organisms with no cultured isolates. Application of the method to >24,500 metagenomes highlights previously undetected species to be strong biomarkers for host conditions and lifestyles in human and mouse microbiomes and shows that even previously uncharacterized species can be genetically profiled at the resolution of single microbial strains.


Gastrointestinal Microbiome , Microbiota , Humans , Animals , Mice , Metagenome/genetics , Microbiota/genetics , Metagenomics/methods , Phylogeny
4.
Bioinformatics ; 38(Suppl 1): i378-i385, 2022 06 24.
Article En | MEDLINE | ID: mdl-35758795

MOTIVATION: Modern biological screens yield enormous numbers of measurements, and identifying and interpreting statistically significant associations among features are essential. In experiments featuring multiple high-dimensional datasets collected from the same set of samples, it is useful to identify groups of associated features between the datasets in a way that provides high statistical power and false discovery rate (FDR) control. RESULTS: Here, we present a novel hierarchical framework, HAllA (Hierarchical All-against-All association testing), for structured association discovery between paired high-dimensional datasets. HAllA efficiently integrates hierarchical hypothesis testing with FDR correction to reveal significant linear and non-linear block-wise relationships among continuous and/or categorical data. We optimized and evaluated HAllA using heterogeneous synthetic datasets of known association structure, where HAllA outperformed all-against-all and other block-testing approaches across a range of common similarity measures. We then applied HAllA to a series of real-world multiomics datasets, revealing new associations between gene expression and host immune activity, the microbiome and host transcriptome, metabolomic profiling and human health phenotypes. AVAILABILITY AND IMPLEMENTATION: An open-source implementation of HAllA is freely available at http://huttenhower.sph.harvard.edu/halla along with documentation, demo datasets and a user group. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Microbiota , Transcriptome
5.
Nature ; 606(7915): 754-760, 2022 06.
Article En | MEDLINE | ID: mdl-35614211

Microbial communities and their associated bioactive compounds1-3 are often disrupted in conditions such as the inflammatory bowel diseases (IBD)4. However, even in well-characterized environments (for example, the human gastrointestinal tract), more than one-third of microbial proteins are uncharacterized and often expected to be bioactive5-7. Here we systematically identified more than 340,000 protein families as potentially bioactive with respect to gut inflammation during IBD, about half of which have not to our knowledge been functionally characterized previously on the basis of homology or experiment. To validate prioritized microbial proteins, we used a combination of metagenomics, metatranscriptomics and metaproteomics to provide evidence of bioactivity for a subset of proteins that are involved in host and microbial cell-cell communication in the microbiome; for example, proteins associated with adherence or invasion processes, and extracellular von Willebrand-like factors. Predictions from high-throughput data were validated using targeted experiments that revealed the differential immunogenicity of prioritized Enterobacteriaceae pilins and the contribution of homologues of von Willebrand factors to the formation of Bacteroides biofilms in a manner dependent on mucin levels. This methodology, which we term MetaWIBELE (workflow to identify novel bioactive elements in the microbiome), is generalizable to other environmental communities and human phenotypes. The prioritized results provide thousands of candidate microbial proteins that are likely to interact with the host immune system in IBD, thus expanding our understanding of potentially bioactive gene products in chronic disease states and offering a rational compendium of possible therapeutic compounds and targets.


Bacterial Proteins , Gastrointestinal Microbiome , Genes, Microbial , Inflammatory Bowel Diseases , Bacterial Proteins/analysis , Bacterial Proteins/genetics , Chronic Disease , Gastrointestinal Microbiome/genetics , Humans , Inflammatory Bowel Diseases/microbiology , Metagenomics , Proteomics , Reproducibility of Results , Transcriptome
6.
PLoS Comput Biol ; 17(11): e1009442, 2021 11.
Article En | MEDLINE | ID: mdl-34784344

It is challenging to associate features such as human health outcomes, diet, environmental conditions, or other metadata to microbial community measurements, due in part to their quantitative properties. Microbiome multi-omics are typically noisy, sparse (zero-inflated), high-dimensional, extremely non-normal, and often in the form of count or compositional measurements. Here we introduce an optimized combination of novel and established methodology to assess multivariable association of microbial community features with complex metadata in population-scale observational studies. Our approach, MaAsLin 2 (Microbiome Multivariable Associations with Linear Models), uses generalized linear and mixed models to accommodate a wide variety of modern epidemiological studies, including cross-sectional and longitudinal designs, as well as a variety of data types (e.g., counts and relative abundances) with or without covariates and repeated measurements. To construct this method, we conducted a large-scale evaluation of a broad range of scenarios under which straightforward identification of meta-omics associations can be challenging. These simulation studies reveal that MaAsLin 2's linear model preserves statistical power in the presence of repeated measures and multiple covariates, while accounting for the nuances of meta-omics features and controlling false discovery. We also applied MaAsLin 2 to a microbial multi-omics dataset from the Integrative Human Microbiome (HMP2) project which, in addition to reproducing established results, revealed a unique, integrated landscape of inflammatory bowel diseases (IBD) across multiple time points and omics profiles.


Computational Biology , Gastrointestinal Microbiome , Multivariate Analysis , Computer Simulation , Humans , Inflammatory Bowel Diseases/genetics , Inflammatory Bowel Diseases/metabolism , Inflammatory Bowel Diseases/pathology
7.
Elife ; 102021 05 04.
Article En | MEDLINE | ID: mdl-33944776

Culture-independent analyses of microbial communities have progressed dramatically in the last decade, particularly due to advances in methods for biological profiling via shotgun metagenomics. Opportunities for improvement continue to accelerate, with greater access to multi-omics, microbial reference genomes, and strain-level diversity. To leverage these, we present bioBakery 3, a set of integrated, improved methods for taxonomic, strain-level, functional, and phylogenetic profiling of metagenomes newly developed to build on the largest set of reference sequences now available. Compared to current alternatives, MetaPhlAn 3 increases the accuracy of taxonomic profiling, and HUMAnN 3 improves that of functional potential and activity. These methods detected novel disease-microbiome links in applications to CRC (1262 metagenomes) and IBD (1635 metagenomes and 817 metatranscriptomes). Strain-level profiling of an additional 4077 metagenomes with StrainPhlAn 3 and PanPhlAn 3 unraveled the phylogenetic and functional structure of the common gut microbe Ruminococcus bromii, previously described by only 15 isolate genomes. With open-source implementations and cloud-deployable reproducible workflows, the bioBakery 3 platform can help researchers deepen the resolution, scale, and accuracy of multi-omic profiling for microbial community studies.


Bacteria/classification , Bacteria/genetics , Computational Biology/methods , Metagenome , Microbiota/genetics , Microbiota/physiology , Phylogeny , Bacteria/metabolism , Humans , Metagenomics/methods , Research Personnel , Ruminococcus/classification , Ruminococcus/genetics , Workflow
9.
Nat Protoc ; 16(6): 2724-2731, 2021 06.
Article En | MEDLINE | ID: mdl-33883746

A lack of prospective studies has been a major barrier for assessing the role of the microbiome in human health and disease on a population-wide scale. To address this significant knowledge gap, we have launched a large-scale collection targeting fecal and oral microbiome specimens from 20,000 women within the Nurses' Health Study II cohort (the Microbiome Among Nurses study, or Micro-N). Leveraging the rich epidemiologic data that have been repeatedly collected from this cohort since 1989; the established biorepository of archived blood, urine, buccal cell, and tumor tissue specimens; the available genetic and biomarker data; the cohort's ongoing follow-up; and the BIOM-Mass microbiome research platform, Micro-N furnishes unparalleled resources for future prospective studies to interrogate the interplay between host, environmental factors, and the microbiome in human health. These prospectively collected materials will provide much-needed evidence to infer causality in microbiome-associated outcomes, paving the way toward development of microbiota-targeted modulators, preventives, diagnostics and therapeutics. Here, we describe a generalizable, scalable and cost-effective platform used for stool and oral microbiome specimen and metadata collection in the Micro-N study as an example of how prospective studies of the microbiome may be carried out.


Gastrointestinal Microbiome , Specimen Handling/methods , Adult , Aged , Female , Humans , Middle Aged , Nurses , Prospective Studies , Specimen Handling/instrumentation , Surveys and Questionnaires
11.
Nat Microbiol ; 4(2): 293-305, 2019 02.
Article En | MEDLINE | ID: mdl-30531976

The inflammatory bowel diseases (IBDs), which include Crohn's disease (CD) and ulcerative colitis (UC), are multifactorial chronic conditions of the gastrointestinal tract. While IBD has been associated with dramatic changes in the gut microbiota, changes in the gut metabolome-the molecular interface between host and microbiota-are less well understood. To address this gap, we performed untargeted metabolomic and shotgun metagenomic profiling of cross-sectional stool samples from discovery (n = 155) and validation (n = 65) cohorts of CD, UC and non-IBD control patients. Metabolomic and metagenomic profiles were broadly correlated with faecal calprotectin levels (a measure of gut inflammation). Across >8,000 measured metabolite features, we identified chemicals and chemical classes that were differentially abundant in IBD, including enrichments for sphingolipids and bile acids, and depletions for triacylglycerols and tetrapyrroles. While > 50% of differentially abundant metabolite features were uncharacterized, many could be assigned putative roles through metabolomic 'guilt by association' (covariation with known metabolites). Differentially abundant species and functions from the metagenomic profiles reflected adaptation to oxidative stress in the IBD gut, and were individually consistent with previous findings. Integrating these data, however, we identified 122 robust associations between differentially abundant species and well-characterized differentially abundant metabolites, indicating possible mechanistic relationships that are perturbed in IBD. Finally, we found that metabolome- and metagenome-based classifiers of IBD status were highly accurate and, like the vast majority of individual trends, generalized well to the independent validation cohort. Our findings thus provide an improved understanding of perturbations of the microbiome-metabolome interface in IBD, including identification of many potential diagnostic and therapeutic targets.


Gastrointestinal Microbiome , Inflammatory Bowel Diseases/metabolism , Inflammatory Bowel Diseases/microbiology , Biodiversity , Biomarkers/metabolism , Colitis, Ulcerative/immunology , Colitis, Ulcerative/metabolism , Colitis, Ulcerative/microbiology , Crohn Disease/immunology , Crohn Disease/metabolism , Crohn Disease/microbiology , Feces/chemistry , Feces/microbiology , Gastrointestinal Microbiome/genetics , Gastrointestinal Microbiome/immunology , Humans , Inflammation/metabolism , Inflammation/microbiology , Inflammatory Bowel Diseases/immunology , Leukocyte L1 Antigen Complex/analysis , Metabolome , Metagenome
12.
Nat Methods ; 15(11): 962-968, 2018 11.
Article En | MEDLINE | ID: mdl-30377376

Functional profiles of microbial communities are typically generated using comprehensive metagenomic or metatranscriptomic sequence read searches, which are time-consuming, prone to spurious mapping, and often limited to community-level quantification. We developed HUMAnN2, a tiered search strategy that enables fast, accurate, and species-resolved functional profiling of host-associated and environmental communities. HUMAnN2 identifies a community's known species, aligns reads to their pangenomes, performs translated search on unclassified reads, and finally quantifies gene families and pathways. Relative to pure translated search, HUMAnN2 is faster and produces more accurate gene family profiles. We applied HUMAnN2 to study clinal variation in marine metabolism, ecological contribution patterns among human microbiome pathways, variation in species' genomic versus transcriptional contributions, and strain profiling. Further, we introduce 'contributional diversity' to explain patterns of ecological assembly across different microbial community types.


Bacteria/classification , Bacteria/genetics , Bacterial Proteins/genetics , Gene Expression Profiling , Metagenome , Software , Transcriptome , Bacteria/isolation & purification , Bacterial Proteins/metabolism , High-Throughput Nucleotide Sequencing , Humans , Microbiota , Species Specificity
13.
Nat Microbiol ; 3(3): 337-346, 2018 03.
Article En | MEDLINE | ID: mdl-29311644

Inflammatory bowel disease (IBD) is a group of chronic diseases of the digestive tract that affects millions of people worldwide. Genetic, environmental and microbial factors have been implicated in the onset and exacerbation of IBD. However, the mechanisms associating gut microbial dysbioses and aberrant immune responses remain largely unknown. The integrative Human Microbiome Project seeks to close these gaps by examining the dynamics of microbiome functionality in disease by profiling the gut microbiomes of >100 individuals sampled over a 1-year period. Here, we present the first results based on 78 paired faecal metagenomes and metatranscriptomes, and 222 additional metagenomes from 59 patients with Crohn's disease, 34 with ulcerative colitis and 24 non-IBD control patients. We demonstrate several cases in which measures of microbial gene expression in the inflamed gut can be informative relative to metagenomic profiles of functional potential. First, although many microbial organisms exhibited concordant DNA and RNA abundances, we also detected species-specific biases in transcriptional activity, revealing predominant transcription of pathways by individual microorganisms per host (for example, by Faecalibacterium prausnitzii). Thus, a loss of these organisms in disease may have more far-reaching consequences than suggested by their genomic abundances. Furthermore, we identified organisms that were metagenomically abundant but inactive or dormant in the gut with little or no expression (for example, Dialister invisus). Last, certain disease-specific microbial characteristics were more pronounced or only detectable at the transcript level, such as pathways that were predominantly expressed by different organisms in patients with IBD (for example, Bacteroides vulgatus and Alistipes putredinis). This provides potential insights into gut microbial pathway transcription that can vary over time, inducing phenotypical changes that are complementary to those linked to metagenomic abundances. The study's results highlight the strength of analysing both the activity and the presence of gut microorganisms to provide insight into the role of the microbiome in IBD.


Gastrointestinal Microbiome/genetics , Inflammatory Bowel Diseases/microbiology , Metagenomics , Transcription, Genetic , Adolescent , Adult , Child , Colitis, Ulcerative/microbiology , Crohn Disease/microbiology , Dysbiosis , Feces/microbiology , Female , Gene Expression Profiling , Humans , Longitudinal Studies , Male , Phenotype , Young Adult
14.
Bioinformatics ; 34(7): 1235-1237, 2018 04 01.
Article En | MEDLINE | ID: mdl-29194469

Summary: bioBakery is a meta'omic analysis environment and collection of individual software tools with the capacity to process raw shotgun sequencing data into actionable microbial community feature profiles, summary reports, and publication-ready figures. It includes a collection of pre-configured analysis modules also joined into workflows for reproducibility. Availability and implementation: bioBakery (http://huttenhower.sph.harvard.edu/biobakery) is publicly available for local installation as individual modules and as a virtual machine image. Each individual module has been developed to perform a particular task (e.g. quantitative taxonomic profiling or statistical analysis), and they are provided with source code, tutorials, demonstration data, and validation results; the bioBakery virtual image includes the entire suite of modules and their dependencies pre-installed. Images are available for both Amazon EC2 and Google Compute Engine. All software is open source under the MIT license. bioBakery is actively maintained with a support group at biobakery-users@googlegroups.com and new tools being added upon their release. Contact: chuttenh@hsph.harvard.edu. Supplementary information: Supplementary data are available at Bioinformatics online.


Metagenomics/methods , Microbiota/genetics , Software , Reproducibility of Results , Workflow
15.
Sci Rep ; 6: 27722, 2016 06 09.
Article En | MEDLINE | ID: mdl-27278669

The human genome is 99% complete. This study contributes to filling the 1% gap by enriching previously unknown repeat regions called microsatellites (MST). We devised a Global MST Enrichment (GME) kit to enrich and nextgen sequence 2 colorectal cell lines and 16 normal human samples to illustrate its utility in identifying contigs from reads that do not map to the genome reference. The analysis of these samples yielded 790 novel extra-referential concordant contigs that are observed in more than one sample. We searched for evidence of functional elements in the concordant contigs in two ways: (1) BLAST-ing each contig against normal RNA-Seq samples, (2) Checking for predicted functional elements using GlimmerHMM. Of the 790 concordant contigs, 37 had an exact match to at least one RNA-Seq read; 15 aligned to more than 100 RNA-Seq reads. Of the 249 concordant contigs predicted by GlimmerHMM to have functional elements, 6 had at least one exact RNA-Seq match. BLAST-ing these novel contigs against all publically available sequences confirmed that they were found in human and chimpanzee BAC and FOSMID clones sequenced as part of the original human genome project. These extra-referential contigs predominantly contained pentameric repeats, especially two motifs: AATGG and GTGGA.


Microsatellite Repeats , Sequence Analysis, DNA/methods , Sequence Analysis, RNA/methods , Algorithms , Animals , Cell Line , Contig Mapping , Genome, Human , Genomics , Humans , Pan troglodytes/genetics
16.
Oncotarget ; 6(26): 22038-47, 2015 Sep 08.
Article En | MEDLINE | ID: mdl-26246470

The pluripotent cells of the embryonic ectodermal tissues are known to be a precursor for multiple tumor types. The adaptability of these cells is a trait exploited by cancer. We previously described cancer-associated microsatellite loci (CAML) shared between glioblastoma (GBM) and lower-grade gliomas. Therefore, we hypothesized that these variants, identified from germline DNA, are shared by cancers from tissues originating from ectodermal tissues: neural tube cells (NTC) and crest cells (NCC). Using exome sequencing data from four cancers with origins to NTC and NCC, a 'signature' of loci significant to each cancer (p-value ≤ 0.01) was created and compared with previously identified CAML from breast cancer. The results of this analysis show that variant loci among the cancers with tissue origins from NTC/NCC were closely linked. Signaling pathways linked to genes with non-coding CAML genotypes revealed enriched connections to hereditary, neurological, and developmental disease or disorders. Thus, variants in genes from tissues initiating from NTC/NCC, if recurrently detected, may indicate a common etiology. Additionally, CAML genotypes from non-tumor DNA may predict cancer phenotypes and are common to shared embryonic tissues of origin.


Exome , Glioblastoma/genetics , Glioblastoma/pathology , Microsatellite Repeats , Neoplasms, Germ Cell and Embryonal/genetics , Neoplasms, Germ Cell and Embryonal/pathology , Neural Crest/pathology , Neural Tube/pathology , Case-Control Studies , Female , Gene Frequency , Genotype , Humans , Male , Signal Transduction
17.
Oncotarget ; 6(13): 11407-20, 2015 May 10.
Article En | MEDLINE | ID: mdl-25779658

Ovarian cancer (OV) ranks fifth in cancer deaths among women, yet there remain few informative biomarkers for this disease. Microsatellites are repetitive genomic regions which we hypothesize could be a source of novel biomarkers for OV and have traditionally been under-appreciated relative to Single Nucleotide Polymorphisms (SNPs). In this study, we explore microsatellite variation as a potential novel source of genomic variation associated with OV. Exomes from 305 OV patient germline samples and 54 tumors, sequenced as part of The Cancer Genome Atlas, were analyzed for microsatellite variation and compared to healthy females sequenced as part of the 1,000 Genomes Project. We identified a subset of 60 microsatellite loci with genotypes that varied significantly between the OV and healthy female populations. Using these loci as a signature set, we classified germline genomes as 'at risk' for OV with a sensitivity of 90.1% and a specificity of 87.6%. Cross-analysis with a similar set of breast cancer associated loci identified individuals 'at risk' for both diseases. This study revealed a genotype-based microsatellite signature present in the germlines of individuals diagnosed with OV, and provides the basis for a potential novel risk assessment diagnostic for OV and new personal genomics targets in tumors.


Biomarkers, Tumor/genetics , Genetic Variation , Genetics, Population , Microsatellite Repeats , Ovarian Neoplasms/genetics , Area Under Curve , Case-Control Studies , Computational Biology , Databases, Genetic , Exome , Female , Gene Expression Profiling , Genetic Association Studies , Genetic Predisposition to Disease , Genetics, Population/methods , Humans , Ovarian Neoplasms/pathology , Phenotype , Precision Medicine , Predictive Value of Tests , Prognosis , ROC Curve , Risk Assessment , Risk Factors
18.
Genomics ; 104(6 Pt B): 453-8, 2014 Dec.
Article En | MEDLINE | ID: mdl-25173571

Several studies have demonstrated that unmapped reads in next generation sequencing data could be used to identify infectious agents or structural variants, but there has been no intensive effort to analyze and classify all non-human sequences found in individual large data sets. To identify commonality in non-human sequences by infectious agents and putative contamination events, we analyzed non-human sequences in 150 genomic sequencing data files from the 1000 Genomes Project and observed that 0.13% of reads on average showed similarities to non-human genomes. We compared results among different sample groups divided based on ethnicities, sequencing centers and enrichment methods (whole genome sequencing vs. exome sequencing) and found that sequencing centers had specific signatures of contaminating genomes as 'time stamps'. We also observed many unmapped reads that falsely indicated contamination because of the high similarity of human sequences to sequences in non-human genome assemblies such as mouse and Nicotiana.


DNA Contamination , Genome, Human , DNA, Bacterial/chemistry , DNA, Plant/chemistry , DNA, Viral/chemistry , Humans
19.
Oncotarget ; 5(15): 6003-14, 2014 Aug 15.
Article En | MEDLINE | ID: mdl-25153720

Genomic studies of glioma sub-types have amassed new disease specific mutations, yet these only partially explain how mutations are linked to predisposition or progression. We hypothesized that microsatellite variation could expand the understanding of glioma etiology. Furthermore, germline markers for gliomas are typically undetectable; therefore we also hypothesize that the predictability of cancer-associated microsatellite loci in germline DNA may support the current hypothesis of a glioma cell of origin. In this study, "normal" germline exome sequenced DNA from the 1000 Genomes Project (n=390) were compared with exome sequences from germlines of subjects with WHO grade II and III lower-grade glioma (LGG, n=136) and WHO grade IV glioblastoma (GBM, n=252) from The Cancer Genome Atlas to identify microsatellite loci non-randomly associated with glioma. From germline data, we identified 48 GBM-specific loci, 42 Lower-grade glioma specific loci and 29 loci that distinguish GBM from LGG (p≤ 0.01). We then attempted to distinguish WHO grade II glioma (n=67) from GBM resulting in 8 informative loci. Significantly, in all glioma grades, comparisons between tumor and matched germline sequences demonstrated no significant differences in these variants (p≥ 0.01). Therefore, these microsatellite loci are considered to be components of grade-specific signatures for glioma which distinguish germline sequences of individuals with cancer from those of individuals that are "normal". In order to better understand the significance of these loci, we identified biological processes enriched in genes with these variants. Most strikingly, six helicase genes were enriched in the GBM cohort (p≤ 1.0 x10⁻³). The preservation of these glioma-specific loci could therefore serve as valuable diagnostic and therapeutic markers; especially since the heterogeneity of tumor cell populations can obscure the identification of mutations preceding a metastatic phenotype.


Brain Neoplasms/genetics , Brain Neoplasms/pathology , Glioblastoma/genetics , Glioblastoma/pathology , Glioma/genetics , Glioma/pathology , Introns , Microsatellite Repeats , Brain Neoplasms/metabolism , Cell Differentiation/genetics , Female , Genomics , Glioblastoma/metabolism , Glioma/metabolism , Humans , Male
20.
Proc Natl Acad Sci U S A ; 111(29): 10630-5, 2014 Jul 22.
Article En | MEDLINE | ID: mdl-25006263

Repeat sequences, especially mobile elements, make up large portions of most eukaryotic genomes and provide enormous, albeit commonly underappreciated, evolutionary potential. We analyzed repeatomes of Drosophila melanogaster that have been diverging in response to a microclimate contrast in Evolution Canyon (Mount Carmel, Israel), a natural evolutionary laboratory with two abutting slopes at an average distance of only 200 m, which pose a constant ecological challenge to their local biotas. Flies inhabiting the colder and more humid north-facing slope carried about 6% more transposable elements than those from the hot and dry south-facing slope, in parallel to a suite of other genetic and phenotypic differences between the two populations. Nearly 50% of all mobile element insertions were slope unique, with many of them disrupting coding sequences of genes critical for cognition, olfaction, and thermotolerance, consistent with the observed patterns of thermotolerance differences and assortative mating.


Biological Evolution , Drosophila melanogaster/genetics , Genetic Variation , Microclimate , Repetitive Sequences, Nucleic Acid/genetics , Animals , Base Sequence , Chromosomes, Insect/genetics , DNA Transposable Elements/genetics , Israel , Microsatellite Repeats/genetics , Polymorphism, Single Nucleotide/genetics , X Chromosome/genetics
...