RESUMEN
The gut microbiome of companion animals is relatively underexplored, despite its relevance to animal health, pet owner health, and basic microbial community biology. Here, we provide the most comprehensive analysis of the canine and feline gut microbiomes to date, incorporating 2639 stool shotgun metagenomes (2272 dog and 367 cat) spanning 14 publicly available datasets (n = 730) and 8 new study populations (n = 1909). These are compared with 238 and 112 baseline human gut metagenomes from the Human Microbiome Project 1-II and a traditionally living Malagasy cohort, respectively, processed in a manner identical to the animal metagenomes. All microbiomes were characterized using reference-based taxonomic and functional profiling, as well as de novo assembly yielding metagenomic assembled genomes clustered into species-level genome bins. Companion animals shared 184 species-level genome bins not found in humans, whereas 198 were found in all three hosts. We applied novel methodology to distinguish strains of these shared organisms either transferred or unique to host species, with phylogenetic patterns suggesting host-specific adaptation of microbial lineages. This corresponded with functional divergence of these lineages by host (e.g. differences in metabolic and antibiotic resistance genes) likely important to companion animal health. This study provides the largest resource to date of companion animal gut metagenomes and greatly contributes to our understanding of the "One Health" concept of a shared microbial environment among humans and companion animals, affecting infectious diseases, immune response, and specific genetic elements.
Asunto(s)
Heces , Microbioma Gastrointestinal , Metagenoma , Metagenómica , Mascotas , Filogenia , Animales , Microbioma Gastrointestinal/genética , Perros/microbiología , Gatos , Mascotas/microbiología , Heces/microbiología , Humanos , Bacterias/genética , Bacterias/clasificación , Bacterias/aislamiento & purificaciónRESUMEN
Diet impacts human health, influencing body adiposity and the risk of developing cardiometabolic diseases. The gut microbiome is a key player in the diet-health axis, but while its bacterial fraction is widely studied, the role of micro-eukaryotes, including Blastocystis, is underexplored. We performed a global-scale analysis on 56,989 metagenomes and showed that human Blastocystis exhibits distinct prevalence patterns linked to geography, lifestyle, and dietary habits. Blastocystis presence defined a specific bacterial signature and was positively associated with more favorable cardiometabolic profiles and negatively with obesity (p < 1e-16) and disorders linked to altered gut ecology (p < 1e-8). In a diet intervention study involving 1,124 individuals, improvements in dietary quality were linked to weight loss and increases in Blastocystis prevalence (p = 0.003) and abundance (p < 1e-7). Our findings suggest a potentially beneficial role for Blastocystis, which may help explain personalized host responses to diet and downstream disease etiopathogenesis.
Asunto(s)
Blastocystis , Dieta , Microbioma Gastrointestinal , Obesidad , Humanos , Blastocystis/metabolismo , Masculino , Femenino , Infecciones por Blastocystis , Adulto , Persona de Mediana Edad , Intestinos/parasitología , Intestinos/microbiología , Enfermedades Cardiovasculares/prevención & control , MetagenomaRESUMEN
The gut microbiome affects the inflammatory environment through effects on T-cells, which influence the production of immune mediators and inflammatory cytokines that stimulate osteoclastogenesis and bone loss in mice. However, there are few large human studies of the gut microbiome and skeletal health. We investigated the association between the human gut microbiome and high resolution peripheral quantitative computed tomography (HR-pQCT) scans of the radius and tibia in two large cohorts; Framingham Heart Study (FHS [n=1227, age range: 32 - 89]), and the Osteoporosis in Men Study (MrOS [n=836, age range: 78 - 98]). Stool samples from study participants underwent amplification and sequencing of the V4 hypervariable region of the 16S rRNA gene. The resulting 16S rRNA sequencing data were processed separately for each cohort, with the DADA2 pipeline incorporated in the16S bioBakery workflow. Resulting amplicon sequence variants were assigned taxonomies using the SILVA reference database. Controlling for multiple covariates, we tested for associations between microbial taxa abundances and HR-pQCT measures using general linear models as implemented in microbiome multivariable association with linear model (MaAslin2). Abundance of 37 microbial genera in FHS, and 4 genera in MrOS, were associated with various skeletal measures (false discovery rate [FDR] ≤ 0.1) including the association of DTU089 with bone measures, which was independently replicated in the two cohorts. A meta-analysis of the taxa-bone associations further revealed (FDR ≤ 0.25) that greater abundances of the genera; Akkermansia and DTU089, were associated with lower radius total vBMD, and tibia cortical vBMD respectively. Conversely, higher abundances of the genera; Lachnospiraceae NK4A136 group, and Faecalibacterium were associated with greater tibia cortical vBMD. We also investigated functional capabilities of microbial taxa by testing for associations between predicted (based on 16S rRNA amplicon sequence data) metabolic pathways abundance and bone phenotypes in each cohort. While there were no concordant functional associations observed in both cohorts, a meta-analysis revealed 8 pathways including the super-pathway of histidine, purine, and pyrimidine biosynthesis, associated with bone measures of the tibia cortical compartment. In conclusion, our findings suggest that there is a link between the gut microbiome and skeletal metabolism.
Asunto(s)
Densidad Ósea , Microbioma Gastrointestinal , Adulto , Anciano , Anciano de 80 o más Años , Humanos , Masculino , Persona de Mediana Edad , Huesos , Densidad Ósea/genética , Estudios de Cohortes , Microbioma Gastrointestinal/genética , ARN Ribosómico 16S/genéticaRESUMEN
Musculoskeletal diseases affect up to 20% of adults worldwide. The gut microbiome has been implicated in inflammatory conditions, but large-scale metagenomic evaluations have not yet traced the routes by which immunity in the gut affects inflammatory arthritis. To characterize the community structure and associated functional processes driving gut microbial involvement in arthritis, the Inflammatory Arthritis Microbiome Consortium investigated 440 stool shotgun metagenomes comprising 221 adults diagnosed with rheumatoid arthritis, ankylosing spondylitis, or psoriatic arthritis and 219 healthy controls and individuals with joint pain without an underlying inflammatory cause. Diagnosis explained about 2% of gut taxonomic variability, which is comparable in magnitude to inflammatory bowel disease. We identified several candidate microbes with differential carriage patterns in patients with elevated blood markers for inflammation. Our results confirm and extend previous findings of increased carriage of typically oral and inflammatory taxa and decreased abundance and prevalence of typical gut clades, indicating that distal inflammatory conditions, as well as local conditions, correspond to alterations to the gut microbial composition. We identified several differentially encoded pathways in the gut microbiome of patients with inflammatory arthritis, including changes in vitamin B salvage and biosynthesis and enrichment of iron sequestration. Although several of these changes characteristic of inflammation could have causal roles, we hypothesize that they are mainly positive feedback responses to changes in host physiology and immune homeostasis. By connecting taxonomic alternations to functional alterations, this work expands our understanding of the shifts in the gut ecosystem that occur in response to systemic inflammation during arthritis.
Asunto(s)
Artritis Reumatoide , Microbioma Gastrointestinal , Microbiota , Humanos , Microbioma Gastrointestinal/genética , Inflamación , Fenotipo , Redes y Vías MetabólicasRESUMEN
Metagenomic assembly enables new organism discovery from microbial communities, but it can only capture few abundant organisms from most metagenomes. Here we present MetaPhlAn 4, which integrates information from metagenome assemblies and microbial isolate genomes for more comprehensive metagenomic taxonomic profiling. From a curated collection of 1.01 M prokaryotic reference and metagenome-assembled genomes, we define unique marker genes for 26,970 species-level genome bins, 4,992 of them taxonomically unidentified at the species level. MetaPhlAn 4 explains ~20% more reads in most international human gut microbiomes and >40% in less-characterized environments such as the rumen microbiome and proves more accurate than available alternatives on synthetic evaluations while also reliably quantifying organisms with no cultured isolates. Application of the method to >24,500 metagenomes highlights previously undetected species to be strong biomarkers for host conditions and lifestyles in human and mouse microbiomes and shows that even previously uncharacterized species can be genetically profiled at the resolution of single microbial strains.
Asunto(s)
Microbioma Gastrointestinal , Microbiota , Humanos , Animales , Ratones , Metagenoma/genética , Microbiota/genética , Metagenómica/métodos , FilogeniaRESUMEN
MOTIVATION: Modern biological screens yield enormous numbers of measurements, and identifying and interpreting statistically significant associations among features are essential. In experiments featuring multiple high-dimensional datasets collected from the same set of samples, it is useful to identify groups of associated features between the datasets in a way that provides high statistical power and false discovery rate (FDR) control. RESULTS: Here, we present a novel hierarchical framework, HAllA (Hierarchical All-against-All association testing), for structured association discovery between paired high-dimensional datasets. HAllA efficiently integrates hierarchical hypothesis testing with FDR correction to reveal significant linear and non-linear block-wise relationships among continuous and/or categorical data. We optimized and evaluated HAllA using heterogeneous synthetic datasets of known association structure, where HAllA outperformed all-against-all and other block-testing approaches across a range of common similarity measures. We then applied HAllA to a series of real-world multiomics datasets, revealing new associations between gene expression and host immune activity, the microbiome and host transcriptome, metabolomic profiling and human health phenotypes. AVAILABILITY AND IMPLEMENTATION: An open-source implementation of HAllA is freely available at http://huttenhower.sph.harvard.edu/halla along with documentation, demo datasets and a user group. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Asunto(s)
Microbiota , TranscriptomaRESUMEN
Microbial communities and their associated bioactive compounds1-3 are often disrupted in conditions such as the inflammatory bowel diseases (IBD)4. However, even in well-characterized environments (for example, the human gastrointestinal tract), more than one-third of microbial proteins are uncharacterized and often expected to be bioactive5-7. Here we systematically identified more than 340,000 protein families as potentially bioactive with respect to gut inflammation during IBD, about half of which have not to our knowledge been functionally characterized previously on the basis of homology or experiment. To validate prioritized microbial proteins, we used a combination of metagenomics, metatranscriptomics and metaproteomics to provide evidence of bioactivity for a subset of proteins that are involved in host and microbial cell-cell communication in the microbiome; for example, proteins associated with adherence or invasion processes, and extracellular von Willebrand-like factors. Predictions from high-throughput data were validated using targeted experiments that revealed the differential immunogenicity of prioritized Enterobacteriaceae pilins and the contribution of homologues of von Willebrand factors to the formation of Bacteroides biofilms in a manner dependent on mucin levels. This methodology, which we term MetaWIBELE (workflow to identify novel bioactive elements in the microbiome), is generalizable to other environmental communities and human phenotypes. The prioritized results provide thousands of candidate microbial proteins that are likely to interact with the host immune system in IBD, thus expanding our understanding of potentially bioactive gene products in chronic disease states and offering a rational compendium of possible therapeutic compounds and targets.
Asunto(s)
Proteínas Bacterianas , Microbioma Gastrointestinal , Genes Microbianos , Enfermedades Inflamatorias del Intestino , Proteínas Bacterianas/análisis , Proteínas Bacterianas/genética , Enfermedad Crónica , Microbioma Gastrointestinal/genética , Humanos , Enfermedades Inflamatorias del Intestino/microbiología , Metagenómica , Proteómica , Reproducibilidad de los Resultados , TranscriptomaRESUMEN
It is challenging to associate features such as human health outcomes, diet, environmental conditions, or other metadata to microbial community measurements, due in part to their quantitative properties. Microbiome multi-omics are typically noisy, sparse (zero-inflated), high-dimensional, extremely non-normal, and often in the form of count or compositional measurements. Here we introduce an optimized combination of novel and established methodology to assess multivariable association of microbial community features with complex metadata in population-scale observational studies. Our approach, MaAsLin 2 (Microbiome Multivariable Associations with Linear Models), uses generalized linear and mixed models to accommodate a wide variety of modern epidemiological studies, including cross-sectional and longitudinal designs, as well as a variety of data types (e.g., counts and relative abundances) with or without covariates and repeated measurements. To construct this method, we conducted a large-scale evaluation of a broad range of scenarios under which straightforward identification of meta-omics associations can be challenging. These simulation studies reveal that MaAsLin 2's linear model preserves statistical power in the presence of repeated measures and multiple covariates, while accounting for the nuances of meta-omics features and controlling false discovery. We also applied MaAsLin 2 to a microbial multi-omics dataset from the Integrative Human Microbiome (HMP2) project which, in addition to reproducing established results, revealed a unique, integrated landscape of inflammatory bowel diseases (IBD) across multiple time points and omics profiles.
Asunto(s)
Biología Computacional , Microbioma Gastrointestinal , Análisis Multivariante , Simulación por Computador , Humanos , Enfermedades Inflamatorias del Intestino/genética , Enfermedades Inflamatorias del Intestino/metabolismo , Enfermedades Inflamatorias del Intestino/patologíaRESUMEN
Culture-independent analyses of microbial communities have progressed dramatically in the last decade, particularly due to advances in methods for biological profiling via shotgun metagenomics. Opportunities for improvement continue to accelerate, with greater access to multi-omics, microbial reference genomes, and strain-level diversity. To leverage these, we present bioBakery 3, a set of integrated, improved methods for taxonomic, strain-level, functional, and phylogenetic profiling of metagenomes newly developed to build on the largest set of reference sequences now available. Compared to current alternatives, MetaPhlAn 3 increases the accuracy of taxonomic profiling, and HUMAnN 3 improves that of functional potential and activity. These methods detected novel disease-microbiome links in applications to CRC (1262 metagenomes) and IBD (1635 metagenomes and 817 metatranscriptomes). Strain-level profiling of an additional 4077 metagenomes with StrainPhlAn 3 and PanPhlAn 3 unraveled the phylogenetic and functional structure of the common gut microbe Ruminococcus bromii, previously described by only 15 isolate genomes. With open-source implementations and cloud-deployable reproducible workflows, the bioBakery 3 platform can help researchers deepen the resolution, scale, and accuracy of multi-omic profiling for microbial community studies.
Asunto(s)
Bacterias/clasificación , Bacterias/genética , Biología Computacional/métodos , Metagenoma , Microbiota/genética , Microbiota/fisiología , Filogenia , Bacterias/metabolismo , Humanos , Metagenómica/métodos , Investigadores , Ruminococcus/clasificación , Ruminococcus/genética , Flujo de TrabajoRESUMEN
A lack of prospective studies has been a major barrier for assessing the role of the microbiome in human health and disease on a population-wide scale. To address this significant knowledge gap, we have launched a large-scale collection targeting fecal and oral microbiome specimens from 20,000 women within the Nurses' Health Study II cohort (the Microbiome Among Nurses study, or Micro-N). Leveraging the rich epidemiologic data that have been repeatedly collected from this cohort since 1989; the established biorepository of archived blood, urine, buccal cell, and tumor tissue specimens; the available genetic and biomarker data; the cohort's ongoing follow-up; and the BIOM-Mass microbiome research platform, Micro-N furnishes unparalleled resources for future prospective studies to interrogate the interplay between host, environmental factors, and the microbiome in human health. These prospectively collected materials will provide much-needed evidence to infer causality in microbiome-associated outcomes, paving the way toward development of microbiota-targeted modulators, preventives, diagnostics and therapeutics. Here, we describe a generalizable, scalable and cost-effective platform used for stool and oral microbiome specimen and metadata collection in the Micro-N study as an example of how prospective studies of the microbiome may be carried out.
Asunto(s)
Microbioma Gastrointestinal , Manejo de Especímenes/métodos , Adulto , Anciano , Femenino , Humanos , Persona de Mediana Edad , Enfermeras y Enfermeros , Estudios Prospectivos , Manejo de Especímenes/instrumentación , Encuestas y CuestionariosRESUMEN
In the Supplementary Tables 2, 4 and 6 originally published with this Article, the authors mistakenly included sample identifiers in the form of UMCGs rather than UMCG IBDs in the validation cohort; this has now been amended.
RESUMEN
The inflammatory bowel diseases (IBDs), which include Crohn's disease (CD) and ulcerative colitis (UC), are multifactorial chronic conditions of the gastrointestinal tract. While IBD has been associated with dramatic changes in the gut microbiota, changes in the gut metabolome-the molecular interface between host and microbiota-are less well understood. To address this gap, we performed untargeted metabolomic and shotgun metagenomic profiling of cross-sectional stool samples from discovery (n = 155) and validation (n = 65) cohorts of CD, UC and non-IBD control patients. Metabolomic and metagenomic profiles were broadly correlated with faecal calprotectin levels (a measure of gut inflammation). Across >8,000 measured metabolite features, we identified chemicals and chemical classes that were differentially abundant in IBD, including enrichments for sphingolipids and bile acids, and depletions for triacylglycerols and tetrapyrroles. While > 50% of differentially abundant metabolite features were uncharacterized, many could be assigned putative roles through metabolomic 'guilt by association' (covariation with known metabolites). Differentially abundant species and functions from the metagenomic profiles reflected adaptation to oxidative stress in the IBD gut, and were individually consistent with previous findings. Integrating these data, however, we identified 122 robust associations between differentially abundant species and well-characterized differentially abundant metabolites, indicating possible mechanistic relationships that are perturbed in IBD. Finally, we found that metabolome- and metagenome-based classifiers of IBD status were highly accurate and, like the vast majority of individual trends, generalized well to the independent validation cohort. Our findings thus provide an improved understanding of perturbations of the microbiome-metabolome interface in IBD, including identification of many potential diagnostic and therapeutic targets.
Asunto(s)
Microbioma Gastrointestinal , Enfermedades Inflamatorias del Intestino/metabolismo , Enfermedades Inflamatorias del Intestino/microbiología , Biodiversidad , Biomarcadores/metabolismo , Colitis Ulcerosa/inmunología , Colitis Ulcerosa/metabolismo , Colitis Ulcerosa/microbiología , Enfermedad de Crohn/inmunología , Enfermedad de Crohn/metabolismo , Enfermedad de Crohn/microbiología , Heces/química , Heces/microbiología , Microbioma Gastrointestinal/genética , Microbioma Gastrointestinal/inmunología , Humanos , Inflamación/metabolismo , Inflamación/microbiología , Enfermedades Inflamatorias del Intestino/inmunología , Complejo de Antígeno L1 de Leucocito/análisis , Metaboloma , MetagenomaRESUMEN
Functional profiles of microbial communities are typically generated using comprehensive metagenomic or metatranscriptomic sequence read searches, which are time-consuming, prone to spurious mapping, and often limited to community-level quantification. We developed HUMAnN2, a tiered search strategy that enables fast, accurate, and species-resolved functional profiling of host-associated and environmental communities. HUMAnN2 identifies a community's known species, aligns reads to their pangenomes, performs translated search on unclassified reads, and finally quantifies gene families and pathways. Relative to pure translated search, HUMAnN2 is faster and produces more accurate gene family profiles. We applied HUMAnN2 to study clinal variation in marine metabolism, ecological contribution patterns among human microbiome pathways, variation in species' genomic versus transcriptional contributions, and strain profiling. Further, we introduce 'contributional diversity' to explain patterns of ecological assembly across different microbial community types.
Asunto(s)
Bacterias/clasificación , Bacterias/genética , Proteínas Bacterianas/genética , Perfilación de la Expresión Génica , Metagenoma , Programas Informáticos , Transcriptoma , Bacterias/aislamiento & purificación , Proteínas Bacterianas/metabolismo , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Microbiota , Especificidad de la EspecieRESUMEN
Inflammatory bowel disease (IBD) is a group of chronic diseases of the digestive tract that affects millions of people worldwide. Genetic, environmental and microbial factors have been implicated in the onset and exacerbation of IBD. However, the mechanisms associating gut microbial dysbioses and aberrant immune responses remain largely unknown. The integrative Human Microbiome Project seeks to close these gaps by examining the dynamics of microbiome functionality in disease by profiling the gut microbiomes of >100 individuals sampled over a 1-year period. Here, we present the first results based on 78 paired faecal metagenomes and metatranscriptomes, and 222 additional metagenomes from 59 patients with Crohn's disease, 34 with ulcerative colitis and 24 non-IBD control patients. We demonstrate several cases in which measures of microbial gene expression in the inflamed gut can be informative relative to metagenomic profiles of functional potential. First, although many microbial organisms exhibited concordant DNA and RNA abundances, we also detected species-specific biases in transcriptional activity, revealing predominant transcription of pathways by individual microorganisms per host (for example, by Faecalibacterium prausnitzii). Thus, a loss of these organisms in disease may have more far-reaching consequences than suggested by their genomic abundances. Furthermore, we identified organisms that were metagenomically abundant but inactive or dormant in the gut with little or no expression (for example, Dialister invisus). Last, certain disease-specific microbial characteristics were more pronounced or only detectable at the transcript level, such as pathways that were predominantly expressed by different organisms in patients with IBD (for example, Bacteroides vulgatus and Alistipes putredinis). This provides potential insights into gut microbial pathway transcription that can vary over time, inducing phenotypical changes that are complementary to those linked to metagenomic abundances. The study's results highlight the strength of analysing both the activity and the presence of gut microorganisms to provide insight into the role of the microbiome in IBD.
Asunto(s)
Microbioma Gastrointestinal/genética , Enfermedades Inflamatorias del Intestino/microbiología , Metagenómica , Transcripción Genética , Adolescente , Adulto , Niño , Colitis Ulcerosa/microbiología , Enfermedad de Crohn/microbiología , Disbiosis , Heces/microbiología , Femenino , Perfilación de la Expresión Génica , Humanos , Estudios Longitudinales , Masculino , Fenotipo , Adulto JovenRESUMEN
Summary: bioBakery is a meta'omic analysis environment and collection of individual software tools with the capacity to process raw shotgun sequencing data into actionable microbial community feature profiles, summary reports, and publication-ready figures. It includes a collection of pre-configured analysis modules also joined into workflows for reproducibility. Availability and implementation: bioBakery (http://huttenhower.sph.harvard.edu/biobakery) is publicly available for local installation as individual modules and as a virtual machine image. Each individual module has been developed to perform a particular task (e.g. quantitative taxonomic profiling or statistical analysis), and they are provided with source code, tutorials, demonstration data, and validation results; the bioBakery virtual image includes the entire suite of modules and their dependencies pre-installed. Images are available for both Amazon EC2 and Google Compute Engine. All software is open source under the MIT license. bioBakery is actively maintained with a support group at biobakery-users@googlegroups.com and new tools being added upon their release. Contact: chuttenh@hsph.harvard.edu. Supplementary information: Supplementary data are available at Bioinformatics online.
Asunto(s)
Metagenómica/métodos , Microbiota/genética , Programas Informáticos , Reproducibilidad de los Resultados , Flujo de TrabajoRESUMEN
The human genome is 99% complete. This study contributes to filling the 1% gap by enriching previously unknown repeat regions called microsatellites (MST). We devised a Global MST Enrichment (GME) kit to enrich and nextgen sequence 2 colorectal cell lines and 16 normal human samples to illustrate its utility in identifying contigs from reads that do not map to the genome reference. The analysis of these samples yielded 790 novel extra-referential concordant contigs that are observed in more than one sample. We searched for evidence of functional elements in the concordant contigs in two ways: (1) BLAST-ing each contig against normal RNA-Seq samples, (2) Checking for predicted functional elements using GlimmerHMM. Of the 790 concordant contigs, 37 had an exact match to at least one RNA-Seq read; 15 aligned to more than 100 RNA-Seq reads. Of the 249 concordant contigs predicted by GlimmerHMM to have functional elements, 6 had at least one exact RNA-Seq match. BLAST-ing these novel contigs against all publically available sequences confirmed that they were found in human and chimpanzee BAC and FOSMID clones sequenced as part of the original human genome project. These extra-referential contigs predominantly contained pentameric repeats, especially two motifs: AATGG and GTGGA.
Asunto(s)
Repeticiones de Microsatélite , Análisis de Secuencia de ADN/métodos , Análisis de Secuencia de ARN/métodos , Algoritmos , Animales , Línea Celular , Mapeo Contig , Genoma Humano , Genómica , Humanos , Pan troglodytes/genéticaRESUMEN
The pluripotent cells of the embryonic ectodermal tissues are known to be a precursor for multiple tumor types. The adaptability of these cells is a trait exploited by cancer. We previously described cancer-associated microsatellite loci (CAML) shared between glioblastoma (GBM) and lower-grade gliomas. Therefore, we hypothesized that these variants, identified from germline DNA, are shared by cancers from tissues originating from ectodermal tissues: neural tube cells (NTC) and crest cells (NCC). Using exome sequencing data from four cancers with origins to NTC and NCC, a 'signature' of loci significant to each cancer (p-value ≤ 0.01) was created and compared with previously identified CAML from breast cancer. The results of this analysis show that variant loci among the cancers with tissue origins from NTC/NCC were closely linked. Signaling pathways linked to genes with non-coding CAML genotypes revealed enriched connections to hereditary, neurological, and developmental disease or disorders. Thus, variants in genes from tissues initiating from NTC/NCC, if recurrently detected, may indicate a common etiology. Additionally, CAML genotypes from non-tumor DNA may predict cancer phenotypes and are common to shared embryonic tissues of origin.
Asunto(s)
Exoma , Glioblastoma/genética , Glioblastoma/patología , Repeticiones de Microsatélite , Neoplasias de Células Germinales y Embrionarias/genética , Neoplasias de Células Germinales y Embrionarias/patología , Cresta Neural/patología , Tubo Neural/patología , Estudios de Casos y Controles , Femenino , Frecuencia de los Genes , Genotipo , Humanos , Masculino , Transducción de SeñalRESUMEN
Ovarian cancer (OV) ranks fifth in cancer deaths among women, yet there remain few informative biomarkers for this disease. Microsatellites are repetitive genomic regions which we hypothesize could be a source of novel biomarkers for OV and have traditionally been under-appreciated relative to Single Nucleotide Polymorphisms (SNPs). In this study, we explore microsatellite variation as a potential novel source of genomic variation associated with OV. Exomes from 305 OV patient germline samples and 54 tumors, sequenced as part of The Cancer Genome Atlas, were analyzed for microsatellite variation and compared to healthy females sequenced as part of the 1,000 Genomes Project. We identified a subset of 60 microsatellite loci with genotypes that varied significantly between the OV and healthy female populations. Using these loci as a signature set, we classified germline genomes as 'at risk' for OV with a sensitivity of 90.1% and a specificity of 87.6%. Cross-analysis with a similar set of breast cancer associated loci identified individuals 'at risk' for both diseases. This study revealed a genotype-based microsatellite signature present in the germlines of individuals diagnosed with OV, and provides the basis for a potential novel risk assessment diagnostic for OV and new personal genomics targets in tumors.
Asunto(s)
Biomarcadores de Tumor/genética , Variación Genética , Genética de Población , Repeticiones de Microsatélite , Neoplasias Ováricas/genética , Área Bajo la Curva , Estudios de Casos y Controles , Biología Computacional , Bases de Datos Genéticas , Exoma , Femenino , Perfilación de la Expresión Génica , Estudios de Asociación Genética , Predisposición Genética a la Enfermedad , Genética de Población/métodos , Humanos , Neoplasias Ováricas/patología , Fenotipo , Medicina de Precisión , Valor Predictivo de las Pruebas , Pronóstico , Curva ROC , Medición de Riesgo , Factores de RiesgoRESUMEN
Several studies have demonstrated that unmapped reads in next generation sequencing data could be used to identify infectious agents or structural variants, but there has been no intensive effort to analyze and classify all non-human sequences found in individual large data sets. To identify commonality in non-human sequences by infectious agents and putative contamination events, we analyzed non-human sequences in 150 genomic sequencing data files from the 1000 Genomes Project and observed that 0.13% of reads on average showed similarities to non-human genomes. We compared results among different sample groups divided based on ethnicities, sequencing centers and enrichment methods (whole genome sequencing vs. exome sequencing) and found that sequencing centers had specific signatures of contaminating genomes as 'time stamps'. We also observed many unmapped reads that falsely indicated contamination because of the high similarity of human sequences to sequences in non-human genome assemblies such as mouse and Nicotiana.