RESUMEN
Decades of genetic association testing in human cohorts have provided important insights into the genetic architecture and biological underpinnings of complex traits and diseases. However, for certain traits, genome-wide association studies (GWAS) for common SNPs are approaching signal saturation, which underscores the need to explore other types of genetic variation to understand the genetic basis of traits and diseases. Copy number variation (CNV) is an important source of heritability that is well known to functionally affect human traits. Recent technological and computational advances enable the large-scale, genome-wide evaluation of CNVs, with implications for downstream applications such as polygenic risk scoring and drug target identification. Here, we review the current state of CNV-GWAS, discuss current limitations in resource infrastructure that need to be overcome to enable the wider uptake of CNV-GWAS results, highlight emerging opportunities and suggest guidelines and standards for future GWAS for genetic variation beyond SNPs at scale.
RESUMEN
The use of omic modalities to dissect the molecular underpinnings of common diseases and traits is becoming increasingly common. But multi-omic traits can be genetically predicted, which enables highly cost-effective and powerful analyses for studies that do not have multi-omics1. Here we examine a large cohort (the INTERVAL study2; n = 50,000 participants) with extensive multi-omic data for plasma proteomics (SomaScan, n = 3,175; Olink, n = 4,822), plasma metabolomics (Metabolon HD4, n = 8,153), serum metabolomics (Nightingale, n = 37,359) and whole-blood Illumina RNA sequencing (n = 4,136), and use machine learning to train genetic scores for 17,227 molecular traits, including 10,521 that reach Bonferroni-adjusted significance. We evaluate the performance of genetic scores through external validation across cohorts of individuals of European, Asian and African American ancestries. In addition, we show the utility of these multi-omic genetic scores by quantifying the genetic control of biological pathways and by generating a synthetic multi-omic dataset of the UK Biobank3 to identify disease associations using a phenome-wide scan. We highlight a series of biological insights with regard to genetic mechanisms in metabolism and canonical pathway associations with disease; for example, JAK-STAT signalling and coronary atherosclerosis. Finally, we develop a portal ( https://www.omicspred.org/ ) to facilitate public access to all genetic scores and validation results, as well as to serve as a platform for future extensions and enhancements of multi-omic genetic scores.
Asunto(s)
Enfermedad de la Arteria Coronaria , Multiómica , Humanos , Enfermedad de la Arteria Coronaria/genética , Enfermedad de la Arteria Coronaria/metabolismo , Metabolómica/métodos , Fenotipo , Proteómica/métodos , Aprendizaje Automático , Negro o Afroamericano/genética , Asiático/genética , Pueblo Europeo/genética , Reino Unido , Conjuntos de Datos como Asunto , Internet , Reproducibilidad de los Resultados , Estudios de Cohortes , Proteoma/análisis , Proteoma/metabolismo , Metaboloma , Plasma/metabolismo , Bases de Datos FactualesRESUMEN
Gene misexpression is the aberrant transcription of a gene in a context where it is usually inactive. Despite its known pathological consequences in specific rare diseases, we have a limited understanding of its wider prevalence and mechanisms in humans. To address this, we analyzed gene misexpression in 4,568 whole-blood bulk RNA sequencing samples from INTERVAL study blood donors. We found that while individual misexpression events occur rarely, in aggregate they were found in almost all samples and a third of inactive protein-coding genes. Using 2,821 paired whole-genome and RNA sequencing samples, we identified that misexpression events are enriched in cis for rare structural variants. We established putative mechanisms through which a subset of SVs lead to gene misexpression, including transcriptional readthrough, transcript fusions, and gene inversion. Overall, we develop misexpression as a type of transcriptomic outlier analysis and extend our understanding of the variety of mechanisms by which genetic variants can influence gene expression.
Asunto(s)
Regulación de la Expresión Génica , Humanos , Análisis de Secuencia de ARN , Variación Genética , Variación Estructural del Genoma/genética , Transcriptoma/genética , Donantes de SangreRESUMEN
Methods of estimating polygenic scores (PGSs) from genome-wide association studies are increasingly utilized. However, independent method evaluation is lacking, and method comparisons are often limited. Here, we evaluate polygenic scores derived via seven methods in five biobank studies (totaling about 1.2 million participants) across 16 diseases and quantitative traits, building on a reference-standardized framework. We conducted meta-analyses to quantify the effects of method choice, hyperparameter tuning, method ensembling, and the target biobank on PGS performance. We found that no single method consistently outperformed all others. PGS effect sizes were more variable between biobanks than between methods within biobanks when methods were well tuned. Differences between methods were largest for the two investigated autoimmune diseases, seropositive rheumatoid arthritis and type 1 diabetes. For most methods, cross-validation was more reliable for tuning hyperparameters than automatic tuning (without the use of target data). For a given target phenotype, elastic net models combining PGS across methods (ensemble PGS) tuned in the UK Biobank provided consistent, high, and cross-biobank transferable performance, increasing PGS effect sizes (ß coefficients) by a median of 5.0% relative to LDpred2 and MegaPRS (the two best-performing single methods when tuned with cross-validation). Our interactively browsable online-results and open-source workflow prspipe provide a rich resource and reference for the analysis of polygenic scoring methods across biobanks.
Asunto(s)
Bancos de Muestras Biológicas , Estudio de Asociación del Genoma Completo , Herencia Multifactorial , Humanos , Herencia Multifactorial/genética , Fenotipo , Diabetes Mellitus Tipo 1/genética , Polimorfismo de Nucleótido Simple , Aprendizaje AutomáticoRESUMEN
When B cells encounter an antigen, they alter their physiological state and anatomical localization and initiate a differentiation process that ultimately produces antibody-secreting cells (ASCs). We have defined the transcriptomes of many mature B cell populations and stages of plasma cell differentiation in mice. We provide a molecular signature of ASCs that highlights the stark transcriptional divide between B cells and plasma cells and enables the demarcation of ASCs on the basis of location and maturity. Changes in gene expression correlated with cell-division history and the acquisition of permissive histone modifications, and they included many regulators that had not been previously implicated in B cell differentiation. These findings both highlight and expand the core program that guides B cell terminal differentiation and the production of antibodies.
Asunto(s)
Diferenciación Celular/genética , Células Plasmáticas/citología , Células Plasmáticas/inmunología , Transcriptoma , Animales , Antígeno de Maduración de Linfocitos B/genética , División Celular/genética , Movimiento Celular/genética , Células Cultivadas , Perfilación de la Expresión Génica , Código de Histonas/genética , Activación de Linfocitos/genética , Ratones , Ratones Endogámicos C57BL , Ratones Noqueados , Factor 1 de Unión al Dominio 1 de Regulación Positiva , ARN/análisis , Proteínas Supresoras de la Señalización de Citocinas/genética , Factores de Transcripción/genéticaRESUMEN
Polygenic risk scores (PRSs), which often aggregate results from genome-wide association studies, can bridge the gap between initial discovery efforts and clinical applications for the estimation of disease risk using genetics. However, there is notable heterogeneity in the application and reporting of these risk scores, which hinders the translation of PRSs into clinical care. Here, in a collaboration between the Clinical Genome Resource (ClinGen) Complex Disease Working Group and the Polygenic Score (PGS) Catalog, we present the Polygenic Risk Score Reporting Standards (PRS-RS), in which we update the Genetic Risk Prediction Studies (GRIPS) Statement to reflect the present state of the field. Drawing on the input of experts in epidemiology, statistics, disease-specific applications, implementation and policy, this comprehensive reporting framework defines the minimal information that is needed to interpret and evaluate PRSs, especially with respect to downstream clinical applications. Items span detailed descriptions of study populations, statistical methods for the development and validation of PRSs and considerations for the potential limitations of these scores. In addition, we emphasize the need for data availability and transparency, and we encourage researchers to deposit and share PRSs through the PGS Catalog to facilitate reproducibility and comparative benchmarking. By providing these criteria in a structured format that builds on existing standards and ontologies, the use of this framework in publishing PRSs will facilitate translation into clinical care and progress towards defining best practice.
Asunto(s)
Predisposición Genética a la Enfermedad , Genética Médica/normas , Herencia Multifactorial/genética , Humanos , Reproducibilidad de los Resultados , Medición de Riesgo/normasRESUMEN
MOTIVATION: Protein-protein interactions (PPIs) are essential to understanding biological pathways as well as their roles in development and disease. Computational tools, based on classic machine learning, have been successful at predicting PPIs in silico, but the lack of consistent and reliable frameworks for this task has led to network models that are difficult to compare and discrepancies between algorithms that remain unexplained. RESULTS: To better understand the underlying inference mechanisms that underpin these models, we designed an open-source framework for benchmarking that accounts for a range of biological and statistical pitfalls while facilitating reproducibility. We use it to shed light on the impact of network topology and how different algorithms deal with highly connected proteins. By studying functional genomics-based and sequence-based models on human PPIs, we show their complementarity as the former performs best on lone proteins while the latter specializes in interactions involving hubs. We also show that algorithm design has little impact on performance with functional genomic data. We replicate our results between both human and S. cerevisiae data and demonstrate that models using functional genomics are better suited to PPI prediction across species. With rapidly increasing amounts of sequence and functional genomics data, our study provides a principled foundation for future construction, comparison, and application of PPI networks. AVAILABILITY AND IMPLEMENTATION: The code and data are available on GitHub: https://github.com/Llannelongue/B4PPI.
Asunto(s)
Mapas de Interacción de Proteínas , Saccharomyces cerevisiae , Humanos , Mapas de Interacción de Proteínas/genética , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Reproducibilidad de los Resultados , Proteínas/metabolismo , Algoritmos , Aprendizaje Automático , Mapeo de Interacción de Proteínas/métodosRESUMEN
Tissue-resident memory T cells (T(RM) cells) provide superior protection against infection in extralymphoid tissues. Here we found that CD103(+)CD8(+) T(RM) cells developed in the skin from epithelium-infiltrating precursor cells that lacked expression of the effector-cell marker KLRG1. A combination of entry into the epithelium plus local signaling by interleukin 15 (IL-15) and transforming growth factor-ß (TGF-ß) was required for the formation of these long-lived memory cells. Notably, differentiation into T(RM) cells resulted in the progressive acquisition of a unique transcriptional profile that differed from that of circulating memory cells and other types of T cells that permanently reside in skin epithelium. We provide a comprehensive molecular framework for the local differentiation of a distinct peripheral population of memory cells that forms a first-line immunological defense system in barrier tissues.
Asunto(s)
Antígenos CD/inmunología , Linfocitos T CD8-positivos/inmunología , Memoria Inmunológica/inmunología , Cadenas alfa de Integrinas/inmunología , Transducción de Señal/inmunología , Piel/inmunología , Animales , Antígenos CD/genética , Antígenos CD/metabolismo , Antígenos de Diferenciación de Linfocitos T/genética , Antígenos de Diferenciación de Linfocitos T/inmunología , Antígenos de Diferenciación de Linfocitos T/metabolismo , Linfocitos T CD8-positivos/metabolismo , Linfocitos T CD8-positivos/virología , Diferenciación Celular/genética , Diferenciación Celular/inmunología , Citometría de Flujo , Herpes Simple/inmunología , Herpes Simple/virología , Herpesvirus Humano 1/inmunología , Herpesvirus Humano 1/fisiología , Interacciones Huésped-Patógeno/inmunología , Cadenas alfa de Integrinas/genética , Cadenas alfa de Integrinas/metabolismo , Interleucina-15/genética , Interleucina-15/inmunología , Interleucina-15/metabolismo , Lectinas Tipo C/genética , Lectinas Tipo C/inmunología , Lectinas Tipo C/metabolismo , Ratones , Ratones Endogámicos C57BL , Ratones Endogámicos , Ratones Noqueados , Ratones Transgénicos , Análisis de Secuencia por Matrices de Oligonucleótidos , Proteínas Serina-Treonina Quinasas/genética , Proteínas Serina-Treonina Quinasas/inmunología , Proteínas Serina-Treonina Quinasas/metabolismo , Receptor Tipo II de Factor de Crecimiento Transformador beta , Receptores Inmunológicos/genética , Receptores Inmunológicos/inmunología , Receptores Inmunológicos/metabolismo , Receptores de Factores de Crecimiento Transformadores beta/genética , Receptores de Factores de Crecimiento Transformadores beta/inmunología , Receptores de Factores de Crecimiento Transformadores beta/metabolismo , Reacción en Cadena de la Polimerasa de Transcriptasa Inversa , Transducción de Señal/genética , Piel/metabolismo , Piel/virología , Transcriptoma/genética , Transcriptoma/inmunologíaRESUMEN
BACKGROUND: Dyslipidemia is treated effectively with statins, but treatment has the potential to induce new-onset type-2 diabetes. Gut microbiota may contribute to this outcome variability. We assessed the associations of gut microbiota diversity and composition with statins. Bacterial associations with statin-associated new-onset type-2 diabetes (T2D) risk were also prospectively evaluated. METHODS: We examined shallow-shotgun-sequenced fecal samples from 5755 individuals in the FINRISK-2002 population cohort with a 17+-year-long register-based follow-up. Alpha-diversity was quantified using Shannon index and beta-diversity with Aitchison distance. Species-specific differential abundances were analyzed using general multivariate regression. Prospective associations were assessed with Cox regression. Applicable results were validated using gradient boosting. RESULTS: Statin use associated with differing taxonomic composition (R2, 0.02%; q=0.02) and 13 differentially abundant species in fully adjusted models (MaAsLin; q<0.05). The strongest positive association was with Clostridium sartagoforme (ß=0.37; SE=0.13; q=0.02) and the strongest negative association with Bacteroides cellulosilyticus (ß=-0.31; SE=0.11; q=0.02). Twenty-five microbial features had significant associations with incident T2D in statin users, of which only Bacteroides vulgatus (HR, 1.286 [1.136-1.457]; q=0.03) was consistent regardless of model adjustment. Finally, higher statin-associated T2D risk was seen with [Ruminococcus] torques (ΔHRstatins, +0.11; q=0.03), Blautia obeum (ΔHRstatins, +0.06; q=0.01), Blautia sp. KLE 1732 (ΔHRstatins, +0.05; q=0.01), and beta-diversity principal component 1 (ΔHRstatin, +0.07; q=0.03) but only when adjusting for demographic covariates. CONCLUSIONS: Statin users have compositionally differing microbiotas from nonusers. The human gut microbiota is associated with incident T2D risk in statin users and possibly has additive effects on statin-associated new-onset T2D risk.
Asunto(s)
Diabetes Mellitus Tipo 2 , Dislipidemias , Microbioma Gastrointestinal , Inhibidores de Hidroximetilglutaril-CoA Reductasas , Humanos , Inhibidores de Hidroximetilglutaril-CoA Reductasas/efectos adversos , Estudios Transversales , Diabetes Mellitus Tipo 2/diagnóstico , Diabetes Mellitus Tipo 2/epidemiología , Dislipidemias/diagnóstico , Dislipidemias/tratamiento farmacológico , Dislipidemias/epidemiologíaRESUMEN
The NHGRI-EBI GWAS Catalog (www.ebi.ac.uk/gwas) is a FAIR knowledgebase providing detailed, structured, standardised and interoperable genome-wide association study (GWAS) data to >200 000 users per year from academic research, healthcare and industry. The Catalog contains variant-trait associations and supporting metadata for >45 000 published GWAS across >5000 human traits, and >40 000 full P-value summary statistics datasets. Content is curated from publications or acquired via author submission of prepublication summary statistics through a new submission portal and validation tool. GWAS data volume has vastly increased in recent years. We have updated our software to meet this scaling challenge and to enable rapid release of submitted summary statistics. The scope of the repository has expanded to include additional data types of high interest to the community, including sequencing-based GWAS, gene-based analyses and copy number variation analyses. Community outreach has increased the number of shared datasets from under-represented traits, e.g. cancer, and we continue to contribute to awareness of the lack of population diversity in GWAS. Interoperability of the Catalog has been enhanced through links to other resources including the Polygenic Score Catalog and the International Mouse Phenotyping Consortium, refinements to GWAS trait annotation, and the development of a standard format for GWAS data.
Asunto(s)
Estudio de Asociación del Genoma Completo , Bases del Conocimiento , Animales , Humanos , Ratones , Variaciones en el Número de Copia de ADN , National Human Genome Research Institute (U.S.) , Fenotipo , Polimorfismo de Nucleótido Simple , Programas Informáticos , Estados UnidosRESUMEN
For Alzheimer's disease-a leading cause of dementia and global morbidity-improved identification of presymptomatic high-risk individuals and identification of new circulating biomarkers are key public health needs. Here, we tested the hypothesis that a polygenic predictor of risk for Alzheimer's disease would identify a subset of the population with increased risk of clinically diagnosed dementia, subclinical neurocognitive dysfunction, and a differing circulating proteomic profile. Using summary association statistics from a recent genome-wide association study, we first developed a polygenic predictor of Alzheimer's disease comprised of 7.1 million common DNA variants. We noted a 7.3-fold (95% CI 4.8 to 11.0; p < 0.001) gradient in risk across deciles of the score among 288,289 middle-aged participants of the UK Biobank study. In cross-sectional analyses stratified by age, minimal differences in risk of Alzheimer's disease and performance on a digit recall test were present according to polygenic score decile at age 50 years, but significant gradients emerged by age 65. Similarly, among 30,541 participants of the Mass General Brigham Biobank, we again noted no significant differences in Alzheimer's disease diagnosis at younger ages across deciles of the score, but for those over 65 years we noted an odds ratio of 2.0 (95% CI 1.3 to 3.2; p = 0.002) in the top versus bottom decile of the polygenic score. To understand the proteomic signature of inherited risk, we performed aptamer-based profiling in 636 blood donors (mean age 43 years) with very high or low polygenic scores. In addition to the well-known apolipoprotein E biomarker, this analysis identified 27 additional proteins, several of which have known roles related to disease pathogenesis. Differences in protein concentrations were consistent even among the youngest subset of blood donors (mean age 33 years). Of these 28 proteins, 7 of the 8 proteins with concentrations available were similarly associated with the polygenic score in participants of the Multi-Ethnic Study of Atherosclerosis. These data highlight the potential for a DNA-based score to identify high-risk individuals during the prolonged presymptomatic phase of Alzheimer's disease and to enable biomarker discovery based on profiling of young individuals in the extremes of the score distribution.
Asunto(s)
Enfermedad de Alzheimer , Adulto , Anciano , Enfermedad de Alzheimer/patología , Biomarcadores , Estudios Transversales , Estudio de Asociación del Genoma Completo , Humanos , Persona de Mediana Edad , ProteómicaRESUMEN
The number of publicly available microbiome samples is continually growing. As data set size increases, bottlenecks arise in standard analytical pipelines. Faith's phylogenetic diversity (Faith's PD) is a highly utilized phylogenetic alpha diversity metric that has thus far failed to effectively scale to trees with millions of vertices. Stacked Faith's phylogenetic diversity (SFPhD) enables calculation of this widely adopted diversity metric at a much larger scale by implementing a computationally efficient algorithm. The algorithm reduces the amount of computational resources required, resulting in more accessible software with a reduced carbon footprint, as compared to previous approaches. The new algorithm produces identical results to the previous method. We further demonstrate that the phylogenetic aspect of Faith's PD provides increased power in detecting diversity differences between younger and older populations in the FINRISK study's metagenomic data.
Asunto(s)
Microbiota , Microbiota/genética , FilogeniaRESUMEN
BACKGROUND: The gut-lung axis is generally recognized, but there are few large studies of the gut microbiome and incident respiratory disease in adults. OBJECTIVE: We sought to investigate the association and predictive capacity of the gut microbiome for incident asthma and chronic obstructive pulmonary disease (COPD). METHODS: Shallow metagenomic sequencing was performed for stool samples from a prospective, population-based cohort (FINRISK02; N = 7115 adults) with linked national administrative health register-derived classifications for incident asthma and COPD up to 15 years after baseline. Generalized linear models and Cox regressions were used to assess associations of microbial taxa and diversity with disease occurrence. Predictive models were constructed using machine learning with extreme gradient boosting. Models considered taxa abundances individually and in combination with other risk factors, including sex, age, body mass index, and smoking status. RESULTS: A total of 695 and 392 statistically significant associations were found between baseline taxonomic groups and incident asthma and COPD, respectively. Gradient boosting decision trees of baseline gut microbiome abundance predicted incident asthma and COPD in the validation data sets with mean area under the curves of 0.608 and 0.780, respectively. Cox analysis showed that the baseline gut microbiome achieved higher predictive performance than individual conventional risk factors, with C-indices of 0.623 for asthma and 0.817 for COPD. The integration of the gut microbiome and conventional risk factors further improved prediction capacities. CONCLUSIONS: The gut microbiome is a significant risk factor for incident asthma and incident COPD and is largely independent of conventional risk factors.
Asunto(s)
Asma , Microbioma Gastrointestinal , Enfermedad Pulmonar Obstructiva Crónica , Adulto , Humanos , Estudios Prospectivos , Factores de RiesgoRESUMEN
Bioinformatic research relies on large-scale computational infrastructures which have a nonzero carbon footprint but so far, no study has quantified the environmental costs of bioinformatic tools and commonly run analyses. In this work, we estimate the carbon footprint of bioinformatics (in kilograms of CO2 equivalent units, kgCO2e) using the freely available Green Algorithms calculator (www.green-algorithms.org, last accessed 2022). We assessed 1) bioinformatic approaches in genome-wide association studies (GWAS), RNA sequencing, genome assembly, metagenomics, phylogenetics, and molecular simulations, as well as 2) computation strategies, such as parallelization, CPU (central processing unit) versus GPU (graphics processing unit), cloud versus local computing infrastructure, and geography. In particular, we found that biobank-scale GWAS emitted substantial kgCO2e and simple software upgrades could make it greener, for example, upgrading from BOLT-LMM v1 to v2.3 reduced carbon footprint by 73%. Moreover, switching from the average data center to a more efficient one can reduce carbon footprint by approximately 34%. Memory over-allocation can also be a substantial contributor to an algorithm's greenhouse gas emissions. The use of faster processors or greater parallelization reduces running time but can lead to greater carbon footprint. Finally, we provide guidance on how researchers can reduce power consumption and minimize kgCO2e. Overall, this work elucidates the carbon footprint of common analyses in bioinformatics and provides solutions which empower a move toward greener research.
Asunto(s)
Huella de Carbono , Biología Computacional , Algoritmos , Estudio de Asociación del Genoma Completo , Programas InformáticosRESUMEN
Allostery is a form of protein regulation, where ligands that bind sites located apart from the active site can modify the activity of the protein. The molecular mechanisms of allostery have been extensively studied, because allosteric sites are less conserved than active sites, and drugs targeting them are more specific than drugs binding the active sites. Here we quantify the importance of allostery in genetic disease. We show that 1) known allosteric proteins are central in disease networks, contribute to genetic disease and comorbidities much more than non-allosteric proteins, and there is an association between being allosteric and involvement in disease; 2) they are enriched in many major disease types like hematopoietic diseases, cardiovascular diseases, cancers, diabetes, or diseases of the central nervous system; 3) variants from cancer genome-wide association studies are enriched near allosteric proteins, indicating their importance to polygenic traits; and 4) the importance of allosteric proteins in disease is due, at least partly, to their central positions in protein-protein interaction networks, and less due to their dynamical properties.
Asunto(s)
Estudio de Asociación del Genoma Completo , Proteínas , Regulación Alostérica/genética , Proteínas/química , Sitio Alostérico , Dominio CatalíticoRESUMEN
Cytokines are essential regulatory components of the immune system, and their aberrant levels have been linked to many disease states. Despite increasing evidence that cytokines operate in concert, many of the physiological interactions between cytokines, and the shared genetic architecture that underlies them, remain unknown. Here, we aimed to identify and characterize genetic variants with pleiotropic effects on cytokines. Using three population-based cohorts (n = 9,263), we performed multivariate genome-wide association studies (GWAS) for a correlation network of 11 circulating cytokines, then combined our results in meta-analysis. We identified a total of eight loci significantly associated with the cytokine network, of which two (PDGFRB and ABO) had not been detected previously. In addition, conditional analyses revealed a further four secondary signals at three known cytokine loci. Integration, through the use of Bayesian colocalization analysis, of publicly available GWAS summary statistics with the cytokine network associations revealed shared causal variants between the eight cytokine loci and other traits; in particular, cytokine network variants at the ABO, SERPINE2, and ZFPM2 loci showed pleiotropic effects on the production of immune-related proteins, on metabolic traits such as lipoprotein and lipid levels, on blood-cell-related traits such as platelet count, and on disease traits such as coronary artery disease and type 2 diabetes.
Asunto(s)
Biomarcadores/análisis , Enfermedades Cardiovasculares/genética , Citocinas/genética , Pleiotropía Genética , Estudio de Asociación del Genoma Completo , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo , Adolescente , Adulto , Anciano , Proteínas Sanguíneas/genética , Proteínas Sanguíneas/inmunología , Enfermedades Cardiovasculares/inmunología , Enfermedades Cardiovasculares/patología , Niño , Citocinas/inmunología , Femenino , Estudios de Seguimiento , Redes Reguladoras de Genes , Predisposición Genética a la Enfermedad , Genoma Humano , Humanos , Estudios Longitudinales , Masculino , Persona de Mediana Edad , Pronóstico , Estudios Prospectivos , Adulto JovenRESUMEN
In asthma, a significant portion of the interaction between genetics and environment occurs through microbiota. The proposed mechanisms behind this interaction are complex and at times contradictory. This review covers recent developments in our understanding of this interaction: the "microbial hypothesis" and the "farm effect"; the role of endotoxin and genetic variation in pattern recognition systems; the interaction with allergen exposure; the additional involvement of host gut and airway microbiota; the role of viral respiratory infections in interaction with the 17q21 and CDHR3 genetic loci; and the importance of in utero and early-life timing of exposures. We propose a unified framework for understanding how all these phenomena interact to drive asthma pathogenesis. Finally, we point out some future challenges for continued research in this field, in particular the need for multiomic integration, as well as the potential utility of asthma endotyping.
Asunto(s)
Asma/inmunología , Microbioma Gastrointestinal/inmunología , Efectos Tardíos de la Exposición Prenatal/inmunología , Animales , Asma/genética , Femenino , Interacción Gen-Ambiente , Humanos , Fenotipo , Embarazo , Biología de SistemasRESUMEN
BACKGROUND: Studies indicate that the nasal microbiome may correlate strongly with the presence or future risk of childhood asthma. OBJECTIVES: In this study, we tested whether developmental trajectories of the nasopharyngeal microbiome in early life and the composition of the microbiome during illnesses were related to risk of childhood asthma. METHODS: Children participating in the Childhood Origins of Asthma study (N = 285) provided nasopharyngeal mucus samples in the first 2 years of life, during routine healthy study visits (at 2, 4, 6, 9, 12, 18, and 24 months of age), and during episodes of respiratory illnesses, all of which were analyzed for respiratory viruses and bacteria. We identified developmental trajectories of early-life microbiome composition, as well as predominant bacteria during respiratory illnesses, and we correlated these with presence of asthma at 6, 8, 11, 13, and 18 years of age. RESULTS: Of the 4 microbiome trajectories identified, a Staphylococcus-dominant microbiome in the first 6 months of life was associated with increased risk of recurrent wheezing by age 3 years and asthma that persisted throughout childhood. In addition, this trajectory was associated with the early onset of allergic sensitization. During wheezing illnesses, detection of rhinoviruses and predominance of Moraxella were associated with asthma that persisted throughout later childhood. CONCLUSION: In infancy, the developmental composition of the microbiome during healthy periods and the predominant microbes during acute wheezing illnesses are both associated with the subsequent risk of developing persistent childhood asthma.
Asunto(s)
Asma/epidemiología , Microbiota , Nasofaringe/microbiología , Adolescente , Bacterias/genética , Bacterias/aislamiento & purificación , Niño , Preescolar , Femenino , Humanos , Lactante , Masculino , ARN Ribosómico 16S , Ruidos Respiratorios , Factores de Riesgo , Virus/genética , Virus/aislamiento & purificaciónRESUMEN
Early prediction of risk of cardiovascular disease (CVD), including stroke, is a cornerstone of disease prevention. Clinical risk scores have been widely used for predicting CVD risk from known risk factors. Most CVDs have a substantial genetic component, which also has been confirmed for stroke in recent gene discovery efforts. However, the role of genetics in prediction of risk of CVD, including stroke, has been limited to testing for highly penetrant monogenic disorders. In contrast, the importance of polygenic variation, the aggregated effect of many common genetic variants across the genome with individually small effects, has become more apparent in the last 5 to 10 years, and powerful polygenic risk scores for CVD have been developed. Here we review the current state of the field of polygenic risk scores for CVD including stroke, and their potential to improve CVD risk prediction. We present findings and lessons from diseases such as coronary artery disease as these will likely be useful to inform future research in stroke polygenic risk prediction.