RESUMO
Rare genetic diseases are typically studied in referral populations, resulting in underdiagnosis and biased assessment of penetrance and phenotype. To address this, we develop a generalizable method of genotype inference based on distant relatedness and deploy this to identify undiagnosed Type 5 Long QT Syndrome (LQT5) rare variant carriers in a non-referral population. We identify 9 LQT5 families referred to a single specialty clinic, each carrying p.Asp76Asn, the most common LQT5 variant. We uncover recent common ancestry and a single shared haplotype among probands. Application to a non-referral population of 69,819 BioVU biobank subjects identifies 22 additional subjects sharing this haplotype, which we confirm to carry p.Asp76Asn. Referral and non-referral carriers have prolonged QT interval corrected for heart rate (QTc) compared to controls, and, among carriers, the QTc polygenic score is independently associated with QTc prolongation. Thus, our innovative analysis of shared chromosomal segments identifies undiagnosed cases of genetic disease and refines the understanding of LQT5 penetrance and phenotype.
Assuntos
Bancos de Espécimes Biológicos , Haplótipos , Síndrome do QT Longo , Humanos , Síndrome do QT Longo/genética , Síndrome do QT Longo/diagnóstico , Feminino , Masculino , Adulto , Penetrância , Pessoa de Meia-Idade , Fenótipo , Linhagem , Predisposição Genética para Doença , Genótipo , EletrocardiografiaRESUMO
Rare genetic diseases are typically studied in referral populations, resulting in underdiagnosis and biased assessment of penetrance and phenotype. To address this, we developed a generalizable method of genotype inference based on distant relatedness and deployed this to identify undiagnosed Type 5 Long QT Syndrome (LQT5) rare variant carriers in a non-referral population. We identified 9 LQT5 families referred to a single specialty clinic, each carrying p.Asp76Asn, the most common LQT5 variant. We uncovered recent common ancestry and a single shared haplotype among probands. Application to a non-referral population of 69,819 BioVU biobank subjects identified 22 additional subjects sharing this haplotype, subsequently confirmed to carry p.Asp76Asn. Referral and non-referral carriers had prolonged QTc compared to controls, and, among carriers, QTc polygenic score additively associated with QTc prolongation. Thus, our novel analysis of shared chromosomal segments identified undiagnosed cases of genetic disease and refined the understanding of LQT5 penetrance and phenotype.
RESUMO
BACKGROUND: Adipokines are hormones secreted from adipose tissue and are associated with cardiometabolic diseases (CMD). Functional differences between adipokines (leptin, adiponectin, and resistin) are known, but inconsistently reported associations with CMD and lack of studies in Hispanic populations are research gaps. We investigated the relationship between subclinical atherosclerosis and multiple adipokine measures. METHODS: Cross-sectional data from the Cameron County Hispanic Cohort (N = 624; mean age = 50; Female = 70.8%) were utilized to assess associations between adipokines [continuous measures of adiponectin, leptin, resistin, leptin-to-adiponectin ratio (LAR), and adiponectin-resistin index (ARI)] and early atherosclerosis [carotid-intima media thickness (cIMT)]. We adjusted for sex, age, body mass index (BMI), smoking status, cytokines, fasting blood glucose levels, blood pressure, lipid levels, and medication usage in the fully adjusted linear regression model. We conducted sexes-combined and sex-stratified analyses to account for sex-specificity and additionally tested whether stratification of participants by their metabolic status (metabolically elevated risk for CMD as defined by having two or more of the following conditions: hypertension, dyslipidemia, insulin resistance, and inflammation vs. not) influenced the relationship between adipokines and cIMT. RESULTS: In the fully adjusted analyses, adiponectin, leptin, and LAR displayed significant interaction by sex (p < 0.1). Male-specific associations were between cIMT and LAR [ß(SE) = 0.060 (0.016), p = 2.52 × 10-4], and female-specific associations were between cIMT and adiponectin [ß(SE) = 0.010 (0.005), p = 0.043] and ARI [ß(SE) = - 0.011 (0.005), p = 0.036]. When stratified by metabolic health status, the male-specific positive association between LAR and cIMT was more evident among the metabolically healthy group [ß(SE) = 0.127 (0.015), p = 4.70 × 10-10] (p for interaction by metabolic health < 0.1). However, the female-specific associations between adiponectin and cIMT and ARI and cIMT were observed only among the metabolically elevated risk group [ß(SE) = 0.014 (0.005), p = 0.012 for adiponectin; ß(SE) = - 0.015 (0.006), p = 0.013 for ARI; p for interaction by metabolic health < 0.1]. CONCLUSION: Associations between adipokines and cIMT were sex-specific, and metabolic health status influenced the relationships between adipokines and cIMT. These heterogeneities by sex and metabolic health affirm the complex relationships between adipokines and atherosclerosis.
Assuntos
Adipocinas , Aterosclerose , Feminino , Masculino , Humanos , Pessoa de Meia-Idade , Leptina , Resistina , Adiponectina , Espessura Intima-Media Carotídea , Estudos Transversais , Hispânico ou LatinoRESUMO
Importance: The diagnosis and study of rare genetic disease is often limited to referral populations, leading to underdiagnosis and a biased assessment of penetrance and phenotype. Objective: To develop a generalizable method of genotype inference based on distant relatedness and to deploy this to identify undiagnosed Type 5 Long QT Syndrome (LQT5) rare variant carriers in a non-referral population. Participants: We identified 9 LQT5 probands and 3 first-degree relatives referred to a single Genetic Arrhythmia clinic, each carrying D76N (p.Asp76Asn), the most common variant implicated in LQT5. The non-referral population consisted of 69,879 ancestry-matched subjects in BioVU, a large biobank that links electronic health records to dense array data. Participants were enrolled from 2007-2022. Data analysis was performed in 2022. Exposures: We developed and applied a novel approach to genotype inference (Distant Relatedness for Identification and Variant Evaluation, or DRIVE) to identify shared, identical-by-descent (IBD) large chromosomal segments in array data. Main Outcomes and Measures: We sought to establish genetic relatedness among the probands and to use genomic segments underlying D76N to identify other potential carriers in BioVU. We then further studied the role of D76N in LQT5 pathogenesis. Results: Genetic reconstruction of pedigrees and distant relatedness detection among clinic probands using DRIVE revealed shared recent common ancestry and identified a single long shared haplotype. Interrogation of the non-referral population in BioVU identified a further 23 subjects sharing this haplotype, and sequencing confirmed D76N carrier status in 22, all previously undiagnosed with LQT5. The QTc was prolonged in D76N carriers compared to BioVU controls, with 40% penetrance of QTc ≥ 480 msec. Among D76N carriers, a QTc polygenic score was additively associated with QTc prolongation. Conclusions and Relevance: Detection of IBD shared chromosomal segments around D76N enabled identification of distantly related and previously undiagnosed rare-variant carriers, demonstrated the contribution of polygenic risk to monogenic disease penetrance, and further established LQT5 as a primary arrhythmia disorder. Analysis of shared chromosomal regions spanning disease-causing mutations can identify undiagnosed cases of genetic diseases.
RESUMO
SUMMARY: Genomic data are often processed in batches and analyzed together to save time. However, it is challenging to combine multiple large VCFs and properly handle imputation quality and missing variants due to the limitations of available tools. To address these concerns, we developed IMMerge, a Python-based tool that takes advantage of multiprocessing to reduce running time. For the first time in a publicly available tool, imputation quality scores are correctly combined with Fisher's z transformation. AVAILABILITY AND IMPLEMENTATION: IMMerge is an open-source project under MIT license. Source code and user manual are available at https://github.com/belowlab/IMMerge.
Assuntos
Genoma , Genômica , SoftwareRESUMO
BACKGROUND: Concurrent variation in adiposity and inflammation suggests potential shared functional pathways and pleiotropic disease underpinning. Yet, exploration of pleiotropy in the context of adiposity-inflammation has been scarce, and none has included self-identified Hispanic/Latino populations. Given the high level of ancestral diversity in Hispanic American population, genetic studies may reveal variants that are infrequent/monomorphic in more homogeneous populations. METHODS: Using multi-trait Adaptive Sum of Powered Score (aSPU) method, we examined individual and shared genetic effects underlying inflammatory (CRP) and adiposity-related traits (Body Mass Index [BMI]), and central adiposity (Waist to Hip Ratio [WHR]) in HLA participating in the Population Architecture Using Genomics and Epidemiology (PAGE) cohort (N = 35,871) with replication of effects in the Cameron County Hispanic Cohort (CCHC) which consists of Mexican American individuals. RESULTS: Of the > 16 million SNPs tested, variants representing 7 independent loci were found to illustrate significant association with multiple traits. Two out of 7 variants were replicated at statistically significant level in multi-trait analyses in CCHC. The lead variant on APOE (rs439401) and rs11208712 were found to harbor multi-trait associations with adiposity and inflammation. CONCLUSIONS: Results from this study demonstrate the importance of considering pleiotropy for improving our understanding of the etiology of the various metabolic pathways that regulate cardiovascular disease development.
Assuntos
Adiposidade , Pleiotropia Genética , Adiposidade/genética , Hispânico ou Latino/genética , Humanos , Inflamação/genética , Obesidade/genéticaRESUMO
Despite a lifetime prevalence of at least 5%, developmental stuttering, characterized by prolongations, blocks, and repetitions of speech sounds, remains a largely idiopathic speech disorder. Family, twin, and segregation studies overwhelmingly support a strong genetic influence on stuttering risk; however, its complex mode of inheritance combined with thus-far underpowered genetic studies contribute to the challenge of identifying and reproducing genes implicated in developmental stuttering susceptibility. We conducted a trans-ancestry genome-wide association study (GWAS) and meta-analysis of developmental stuttering in two primary datasets: The International Stuttering Project comprising 1,345 clinically ascertained cases from multiple global sites and 6,759 matched population controls from the biobank at Vanderbilt University Medical Center (VUMC), and 785 self-reported stuttering cases and 7,572 controls ascertained from The National Longitudinal Study of Adolescent to Adult Health (Add Health). Meta-analysis of these genome-wide association studies identified a genome-wide significant (GWS) signal for clinically reported developmental stuttering in the general population: a protective variant in the intronic or genic upstream region of SSUH2 (rs113284510, protective allele frequency = 7.49%, Z = -5.576, p = 2.46 × 10-8) that acts as an expression quantitative trait locus (eQTL) in esophagus-muscularis tissue by reducing its gene expression. In addition, we identified 15 loci reaching suggestive significance (p < 5 × 10-6). This foundational population-based genetic study of a common speech disorder reports the findings of a clinically ascertained study of developmental stuttering and highlights the need for further research.
RESUMO
AIMS/HYPOTHESIS: Type 2 diabetes is a growing global public health challenge. Investigating quantitative traits, including fasting glucose, fasting insulin and HbA1c, that serve as early markers of type 2 diabetes progression may lead to a deeper understanding of the genetic aetiology of type 2 diabetes development. Previous genome-wide association studies (GWAS) have identified over 500 loci associated with type 2 diabetes, glycaemic traits and insulin-related traits. However, most of these findings were based only on populations of European ancestry. To address this research gap, we examined the genetic basis of fasting glucose, fasting insulin and HbA1c in participants of the diverse Population Architecture using Genomics and Epidemiology (PAGE) Study. METHODS: We conducted a GWAS of fasting glucose (n = 52,267), fasting insulin (n = 48,395) and HbA1c (n = 23,357) in participants without diabetes from the diverse PAGE Study (23% self-reported African American, 46% Hispanic/Latino, 40% European, 4% Asian, 3% Native Hawaiian, 0.8% Native American), performing transethnic and population-specific GWAS meta-analyses, followed by fine-mapping to identify and characterise novel loci and independent secondary signals in known loci. RESULTS: Four novel associations were identified (p < 5 × 10-9), including three loci associated with fasting insulin, and a novel, low-frequency African American-specific locus associated with fasting glucose. Additionally, seven secondary signals were identified, including novel independent secondary signals for fasting glucose at the known GCK locus and for fasting insulin at the known PPP1R3B locus in transethnic meta-analysis. CONCLUSIONS/INTERPRETATION: Our findings provide new insights into the genetic architecture of glycaemic traits and highlight the continued importance of conducting genetic studies in diverse populations. DATA AVAILABILITY: Full summary statistics from each of the population-specific and transethnic results are available at NHGRI-EBI GWAS catalog ( https://www.ebi.ac.uk/gwas/downloads/summary-statistics ).
Assuntos
Diabetes Mellitus Tipo 2 , Estudo de Associação Genômica Ampla , Glicemia/genética , Diabetes Mellitus Tipo 2/genética , Estudo de Associação Genômica Ampla/métodos , Genômica , Humanos , Polimorfismo de Nucleotídeo Único/genéticaRESUMO
PURPOSE: This study aimed to identify cases of developmental stuttering and associated comorbidities in de-identified electronic health records (EHRs) at Vanderbilt University Medical Center, and, in turn, build and test a stuttering prediction model. METHODS: A multi-step process including a keyword search of medical notes, a text-mining algorithm, and manual review was employed to identify stuttering cases in the EHR. Confirmed cases were compared to matched controls in a phenotype code (phecode) enrichment analysis to reveal conditions associated with stuttering (i.e., comorbidities). These associated phenotypes were used as proxy variables to phenotypically predict stuttering in subjects within the EHR that were not otherwise identifiable using the multi-step identification process described above. RESULTS: The multi-step process resulted in the manually reviewed identification of 1,143 stuttering cases in the EHR. Highly enriched phecodes included codes related to childhood onset fluency disorder, adult-onset fluency disorder, hearing loss, sleep disorders, atopy, a multitude of codes for infections, neurological deficits, and body weight. These phecodes were used as variables to create a phenome risk classifier (PheRC) prediction model to identify additional high likelihood stuttering cases. The PheRC prediction model resulted in a positive predictive value of 83 %. CONCLUSIONS: This study demonstrates the feasibility of using EHRs in the study of stuttering and found phenotypic associations. The creation of the PheRC has the potential to enable future studies of stuttering using existing EHR data, including investigations into the genetic etiology.
Assuntos
Gagueira , Algoritmos , Criança , Comorbidade , Registros Eletrônicos de Saúde , Humanos , Fenótipo , Gagueira/diagnóstico , Gagueira/epidemiologiaRESUMO
Given the coronavirus disease 2019 (COVID-19) pandemic, investigations into host susceptibility to infectious diseases and downstream sequelae have never been more relevant. Pneumonia is a lung disease that can cause respiratory failure and hypoxia and is a common complication of infectious diseases, including COVID-19. Few genome-wide association studies (GWASs) of host susceptibility and severity of pneumonia have been conducted. We performed GWASs of pneumonia susceptibility and severity in the Vanderbilt University biobank (BioVU) with linked electronic health records (EHRs), including Illumina Expanded Multi-Ethnic Global Array (MEGAEX)-genotyped European ancestry (EA, n= 69,819) and African ancestry (AA, n = 15,603) individuals. Two regions of large effect were identified: the CFTR locus in EA (rs113827944; OR = 1.84, p value = 1.2 × 10-36) and HBB in AA (rs334 [p.Glu7Val]; OR = 1.63, p value = 3.5 × 10-13). Mutations in these genes cause cystic fibrosis (CF) and sickle cell disease (SCD), respectively. After removing individuals diagnosed with CF and SCD, we assessed heterozygosity effects at our lead variants. Further GWASs after removing individuals with CF uncovered an additional association in R3HCC1L (rs10786398; OR = 1.22, p value = 3.5 × 10-8), which was replicated in two independent datasets: UK Biobank (n = 459,741) and 7,985 non-overlapping BioVU subjects, who are genotyped on arrays other than MEGAEX. This variant was also validated in GWASs of COVID-19 hospitalization and lung function. Our results highlight the importance of the host genome in infectious disease susceptibility and severity and offer crucial insight into genetic effects that could potentially influence severity of COVID-19 sequelae.
Assuntos
COVID-19/complicações , COVID-19/genética , Interações Hospedeiro-Patógeno/genética , Pneumonia Viral/complicações , Pneumonia Viral/genética , Bronquite/genética , COVID-19/patologia , COVID-19/fisiopatologia , Regulador de Condutância Transmembrana em Fibrose Cística/genética , Bases de Dados Genéticas , Registros Eletrônicos de Saúde , Feminino , Estudo de Associação Genômica Ampla , Genótipo , Hemoglobinas/genética , Humanos , Pacientes Internados , Desequilíbrio de Ligação , Masculino , Pacientes Ambulatoriais , Pneumonia Viral/patologia , Pneumonia Viral/fisiopatologia , Polimorfismo de Nucleotídeo Único/genética , Análise de Componente Principal , Doença Pulmonar Obstrutiva Crônica/genética , Reprodutibilidade dos Testes , Reino UnidoRESUMO
CITRUS is a supervised machine learning algorithm designed to analyze single cell data, identify cell populations, and identify changes in the frequencies or functional marker expression patterns of those populations that are significantly associated with an outcome. The algorithm is a black box that includes steps to cluster cell populations, characterize these populations, and identify the significant characteristics. This chapter describes how to optimize the use of CITRUS by combining it with upstream and downstream data analysis and visualization tools.
Assuntos
Algoritmos , Biomarcadores/análise , Citometria de Fluxo/métodos , Espectrometria de Massas/métodos , Análise de Célula Única/métodos , Aprendizado de Máquina Supervisionado , HumanosRESUMO
CD40 expression is required for germinal center (GC) formation and function, but the kinetics and magnitude of signaling following CD40 engagement remain poorly characterized in human B cells undergoing GC reactions. Here, differences in CD40 expression and signaling responses were compared across differentiation stages of mature human tonsillar B cells. A combination of mass cytometry and phospho-specific flow cytometry was used to quantify protein expression and CD40L-induced signaling in primary human naïve, GC, and memory B cells. Protein expression signatures of cell subsets were quantified using viSNE and Marker Enrichment Modeling (MEM). This approach revealed enriched expression of CD40 protein in GC B cells, compared to naïve and memory B cells. Despite this, GC B cells responded to CD40L engagement with lower phosphorylation of NFκB p65 during the first 30 min following CD40L activation. Before CD40L stimulation, GC B cells expressed higher levels of suppressor protein IκBα than naïve and memory B cells. Following CD40 activation, IκBα was rapidly degraded and reached equivalently low levels in naïve, GC, and memory B cells at 30 min following CD40L. Quantifying CD40 signaling responses as a function of bound ligand revealed a correlation between bound CD40L and degree of induced NFκB p65 phosphorylation, whereas comparable IκBα degradation occurred at all measured levels of CD40L binding. These results characterize cell-intrinsic signaling differences that exist in mature human B cells undergoing GC reactions. © 2019 International Society for Advancement of Cytometry.
Assuntos
Linfócitos B/fisiologia , Antígenos CD40/metabolismo , Ligante de CD40/metabolismo , Centro Germinativo/citologia , Memória Imunológica , Linfócitos B/citologia , Linfócitos B/metabolismo , Ligante de CD40/fisiologia , Células Cultivadas , Centro Germinativo/imunologia , Centro Germinativo/metabolismo , Humanos , NF-kappa B/metabolismo , Fosforilação , Transdução de Sinais/imunologiaRESUMO
The application of machine learning in medicine has been productive in multiple fields, but has not previously been applied to analyze the complexity of organ involvement by chronic graft-versus-host disease. Chronic graft-versus-host disease is classified by an overall composite score as mild, moderate or severe, which may overlook clinically relevant patterns in organ involvement. Here we applied a novel computational approach to chronic graft-versus-host disease with the goal of identifying phenotypic groups based on the subcomponents of the National Institutes of Health Consensus Criteria. Computational analysis revealed seven distinct groups of patients with contrasting clinical risks. The high-risk group had an inferior overall survival compared to the low-risk group (hazard ratio 2.24; 95% confidence interval: 1.36-3.68), an effect that was independent of graft-versus-host disease severity as measured by the National Institutes of Health criteria. To test clinical applicability, knowledge was translated into a simplified clinical prognostic decision tree. Groups identified by the decision tree also stratified outcomes and closely matched those from the original analysis. Patients in the high- and intermediate-risk decision-tree groups had significantly shorter overall survival than those in the low-risk group (hazard ratio 2.79; 95% confidence interval: 1.58-4.91 and hazard ratio 1.78; 95% confidence interval: 1.06-3.01, respectively). Machine learning and other computational analyses may better reveal biomarkers and stratify risk than the current approach based on cumulative severity. This approach could now be explored in other disease models with complex clinical phenotypes. External validation must be completed prior to clinical application. Ultimately, this approach has the potential to reveal distinct pathophysiological mechanisms that may underlie clusters. Clinicaltrials.gov identifier: NCT00637689.
Assuntos
Doença Enxerto-Hospedeiro , Neoplasias Hematológicas/terapia , Transplante de Células-Tronco Hematopoéticas , Aprendizado de Máquina , Adulto , Biomarcadores/sangue , Doença Crônica , Consenso , Feminino , Doença Enxerto-Hospedeiro/sangue , Doença Enxerto-Hospedeiro/diagnóstico , Humanos , Masculino , Pessoa de Meia-Idade , National Institutes of Health (U.S.) , Estudos Prospectivos , Transplante Homólogo , Estados UnidosRESUMO
BACKGROUND: Although a preponderance of pre-clinical data demonstrates the immunosuppressive potential of mesenchymal stromal cells (MSCs), significant heterogeneity and lack of critical quality attributes (CQAs) based on immunosuppressive capacity likely have contributed to inconsistent clinical outcomes. This heterogeneity exists not only between MSC lots derived from different donors, tissues and manufacturing conditions, but also within a given MSC lot in the form of functional subpopulations. We therefore explored the potential of functionally relevant morphological profiling (FRMP) to identify morphological subpopulations predictive of the immunosuppressive capacity of MSCs derived from multiple donors, manufacturers and passages. METHODS: We profiled the single-cell morphological response of MSCs from different donors and passages to the functionally relevant inflammatory cytokine interferon (IFN)-γ. We used the machine learning approach visual stochastic neighbor embedding (viSNE) to identify distinct morphological subpopulations that could predict suppression of activated CD4+ and CD8+ T cells in a multiplexed quantitative assay. RESULTS: Multiple IFN-γ-stimulated subpopulations significantly correlated with the ability of MSCs to inhibit CD4+ and CD8+ T-cell activation and served as effective CQAs to predict the immunosuppressive capacity of additional manufactured MSC lots. We further characterized the emergence of morphological heterogeneity following IFN-γ stimulation, which provides a strategy for identifying functional subpopulations for future single-cell characterization and enrichment techniques. DISCUSSION: This work provides a generalizable analytical platform for assessing functional heterogeneity based on single-cell morphological responses that could be used to identify novel CQAs and inform cell manufacturing decisions.
Assuntos
Terapia de Imunossupressão , Interferon gama/farmacologia , Aprendizado de Máquina , Células-Tronco Mesenquimais/citologia , Células-Tronco Mesenquimais/efeitos dos fármacos , Linfócitos T CD4-Positivos/imunologia , Linfócitos T CD8-Positivos/imunologia , Plasticidade Celular , Proliferação de Células , Células Cultivadas , Técnicas de Cocultura , Humanos , Leucócitos Mononucleares/citologia , Ativação Linfocitária , Processos Estocásticos , Inclusão do Tecido/métodosRESUMO
Differences in the quality of BCR signaling control key steps of B cell maturation and differentiation. Endogenously produced H2O2 is thought to fine tune the level of BCR signaling by reversibly inhibiting phosphatases. However, relatively little is known about how B cells at different stages sense and respond to such redox cues. In this study, we used phospho-specific flow cytometry and high-dimensional mass cytometry (CyTOF) to compare BCR signaling responses in mature human tonsillar B cells undergoing germinal center (GC) reactions. GC B cells, in contrast to mature naive B cells, memory B cells, and plasmablasts, were hypersensitive to a range of H2O2 concentrations and responded by phosphorylating SYK and other membrane-proximal BCR effectors in the absence of BCR engagement. These findings reveal that stage-specific redox responses distinguish human GC B cells.