Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 28
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Cell ; 185(16): 3041-3055.e25, 2022 08 04.
Artigo em Inglês | MEDLINE | ID: mdl-35917817

RESUMO

Rare copy-number variants (rCNVs) include deletions and duplications that occur infrequently in the global human population and can confer substantial risk for disease. In this study, we aimed to quantify the properties of haploinsufficiency (i.e., deletion intolerance) and triplosensitivity (i.e., duplication intolerance) throughout the human genome. We harmonized and meta-analyzed rCNVs from nearly one million individuals to construct a genome-wide catalog of dosage sensitivity across 54 disorders, which defined 163 dosage sensitive segments associated with at least one disorder. These segments were typically gene dense and often harbored dominant dosage sensitive driver genes, which we were able to prioritize using statistical fine-mapping. Finally, we designed an ensemble machine-learning model to predict probabilities of dosage sensitivity (pHaplo & pTriplo) for all autosomal genes, which identified 2,987 haploinsufficient and 1,559 triplosensitive genes, including 648 that were uniquely triplosensitive. This dosage sensitivity resource will provide broad utility for human disease research and clinical genetics.


Assuntos
Variações do Número de Cópias de DNA , Genoma Humano , Variações do Número de Cópias de DNA/genética , Dosagem de Genes , Haploinsuficiência/genética , Humanos
2.
PLoS Comput Biol ; 16(4): e1007522, 2020 04.
Artigo em Inglês | MEDLINE | ID: mdl-32282793

RESUMO

Studies of complex disorders benefit from integrative analyses of multiple omics data. Yet, sample mix-ups frequently occur in multi-omics studies, weakening statistical power and risking false findings. Accurately aligning sample information, genotype, and corresponding omics data is critical for integrative analyses. We developed DRAMS (https://github.com/Yi-Jiang/DRAMS) to Detect and Re-Align Mixed-up Samples to address the sample mix-up problem. It uses a logistic regression model followed by a modified topological sorting algorithm to identify the potential true IDs based on data relationships of multi-omics. According to tests using simulated data, the more types of omics data used or the smaller the proportion of mix-ups, the better that DRAMS performs. Applying DRAMS to real data from the PsychENCODE BrainGVEX project, we detected and corrected 201 (12.5% of total data generated) mix-ups. Of the 21 mix-ups involving errors of racial identity, DRAMS re-assigned all data to the correct racial group in the 1000 Genomes project. In doing so, quantitative trait loci (QTL) (FDR<0.01) increased by an average of 1.62-fold. The use of DRAMS in multi-omics studies will strengthen statistical power of the study and improve quality of the results. Even though very limited studies have multi-omics data in place, we expect such data will increase quickly with the needs of DRAMS.


Assuntos
Biologia Computacional/métodos , Lobo Frontal/metabolismo , Genômica/métodos , Polimorfismo de Nucleotídeo Único , Algoritmos , Cromatina/química , Simulação por Computador , Etnicidade , Feminino , Genoma , Genótipo , Humanos , Modelos Logísticos , Masculino , Modelos Genéticos , Análise de Sequência com Séries de Oligonucleotídeos , RNA-Seq , Reprodutibilidade dos Testes , Fatores Sexuais , Software , Interface Usuário-Computador , Sequenciamento Completo do Genoma
3.
Mult Scler ; 24(14): 1815-1824, 2018 12.
Artigo em Inglês | MEDLINE | ID: mdl-28933650

RESUMO

BACKGROUND: A wealth of single-nucleotide polymorphisms (SNPs) responsible for multiple sclerosis (MS) susceptibility have been identified; however, they explain only a fraction of MS heritability. OBJECTIVES: We contributed to discovery of new MS susceptibility SNPs by studying a founder population with high MS prevalence. METHODS: We analyzed ImmunoChip data from 15 multiplex families and 94 unrelated controls from the Nuoro Province, Sardinia, Italy. We tested each SNP for both association and linkage with MS, the linkage being explored in terms of identity-by-descent (IBD) sharing excess and using gene dropping to compute a corresponding empirical p-value. By targeting regions that are both associated and in linkage with MS, we increase chances of identifying interesting genomic regions. RESULTS: We identified 486 MS-associated (p < 1 × 10-4) and 18,426 MS-linked (p < 0.05) SNPs. A total of 111 loci were both linked and associated with MS, 18 of them pointing to 14 non-major histocompatibility complex (MHC) genes, and 93 of them located in the MHC region. CONCLUSION: We discovered new suggestive signals and confirmed some previously identified ones. We believe this to represent a significant step toward an understanding of the genetic basis of MS.


Assuntos
Ligação Genética/genética , Predisposição Genética para Doença/genética , Esclerose Múltipla/genética , Alelos , Humanos , Itália , Polimorfismo de Nucleotídeo Único/genética
4.
PLoS Comput Biol ; 11(3): e1004139, 2015 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-25735005

RESUMO

Founder populations and large pedigrees offer many well-known advantages for genetic mapping studies, including cost-efficient study designs. Here, we describe PRIMAL (PedigRee IMputation ALgorithm), a fast and accurate pedigree-based phasing and imputation algorithm for founder populations. PRIMAL incorporates both existing and original ideas, such as a novel indexing strategy of Identity-By-Descent (IBD) segments based on clique graphs. We were able to impute the genomes of 1,317 South Dakota Hutterites, who had genome-wide genotypes for ~300,000 common single nucleotide variants (SNVs), from 98 whole genome sequences. Using a combination of pedigree-based and LD-based imputation, we were able to assign 87% of genotypes with >99% accuracy over the full range of allele frequencies. Using the IBD cliques we were also able to infer the parental origin of 83% of alleles, and genotypes of deceased recent ancestors for whom no genotype information was available. This imputed data set will enable us to better study the relative contribution of rare and common variants on human phenotypes, as well as parental origin effect of disease risk alleles in >1,000 individuals at minimal cost.


Assuntos
Algoritmos , Efeito Fundador , Modelos Genéticos , Linhagem , Software , Feminino , Genoma Humano , Genômica , Humanos , Masculino , Polimorfismo de Nucleotídeo Único/genética , Análise de Sequência de DNA , South Dakota , População Branca/genética
5.
J Allergy Clin Immunol ; 133(1): 248-55.e1-10, 2014 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-23932459

RESUMO

BACKGROUND: Lung function is a long-term predictor of mortality and morbidity. OBJECTIVE: We sought to identify single nucleotide polymorphisms (SNPs) associated with lung function. METHODS: We performed a genome-wide association study (GWAS) of FEV1, forced vital capacity (FVC), and FEV1/FVC in 1144 Hutterites aged 6 to 89 years, who are members of a founder population of European descent. We performed least absolute shrinkage and selection operation regression to select the minimum set of SNPs that best predict FEV1/FVC in the Hutterites and used the GRAIL algorithm to mine the Gene Ontology database for evidence of functional connections between genes near the predictive SNPs. RESULTS: Our GWAS identified significant associations between FEV1/FVC and SNPs at the THSD4-UACA-TLE3 locus on chromosome 15q23 (P = 5.7 × 10(-8) to 3.4 × 10(-9)). Nine SNPs at or near 4 additional loci had P < 10(-5) with FEV1/FVC. Only 2 SNPs were found with P < 10(-5) for FEV1 or FVC. We found nominal levels of significance with SNPs at 9 of the 27 previously reported loci associated with lung function measures. Among a predictive set of 80 SNPs, 6 loci were identified that had a significant degree of functional connectivity (GRAIL P < .05), including 3 clusters of ß-defensin genes, 2 chemokine genes (CCL18 and CXCL12), and TNFRSF13B. CONCLUSION: This study identifies genome-wide significant associations and replicates results of previous GWASs. Multimarker modeling implicated for the first time common variation in genes involved in antimicrobial immunity in airway mucosa that influences lung function.


Assuntos
Quimiocina CXCL12/genética , Quimiocinas CC/genética , Pulmão/fisiologia , Respiração/genética , Proteína Transmembrana Ativadora e Interagente do CAML/genética , beta-Defensinas/genética , Adolescente , Adulto , Idoso , Criança , Feminino , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Humanos , Imunidade nas Mucosas/genética , Masculino , Pessoa de Meia-Idade , Polimorfismo de Nucleotídeo Único , Respiração/imunologia , Testes de Função Respiratória , Estados Unidos , Adulto Jovem
6.
Phytother Res ; 28(12): 1822-8, 2014 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-25098402

RESUMO

The roots and rhizomes of Smilax riparia, called 'Niu-Wei-Cai' in traditional Chinese medicine, are believed to be effective in treating the symptoms of gout. However, the active constituents and their uricosuric mechanisms are unknown. In this study, we isolated two steroidal glycosides, named smilaxchinoside A and smilaxchinoside C, from the total saponins obtained from the ethanol extract of the roots of S. riparia. We then examined if these two compounds were effective in reducing serum uric acid levels in a hyperuricemic mouse model induced by potassium oxonate. We observed that these two steroidal glycosides possess potent uricosuric activities, and the observed effects accompanied the reduction of renal mURAT1 and the inhibition of xanthine oxidase, which contribute to the enhancement of uric acid excretion and the reduction of hyperuricemia-induced renal dysfunction. Smilaxchinoside A and smilaxchinoside C may have a clinical utility in treating gout and other medical conditions caused by hyperuricemia.


Assuntos
Glicosídeos/farmacologia , Hiperuricemia/tratamento farmacológico , Extratos Vegetais/farmacologia , Smilax/química , Esteroides/farmacologia , Uricosúricos/farmacologia , Animais , Modelos Animais de Doenças , Medicamentos de Ervas Chinesas/farmacologia , Proteínas Facilitadoras de Transporte de Glucose/metabolismo , Glicosídeos/isolamento & purificação , Rim/efeitos dos fármacos , Masculino , Camundongos , Proteína 1 Transportadora de Ânions Orgânicos/metabolismo , Transportadores de Ânions Orgânicos/metabolismo , Ácido Oxônico , Raízes de Plantas/química , Saponinas/farmacologia , Esteroides/isolamento & purificação , Ácido Úrico/sangue , Uricosúricos/isolamento & purificação , Xantina Oxidase/metabolismo
7.
Biol Psychiatry Glob Open Sci ; 4(3): 100297, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38645405

RESUMO

Background: Patients with schizophrenia have substantial comorbidity that contributes to reduced life expectancy of 10 to 20 years. Identifying modifiable comorbidities could improve rates of premature mortality. Conditions that frequently co-occur but lack shared genetic risk with schizophrenia are more likely to be products of treatment, behavior, or environmental factors and therefore are enriched for potentially modifiable associations. Methods: Phenome-wide comorbidity was calculated from electronic health records of 250,000 patients across 2 independent health care institutions (Vanderbilt University Medical Center and Mass General Brigham); associations with schizophrenia polygenic risk scores were calculated across the same phenotypes in linked biobanks. Results: Schizophrenia comorbidity was significantly correlated across institutions (r = 0.85), and the 77 identified comorbidities were consistent with prior literature. Overall, comorbidity and polygenic risk score associations were significantly correlated (r = 0.55, p = 1.29 × 10-118). However, directly testing for the absence of genetic effects identified 36 comorbidities that had significantly equivalent schizophrenia polygenic risk score distributions between cases and controls. This set included phenotypes known to be consequences of antipsychotic medications (e.g., movement disorders) or of the disease such as reduced hygiene (e.g., diseases of the nail), thereby validating the approach. It also highlighted phenotypes with less clear causal relationships and minimal genetic effects such as tobacco use disorder and diabetes. Conclusions: This work demonstrates the consistency and robustness of electronic health record-based schizophrenia comorbidities across independent institutions and with the existing literature. It identifies known and novel comorbidities with an absence of shared genetic risk, indicating other causes that may be modifiable and where further study of causal pathways could improve outcomes for patients.


Patients with schizophrenia have many co-occurring diseases that contribute substantially to premature mortality of 10 to 20 years. Conditions that are comorbid but lack shared genetic risk with schizophrenia are likely to have causes that are more modifiable. Here, we calculated comorbidity from electronic health records from 2 independent health care institutions and associations with schizophrenia polygenic risk scores across the same phenotypes in linked biobanks. We identified known and novel diseases comorbid with schizophrenia, thereby validating our approach.

8.
JAMA Netw Open ; 7(3): e243821, 2024 Mar 04.
Artigo em Inglês | MEDLINE | ID: mdl-38536175

RESUMO

Importance: Despite consistent public health recommendations, obesity rates in the US continue to increase. Physical activity recommendations do not account for individual genetic variability, increasing risk of obesity. Objective: To use activity, clinical, and genetic data from the All of Us Research Program (AoURP) to explore the association of genetic risk of higher body mass index (BMI) with the level of physical activity needed to reduce incident obesity. Design, Setting, and Participants: In this US population-based retrospective cohort study, participants were enrolled in the AoURP between May 1, 2018, and July 1, 2022. Enrollees in the AoURP who were of European ancestry, owned a personal activity tracking device, and did not have obesity up to 6 months into activity tracking were included in the analysis. Exposure: Physical activity expressed as daily step counts and a polygenic risk score (PRS) for BMI, calculated as weight in kilograms divided by height in meters squared. Main Outcome and Measures: Incident obesity (BMI ≥30). Results: A total of 3124 participants met inclusion criteria. Among 3051 participants with available data, 2216 (73%) were women, and the median age was 52.7 (IQR, 36.4-62.8) years. The total cohort of 3124 participants walked a median of 8326 (IQR, 6499-10 389) steps/d over a median of 5.4 (IQR, 3.4-7.0) years of personal activity tracking. The incidence of obesity over the study period increased from 13% (101 of 781) to 43% (335 of 781) in the lowest and highest PRS quartiles, respectively (P = 1.0 × 10-20). The BMI PRS demonstrated an 81% increase in obesity risk (P = 3.57 × 10-20) while mean step count demonstrated a 43% reduction (P = 5.30 × 10-12) when comparing the 75th and 25th percentiles, respectively. Individuals with a PRS in the 75th percentile would need to walk a mean of 2280 (95% CI, 1680-3310) more steps per day (11 020 total) than those at the 50th percentile to have a comparable risk of obesity. To have a comparable risk of obesity to individuals at the 25th percentile of PRS, those at the 75th percentile with a baseline BMI of 22 would need to walk an additional 3460 steps/d; with a baseline BMI of 24, an additional 4430 steps/d; with a baseline BMI of 26, an additional 5380 steps/d; and with a baseline BMI of 28, an additional 6350 steps/d. Conclusions and Relevance: In this cohort study, the association between daily step count and obesity risk across genetic background and baseline BMI were quantified. Population-based recommendations may underestimate physical activity needed to prevent obesity among those at high genetic risk.


Assuntos
Saúde da População , Feminino , Humanos , Pessoa de Meia-Idade , Masculino , Estudos de Coortes , Estudos Retrospectivos , Obesidade , Exercício Físico , Estratificação de Risco Genético
9.
Nat Med ; 30(9): 2648-2656, 2024 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-39030265

RESUMO

Poor sleep health is associated with increased all-cause mortality and incidence of many chronic conditions. Previous studies have relied on cross-sectional and self-reported survey data or polysomnograms, which have limitations with respect to data granularity, sample size and longitudinal information. Here, using objectively measured, longitudinal sleep data from commercial wearable devices linked to electronic health record data from the All of Us Research Program, we show that sleep patterns, including sleep stages, duration and regularity, are associated with chronic disease incidence. Of the 6,785 participants included in this study, 71% were female, 84% self-identified as white and 71% had a college degree; the median age was 50.2 years (interquartile range = 35.7, 61.5) and the median sleep monitoring period was 4.5 years (2.5, 6.5). We found that rapid eye movement sleep and deep sleep were inversely associated with the odds of incident atrial fibrillation and that increased sleep irregularity was associated with increased odds of incident obesity, hyperlipidemia, hypertension, major depressive disorder and generalized anxiety disorder. Moreover, J-shaped associations were observed between average daily sleep duration and hypertension, major depressive disorder and generalized anxiety disorder. These findings show that sleep stages, duration and regularity are all important factors associated with chronic disease development and may inform evidence-based recommendations on healthy sleeping habits.


Assuntos
Sono , Dispositivos Eletrônicos Vestíveis , Humanos , Feminino , Pessoa de Meia-Idade , Masculino , Doença Crônica , Adulto , Sono/fisiologia , Estados Unidos/epidemiologia , Polissonografia , Fatores de Risco , Estudos Transversais , Idoso
10.
Am J Psychiatry ; 181(7): 608-619, 2024 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-38745458

RESUMO

OBJECTIVE: Treatment-resistant depression (TRD) occurs in roughly one-third of all individuals with major depressive disorder (MDD). Although research has suggested a significant common variant genetic component of liability to TRD, with heritability estimated at 8% when compared with non-treatment-resistant MDD, no replicated genetic loci have been identified, and the genetic architecture of TRD remains unclear. A key barrier to this work has been the paucity of adequately powered cohorts for investigation, largely because of the challenge in prospectively investigating this phenotype. The objective of this study was to perform a well-powered genetic study of TRD. METHODS: Using receipt of electroconvulsive therapy (ECT) as a surrogate for TRD, the authors applied standard machine learning methods to electronic health record data to derive predicted probabilities of receiving ECT. These probabilities were then applied as a quantitative trait in a genome-wide association study of 154,433 genotyped patients across four large biobanks. RESULTS: Heritability estimates ranged from 2% to 4.2%, and significant genetic overlap was observed with cognition, attention deficit hyperactivity disorder, schizophrenia, alcohol and smoking traits, and body mass index. Two genome-wide significant loci were identified, both previously implicated in metabolic traits, suggesting shared biology and potential pharmacological implications. CONCLUSIONS: This work provides support for the utility of estimation of disease probability for genomic investigation and provides insights into the genetic architecture and biology of TRD.


Assuntos
Transtorno Depressivo Maior , Transtorno Depressivo Resistente a Tratamento , Eletroconvulsoterapia , Estudo de Associação Genômica Ampla , Humanos , Transtorno Depressivo Resistente a Tratamento/genética , Transtorno Depressivo Resistente a Tratamento/terapia , Feminino , Masculino , Transtorno Depressivo Maior/genética , Transtorno Depressivo Maior/terapia , Pessoa de Meia-Idade , Aprendizado de Máquina , Adulto , Fenótipo , Idoso , Índice de Massa Corporal , Esquizofrenia/genética , Esquizofrenia/terapia
11.
medRxiv ; 2024 May 27.
Artigo em Inglês | MEDLINE | ID: mdl-38585743

RESUMO

Background: Electronic health records (EHR) are increasingly used for studying multimorbidities. However, concerns about accuracy, completeness, and EHRs being primarily designed for billing and administrative purposes raise questions about the consistency and reproducibility of EHR-based multimorbidity research. Methods: Utilizing phecodes to represent the disease phenome, we analyzed pairwise comorbidity strengths using a dual logistic regression approach and constructed multimorbidity as an undirected weighted graph. We assessed the consistency of the multimorbidity networks within and between two major EHR systems at local (nodes and edges), meso (neighboring patterns), and global (network statistics) scales. We present case studies to identify disease clusters and uncover clinically interpretable disease relationships. We provide an interactive web tool and a knowledge base combining data from multiple sources for online multimorbidity analysis. Findings: Analyzing data from 500,000 patients across Vanderbilt University Medical Center and Mass General Brigham health systems, we observed a strong correlation in disease frequencies (Kendall's τ = 0.643) and comorbidity strengths (Pearson ρ = 0.79). Consistent network statistics across EHRs suggest similar structures of multimorbidity networks at various scales. Comorbidity strengths and similarities of multimorbidity connection patterns align with the disease genetic correlations. Graph-theoretic analyses revealed a consistent core-periphery structure, implying efficient network clustering through threshold graph construction. Using hydronephrosis as a case study, we demonstrated the network's ability to uncover clinically relevant disease relationships and provide novel insights. Interpretation: Our findings demonstrate the robustness of large-scale EHR data for studying phenome-wide multimorbidities. The alignment of multimorbidity patterns with genetic data suggests the potential utility for uncovering shared biology of diseases. The consistent core-periphery structure offers analytical insights to discover complex disease interactions. This work also sets the stage for advanced disease modeling, with implications for precision medicine. Funding: VUMC Biostatistics Development Award, the National Institutes of Health, and the VA CSRD.

12.
Res Sq ; 2023 Jun 08.
Artigo em Inglês | MEDLINE | ID: mdl-37333237

RESUMO

Despite consistent public health recommendations, obesity rates continue to increase. Physical activity (e.g. daily steps) is a well-established modifier of body weight. Genetic background is an important, but typically uncaptured, contributor to obesity risk. Leveraging physical activity, clinical, and genetic data from the All of Us Research Program, we measured the impact of genetic risk of obesity on the level of physical activity needed to reduce incident obesity. For example, we show that an additional 3,310 steps per day (11,910 steps total) would be needed to mitigate a 25% higher than average genetic risk of obesity. We quantify the number of daily steps needed to mitigate obesity risk across the spectrum of genetic risk. This work quantifies the relationship between physical activity and genetic risk showing significant independent effects and provides a first step towards personalized activity recommendations that incorporate genetic information to reduce incident obesity risk.

13.
medRxiv ; 2023 Jun 05.
Artigo em Inglês | MEDLINE | ID: mdl-37333378

RESUMO

Patients with schizophrenia have substantial comorbidity contributing to reduced life expectancy of 10-20 years. Identifying which comorbidities might be modifiable could improve rates of premature mortality in this population. We hypothesize that conditions that frequently co-occur but lack shared genetic risk with schizophrenia are more likely to be products of treatment, behavior, or environmental factors and therefore potentially modifiable. To test this hypothesis, we calculated phenome-wide comorbidity from electronic health records (EHR) in 250,000 patients in each of two independent health care institutions (Vanderbilt University Medical Center and Mass General Brigham) and association with schizophrenia polygenic risk scores (PRS) across the same phenotypes (phecodes) in linked biobanks. Comorbidity with schizophrenia was significantly correlated across institutions (r = 0.85) and consistent with prior literature. After multiple test correction, there were 77 significant phecodes comorbid with schizophrenia. Overall, comorbidity and PRS association were highly correlated (r = 0.55, p = 1.29×10-118), however, 36 of the EHR identified comorbidities had significantly equivalent schizophrenia PRS distributions between cases and controls. Fifteen of these lacked any PRS association and were enriched for phenotypes known to be side effects of antipsychotic medications (e.g., "movement disorders", "convulsions", "tachycardia") or other schizophrenia related factors such as from smoking ("bronchitis") or reduced hygiene (e.g., "diseases of the nail") highlighting the validity of this approach. Other phenotypes implicated by this approach where the contribution from shared common genetic risk with schizophrenia was minimal included tobacco use disorder, diabetes, and dementia. This work demonstrates the consistency and robustness of EHR-based schizophrenia comorbidities across independent institutions and with the existing literature. It identifies comorbidities with an absence of shared genetic risk indicating other causes that might be more modifiable and where further study of causal pathways could improve outcomes for patients.

14.
medRxiv ; 2023 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-37961557

RESUMO

The value of genetic information for improving the performance of clinical risk prediction models has yielded variable conclusions. Many methodological decisions have the potential to contribute to differential results across studies. Here, we performed multiple modeling experiments integrating clinical and demographic data from electronic health records (EHR) and genetic data to understand which decision points may affect performance. Clinical data in the form of structured diagnostic codes, medications, procedural codes, and demographics were extracted from two large independent health systems and polygenic risk scores (PRS) were generated across all patients with genetic data in the corresponding biobanks. Crohn's disease was used as the model phenotype based on its substantial genetic component, established EHR-based definition, and sufficient prevalence for model training and testing. We investigated the impact of PRS integration method, as well as choices regarding training sample, model complexity, and performance metrics. Overall, our results show that including PRS resulted in higher performance by some metrics but the gain in performance was only robust when combined with demographic data alone. Improvements were inconsistent or negligible after including additional clinical information. The impact of genetic information on performance also varied by PRS integration method, with a small improvement in some cases from combining PRS with the output of a clinical model (late-fusion) compared to its inclusion an additional feature (early-fusion). The effects of other modeling decisions varied between institutions though performance increased with more compute-intensive models such as random forest. This work highlights the importance of considering methodological decision points in interpreting the impact on prediction performance when including PRS information in clinical models.

15.
Cell Genom ; 3(4): 100277, 2023 Apr 12.
Artigo em Inglês | MEDLINE | ID: mdl-37082147

RESUMO

Autism spectrum disorder (ASD) is a heritable neurodevelopmental disorder characterized by deficits in social interactions and communication. Protein-altering variants in many genes have been shown to contribute to ASD; however, understanding the convergence across many genes remains a challenge. We demonstrate that coexpression patterns from 993 human postmortem brains are significantly correlated with the transcriptional consequences of CRISPR perturbations in human neurons. Across 71 ASD risk genes, there was significant tissue-specific convergence implicating synaptic pathways. Tissue-specific convergence was further demonstrated across schizophrenia and atrial fibrillation risk genes. The degree of ASD convergence was significantly correlated with ASD association from rare variation and differential expression in ASD brains. Positively convergent genes showed intolerance to functional mutations and had shorter coding lengths than known risk genes even after removing association with ASD. These results indicate that convergent coexpression can identify potentially novel genes that are unlikely to be discovered by sequencing studies.

16.
Genet Epidemiol ; 35(6): 557-67, 2011 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-21769932

RESUMO

We present a novel method, IBDLD, for estimating the probability of identity by descent (IBD) for a pair of related individuals at a locus, given dense genotype data and a pedigree of arbitrary size and complexity. IBDLD overcomes the challenges of exact multipoint estimation of IBD in pedigrees of potentially large size and eliminates the difficulty of accommodating the background linkage disequilibrium (LD) that is present in high-density genotype data. We show that IBDLD is much more accurate at estimating the true IBD sharing than methods that remove LD by pruning SNPs and is highly robust to pedigree errors or other forms of misspecified relationships. The method is fast and can be used to estimate the probability for each possible IBD sharing state at every SNP from a high-density genotyping array for hundreds of thousands of pairs of individuals. We use it to estimate point-wise and genomewide IBD sharing between 185,745 pairs of subjects all of whom are related through a single, large and complex 13-generation pedigree and genotyped with the Affymetrix 500 k chip. We find that we are able to identify the true pedigree relationship for individuals who were misidentified in the collected data and estimate empirical kinship coefficients that can be used in follow-up QTL mapping studies. IBDLD is implemented as an open source software package and is freely available.


Assuntos
Estudo de Associação Genômica Ampla , Epidemiologia Molecular/métodos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Algoritmos , Simulação por Computador , Feminino , Genótipo , Humanos , Desequilíbrio de Ligação , Masculino , Modelos Genéticos , Linhagem , Probabilidade , Locos de Características Quantitativas , Reprodutibilidade dos Testes
17.
Nat Med ; 27(6): 1097-1104, 2021 06.
Artigo em Inglês | MEDLINE | ID: mdl-34083811

RESUMO

Around 5% of the population is affected by a rare genetic disease, yet most endure years of uncertainty before receiving a genetic test. A common feature of genetic diseases is the presence of multiple rare phenotypes that often span organ systems. Here, we use diagnostic billing information from longitudinal clinical data in the electronic health records (EHRs) of 2,286 patients who received a chromosomal microarray test, and 9,144 matched controls, to build a model to predict who should receive a genetic test. The model achieved high prediction accuracies in a held-out test sample (area under the receiver operating characteristic curve (AUROC), 0.97; area under the precision-recall curve (AUPRC), 0.92), in an independent hospital system (AUROC, 0.95; AUPRC, 0.62), and in an independent set of 172,265 patients in which cases were broadly defined as having an interaction with a genetics provider (AUROC, 0.9; AUPRC, 0.63). Patients carrying a putative pathogenic copy number variant were also accurately identified by the model. Compared with current approaches for genetic test determination, our model could identify more patients for testing while also increasing the proportion of those tested who have a genetic disease. We demonstrate that phenotypic patterns representative of a wide range of genetic diseases can be captured from EHRs to systematize decision-making for genetic testing, with the potential to speed up diagnosis, improve care and reduce costs.


Assuntos
Variações do Número de Cópias de DNA/genética , Doenças Genéticas Inatas/diagnóstico , Testes Genéticos , Doenças Raras/diagnóstico , Adolescente , Adulto , Criança , Pré-Escolar , Registros Eletrônicos de Saúde , Feminino , Doenças Genéticas Inatas/patologia , Humanos , Lactente , Masculino , Análise em Microsséries , Fenótipo , Doenças Raras/genética , Doenças Raras/patologia
18.
Genetica ; 138(9-10): 1099-109, 2010 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-20835884

RESUMO

The identity-by-descent (IBD) based variance component analysis is an important method for mapping quantitative trait loci (QTL) in outbred populations. The interval-mapping approach and various modified versions of it may have limited use in evaluating the genetic variances of the entire genome because they require evaluation of multiple models and model selection. In this study, we developed a multiple variance component model for genome-wide evaluation using both the maximum likelihood (ML) method and the MCMC implemented Bayesian method. We placed one QTL in every few cM on the entire genome and estimated the QTL variances and positions simultaneously in a single model. Genomic regions that have no QTL usually showed no evidence of QTL while regions with large QTL always showed strong evidence of QTL. While the Bayesian method produced the optimal result, the ML method is computationally more efficient than the Bayesian method. Simulation experiments were conducted to demonstrate the efficacy of the new methods.


Assuntos
Estudos de Associação Genética , Modelos Genéticos , Locos de Características Quantitativas , Alelos , Análise de Variância , Teorema de Bayes , Mapeamento Cromossômico , Funções Verossimilhança , Cadeias de Markov
19.
Nat Commun ; 11(1): 2990, 2020 06 12.
Artigo em Inglês | MEDLINE | ID: mdl-32533064

RESUMO

Structural variants (SVs) contribute to many disorders, yet, functionally annotating them remains a major challenge. Here, we integrate SVs with RNA-sequencing from human post-mortem brains to quantify their dosage and regulatory effects. We show that genic and regulatory SVs exist at significantly lower frequencies than intergenic SVs. Functional impact of copy number variants (CNVs) stems from both the proportion of genic and regulatory content altered and loss-of-function intolerance of the gene. We train a linear model to predict expression effects of rare CNVs and use it to annotate regulatory disruption of CNVs from 14,891 independent genome-sequenced individuals. Pathogenic deletions implicated in neurodevelopmental disorders show significantly more extreme regulatory disruption scores and if rank ordered would be prioritized higher than using frequency or length alone. This work shows the deleteriousness of regulatory SVs, particularly those altering CTCF sites and provides a simple approach for functionally annotating the regulatory consequences of CNVs.


Assuntos
Encéfalo/metabolismo , Variações do Número de Cópias de DNA , Regulação da Expressão Gênica , Variação Genética , Genoma Humano/genética , Autopsia/métodos , Encéfalo/patologia , Feminino , Perfilação da Expressão Gênica/métodos , Humanos , Masculino , Transtornos do Neurodesenvolvimento/genética , Análise de Sequência de RNA/métodos
20.
Biosystems ; 91(1): 158-65, 2008 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-17919808

RESUMO

DNA arrays measure the expression levels for thousands of genes simultaneously under different conditions. These measurements reflect many aspects of the underlying biological processes. A method based on the matrix of thresholding partial correlation coefficients (MTPCC) is proposed for network inference from expression profiles. It includes three main parts: (1) hierarchical cluster analysis, (2) cluster boundaries establishment, and (3) regulatory network inference. The method was applied to the expression data of 2467 genes in Saccharomyces cerevisiae measured under 79 different conditions [Eisen, M.B., Spellman, P.T., Brown, P.O., Botstein, D., 1998. Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. 95, 14863-14868]. Using hierarchical clustering and cluster boundaries establishment, the 2467 genes were grouped into 12 clusters. The expression profiles of each cluster were expressed as a set of expression levels average over the cluster that constituted genes of each condition. Then the expression data of these clusters were subjected to the analysis of partial correlation, and the significance of each element in the obtained partial correlation coefficient matrix (PCCM) was examined by a permutation test. The corresponding undirected dependency graph (UDG) was obtained as a model of the regulatory network of S. cerevisiae. The veracity of the network was evidenced by the consistency of our results with the collected results from experimental studies.


Assuntos
Redes Reguladoras de Genes/genética , Perfilação da Expressão Gênica , Família Multigênica/genética , Saccharomyces cerevisiae/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA