Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 14 de 14
Filter
1.
Cell ; 184(8): 2068-2083.e11, 2021 04 15.
Article in English | MEDLINE | ID: mdl-33861964

ABSTRACT

Understanding population health disparities is an essential component of equitable precision health efforts. Epidemiology research often relies on definitions of race and ethnicity, but these population labels may not adequately capture disease burdens and environmental factors impacting specific sub-populations. Here, we propose a framework for repurposing data from electronic health records (EHRs) in concert with genomic data to explore the demographic ties that can impact disease burdens. Using data from a diverse biobank in New York City, we identified 17 communities sharing recent genetic ancestry. We observed 1,177 health outcomes that were statistically associated with a specific group and demonstrated significant differences in the segregation of genetic variants contributing to Mendelian diseases. We also demonstrated that fine-scale population structure can impact the prediction of complex disease risk within groups. This work reinforces the utility of linking genomic data to EHRs and provides a framework toward fine-scale monitoring of population health.


Subject(s)
Ethnicity/genetics , Population Health , Databases, Genetic , Electronic Health Records , Genomics , Humans , Self Report
2.
Genet Epidemiol ; 42(1): 49-63, 2018 02.
Article in English | MEDLINE | ID: mdl-29114909

ABSTRACT

BACKGROUND: Epistasis and gene-environment interactions are known to contribute significantly to variation of complex phenotypes in model organisms. However, their identification in human association studies remains challenging for myriad reasons. In the case of epistatic interactions, the large number of potential interacting sets of genes presents computational, multiple hypothesis correction, and other statistical power issues. In the case of gene-environment interactions, the lack of consistently measured environmental covariates in most disease studies precludes searching for interactions and creates difficulties for replicating studies. RESULTS: In this work, we develop a new statistical approach to address these issues that leverages genetic ancestry, defined as the proportion of ancestry derived from each ancestral population (e.g., the fraction of European/African ancestry in African Americans), in admixed populations. We applied our method to gene expression and methylation data from African American and Latino admixed individuals, respectively, identifying nine interactions that were significant at P<5×10-8. We show that two of the interactions in methylation data replicate, and the remaining six are significantly enriched for low P-values (P<1.8×10-6). CONCLUSION: We show that genetic ancestry can be a useful proxy for unknown and unmeasured covariates in the search for interaction effects. These results have important implications for our understanding of the genetic architecture of complex traits.


Subject(s)
Black People/genetics , Black or African American/genetics , Epistasis, Genetic/genetics , Gene-Environment Interaction , Hispanic or Latino/genetics , Models, Genetic , White People/genetics , DNA Methylation , Humans , Phenotype
3.
Proc Natl Acad Sci U S A ; 112(44): 13621-6, 2015 Nov 03.
Article in English | MEDLINE | ID: mdl-26483472

ABSTRACT

Nonrandom mating in human populations has important implications for genetics and medicine as well as for economics and sociology. In this study, we performed an integrative analysis of a large cohort of Mexican and Puerto Rican couples using detailed socioeconomic attributes and genotypes. We found that in ethnically homogeneous Latino communities, partners are significantly more similar in their genomic ancestries than expected by chance. Consistent with this, we also found that partners are more closely related--equivalent to between third and fourth cousins in Mexicans and Puerto Ricans--than matched random male-female pairs. Our analysis showed that this genomic ancestry similarity cannot be explained by the standard socioeconomic measurables alone. Strikingly, the assortment of genomic ancestry in couples was consistently stronger than even the assortment of education. We found enriched correlation of partners' genotypes at genes known to be involved in facial development. We replicated our results across multiple geographic locations. We discuss the implications of assortment and assortment-specific loci on disease dynamics and disease mapping methods in Latinos.


Subject(s)
Genetics, Medical , Hispanic or Latino , Interpersonal Relations , Socioeconomic Factors , Cohort Studies , Female , Heterozygote , Humans , Male , Mexico/ethnology , Puerto Rico/ethnology
4.
Bioinformatics ; 31(12): i181-9, 2015 Jun 15.
Article in English | MEDLINE | ID: mdl-26072481

ABSTRACT

MOTIVATION: Approaches to identifying new risk loci, training risk prediction models, imputing untyped variants and fine-mapping causal variants from summary statistics of genome-wide association studies are playing an increasingly important role in the human genetics community. Current summary statistics-based methods rely on global 'best guess' reference panels to model the genetic correlation structure of the dataset being studied. This approach, especially in admixed populations, has the potential to produce misleading results, ignores variation in local structure and is not feasible when appropriate reference panels are missing or small. Here, we develop a method, Adapt-Mix, that combines information across all available reference panels to produce estimates of local genetic correlation structure for summary statistics-based methods in arbitrary populations. RESULTS: We applied Adapt-Mix to estimate the genetic correlation structure of both admixed and non-admixed individuals using simulated and real data. We evaluated our method by measuring the performance of two summary statistics-based methods: imputation and joint-testing. When using our method as opposed to the current standard of 'best guess' reference panels, we observed a 28% decrease in mean-squared error for imputation and a 73.7% decrease in mean-squared error for joint-testing. AVAILABILITY AND IMPLEMENTATION: Our method is publicly available in a software package called ADAPT-Mix available at https://github.com/dpark27/adapt_mix.


Subject(s)
Genome-Wide Association Study/methods , Algorithms , Coronary Artery Disease/genetics , Data Interpretation, Statistical , Genotype , Humans , Phenotype , Polymorphism, Single Nucleotide , Software
5.
BMC Bioinformatics ; 16 Suppl 5: S9, 2015.
Article in English | MEDLINE | ID: mdl-25860540

ABSTRACT

Identifying segments in the genome of different individuals that are identical-by-descent (IBD) is a fundamental element of genetics. IBD data is used for numerous applications including demographic inference, heritability estimation, and mapping disease loci. Simultaneous detection of IBD over multiple haplotypes has proven to be computationally difficult. To overcome this, many state of the art methods estimate the probability of IBD between each pair of haplotypes separately. While computationally efficient, these methods fail to leverage the clique structure of IBD resulting in less powerful IBD identification, especially for small IBD segments.


Subject(s)
Asthma/genetics , Computational Biology/methods , Genetics, Population , Genome, Human , Haplotypes/genetics , Polymorphism, Single Nucleotide/genetics , Asthma/epidemiology , Cohort Studies , Computer Simulation , Hispanic or Latino/genetics , Humans , Probability
6.
Am J Hum Genet ; 90(6): 1046-63, 2012 Jun 08.
Article in English | MEDLINE | ID: mdl-22658545

ABSTRACT

We sought to comprehensively and systematically characterize the relationship between genetic variation, miRNA expression, and mRNA expression. Genome-wide expression profiling of samples of European and African ancestry identified in each population hundreds of miRNAs whose increased expression is correlated with correspondingly reduced expression of target mRNAs. We scanned 3' UTR SNPs with a potential functional effect on miRNA binding for cis-acting expression quantitative trait loci (eQTLs) for the corresponding proximal target genes. To extend sequence-based, localized analyses of SNP effect on miRNA binding, we proceeded to dissect the genetic basis of miRNA expression variation; we mapped miRNA expression levels-as quantitative traits-to loci in the genome as miRNA eQTLs, demonstrating that miRNA expression is under significant genetic control. We found that SNPs associated with miRNA expression are significantly enriched with those SNPs already shown to be associated with mRNA. Moreover, we discovered that many of the miRNA-associated genetic variations identified in our study are associated with a broad spectrum of human complex traits from the National Human Genome Research Institute catalog of published genome-wide association studies. Experimentally, we replicated miRNA-induced mRNA expression inhibition and the cis-eQTL relationship to the target gene for several identified relationships among SNPs, miRNAs, and mRNAs in an independent set of samples; furthermore, we conducted miRNA overexpression and inhibition experiments to functionally validate the miRNA-mRNA relationships. This study extends our understanding of the genetic regulation of the transcriptome and suggests that genetic variation might underlie observed relationships between miRNAs and mRNAs more commonly than has previously been appreciated.


Subject(s)
Gene Expression Regulation , MicroRNAs/metabolism , Transcriptome , 3' Untranslated Regions , Algorithms , Exons , Gene Expression Profiling , Genetic Variation , Genome , Genome, Human , Genome-Wide Association Study , Genotype , Humans , Models, Genetic , Polymorphism, Single Nucleotide , Quantitative Trait Loci , RNA, Messenger/metabolism , Transcription, Genetic
7.
Nat Genet ; 56(8): 1592-1596, 2024 Aug.
Article in English | MEDLINE | ID: mdl-39103650

ABSTRACT

Coronavirus disease 2019 (COVID-19) and influenza are respiratory illnesses caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and influenza viruses, respectively. Both diseases share symptoms and clinical risk factors1, but the extent to which these conditions have a common genetic etiology is unknown. This is partly because host genetic risk factors are well characterized for COVID-19 but not for influenza, with the largest published genome-wide association studies for these conditions including >2 million individuals2 and about 1,000 individuals3-6, respectively. Shared genetic risk factors could point to targets to prevent or treat both infections. Through a genetic study of 18,334 cases with a positive test for influenza and 276,295 controls, we show that published COVID-19 risk variants are not associated with influenza. Furthermore, we discovered and replicated an association between influenza infection and noncoding variants in B3GALT5 and ST6GAL1, neither of which was associated with COVID-19. In vitro small interfering RNA knockdown of ST6GAL1-an enzyme that adds sialic acid to the cell surface, which is used for viral entry-reduced influenza infectivity by 57%. These results mirror the observation that variants that downregulate ACE2, the SARS-CoV-2 receptor, protect against COVID-19 (ref. 7). Collectively, these findings highlight downregulation of key cell surface receptors used for viral entry as treatment opportunities to prevent COVID-19 and influenza.


Subject(s)
COVID-19 , Genetic Predisposition to Disease , Genome-Wide Association Study , Influenza, Human , SARS-CoV-2 , Humans , Influenza, Human/genetics , Influenza, Human/epidemiology , Influenza, Human/virology , COVID-19/genetics , COVID-19/virology , Risk Factors , SARS-CoV-2/genetics , Male , Female , Polymorphism, Single Nucleotide , Case-Control Studies , Middle Aged
8.
BMJ Open ; 12(10): e049657, 2022 10 12.
Article in English | MEDLINE | ID: mdl-36223959

ABSTRACT

OBJECTIVES: The enormous toll of the COVID-19 pandemic has heightened the urgency of collecting and analysing population-scale datasets in real time to monitor and better understand the evolving pandemic. The objectives of this study were to examine the relationship of risk factors to COVID-19 susceptibility and severity and to develop risk models to accurately predict COVID-19 outcomes using rapidly obtained self-reported data. DESIGN: A cross-sectional study. SETTING: AncestryDNA customers in the USA who consented to research. PARTICIPANTS: The AncestryDNA COVID-19 Study collected self-reported survey data on symptoms, outcomes, risk factors and exposures for over 563 000 adult individuals in the USA in just under 4 months, including over 4700 COVID-19 cases as measured by a self-reported positive test. RESULTS: We replicated previously reported associations between several risk factors and COVID-19 susceptibility and severity outcomes, and additionally found that differences in known exposures accounted for many of the susceptibility associations. A notable exception was elevated susceptibility for men even after adjusting for known exposures and age (adjusted OR=1.36, 95% CI=1.19 to 1.55). We also demonstrated that self-reported data can be used to build accurate risk models to predict individualised COVID-19 susceptibility (area under the curve (AUC)=0.84) and severity outcomes including hospitalisation and critical illness (AUC=0.87 and 0.90, respectively). The risk models achieved robust discriminative performance across different age, sex and genetic ancestry groups within the study. CONCLUSIONS: The results highlight the value of self-reported epidemiological data to rapidly provide public health insights into the evolving COVID-19 pandemic.


Subject(s)
COVID-19 , Adult , COVID-19/epidemiology , Cross-Sectional Studies , Humans , Male , Pandemics , Risk Factors , SARS-CoV-2
9.
Nat Genet ; 54(4): 374-381, 2022 04.
Article in English | MEDLINE | ID: mdl-35410379

ABSTRACT

Multiple COVID-19 genome-wide association studies (GWASs) have identified reproducible genetic associations indicating that there is a genetic component to susceptibility and severity risk. To complement these studies, we collected deep coronavirus disease 2019 (COVID-19) phenotype data from a survey of 736,723 AncestryDNA research participants. With these data, we defined eight phenotypes related to COVID-19 outcomes: four phenotypes that align with previously studied COVID-19 definitions and four 'expanded' phenotypes that focus on susceptibility given exposure, mild clinical manifestations and an aggregate score of symptom severity. We performed a replication analysis of 12 previously reported COVID-19 genetic associations with all eight phenotypes in a trans-ancestry meta-analysis of AncestryDNA research participants. In this analysis, we show distinct patterns of association at the 12 loci with the eight outcomes that we assessed. We also performed a genome-wide discovery analysis of all eight phenotypes, which did not yield new genome-wide significant loci but did suggest that three of the four 'expanded' COVID-19 phenotypes have enhanced power to capture protective genetic associations relative to the previously studied phenotypes. Thus, we conclude that continued large-scale ascertainment of deep COVID-19 phenotype data would likely represent a boon for COVID-19 therapeutic target identification.


Subject(s)
COVID-19 , Genome-Wide Association Study , COVID-19/genetics , Genetic Predisposition to Disease , Humans , Phenotype , Polymorphism, Single Nucleotide/genetics
10.
Nat Genet ; 54(4): 382-392, 2022 04.
Article in English | MEDLINE | ID: mdl-35241825

ABSTRACT

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) enters human host cells via angiotensin-converting enzyme 2 (ACE2) and causes coronavirus disease 2019 (COVID-19). Here, through a genome-wide association study, we identify a variant (rs190509934, minor allele frequency 0.2-2%) that downregulates ACE2 expression by 37% (P = 2.7 × 10-8) and reduces the risk of SARS-CoV-2 infection by 40% (odds ratio = 0.60, P = 4.5 × 10-13), providing human genetic evidence that ACE2 expression levels influence COVID-19 risk. We also replicate the associations of six previously reported risk variants, of which four were further associated with worse outcomes in individuals infected with the virus (in/near LZTFL1, MHC, DPP9 and IFNAR2). Lastly, we show that common variants define a risk score that is strongly associated with severe disease among cases and modestly improves the prediction of disease severity relative to demographic and clinical factors alone.


Subject(s)
COVID-19 , Angiotensin-Converting Enzyme 2/genetics , COVID-19/genetics , Genome-Wide Association Study , Humans , Risk Factors , SARS-CoV-2/genetics
11.
Front Genet ; 12: 673167, 2021.
Article in English | MEDLINE | ID: mdl-34108994

ABSTRACT

Genome-wide association studies (GWAS) are primarily conducted in single-ancestry settings. The low transferability of results has limited our understanding of human genetic architecture across a range of complex traits. In contrast to homogeneous populations, admixed populations provide an opportunity to capture genetic architecture contributed from multiple source populations and thus improve statistical power. Here, we provide a mechanistic simulation framework to investigate the statistical power and transferability of GWAS under directional polygenic selection or varying divergence. We focus on a two-way admixed population and show that GWAS in admixed populations can be enriched for power in discovery by up to 2-fold compared to the ancestral populations under similar sample size. Moreover, higher accuracy of cross-population polygenic score estimates is also observed if variants and weights are trained in the admixed group rather than in the ancestral groups. Common variant associations are also more likely to replicate if first discovered in the admixed group and then transferred to an ancestral population, than the other way around (across 50 iterations with 1,000 causal SNPs, training on 10,000 individuals, testing on 1,000 in each population, p = 3.78e-6, 6.19e-101, ∼0 for FST = 0.2, 0.5, 0.8, respectively). While some of these FST values may appear extreme, we demonstrate that they are found across the entire phenome in the GWAS catalog. This framework demonstrates that investigation of admixed populations harbors significant advantages over GWAS in single-ancestry cohorts for uncovering the genetic architecture of traits and will improve downstream applications such as personalized medicine across diverse populations.

12.
PLoS One ; 9(7): e100542, 2014.
Article in English | MEDLINE | ID: mdl-25057966

ABSTRACT

Klebsiella oxytoca is an opportunistic pathogen implicated in various clinical diseases in animals and humans. Studies suggest that in humans K. oxytoca exerts its pathogenicity in part through a cytotoxin. However, cytotoxin production in animal isolates of K. oxytoca and its pathogenic properties have not been characterized. Furthermore, neither the identity of the toxin nor a complete repertoire of genes involved in K. oxytoca pathogenesis have been fully elucidated. Here, we showed that several animal isolates of K. oxytoca, including the clinical isolates, produced secreted products in bacterial culture supernatant that display cytotoxicity on HEp-2 and HeLa cells, indicating the ability to produce cytotoxin. Cytotoxin production appears to be regulated by the environment, and soy based product was found to have a strong toxin induction property. The toxin was identified, by liquid chromatography-mass spectrometry and NMR spectroscopy, as low molecular weight heat labile benzodiazepine, tilivalline, previously shown to cause cytotoxicity in several cell lines, including mouse L1210 leukemic cells. Genome sequencing and analyses of a cytotoxin positive K. oxytoca strain isolated from an abscess of a mouse, identified genes previously shown to promote pathogenesis in other enteric bacterial pathogens including ecotin, several genes encoding for type IV and type VI secretion systems, and proteins that show sequence similarity to known bacterial toxins including cholera toxin. To our knowledge, these results demonstrate for the first time, that animal isolates of K. oxytoca, produces a cytotoxin, and that cytotoxin production is under strict environmental regulation. We also confirmed tilivalline as the cytotoxin present in animal K. oxytoca strains. These findings, along with the discovery of a repertoire of genes with virulence potential, provide important insights into the pathogenesis of K. oxytoca. As a novel diagnostic tool, tilivalline may serve as a biomarker for K oxytoca-induced cytotoxicity in humans and animals through detection in various samples from food to diseased samples using LC-MS/MS. Induction of K. oxytoca cytotoxin by consumption of soy may be in part involved in the pathogenesis of gastrointestinal disease.


Subject(s)
Bacterial Toxins/toxicity , Benzodiazepinones/toxicity , Klebsiella Infections/veterinary , Klebsiella oxytoca/pathogenicity , Animals , Bacterial Secretion Systems/genetics , Bacterial Toxins/biosynthesis , Bacterial Toxins/chemistry , Bacterial Toxins/isolation & purification , Benzodiazepinones/chemistry , Benzodiazepinones/isolation & purification , Benzodiazepinones/metabolism , Cell Death/drug effects , Cell Line , Cell Survival/drug effects , Haplorhini , HeLa Cells , Humans , Klebsiella Infections/microbiology , Klebsiella oxytoca/drug effects , Klebsiella oxytoca/isolation & purification , Klebsiella oxytoca/metabolism , Mice , Plant Extracts/isolation & purification , Plant Extracts/pharmacology , Rats , Glycine max/chemistry , Swine
13.
Genetics ; 192(4): 1249-69, 2012 Dec.
Article in English | MEDLINE | ID: mdl-23051646

ABSTRACT

Whole genome sequencing (WGS) allows researchers to pinpoint genetic differences between individuals and significantly shortcuts the costly and time-consuming part of forward genetic analysis in model organism systems. Currently, the most effort-intensive part of WGS is the bioinformatic analysis of the relatively short reads generated by second generation sequencing platforms. We describe here a novel, easily accessible and cloud-based pipeline, called CloudMap, which greatly simplifies the analysis of mutant genome sequences. Available on the Galaxy web platform, CloudMap requires no software installation when run on the cloud, but it can also be run locally or via Amazon's Elastic Compute Cloud (EC2) service. CloudMap uses a series of predefined workflows to pinpoint sequence variations in animal genomes, such as those of premutagenized and mutagenized Caenorhabditis elegans strains. In combination with a variant-based mapping procedure, CloudMap allows users to sharply define genetic map intervals graphically and to retrieve very short lists of candidate variants with a few simple clicks. Automated workflows and extensive video user guides are available to detail the individual analysis steps performed (http://usegalaxy.org/cloudmap). We demonstrate the utility of CloudMap for WGS analysis of C. elegans and Arabidopsis genomes and describe how other organisms (e.g., Zebrafish and Drosophila) can easily be accommodated by this software platform. To accommodate rapid analysis of many mutants from large-scale genetic screens, CloudMap contains an in silico complementation testing tool that allows users to rapidly identify instances where multiple alleles of the same gene are present in the mutant collection. Lastly, we describe the application of a novel mapping/WGS method ("Variant Discovery Mapping") that does not rely on a defined polymorphic mapping strain, and we integrate the application of this method into CloudMap. CloudMap tools and documentation are continually updated at http://usegalaxy.org/cloudmap.


Subject(s)
Chromosome Mapping/methods , Computational Biology/methods , Internet , Mutation , Software , Animals , Arabidopsis/genetics , Caenorhabditis elegans/genetics , Computer Simulation , Drosophila/genetics , Genetic Variation , Genome , Polymorphism, Single Nucleotide , Reproducibility of Results , Zebrafish/genetics
14.
PLoS One ; 7(8): e42842, 2012.
Article in English | MEDLINE | ID: mdl-22952616

ABSTRACT

The recently identified type VI secretion system (T6SS) of proteobacteria has been shown to promote pathogenicity, competitive advantage over competing microorganisms, and adaptation to environmental perturbation. By detailed phenotypic characterization of loss-of-function mutants, in silico, in vitro and in vivo analyses, we provide evidence that the enteric pathogen, Campylobacter jejuni, possesses a functional T6SS and that the secretion system exerts pleiotropic effects on two crucial processes--survival in a bile salt, deoxycholic acid (DCA), and host cell adherence and invasion. The expression of T6SS during initial exposure to the upper range of physiological levels of DCA (0.075%-0.2%) was detrimental to C. jejuni proliferation, whereas down-regulation or inactivation of T6SS enabled C. jejuni to resist this effect. The C. jejuni multidrug efflux transporter gene, cmeA, was significantly up-regulated during the initial exposure to DCA in the wild type C. jejuni relative to the T6SS-deficient strains, suggesting that inhibition of proliferation is the consequence of T6SS-mediated DCA influx. A sequential modulation of the efflux transporter activity and the T6SS represents, in part, an adaptive mechanism for C. jejuni to overcome this inhibitory effect, thereby ensuring its survival. C. jejuni T6SS plays important roles in host cell adhesion and invasion as T6SS inactivation resulted in a reduction of adherence to and invasion of in vitro cell lines, while over-expression of a hemolysin co-regulated protein, which encodes a secreted T6SS component, greatly enhanced these processes. When inoculated into B6.129P2-IL-10(tm1Cgn) mice, the T6SS-deficient C. jejuni strains did not effectively establish persistent colonization, indicating that T6SS contributes to colonization in vivo. Taken together, our data demonstrate the importance of bacterial T6SS in host cell adhesion, invasion, colonization and, for the first time to our knowledge, adaptation to DCA, providing new insights into the role of T6SS in C. jejuni pathogenesis.


Subject(s)
Bacterial Secretion Systems , Campylobacter Infections/microbiology , Campylobacter jejuni/physiology , Deoxycholic Acid/pharmacology , Agar/chemistry , Animals , Bacterial Proteins/metabolism , Campylobacter jejuni/metabolism , Cell Adhesion , Cell Proliferation , Genes, Bacterial , Genetic Complementation Test , Hemolysin Proteins/metabolism , Interleukin-10/metabolism , Mice , Mice, Transgenic , Multigene Family , Mutation , Phenotype
SELECTION OF CITATIONS
SEARCH DETAIL