Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 17 de 17
Filter
1.
Front Plant Sci ; 15: 1338425, 2024.
Article in English | MEDLINE | ID: mdl-38571717

ABSTRACT

The introduction of dwarfing genes triggered a wave of "green revolution". A number of wheats dwarfing genes have been reported in previous studies, and only a small fraction of these have been applied to production practices. Therefore, the development of novel dwarfing genes for wheat is of great value. In this study, a novel dwarfing site, Rht-yz, identified in the Yanzhan mutation, is located on chromosome 4B (30-33MB) and its mechanism of action is different from that of Rht-B1b (C-T mutation), but whether it affects the Rht-B1a (TraesCS4B02G043100) or other genes is unclear. Exogenously applied GA3 experiments showed that Rht-yz is one of the gibberellin-insensitive dwarf genes. The effects of the dwarf gene Rht-yz on agronomic traits in wheat were evaluated in the field using Yanzhan, Yanzhan mutations, F2:3 and F3:4 lines. The results showed that Rht-yz improved lodging resistance by reducing plant height, increasing diameter, wall thickness and mechanical strength of the basal stem. In terms of yield traits, Rht-yz had negative effects on tiller number plant-1, biomass plant-1 and yield plant-1, but had no significant effect on harvest index, 1000-kernel weight and spike traits. In addition, Rht-yz significantly increased crude protein, wet gluten and starch content. Therefore, the rational use of the new dwarfing site Rht-yz has potential and value in dwarf wheat breeding.

2.
Front Plant Sci ; 14: 1134170, 2023.
Article in English | MEDLINE | ID: mdl-36993845

ABSTRACT

Compared to C3 species, C4 plants showed higher photosynthetic capacity as well as water and nitrogen use efficiency due to the presence of the C4 photosynthetic pathway. Previous studies have shown that all genes required for the C4 photosynthetic pathway exist in the genomes of C3 species and are expressed. In this study, the genes encoding six key C4 photosynthetic pathway enzymes (ß-CA, PEPC, ME, MDH, RbcS, and PPDK) in the genomes of five important gramineous crops (C4: maize, foxtail millet, and sorghum; C3: rice and wheat) were systematically identified and compared. Based on sequence characteristics and evolutionary relationships, their C4 functional gene copies were distinguished from non-photosynthetic functional gene copies. Furthermore, multiple sequence alignment revealed important sites affecting the activities of PEPC and RbcS between the C3 and C4 species. Comparisons of expression characteristics confirmed that the expression patterns of non-photosynthetic gene copies were relatively conserved among species, while C4 gene copies in C4 species acquired new tissue expression patterns during evolution. Additionally, multiple sequence features that may affect C4 gene expression and subcellular localization were found in the coding and promoter regions. Our work emphasized the diversity of the evolution of different genes in the C4 photosynthetic pathway and confirmed that the specific high expression in the leaf and appropriate intracellular distribution were the keys to the evolution of C4 photosynthesis. The results of this study will help determine the evolutionary mechanism of the C4 photosynthetic pathway in Gramineae and provide references for the transformation of C4 photosynthetic pathways in wheat, rice, and other major C3 cereal crops.

3.
Plants (Basel) ; 11(24)2022 Dec 12.
Article in English | MEDLINE | ID: mdl-36559583

ABSTRACT

During the breeding progress, screening excellent wheat varieties and lines takes lots of labor and time. Moreover, different climatic conditions will bring more complex and unpredictable situations. Therefore, the selection efficiency needs to be improved by applying the proper selection index. This study evaluates the capability of CTD as an index for evaluating wheat germplasm in field conditions and proposes a strategy for the proper and efficient application of CTD as an index in breeding programs. In this study, 186 bread wheat varieties were grown in the field and evaluated for three continuous years with varied climatic conditions: normal, spring freezing, and early drought climatic conditions. The CTD and photosynthetic parameters were investigated at three key growth stages, canopy structural traits at the early grain filling stage, and yield traits at maturity. The variations in CTD among varieties were the highest in normal conditions and lowest in spring freezing conditions. CTD at the three growing stages was significantly and positively correlated for each growing season, and CTD at the middle grain filling stage was most significantly correlated across the three growing seasons, suggesting that CTD at the middle grain filling stage might be more important for evaluation. CTD was greatly affected by photosynthetic and canopy structural traits, which varied in different climatic conditions. Plant height, peduncle length, and the distance of the flag leaf to the spike were negatively correlated with CTD at the middle grain filling stage in both normal and drought conditions but positively correlated with CTD at the three stages in spring freezing conditions. Flag leaf length was positively correlated with CTD at the three stages in normal conditions but negatively correlated with CTD at the heading and middle grain filling stages in spring freezing conditions. Further analysis showed that CTD could be an index for evaluating the photosynthetic and yield traits of wheat germplasm in different environments, with varied characteristics in different climatic conditions. In normal conditions, the varieties with higher CTDs at the early filling stage had higher photosynthetic capacities and higher yields; in drought conditions, the varieties with high CTDs had better photosynthetic capacities, but those with moderate CTD had higher yield, while in spring freezing conditions, there were no differences in yield and biomass among the CTD groups. In sum, CTD could be used as an index to screen wheat varieties in specific climatic conditions, especially in normal and drought conditions, for photosynthetic parameters and some yield traits.

4.
Am J Hum Genet ; 109(11): 1998-2008, 2022 11 03.
Article in English | MEDLINE | ID: mdl-36240765

ABSTRACT

As most existing genome-wide association studies (GWASs) were conducted in European-ancestry cohorts, and as the existing polygenic risk score (PRS) models have limited transferability across ancestry groups, PRS research on non-European-ancestry groups needs to make efficient use of available data until we attain large sample sizes across all ancestry groups. Here we propose a PRS method using transfer learning techniques. Our approach, TL-PRS, uses gradient descent to fine-tune the baseline PRS model from an ancestry group with large sample GWASs to the dataset of target ancestry. In our application of constructing PRS for seven quantitative and two dichotomous traits for 10,285 individuals of South Asian ancestry and 8,168 individuals of African ancestry in UK Biobank, TL-PRS using PRS-CS as a baseline method obtained 25% average relative improvement for South Asian samples and 29% for African samples compared to the standard PRS-CS method in terms of predicted R2. Our approach increases the transferability of PRSs across ancestries and thereby helps reduce existing inequities in genetics research.


Subject(s)
Genome-Wide Association Study , Multifactorial Inheritance , Humans , Multifactorial Inheritance/genetics , Genetic Predisposition to Disease , Polymorphism, Single Nucleotide/genetics , Risk Factors , Machine Learning
6.
Nat Genet ; 54(10): 1466-1469, 2022 10.
Article in English | MEDLINE | ID: mdl-36138231

ABSTRACT

Several biobanks, including UK Biobank (UKBB), are generating large-scale sequencing data. An existing method, SAIGE-GENE, performs well when testing variants with minor allele frequency (MAF) ≤ 1%, but inflation is observed in variance component set-based tests when restricting to variants with MAF ≤ 0.1% or 0.01%. Here, we propose SAIGE-GENE+ with greatly improved type I error control and computational efficiency to facilitate rare variant tests in large-scale data. We further show that incorporating multiple MAF cutoffs and functional annotations can improve power and thus uncover new gene-phenotype associations. In the analysis of UKBB whole exome sequencing data for 30 quantitative and 141 binary traits, SAIGE-GENE+ identified 551 gene-phenotype associations.


Subject(s)
Genome-Wide Association Study , Gene Frequency/genetics , Genome-Wide Association Study/methods , Phenotype , Exome Sequencing
7.
Planta ; 255(6): 114, 2022 May 04.
Article in English | MEDLINE | ID: mdl-35507093

ABSTRACT

MAIN CONCLUSION: Rht5 was narrowed to an approximately 1 Mb interval and had pleiotropic effects on plant height, spike length and grain size. TraesCS3B02G025600 was predicted as the possible candidate gene. Plant height is an important component related to plant architecture, lodging resistance, and yield performance. The utilization of dwarf genes has made great contributions to wheat breeding and production. In this study, two F2 populations derived from the crosses of Jinmai47 and Ningchun45 with Marfed M were employed to identify the genetic region of reduce plant height 5 (Rht5), and their derived lines were used to evaluate its effects on plant height and main agronomic traits. Rht5 was fine-mapped between markers Kasp-25 and Kasp-23, in approximately 1 Mb region on chromosome 3BS, which harbored 17 high-confidence annotated genes based on the reference genome of Chinese Spring (IWGSC RefSeq v1.1). TraesCS3B02G025600 were predicted as the possible candidate gene based on its differential expression and sequence variation between dwarf and tall lines and parents. The results of phenotypic evaluation showed that Rht5 had pleiotropic effects on plant height, spike length, culm diameter, grain size and grain yield. The plant height of Rht5 dwarf lines was reduced by an average of 32.67% (32.53 cm) and 27.84% (33.62 cm) in the Jinmai47 and Ningchun45 population, respectively. While Rht5 showed significant and negative pleiotropic effects on culm diameter, aboveground biomass, grain yield, spike length, spikelet number, grain number per spike, grain size, grain weight and filling degree of basal second internode. The culm lodging resistance index (CLRI) of dwarf lines was significantly higher than that of tall lines in the two population. In conclusion, these results lay a foundation for understanding the dwarfing mechanism of Rht5.


Subject(s)
Bread , Triticum , Edible Grain/genetics , Genes, Plant/genetics , Phenotype , Plant Breeding , Triticum/genetics
8.
J Clin Med ; 10(19)2021 Sep 24.
Article in English | MEDLINE | ID: mdl-34640359

ABSTRACT

Testing for SARS-CoV-2 antibodies is commonly used to determine prior COVID-19 infections and to gauge levels of infection- or vaccine-induced immunity. Michigan Medicine, a primary regional health center, provided an ideal setting to understand serologic testing patterns over time. Between 27 April 2020 and 3 May 2021, characteristics for 10,416 individuals presenting for SARS-CoV-2 antibody tests (10,932 tests in total) were collected. Relative to the COVID-19 vaccine roll-out date, 14 December 2020, the data were split into a pre- (8026 individuals) and post-vaccine launch (2587 individuals) period and contrasted with untested individuals to identify factors associated with tested individuals and seropositivity. Exploratory analysis of vaccine-mediated seropositivity was performed in 347 fully vaccinated individuals. Predictors of tested individuals included age, sex, smoking, neighborhood variables, and pre-existing conditions. Seropositivity in the pre-vaccine launch period was 9.2% and increased to 46.7% in the post-vaccine launch period. In the pre-vaccine launch period, seropositivity was significantly associated with age (10 year; OR = 0.80 (0.73, 0.89)), ever-smoker status (0.49 (0.35, 0.67)), respiratory disease (4.38 (3.13, 6.12)), circulatory disease (2.09 (1.48, 2.96)), liver disease (2.06 (1.11, 3.84)), non-Hispanic Black race/ethnicity (2.18 (1.33, 3.58)), and population density (1.10 (1.03, 1.18)). Except for the latter two, these associations remained statistically significant in the post-vaccine launch period. The positivity rate of fully vaccinated individual was 296/347(85.3% (81.0%, 88.8%)).

9.
BMC Genomics ; 22(1): 519, 2021 Jul 08.
Article in English | MEDLINE | ID: mdl-34238217

ABSTRACT

BACKGROUND: Amino acid transporters (AATs) plays an essential roles in growth and development of plants, including amino acids long-range transport, seed germination, quality formation, responsiveness to pathogenic bacteria and abiotic stress by modulating the transmembrane transfer of amino acids. In this study, we performed a genome-wide screening to analyze the AAT genes in foxtail millet (Setaria italica L.), especially those associated with quality formation and abiotic stresses response. RESULTS: A total number of 94 AAT genes were identified and divided into 12 subfamilies by their sequence characteristics and phylogenetic relationship. A large number (58/94, 62%) of AAT genes in foxtail millet were expanded via gene duplication, involving 13 tandem and 12 segmental duplication events. Tandemly duplicated genes had a significant impact on their functional differentiation via sequence variation, structural variation and expression variation. Further comparison in multiple species showed that in addition to paralogous genes, the expression variations of the orthologous AAT genes also contributed to their functional differentiation. The transcriptomic comparison of two millet cultivars verified the direct contribution of the AAT genes such as SiAAP1, SiAAP8, and SiAUX2 in the formation of grain quality. In addition, the qRT-PCR analysis suggested that several AAT genes continuously responded to diverse abiotic stresses, such as SiATLb1, SiANT1. Finally, combined with the previous studies and analysis on sequence characteristics and expression patterns of AAT genes, the possible functions of the foxtail millet AAT genes were predicted. CONCLUSION: This study for the first time reported the evolutionary features, functional differentiation, roles in the quality formation and response to abiotic stresses of foxtail millet AAT gene family, thus providing a framework for further functional analysis of SiAAT genes, and also contributing to the applications of AAT genes in improving the quality and resistance to abiotic stresses of foxtail millet, and other cereal crops.


Subject(s)
Setaria Plant , Amino Acid Transport Systems , Gene Expression Regulation, Plant , Phylogeny , Plant Proteins/genetics , Plant Proteins/metabolism , Setaria Plant/genetics , Setaria Plant/metabolism , Stress, Physiological/genetics
10.
medRxiv ; 2020 Jul 29.
Article in English | MEDLINE | ID: mdl-32793922

ABSTRACT

Importance The diagnostic tests for COVID-19 have a high false negative rate, but not everyone with an initial negative result is re-tested. Michigan Medicine, being one of the primary regional centers accepting COVID-19 cases, provided an ideal setting for studying COVID-19 repeated testing patterns during the first wave of the pandemic. Objective To identify the characteristics of patients who underwent repeated testing for COVID-19 and determine if repeated testing was associated with patient characteristics and with downstream outcomes among positive cases. Design This cross-sectional study described the pattern of testing for COVID-19 at Michigan Medicine. The main hypothesis under consideration is whether patient characteristics differed between those tested once and those who underwent multiple tests. We then restrict our attention to those that had at least one positive test and study repeated testing patterns in patients with severe COVID-19 related outcomes (testing positive, hospitalization and ICU care). Setting Demographic and clinical characteristics, test results, and health outcomes for 15,920 patients presenting to Michigan Medicine between March 10 and June 4, 2020 for a diagnostic test for COVID-19 were collected from their electronic medical records on June 24, 2020. Data on the number and types of tests administered to a given patient, as well as the sequences of patient-specific test results were derived from records of patient laboratory results. Participants Anyone tested between March 10 and June 4, 2020 at Michigan Medicine with a diagnostic test for COVID-19 in their Electronic Health Records were included in our analysis. Exposures Comparison of repeated testing across patient demographics, clinical characteristics, and patient outcomes Main Outcomes and Measures Whether patients underwent repeated diagnostic testing for SARS CoV-2 in Michigan Medicine Results Between March 10th and June 4th, 19,540 tests were ordered for 15,920 patients, with most patients only tested once (13596, 85.4%) and never testing positive (14753, 92.7%). There were 5 patients who got tested 10 or more times and there were substantial variations in test results within a patient. After fully adjusting for patient and neighborhood socioeconomic status (NSES) and demographic characteristics, patients with circulatory diseases (OR: 1.42; 95% CI: (1.18, 1.72)), any cancer (OR: 1.14; 95% CI: (1.01, 1.29)), Type 2 diabetes (OR: 1.22; 95% CI: (1.06, 1.39)), kidney diseases (OR: 1.95; 95% CI: (1.71, 2.23)), and liver diseases (OR: 1.30; 95% CI: (1.11, 1.50)) were found to have higher odds of undergoing repeated testing when compared to those without. Additionally, as compared to non-Hispanic whites, non-Hispanic blacks were found to have higher odds (OR: 1.21; 95% CI: (1.03, 1.43)) of receiving additional testing. Females were found to have lower odds (OR: 0.86; 95% CI: (0.76, 0.96)) of receiving additional testing than males. Neighborhood poverty level also affected whether to receive additional testing. For 1% increase in proportion of population with annual income below the federal poverty level, the odds ratio of receiving repeated testing is 1.01 (OR: 1.01; 95% CI: (1.00, 1.01)). Focusing on only those 1167 patients with at least one positive result in their full testing history, patient age in years (OR: 1.01; 95% CI: (1.00, 1.03)), prior history of kidney diseases (OR: 2.15; 95% CI: (1.36, 3.41)) remained significantly different between patients who underwent repeated testing and those who did not. After adjusting for both patient demographic factors and NSES, hospitalization (OR: 7.44; 95% CI: (4.92, 11.41)) and ICU-level care (OR: 6.97; 95% CI: (4.48, 10.98)) were significantly associated with repeated testing. Of these 1167 patients, 306 got repeated testing and 1118 tests were done on these 306 patients, of which 810 (72.5%) were done during inpatient stays, substantiating that most repeated tests for test positive patients were done during hospitalization or ICU care. Additionally, using repeated testing data we estimate the "real world" false negative rate of the RT-PCR diagnostic test was 23.8% (95% CI: (19.5%, 28.5%)). Conclusions and Relevance This study sought to quantify the pattern of repeated testing for COVID-19 at Michigan Medicine. While most patients were tested once and received a negative result, a meaningful subset of patients (2324, 14.6% of the population who got tested) underwent multiple rounds of testing (5,944 tests were done in total on these 2324 patients, with an average of 2.6 tests per person), with 10 or more tests for five patients. Both hospitalizations and ICU care differed significantly between patients who underwent repeated testing versus those only tested once as expected. These results shed light on testing patterns and have important implications for understanding the variation of repeated testing results within and between patients.

11.
Front Plant Sci ; 11: 1091, 2020.
Article in English | MEDLINE | ID: mdl-32849679

ABSTRACT

In wheat breeding, improved quality traits, including grain quality and dough rheological properties, have long been a critical goal. To understand the genetic basis of key quality traits of wheat, two single-locus and five multi-locus GWAS models were performed for six grain quality traits and three dough rheological properties based on 19, 254 SNPs in 267 bread wheat accessions. As a result, 299 quantitative trait nucleotides (QTNs) within 105 regions were identified to be associated with these quality traits in four environments. Of which, 40 core QTN regions were stably detected in at least three environments, 19 of which were novel. Compared with the previous studies, these novel QTN regions explained smaller phenotypic variation, which verified the advantages of the multi-locus GWAS models in detecting important small effect QTNs associated with complex traits. After characterization of the function and expression in-depth, 67 core candidate genes involved in protein/sugar synthesis, histone modification and the regulation of transcription factor were observed to be associated with the formation of grain quality, which showed that multi-level regulations influenced wheat grain quality. Finally, a preliminary network of gene regulation that may affect wheat quality formation was inferred. This study verified the power and reliability of multi-locus GWAS methods in wheat quality trait research, and increased the understanding of wheat quality formation mechanisms. The detected QTN regions and candidate genes in this study could be further used for gene cloning and marker-assisted selection in high-quality breeding of bread wheat.

12.
Nat Genet ; 52(6): 634-639, 2020 06.
Article in English | MEDLINE | ID: mdl-32424355

ABSTRACT

With very large sample sizes, biobanks provide an exciting opportunity to identify genetic components of complex traits. To analyze rare variants, region-based multiple-variant aggregate tests are commonly used to increase power for association tests. However, because of the substantial computational cost, existing region-based tests cannot analyze hundreds of thousands of samples while accounting for confounders such as population stratification and sample relatedness. Here we propose a scalable generalized mixed-model region-based association test, SAIGE-GENE, that is applicable to exome-wide and genome-wide region-based analysis for hundreds of thousands of samples and can account for unbalanced case-control ratios for binary traits. Through extensive simulation studies and analysis of the HUNT study with 69,716 Norwegian samples and the UK Biobank data with 408,910 White British samples, we show that SAIGE-GENE can efficiently analyze large-sample data (N > 400,000) with type I error rates well controlled.


Subject(s)
Biological Specimen Banks/statistics & numerical data , Case-Control Studies , Exome , Linear Models , Genetic Markers , Humans , Lipoproteins, HDL/genetics , Models, Genetic , Multifactorial Inheritance , Norway , United Kingdom , Waist-Hip Ratio
13.
Front Immunol ; 11: 621757, 2020.
Article in English | MEDLINE | ID: mdl-33603751

ABSTRACT

Evasion of immunosurveillance is critical for cancer initiation and development. The expression of "don't eat me" signals protects cancer cells from being phagocytosed by macrophages, and the blockade of such signals demonstrates therapeutic potential by restoring the susceptibility of cancer cells to macrophage-mediated phagocytosis. However, whether additional self-protective mechanisms play a role against macrophage surveillance remains unexplored. Here, we derived a macrophage-resistant cancer model from cells deficient in the expression of CD47, a major "don't eat me" signal, via a macrophage selection assay. Comparative studies performed between the parental and resistant cells identified self-protective traits independent of CD47, which were examined with both pharmacological or genetic approaches in in vitro phagocytosis assays and in vivo tumor models for their roles in protecting against macrophage surveillance. Here we demonstrated that extracellular acidification resulting from glycolysis in cancer cells protected them against macrophage-mediated phagocytosis. The acidic tumor microenvironment resulted in direct inhibition of macrophage phagocytic ability and recruitment of weakly phagocytic macrophages. Targeting V-ATPase which transports excessive protons in cancer cells to acidify extracellular medium elicited a pro-phagocytic microenvironment with an increased ratio of M1-/M2-like macrophage populations, therefore inhibiting tumor development and metastasis. In addition, blockade of extracellular acidification enhanced cell surface exposure of CD71, targeting which by antibodies promoted cancer cell phagocytosis. Our results reveal that extracellular acidification due to the Warburg effect confers immune evasion ability on cancer cells. This previously unrecognized role highlights the components mediating the Warburg effect as potential targets for new immunotherapy harnessing the tumoricidal capabilities of macrophages.


Subject(s)
Immunologic Surveillance , Macrophages/immunology , Neoplasms, Experimental/immunology , Tumor Escape , Warburg Effect, Oncologic , Animals , Cell Line, Tumor , Humans , Macrophages/pathology , Mice , Mice, Inbred BALB C , Mice, Inbred NOD , Mice, Knockout , Neoplasms, Experimental/pathology
14.
Epidemiology ; 31(2): 194-204, 2020 03.
Article in English | MEDLINE | ID: mdl-31809338

ABSTRACT

Latent class models have become a popular means of summarizing survey questionnaires and other large sets of categorical variables. Often these classes are of primary interest to better understand complex patterns in data. Increasingly, these latent classes are reified into predictors of other outcomes of interests, treating the most likely class as the true class to which an individual belongs even though there is uncertainty in class membership. This uncertainty can be viewed as a form of measurement error in predictors, leading to bias in the estimates of the regression parameters associated with the latent classes. Despite this fact, there is very limited literature treating latent class predictors as measurement error models. Most applications ignore this issue and fit a two-stage model that treats the modal class prediction as truth. Here, we develop two approaches-one likelihood-based, the other Bayesian-to implement a joint model for latent class analysis and outcome prediction. We apply these methods to an analysis of how acculturation behaviors predict depression in South Asian immigrants to the United States. A simulation study gives guidance for when a two-stage model can be safely implemented and when the joint model may be required.


Subject(s)
Epidemiologic Methods , Latent Class Analysis , Uncertainty , Acculturation , Asia/ethnology , Bayes Theorem , Depression/epidemiology , Emigrants and Immigrants/psychology , Emigrants and Immigrants/statistics & numerical data , Female , Humans , Likelihood Functions , Male , Models, Statistical , United States/epidemiology
15.
Am J Hum Genet ; 106(1): 3-12, 2020 01 02.
Article in English | MEDLINE | ID: mdl-31866045

ABSTRACT

In biobank data analysis, most binary phenotypes have unbalanced case-control ratios, and this can cause inflation of type I error rates. Recently, a saddle point approximation (SPA) based single-variant test has been developed to provide an accurate and scalable method to test for associations of such phenotypes. For gene- or region-based multiple-variant tests, a few methods exist that can adjust for unbalanced case-control ratios; however, these methods are either less accurate when case-control ratios are extremely unbalanced or not scalable for large data analyses. To address these problems, we propose SKAT- and SKAT-O- type region-based tests; in these tests, the single-variant score statistic is calibrated based on SPA and efficient resampling (ER). Through simulation studies, we show that the proposed method provides well-calibrated p values. In contrast, when the case-control ratio is 1:99, the unadjusted approach has greatly inflated type I error rates (90 times that of exome-wide sequencing α = 2.5 × 10-6). Additionally, the proposed method has similar computation time to the unadjusted approaches and is scalable for large sample data. In our application, the UK Biobank whole-exome sequence data analysis of 45,596 unrelated European samples and 791 PheCode phenotypes identified 10 rare-variant associations with p value < 10-7, including the associations between JAK2 and myeloproliferative disease, HOXB13 and cancer of prostate, and F11 and congenital coagulation defects. All analysis summary results are publicly available through a web-based visual server, and this availability can help facilitate the identification of the genetic basis of complex diseases.


Subject(s)
Biological Specimen Banks , Exome Sequencing/methods , Exome/genetics , Genome-Wide Association Study , Phenomics , Polymorphism, Single Nucleotide , Case-Control Studies , Computer Simulation , Humans , Numerical Analysis, Computer-Assisted , Phenotype , United Kingdom
16.
Am J Hum Genet ; 105(6): 1182-1192, 2019 12 05.
Article in English | MEDLINE | ID: mdl-31735295

ABSTRACT

The etiology of most complex diseases involves genetic variants, environmental factors, and gene-environment interaction (G × E) effects. Compared with marginal genetic association studies, G × E analysis requires more samples and detailed measure of environmental exposures, and this limits the possible discoveries. Large-scale population-based biobanks with detailed phenotypic and environmental information, such as UK-Biobank, can be ideal resources for identifying G × E effects. However, due to the large computation cost and the presence of case-control imbalance, existing methods often fail. Here we propose a scalable and accurate method, SPAGE (SaddlePoint Approximation implementation of G × E analysis), that is applicable for genome-wide scale phenome-wide G × E studies. SPAGE fits a genotype-independent logistic model only once across the genome-wide analysis in order to reduce computation cost, and SPAGE uses a saddlepoint approximation (SPA) to calibrate the test statistics for analysis of phenotypes with unbalanced case-control ratios. Simulation studies show that SPAGE is 33-79 times faster than the Wald test and 72-439 times faster than the Firth's test, and SPAGE can control type I error rates at the genome-wide significance level even when case-control ratios are extremely unbalanced. Through the analysis of UK-Biobank data of 344,341 white British European-ancestry samples, we show that SPAGE can efficiently analyze large samples while controlling for unbalanced case-control ratios.


Subject(s)
Biological Specimen Banks , Gene-Environment Interaction , Genetic Diseases, Inborn/genetics , Genome-Wide Association Study , Polymorphism, Single Nucleotide , Quantitative Trait, Heritable , Case-Control Studies , Female , Genetic Diseases, Inborn/epidemiology , Humans , Logistic Models , Male , Phenomics , Phenotype , United Kingdom/epidemiology
17.
Environ Health ; 16(1): 102, 2017 09 26.
Article in English | MEDLINE | ID: mdl-28950902

ABSTRACT

BACKGROUND: There is growing concern of health effects of exposure to pollutant mixtures. We initially proposed an Environmental Risk Score (ERS) as a summary measure to examine the risk of exposure to multi-pollutants in epidemiologic research considering only pollutant main effects. We expand the ERS by consideration of pollutant-pollutant interactions using modern machine learning methods. We illustrate the multi-pollutant approaches to predicting a marker of oxidative stress (gamma-glutamyl transferase (GGT)), a common disease pathway linking environmental exposure and numerous health endpoints. METHODS: We examined 20 metal biomarkers measured in urine or whole blood from 6 cycles of the National Health and Nutrition Examination Survey (NHANES 2003-2004 to 2013-2014, n = 9664). We randomly split the data evenly into training and testing sets and constructed ERS's of metal mixtures for GGT using adaptive elastic-net with main effects and pairwise interactions (AENET-I), Bayesian additive regression tree (BART), Bayesian kernel machine regression (BKMR), and Super Learner in the training set and evaluated their performances in the testing set. We also evaluated the associations between GGT-ERS and cardiovascular endpoints. RESULTS: ERS based on AENET-I performed better than other approaches in terms of prediction errors in the testing set. Important metals identified in relation to GGT include cadmium (urine), dimethylarsonic acid, monomethylarsonic acid, cobalt, and barium. All ERS's showed significant associations with systolic and diastolic blood pressure and hypertension. For hypertension, one SD increase in each ERS from AENET-I, BART and SuperLearner were associated with odds ratios of 1.26 (95% CI, 1.15, 1.38), 1.17 (1.09, 1.25), and 1.30 (1.20, 1.40), respectively. ERS's showed non-significant positive associations with mortality outcomes. CONCLUSIONS: ERS is a useful tool for characterizing cumulative risk from pollutant mixtures, with accounting for statistical challenges such as high degrees of correlations and pollutant-pollutant interactions. ERS constructed for an intermediate marker like GGT is predictive of related disease endpoints.


Subject(s)
Cardiovascular Diseases/epidemiology , Environmental Pollutants/adverse effects , Linear Models , Machine Learning , Metals/adverse effects , Oxidative Stress/drug effects , Risk Assessment/methods , Adult , Aged , Cardiovascular Diseases/chemically induced , Female , Humans , Male , Middle Aged , Prevalence , United States/epidemiology , Young Adult
SELECTION OF CITATIONS
SEARCH DETAIL
...