Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 69
Filter
Add more filters

Country/Region as subject
Publication year range
1.
Hum Mol Genet ; 32(18): 2842-2855, 2023 09 05.
Article in English | MEDLINE | ID: mdl-37471639

ABSTRACT

Pulmonary surfactant is a lipoprotein synthesized and secreted by alveolar type II cells in lung. We evaluated the associations between 200,139 single nucleotide polymorphisms (SNPs) of 40 surfactant-related genes and lung cancer risk using genotyped data from two independent lung cancer genome-wide association studies. Discovery data included 18,082 cases and 13,780 controls of European ancestry. Replication data included 1,914 cases and 3,065 controls of European descent. Using multivariate logistic regression, we found novel SNPs in surfactant-related genes CTSH [rs34577742 C > T, odds ratio (OR) = 0.90, 95% confidence interval (CI) = 0.89-0.93, P = 7.64 × 10-9] and SFTA2 (rs3095153 G > A, OR = 1.16, 95% CI = 1.10-1.21, P = 1.27 × 10-9) associated with overall lung cancer in the discovery data and validated in an independent replication data-CTSH (rs34577742 C > T, OR = 0.88, 95% CI = 0.80-0.96, P = 5.76 × 10-3) and SFTA2 (rs3095153 G > A, OR = 1.14, 95% CI = 1.01-1.28, P = 3.25 × 10-2). Among ever smokers, we found SNPs in CTSH (rs34577742 C > T, OR = 0.89, 95% CI = 0.85-0.92, P = 1.94 × 10-7) and SFTA2 (rs3095152 G > A, OR = 1.20, 95% CI = 1.14-1.27, P = 4.25 × 10-11) associated with overall lung cancer in the discovery data and validated in the replication data-CTSH (rs34577742 C > T, OR = 0.88, 95% CI = 0.79-0.97, P = 1.64 × 10-2) and SFTA2 (rs3095152 G > A, OR = 1.15, 95% CI = 1.01-1.30, P = 3.81 × 10-2). Subsequent transcriptome-wide association study using expression weights from a lung expression quantitative trait loci study revealed genes most strongly associated with lung cancer are CTSH (PTWAS = 2.44 × 10-4) and SFTA2 (PTWAS = 2.32 × 10-6).


Subject(s)
Lung Neoplasms , Pulmonary Surfactants , Humans , Genome-Wide Association Study , Lung/metabolism , Genotype , Pulmonary Surfactants/metabolism , Surface-Active Agents/metabolism , Polymorphism, Single Nucleotide , Genetic Predisposition to Disease , Cathepsin H/genetics , Cathepsin H/metabolism
2.
Hum Mol Genet ; 31(16): 2831-2843, 2022 08 23.
Article in English | MEDLINE | ID: mdl-35138370

ABSTRACT

Differences by sex in lung cancer incidence and mortality have been reported which cannot be fully explained by sex differences in smoking behavior, implying existence of genetic and molecular basis for sex disparity in lung cancer development. However, the information about sex dimorphism in lung cancer risk is quite limited despite the great success in lung cancer association studies. By adopting a stringent two-stage analysis strategy, we performed a genome-wide gene-sex interaction analysis using genotypes from a lung cancer cohort including ~ 47 000 individuals with European ancestry. Three low-frequency variants (minor allele frequency < 0.05), rs17662871 [odds ratio (OR) = 0.71, P = 4.29×10-8); rs79942605 (OR = 2.17, P = 2.81×10-8) and rs208908 (OR = 0.70, P = 4.54×10-8) were identified with different risk effect of lung cancer between men and women. Further expression quantitative trait loci and functional annotation analysis suggested rs208908 affects lung cancer risk through differential regulation of Coxsackie virus and adenovirus receptor gene expression in lung tissues between men and women. Our study is one of the first studies to provide novel insights about the genetic and molecular basis for sex disparity in lung cancer development.


Subject(s)
Genome-Wide Association Study , Lung Neoplasms , Case-Control Studies , Female , Genetic Predisposition to Disease , Humans , Lung , Lung Neoplasms/epidemiology , Lung Neoplasms/genetics , Male , Polymorphism, Single Nucleotide/genetics
3.
Am J Hematol ; 99(7): 1230-1239, 2024 07.
Article in English | MEDLINE | ID: mdl-38654461

ABSTRACT

Venous thromboembolism (VTE) poses a significant risk to cancer patients receiving systemic therapy. The generalizability of pan-cancer models to lymphomas is limited. Currently, there are no reliable risk prediction models for thrombosis in patients with lymphoma. Our objective was to create a risk assessment model (RAM) specifically for lymphomas. We performed a retrospective cohort study to develop Fine and Gray sub-distribution hazard model for VTE and pulmonary embolism (PE)/ lower extremity deep vein thrombosis (LE-DVT) respectively in adult lymphoma patients from the Veterans Affairs national healthcare system (VA). External validations were performed at the Harris Health System (HHS) and the MD Anderson Cancer Center (MDACC). Time-dependent c-statistic and calibration curves were used to assess discrimination and fit. There were 10,313 (VA), 854 (HHS), and 1858 (MDACC) patients in the derivation and validation cohorts with diverse baseline. At 6 months, the VTE incidence was 5.8% (VA), 8.2% (HHS), and 8.8% (MDACC), respectively. The corresponding estimates for PE/LE-DVT were 3.9% (VA), 4.5% (HHS), and 3.7% (MDACC), respectively. The variables in the final RAM included lymphoma histology, body mass index, therapy type, recent hospitalization, history of VTE, history of paralysis/immobilization, and time to treatment initiation. The RAM had c-statistics of 0.68 in the derivation and 0.69 and 0.72 in the two external validation cohorts. The two models achieved a clear differentiation in risk stratification in each cohort. Our findings suggest that easy-to-implement, clinical-based model could be used to predict personalized VTE risk for lymphoma patients.


Subject(s)
Lymphoma , Venous Thromboembolism , Humans , Retrospective Studies , Lymphoma/complications , Lymphoma/epidemiology , Middle Aged , Female , Male , Aged , Risk Assessment , Venous Thromboembolism/etiology , Venous Thromboembolism/epidemiology , Adult , Pulmonary Embolism/etiology , Pulmonary Embolism/epidemiology , Venous Thrombosis/etiology , Venous Thrombosis/epidemiology , Risk Factors , Incidence , Aged, 80 and over
4.
PLoS Genet ; 17(3): e1009254, 2021 03.
Article in English | MEDLINE | ID: mdl-33667223

ABSTRACT

Squamous cell carcinomas (SqCC) of the aerodigestive tract have similar etiological risk factors. Although genetic risk variants for individual cancers have been identified, an agnostic, genome-wide search for shared genetic susceptibility has not been performed. To identify novel and pleotropic SqCC risk variants, we performed a meta-analysis of GWAS data on lung SqCC (LuSqCC), oro/pharyngeal SqCC (OSqCC), laryngeal SqCC (LaSqCC) and esophageal SqCC (ESqCC) cancers, totaling 13,887 cases and 61,961 controls of European ancestry. We identified one novel genome-wide significant (Pmeta<5x10-8) aerodigestive SqCC susceptibility loci in the 2q33.1 region (rs56321285, TMEM273). Additionally, three previously unknown loci reached suggestive significance (Pmeta<5x10-7): 1q32.1 (rs12133735, near MDM4), 5q31.2 (rs13181561, TMEM173) and 19p13.11 (rs61494113, ABHD8). Multiple previously identified loci for aerodigestive SqCC also showed evidence of pleiotropy in at least another SqCC site, these include: 4q23 (ADH1B), 6p21.33 (STK19), 6p21.32 (HLA-DQB1), 9p21.33 (CDKN2B-AS1) and 13q13.1(BRCA2). Gene-based association and gene set enrichment identified a set of 48 SqCC-related genes rel to DNA damage and epigenetic regulation pathways. Our study highlights the importance of cross-cancer analyses to identify pleiotropic risk loci of histology-related cancers arising at distinct anatomical sites.


Subject(s)
Carcinoma, Squamous Cell/genetics , Digestive System Neoplasms/genetics , Genetic Loci , Genetic Predisposition to Disease , Genome-Wide Association Study , Alleles , Biomarkers, Tumor , Carcinoma, Squamous Cell/metabolism , Carcinoma, Squamous Cell/pathology , Digestive System Neoplasms/metabolism , Digestive System Neoplasms/pathology , Genotype , Humans , Odds Ratio , Signal Transduction
5.
Hum Mol Genet ; 31(1): 146-155, 2021 12 17.
Article in English | MEDLINE | ID: mdl-34368847

ABSTRACT

Genotype imputation is widely used in genetic studies to boost the power of GWAS, to combine multiple studies for meta-analysis and to perform fine mapping. With advances of imputation tools and large reference panels, genotype imputation has become mature and accurate. However, the uncertain nature of imputed genotypes can cause bias in the downstream analysis. Many studies have compared the performance of popular imputation approaches, but few investigated bias characteristics of downstream association analyses. Herein, we showed that the imputation accuracy is diminished if the real genotypes contain minor alleles. Although these genotypes are less common, which is particularly true for loci with low minor allele frequency, a large discordance between imputed and observed genotypes significantly inflated the association results, especially in data with a large portion of uncertain SNPs. The significant discordance of P-values happened as the P-value approached 0 or the imputation quality was poor. Although elimination of poorly imputed SNPs can remove false positive (FP) SNPs, it sacrificed, sometimes, more than 80% true positive (TP) SNPs. For top ranked SNPs, removing variants with moderate imputation quality cannot reduce the proportion of FP SNPs, and increasing sample size in reference panels did not greatly benefit the results as well. Additionally, samples with a balanced ratio between cases and controls can dramatically improve the number of TP SNPs observed in the imputation based GWAS. These results raise concerns about results from analysis of association studies when rare variants are studied, particularly when case-control studies are unbalanced.


Subject(s)
Genome-Wide Association Study , Polymorphism, Single Nucleotide , Alleles , Gene Frequency/genetics , Genome-Wide Association Study/methods , Genotype , Polymorphism, Single Nucleotide/genetics
6.
Nano Lett ; 22(13): 5553-5560, 2022 07 13.
Article in English | MEDLINE | ID: mdl-35708317

ABSTRACT

With the development of flexible devices, it is necessary to design high-performance power supplies with superior flexibility, durability, safety, etc., to ensure that they can be deformed with the device while retaining their electrochemical functions. Herein, we have designed a flexible lithium-ion battery inspired by the DNA helix structure. The battery structure is mainly composed of multiple thick energy stacks for energy storage and some grooves for stress buffers, which realized the spiral deformation of batteries. According to the results, the batteries exhibit less than 3% capacity degradation even after more than 31000 times of in situ dynamic mechanical loadings. Moreover, the mechanism of the battery with spiral deformability is further revealed. It is anticipated that this bioinspired design strategy could create unique opportunities for the commercialization of flexible batteries and fill the current gap in realizing battery-specific deformations to meet various requirements for future complex device designs.


Subject(s)
Electric Power Supplies , Lithium , DNA , Ions , Lithium/chemistry
7.
Hum Genet ; 141(2): 229-238, 2022 Feb.
Article in English | MEDLINE | ID: mdl-34981173

ABSTRACT

Genome wide association studies (GWASs) have identified tens of thousands of single nucleotide polymorphisms (SNPs) associated with human diseases and characteristics. A significant fraction of GWAS findings can be false positives. The gold standard for true positives is an independent validation. The goal of this study was to identify SNP features associated with validation success. Summary statistics from the Catalog of Published GWASs were used in the analysis. Since our goal was an analysis of reproducibility, we focused on the diseases/phenotypes targeted by at least 10 GWASs. GWASs were arranged in discovery-validation pairs based on the time of publication, with the discovery GWAS published before validation. We used four definitions of the validation success that differ by stringency. Associations of SNP features with validation success were consistent across the definitions. The strongest predictor of SNP validation was the level of statistical significance in the discovery GWAS. The magnitude of the effect size was associated with validation success in a non-linear manner. SNPs with risk allele frequencies in the range 30-70% showed a higher validation success rate compared to rarer or more common SNPs. Missense, 5'UTR, stop gained, and SNPs located in transcription factor binding sites had a higher validation success rate compared to intergenic, intronic and synonymous SNPs. There was a positive association between validation success and the level of evolutionary conservation of the sites. In addition, validation success was higher when discovery and validation GWASs targeted the same ethnicity. All predictors of validation success remained significant in a multivariate logistic regression model indicating their independent contribution. To conclude, we identified SNP features predicting validation success of GWAS hits. These features can be used to select SNPs for validation and downstream functional studies.


Subject(s)
Genome-Wide Association Study/methods , Polymorphism, Single Nucleotide , Conserved Sequence , Ethnicity/genetics , Gene Frequency , Genetic Association Studies/methods , Genetic Association Studies/statistics & numerical data , Genetic Predisposition to Disease , Genome-Wide Association Study/statistics & numerical data , Humans , Logistic Models , Multivariate Analysis , Odds Ratio , Racial Groups/genetics , Reproducibility of Results
8.
Small ; 18(45): e2204745, 2022 Nov.
Article in English | MEDLINE | ID: mdl-36148862

ABSTRACT

Emerging directions in the growing wearable electronics market have spurred the development of flexible energy storage systems that require deformability while maintaining electrochemical performance. However, the traditional fabrication approaches of lithium-ion batteries (LIBs) are challenging to withstand long-cycle bending alternating loads due to the stress concentration caused by the nonuniformity of the actual deformation. Herein, inspired by kirigami, a segmented deformation design of full-cell scale thin-type flexible lithium-ion batteries (FLIBs) with large-scale manufacturing characteristics via the current collector's mechanical blanking process is reported. This strategy allows the battery's elliptical deformation of the actual state to be transformed into the circular strain of the ideal configuration, thereby dispersing the stress concentration on the top of the battery. According to the results, the designed battery maintains >95% capacity after >20 000 harsh in situ dynamic tests. In addition, finite element analysis further reveals the mechanism that the segmented deformation strategy bears the mechanical stress. This work can enlighten the rational design and customization of electrode patterns for high compatibility with various devices, thereby providing potential opportunities for the application of FLIBs.

9.
J Anim Physiol Anim Nutr (Berl) ; 106(5): 1036-1045, 2022 Sep.
Article in English | MEDLINE | ID: mdl-34668247

ABSTRACT

Yucca schidigera extract (YE) can decrease ammonia concentration in livestock housing, which could be associated with the inhibition of urease. The aim of this study was to investigate the other possible reasons of dietary YE supplementation reducing nitrogen emission in weaned piglets. A total of 14 crossbred weaned barrows were allotted into two groups fed the diets supplementing 0 and 120 mg/kg YE for 14 days. The YE administration decreased F/G ratio and hindgut NH3 -N production in weaned piglets (p < 0.05). Dietary YE supplementation decreased serum urea nitrogen levels, and increased nutrient digestibility, which could be related to the improvement of morphology, digestive and absorptive enzyme activities, and nutrient transporter mRNA expression in jejunal mucosa of weaned piglets (p < 0.05). The mRNA expression of tight junction proteins, mucins and apoptosis-related genes was also improved by YE treatment in jejunal mucosa of weaned piglets (p < 0.05). In addition, dietary YE supplementation regulated the microbiota structure and volatile fatty acid content in distal intestine of weaned piglets (p < 0.05). These results suggest that YE administration can decrease hindgut NH3 -N production in weaned piglets, which is associated with the increased nutrient utilization and gut-barrier function.


Subject(s)
Yucca , Animals , Dietary Supplements , Nitrogen , Nutrients , Plant Extracts/pharmacology , RNA, Messenger , Swine
10.
IUBMB Life ; 73(2): 328-340, 2021 02.
Article in English | MEDLINE | ID: mdl-33368980

ABSTRACT

Cancer seriously impairs human health and survival. Many perturbations, such as increased oxidative stress, pathogen infection, and inflammation, promote the accumulation of DNA mutations, and ultimately lead to carcinogenesis. Tea is one of the most highly consumed beverages worldwide and has been linked to improvements in human health. Tea contains many active components, including tea polyphenols, tea polysaccharides, L-theanine, tea pigments, and caffeine among other common components. Several studies have identified components in tea that can directly or indirectly reduce carcinogenesis with some being used in a clinical setting. Many previous studies, in vitro and in vivo, have focused on the mechanisms that functional components of tea utilized to protect against cancer. One particular mechanism that has been well described is an improvement in antioxidant capacity seen with tea consumption. However, other mechanisms, including anti-pathogen, anti-inflammation and alterations in cell survival pathways, are also involved. The current review focuses on these anti-cancer mechanisms. This will be beneficial for clinical utilization of tea components in preventing and treating cancer in the future.


Subject(s)
Anti-Infective Agents/pharmacology , Anti-Inflammatory Agents/pharmacology , Carcinogenesis/drug effects , Infections/drug therapy , Neoplasms/drug therapy , Plant Extracts/pharmacology , Tea/chemistry , Animals , Antioxidants/pharmacology , Cell Survival , Host-Pathogen Interactions , Humans , Oxidative Stress
11.
Bioinformatics ; 35(17): 2891-2898, 2019 09 01.
Article in English | MEDLINE | ID: mdl-30649252

ABSTRACT

MOTIVATION: Integration of multiple genetic sources for copy number variation detection (CNV) is a powerful approach to improve the identification of variants associated with complex traits. Although it has been shown that the widely used change point based methods can increase statistical power to identify variants, it remains challenging to effectively detect CNVs with weak signals due to the noisy nature of genotyping intensity data. We previously developed modSaRa, a normal mean-based model on a screening and ranking algorithm for copy number variation identification which presented desirable sensitivity with high computational efficiency. To boost statistical power for the identification of variants, here we present a novel improvement that integrates the relative allelic intensity with external information from empirical statistics with modeling, which we called modSaRa2. RESULTS: Simulation studies illustrated that modSaRa2 markedly improved both sensitivity and specificity over existing methods for analyzing array-based data. The improvement in weak CNV signal detection is the most substantial, while it also simultaneously improves stability when CNV size varies. The application of the new method to a whole genome melanoma dataset identified novel candidate melanoma risk associated deletions on chromosome bands 1p22.2 and duplications on 6p22, 6q25 and 19p13 regions, which may facilitate the understanding of the possible roles of germline copy number variants in the etiology of melanoma. AVAILABILITY AND IMPLEMENTATION: http://c2s2.yale.edu/software/modSaRa2 or https://github.com/FeifeiXiaoUSC/modSaRa2. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Algorithms , DNA Copy Number Variations , Genome-Wide Association Study , Alleles , Data Interpretation, Statistical , Polymorphism, Single Nucleotide , Sensitivity and Specificity , Software
12.
BMC Genet ; 20(1): 85, 2019 11 12.
Article in English | MEDLINE | ID: mdl-31718536

ABSTRACT

BACKGROUND: Over the relatively short history of Genome Wide Association Studies (GWASs), hundreds of GWASs have been published and thousands of disease risk-associated SNPs have been identified. Summary statistics from the conducted GWASs are often available and can be used to identify SNP features associated with the level of GWAS statistical significance. Those features could be used to select SNPs from gray zones (SNPs that are nominally significant but do not reach the genome-wide level of significance) for targeted analyses. METHODS: We used summary statistics from recently published breast and lung cancer and scleroderma GWASs to explore the association between the level of the GWAS statistical significance and the expression quantitative trait loci (eQTL) status of the SNP. Data from the Genotype-Tissue Expression Project (GTEx) were used to identify eQTL SNPs. RESULTS: We found that SNPs reported as eQTLs were more significant in GWAS (higher -log10p) regardless of the tissue specificity of the eQTL. Pan-tissue eQTLs (those reported as eQTLs in multiple tissues) tended to be more significant in the GWAS compared to those reported as eQTL in only one tissue type. eQTL density in the ±5 kb adjacent region of a given SNP was also positively associated with the level of GWAS statistical significance regardless of the eQTL status of the SNP. We found that SNPs located in the regions of high eQTL density were more likely to be located in regulatory elements (transcription factor or miRNA binding sites). When SNPs were stratified by the level of statistical significance, the proportion of eQTLs was positively associated with the mean level of statistical significance in the group. The association curve reaches a plateau around -log10p ≈ 5. The observed associations suggest that quasi-significant SNPs (10- 5 < p < 5 × 10- 8) and SNPs at the genome wide level of statistical significance (p < 5 × 10- 8) may have a similar proportions of risk associated SNPs. CONCLUSIONS: The results of this study indicate that the SNP's eQTL status, as well as eQTL density in the adjacent region are positively associated with the level of statistical significance of the SNP in GWAS.


Subject(s)
Breast Neoplasms/genetics , Genome-Wide Association Study/methods , Lung Neoplasms/genetics , Polymorphism, Single Nucleotide , Quantitative Trait Loci , Scleroderma, Systemic/genetics , Female , Gene Expression Profiling , Gene Expression Regulation , Genetic Predisposition to Disease , Humans , Male , Models, Statistical , Organ Specificity , Regulatory Elements, Transcriptional
13.
Int J Mol Sci ; 20(21)2019 Oct 23.
Article in English | MEDLINE | ID: mdl-31652732

ABSTRACT

Cancer is a worldwide epidemic and represents a major threat to human health and survival. Reactive oxygen species (ROS) play a dual role in cancer cells, which includes both promoting and inhibiting carcinogenesis. Tea remains one of the most prevalent beverages consumed due in part to its anti- or pro-oxidative properties. The active compounds in tea, particularly tea polyphenols, can directly or indirectly scavenge ROS to reduce oncogenesis and cancerometastasis. Interestingly, the excessive levels of ROS induced by consuming tea could induce programmed cell death (PCD) or non-PCD of cancer cells. On the basis of illustrating the relationship between ROS and cancer, the current review discusses the composition and efficacy of tea including the redox-relative (including anti-oxidative and pro-oxidative activity) mechanisms and their role along with other components in preventing and treating cancer. This information will highlight the basis for the clinical utilization of tea extracts in the prevention or treatment of cancer in the future.


Subject(s)
Antineoplastic Agents, Phytogenic/pharmacology , Carcinogenesis/drug effects , Reactive Oxygen Species/metabolism , Tea/chemistry , Animals , Carcinogenesis/metabolism , Humans , Oxidation-Reduction , Plant Extracts/pharmacology
14.
Carcinogenesis ; 39(3): 336-346, 2018 03 08.
Article in English | MEDLINE | ID: mdl-29059373

ABSTRACT

Non-small cell lung cancer is the most common type of lung cancer. Both environmental and genetic risk factors contribute to lung carcinogenesis. We conducted a genome-wide interaction analysis between single nucleotide polymorphisms (SNPs) and smoking status (never- versus ever-smokers) in a European-descent population. We adopted a two-step analysis strategy in the discovery stage: we first conducted a case-only interaction analysis to assess the relationship between SNPs and smoking behavior using 13336 non-small cell lung cancer cases. Candidate SNPs with P-value <0.001 were further analyzed using a standard case-control interaction analysis including 13970 controls. The significant SNPs with P-value <3.5 × 10-5 (correcting for multiple tests) from the case-control analysis in the discovery stage were further validated using an independent replication dataset comprising 5377 controls and 3054 non-small cell lung cancer cases. We further stratified the analysis by histological subtypes. Two novel SNPs, rs6441286 and rs17723637, were identified for overall lung cancer risk. The interaction odds ratio and meta-analysis P-value for these two SNPs were 1.24 with 6.96 × 10-7 and 1.37 with 3.49 × 10-7, respectively. In addition, interaction of smoking with rs4751674 was identified in squamous cell lung carcinoma with an odds ratio of 0.58 and P-value of 8.12 × 10-7. This study is by far the largest genome-wide SNP-smoking interaction analysis reported for lung cancer. The three identified novel SNPs provide potential candidate biomarkers for lung cancer risk screening and intervention. The results from our study reinforce that gene-smoking interactions play important roles in the etiology of lung cancer and account for part of the missing heritability of this disease.


Subject(s)
Carcinoma, Non-Small-Cell Lung/etiology , Carcinoma, Non-Small-Cell Lung/genetics , Lung Neoplasms/etiology , Lung Neoplasms/genetics , Smoking/adverse effects , Case-Control Studies , Gene-Environment Interaction , Genetic Predisposition to Disease/genetics , Genome-Wide Association Study , Genotype , Humans , Polymorphism, Single Nucleotide , White People
15.
Bioinformatics ; 33(4): 561-563, 2017 02 15.
Article in English | MEDLINE | ID: mdl-28035028

ABSTRACT

Motivation: Checking concordance between reported sex and genotype-inferred sex is a crucial quality control measure in genome-wide association studies (GWAS). However, limited insights exist regarding the true accuracy of software that infer sex from genotype array data. Results: We present seXY, a logistic regression model trained on both X chromosome heterozygosity and Y chromosome missingness, that consistently demonstrated >99.5% sex inference accuracy in cross-validation for 889 males and 5,361 females enrolled in prostate cancer and ovarian cancer GWAS. Compared to PLINK, one of the most popular tools for sex inference in GWAS that assesses only X chromosome heterozygosity, seXY achieved marginally better male classification and 3% more accurate female classification. Availability and Implementation: https://github.com/Christopher-Amos-Lab/seXY. Contact: Christopher.I.Amos@dartmouth.edu. Supplementary information: Supplementary data are available at Bioinformatics online.


Subject(s)
Chromosomes, Human , Genome-Wide Association Study/methods , Sex Chromosomes , Sex Determination Analysis/methods , Software , Female , Humans , Male , Quality Control
16.
J Am Soc Nephrol ; 28(8): 2311-2321, 2017 Aug.
Article in English | MEDLINE | ID: mdl-28360221

ABSTRACT

Disorders of water balance, an excess or deficit of total body water relative to body electrolyte content, are common and ascertained by plasma hypo- or hypernatremia, respectively. We performed a two-stage genome-wide association study meta-analysis on plasma sodium concentration in 45,889 individuals of European descent (stage 1 discovery) and 17,637 additional individuals of European descent (stage 2 replication), and a transethnic meta-analysis of replicated single-nucleotide polymorphisms in 79,506 individuals (63,526 individuals of European descent, 8765 individuals of Asian Indian descent, and 7215 individuals of African descent). In stage 1, we identified eight loci associated with plasma sodium concentration at P<5.0 × 10-6 Of these, rs9980 at NFAT5 replicated in stage 2 meta-analysis (P=3.1 × 10-5), with combined stages 1 and 2 genome-wide significance of P=5.6 × 10-10 Transethnic meta-analysis further supported the association at rs9980 (P=5.9 × 10-12). Additionally, rs16846053 at SLC4A10 showed nominally, but not genome-wide, significant association in combined stages 1 and 2 meta-analysis (P=6.7 × 10-8). NFAT5 encodes a ubiquitously expressed transcription factor that coordinates the intracellular response to hypertonic stress but was not previously implicated in the regulation of systemic water balance. SLC4A10 encodes a sodium bicarbonate transporter with a brain-restricted expression pattern, and variant rs16846053 affects a putative intronic NFAT5 DNA binding motif. The lead variants for NFAT5 and SLC4A10 are cis expression quantitative trait loci in tissues of the central nervous system and relevant to transcriptional regulation. Thus, genetic variation in NFAT5 and SLC4A10 expression and function in the central nervous system may affect the regulation of systemic water balance.


Subject(s)
Genetic Loci , Plasma/chemistry , Sodium-Bicarbonate Symporters/genetics , Sodium/analysis , Transcription Factors/genetics , Water-Electrolyte Imbalance/blood , Water-Electrolyte Imbalance/genetics , Aged , Female , Genome-Wide Association Study , Humans , Male , Middle Aged , Osmolar Concentration , Racial Groups
17.
BMC Bioinformatics ; 17: 122, 2016 Mar 09.
Article in English | MEDLINE | ID: mdl-26961892

ABSTRACT

BACKGROUND: Identifying subpopulations within a study and inferring intercontinental ancestry of the samples are important steps in genome wide association studies. Two software packages are widely used in analysis of substructure: Structure and Eigenstrat. Structure assigns each individual to a population by using a Bayesian method with multiple tuning parameters. It requires considerable computational time when dealing with thousands of samples and lacks the ability to create scores that could be used as covariates. Eigenstrat uses a principal component analysis method to model all sources of sampling variation. However, it does not readily provide information directly relevant to ancestral origin; the eigenvectors generated by Eigenstrat are sample specific and thus cannot be generalized to other individuals. RESULTS: We developed FastPop, an efficient R package that fills the gap between Structure and Eigenstrat. It can: 1, generate PCA scores that identify ancestral origins and can be used for multiple studies; 2, infer ancestry information for data arising from two or more intercontinental origins. We demonstrate the use of FastPop using 2318 SNP markers selected from the genome based on high variability among European, Asian and West African (African) populations. We conducted an analysis of 505 Hapmap samples with European, African or Asian ancestry along with 19661 additional samples of unknown ancestry. The results from FastPop are highly consistent with those obtained by Structure across the 19661 samples we studied. The correlations of the results between FastPop and Structure are 0.99, 0.97 and 0.99 for European, African and Asian ancestry scores, respectively. Compared with Structure, FastPop is more efficient as it finished ancestry inference for 19661 samples in 16 min compared with 21-24 h required by Structure. FastPop also provided scores based on SNP weights so the scores of reference population can be applied to other studies provided the same set of markers are used. We also present application of the method for studying four continental populations (European, Asian, African, and Native American). CONCLUSIONS: We developed an algorithm that can infer ancestries on data involving two or more intercontinental origins. It is efficient for analyzing large datasets. Additionally the PCA derived scores can be applied to multiple data sets to ensure the same ancestry analysis is applied to all studies.


Subject(s)
Algorithms , Ethnicity/genetics , Genetics, Population , Genome-Wide Association Study/methods , Polymorphism, Single Nucleotide/genetics , Principal Component Analysis , Racial Groups/genetics , Software , Bayes Theorem , Genotype , HapMap Project , Humans
18.
Carcinogenesis ; 37(1): 96-105, 2016 Jan.
Article in English | MEDLINE | ID: mdl-26590902

ABSTRACT

Chromosome 5p15.33 has been identified as a lung cancer susceptibility locus, however the underlying causal mechanisms were not fully elucidated. Previous fine-mapping studies of this locus have relied on imputation or investigated a small number of known, common variants. This study represents a significant advance over previous research by investigating a large number of novel, rare variants, as well as their underlying mechanisms through telomere length. Variants for this fine-mapping study were identified through a targeted deep sequencing (average depth of coverage greater than 4000×) of 576 individuals. Subsequently, 4652 SNPs, including 1108 novel SNPs, were genotyped in 5164 cases and 5716 controls of European ancestry. After adjusting for known risk loci, rs2736100 and rs401681, we identified a new, independent lung cancer susceptibility variant in LPCAT1: rs139852726 (OR = 0.46, P = 4.73×10(-9)), and three new adenocarcinoma risk variants in TERT: rs61748181 (OR = 0.53, P = 2.64×10(-6)), rs112290073 (OR = 1.85, P = 1.27×10(-5)), rs138895564 (OR = 2.16, P = 2.06×10(-5); among young cases, OR = 3.77, P = 8.41×10(-4)). In addition, we found that rs139852726 (P = 1.44×10(-3)) was associated with telomere length in a sample of 922 healthy individuals. The gene-based SKAT-O analysis implicated TERT as the most relevant gene in the 5p15.33 region for adenocarcinoma (P = 7.84×10(-7)) and lung cancer (P = 2.37×10(-5)) risk. In this largest fine-mapping study to investigate a large number of rare and novel variants within 5p15.33, we identified novel lung and adenocarcinoma susceptibility loci with large effects and provided support for the role of telomere length as the potential underlying mechanism.


Subject(s)
Chromosomes, Human, Pair 5 , Genetic Loci , Lung Neoplasms/genetics , Case-Control Studies , Chromosome Mapping/methods , Female , Genetic Predisposition to Disease , Genotyping Techniques/methods , Humans , Male , Middle Aged
19.
Hum Mol Genet ; 22(17): 3597-607, 2013 Sep 01.
Article in English | MEDLINE | ID: mdl-23669352

ABSTRACT

Genetic loci for body mass index (BMI) in adolescence and young adulthood, a period of high risk for weight gain, are understudied, yet may yield important insight into the etiology of obesity and early intervention. To identify novel genetic loci and examine the influence of known loci on BMI during this critical time period in late adolescence and early adulthood, we performed a two-stage meta-analysis using 14 genome-wide association studies in populations of European ancestry with data on BMI between ages 16 and 25 in up to 29 880 individuals. We identified seven independent loci (P < 5.0 × 10⁻8) near FTO (P = 3.72 × 10⁻²³), TMEM18 (P = 3.24 × 10⁻¹7), MC4R (P = 4.41 × 10⁻¹7), TNNI3K (P = 4.32 × 10⁻¹¹), SEC16B (P = 6.24 × 10⁻9), GNPDA2 (P = 1.11 × 10⁻8) and POMC (P = 4.94 × 10⁻8) as well as a potential secondary signal at the POMC locus (rs2118404, P = 2.4 × 10⁻5 after conditioning on the established single-nucleotide polymorphism at this locus) in adolescents and young adults. To evaluate the impact of the established genetic loci on BMI at these young ages, we examined differences between the effect sizes of 32 published BMI loci in European adult populations (aged 18-90) and those observed in our adolescent and young adult meta-analysis. Four loci (near PRKD1, TNNI3K, SEC16B and CADM2) had larger effects and one locus (near SH2B1) had a smaller effect on BMI during adolescence and young adulthood compared with older adults (P < 0.05). These results suggest that genetic loci for BMI can vary in their effects across the life course, underlying the importance of evaluating BMI at different ages.


Subject(s)
Body Mass Index , Genetic Loci , Weight Gain/genetics , Adolescent , Adult , Age Factors , Aged , Aged, 80 and over , Cohort Studies , Genome-Wide Association Study , Humans , Middle Aged , Polymorphism, Single Nucleotide , White People/genetics , Young Adult
20.
Tumour Biol ; 36(11): 8993-9003, 2015 Nov.
Article in English | MEDLINE | ID: mdl-26081616

ABSTRACT

Lung adenocarcinoma is caused by the combination of genetic and environmental effects, and smoking plays an important role in the disease development. Exploring the gene expression profile and identifying genes that are shared or vary between smokers and nonsmokers with lung adenocarcinoma will provide insights into the etiology of this complex cancer. We obtained RNA-seq data from paired normal and tumor tissues from 34 nonsmoking and 34 smoking patients with lung adenocarcinoma (GEO: GSE40419). R Bioconductor, edgeR, was adopted to conduct differential gene expression analysis between paired normal and tumor tissues. A generalized linear model was applied to identify genes that were differentially expressed in nonsmoker and smoker patients as well as genes that varied between these two groups. We identified 2273 genes that showed differential expression with FDR < 0.05 and |logFC| >1 in nonsmoker tumor versus normal tissues; 3030 genes in the smoking group; and 1967 genes were common to both groups. Sixty-eight and 70% of the identified genes were downregulated in nonsmoking and smoking groups, respectively. The 20 genes such as SPP1, SPINK1, and FAM83A with largest fold changes in smokers also showed similar large and highly significant fold changes in nonsmokers and vice versa, showing commonalities in expression changes for adenocarcinomas in both smokers and nonsmokers for these genes. We also identified 175 genes that were significantly differently expressed between tumor samples from nonsmoker and smoker patients. Gene expression profile varied substantially between smoker and nonsmoker patients with lung adenocarcinoma. Smoking patients overall showed far more complicated disease mechanism and have more dysregulation in their gene expression profiles. Our study reveals pathogenetic differences in smoking and nonsmoking patients with lung adenocarcinoma from transcriptome analysis. We provided a list of candidate genes for further study for disease detection and treatment in both smoking and nonsmoking patients with lung adenocarcinoma.


Subject(s)
Adenocarcinoma/genetics , Gene Expression Regulation, Neoplastic , Lung Neoplasms/genetics , Neoplasm Proteins/biosynthesis , Smoking/genetics , Adenocarcinoma/pathology , Adenocarcinoma of Lung , High-Throughput Nucleotide Sequencing , Humans , Lung Neoplasms/pathology , Neoplasm Proteins/genetics , RNA/genetics , Smoking/pathology , Transcriptome/genetics
SELECTION OF CITATIONS
SEARCH DETAIL