Your browser doesn't support javascript.
loading
: 20 | 50 | 100
1 - 6 de 6
1.
Am J Hum Genet ; 109(4): 727-737, 2022 04 07.
Article En | MEDLINE | ID: mdl-35298920

Inferring the structure of human populations from genetic variation data is a key task in population and medical genomic studies. Although a number of methods for population structure inference have been proposed, current methods are impractical to run on biobank-scale genomic datasets containing millions of individuals and genetic variants. We introduce SCOPE, a method for population structure inference that is orders of magnitude faster than existing methods while achieving comparable accuracy. SCOPE infers population structure in about a day on a dataset containing one million individuals and variants as well as on the UK Biobank dataset containing 488,363 individuals and 569,346 variants. Furthermore, SCOPE can leverage allele frequencies from previous studies to improve the interpretability of population structure estimates.


Biological Specimen Banks , Genetics, Population , Gene Frequency/genetics , Genomics , Humans
2.
Am J Hum Genet ; 108(5): 799-808, 2021 05 06.
Article En | MEDLINE | ID: mdl-33811807

The proportion of variation in complex traits that can be attributed to non-additive genetic effects has been a topic of intense debate. The availability of biobank-scale datasets of genotype and trait data from unrelated individuals opens up the possibility of obtaining precise estimates of the contribution of non-additive genetic effects. We present an efficient method to estimate the variation in a complex trait that can be attributed to additive (additive heritability) and dominance deviation (dominance heritability) effects across all genotyped SNPs in a large collection of unrelated individuals. Over a wide range of genetic architectures, our method yields unbiased estimates of additive and dominance heritability. We applied our method, in turn, to array genotypes as well as imputed genotypes (at common SNPs with minor allele frequency [MAF] > 1%) and 50 quantitative traits measured in 291,273 unrelated white British individuals in the UK Biobank. Averaged across these 50 traits, we find that additive heritability on array SNPs is 21.86% while dominance heritability is 0.13% (about 0.48% of the additive heritability) with qualitatively similar results for imputed genotypes. We find no statistically significant evidence for dominance heritability (p<0.05/50 accounting for the number of traits tested) and estimate that dominance heritability is unlikely to exceed 1% for the traits analyzed. Our analyses indicate a limited contribution of dominance heritability to complex trait variation.


Biological Specimen Banks , Datasets as Topic , Genes, Dominant/genetics , Genetic Variation , Multifactorial Inheritance/genetics , Female , Humans , Male , Models, Genetic , Polymorphism, Single Nucleotide/genetics
3.
NPJ Genom Med ; 5(1): 55, 2020 Dec 11.
Article En | MEDLINE | ID: mdl-33311498

Pancreatic ductal adenocarcinoma (PDAC) is an aggressive cancer with a 5-year survival rate of <8%. Unsupervised clustering of 76 PDAC patients based on intron retention (IR) events resulted in two clusters of tumors (IR-1 and IR-2). While gene expression-based clusters are not predictive of patient outcome in this cohort, the clusters we developed based on intron retention were associated with differences in progression-free interval. IR levels are lower and clinical outcome is worse in IR-1 compared with IR-2. Oncogenes were significantly enriched in the set of 262 differentially retained introns between the two IR clusters. Higher IR levels in IR-2 correlate with higher gene expression, consistent with detention of intron-containing transcripts in the nucleus in IR-2. Out of 258 genes encoding RNA-binding proteins (RBP) that were differentially expressed between IR-1 and IR-2, the motifs for seven RBPs were significantly enriched in the 262-intron set, and the expression of 25 RBPs were highly correlated with retention levels of 139 introns. Network analysis suggested that retention of introns in IR-2 could result from disruption of an RBP protein-protein interaction network previously linked to efficient intron removal. Finally, IR-based clusters developed for the majority of the 20 cancer types surveyed had two clusters with asymmetrical distributions of IR events like PDAC, with one cluster containing mostly intron loss events. Taken together, our findings suggest IR may be an important biomarker for subclassifying tumors.

4.
PLoS Genet ; 16(5): e1008773, 2020 05.
Article En | MEDLINE | ID: mdl-32469896

Principal component analysis (PCA) is a key tool for understanding population structure and controlling for population stratification in genome-wide association studies (GWAS). With the advent of large-scale datasets of genetic variation, there is a need for methods that can compute principal components (PCs) with scalable computational and memory requirements. We present ProPCA, a highly scalable method based on a probabilistic generative model, which computes the top PCs on genetic variation data efficiently. We applied ProPCA to compute the top five PCs on genotype data from the UK Biobank, consisting of 488,363 individuals and 146,671 SNPs, in about thirty minutes. To illustrate the utility of computing PCs in large samples, we leveraged the population structure inferred by ProPCA within White British individuals in the UK Biobank to identify several novel genome-wide signals of recent putative selection including missense mutations in RPGRIP1L and TLR4.


Adaptor Proteins, Signal Transducing/genetics , Computational Biology/methods , Mutation, Missense , Toll-Like Receptor 4/genetics , White People/genetics , Algorithms , Biological Specimen Banks , Genetics, Population , Genome-Wide Association Study/methods , Humans , Models, Genetic , Polymorphism, Single Nucleotide , Principal Component Analysis , United Kingdom/ethnology
5.
Sci Rep ; 8(1): 11807, 2018 08 07.
Article En | MEDLINE | ID: mdl-30087365

Triple-negative breast cancers (TNBC) lack estrogen and progesterone receptors and HER2 amplification, and are resistant to therapies that target these receptors. Tumors from TNBC patients are heterogeneous based on genetic variations, tumor histology, and clinical outcomes. We used high throughput genomic data for TNBC patients (n = 137) from TCGA to characterize inter-tumor heterogeneity. Similarity network fusion (SNF)-based integrative clustering combining gene expression, miRNA expression, and copy number variation, revealed three distinct patient clusters. Integrating multiple types of data resulted in more distinct clusters than analyses with a single datatype. Whereas most TNBCs are classified by PAM50 as basal subtype, one of the clusters was enriched in the non-basal PAM50 subtypes, exhibited more aggressive clinical features and had a distinctive signature of oncogenic mutations, miRNAs and expressed genes. Our analyses provide a new classification scheme for TNBC based on multiple omics datasets and provide insight into molecular features that underlie TNBC heterogeneity.


Databases, Nucleic Acid , Gene Expression Regulation, Neoplastic , MicroRNAs , RNA, Neoplasm , Receptor, ErbB-2 , Triple Negative Breast Neoplasms , Adult , Aged , Female , Humans , MicroRNAs/biosynthesis , MicroRNAs/genetics , Middle Aged , RNA, Neoplasm/biosynthesis , RNA, Neoplasm/genetics , Receptor, ErbB-2/biosynthesis , Receptor, ErbB-2/genetics , Triple Negative Breast Neoplasms/classification , Triple Negative Breast Neoplasms/genetics , Triple Negative Breast Neoplasms/metabolism , Triple Negative Breast Neoplasms/pathology
6.
J Mammary Gland Biol Neoplasia ; 22(1): 59-69, 2017 03.
Article En | MEDLINE | ID: mdl-28124184

Reelin is a regulator of cell migration in the nervous system, and has other functions in the development of a number of non-neuronal tissues. In addition, alterations in reelin expression levels have been reported in breast, pancreatic, liver, gastric, and other cancers. Reelin is normally expressed in mammary gland stromal cells, but whether stromal reelin contributes to breast cancer progression is unknown. Herein, we used a syngeneic mouse mammary tumor transplantation model to examine the impact of host-derived reelin on breast cancer progression. We found that transplanted syngeneic tumors grew more slowly in reelin-deficient (rl Orl -/- ) mice and had delayed metastatic colonization of the lungs. Immunohistochemistry of primary tumors revealed that tumors grown in rl Orl -/- animals had fewer blood vessels and increased macrophage infiltration. Gene expression studies from tumor tissues indicate that loss of host-derived reelin alters the balance of M1- and M2-associated macrophage markers, suggesting that reelin may influence the polarization of these cells. Consistent with this, rl Orl -/- M1-polarized bone marrow-derived macrophages have heightened levels of the M1-associated cytokines iNOS and IL-6. Based on these observations, we propose a novel function for the reelin protein in breast cancer progression.


Breast Neoplasms/metabolism , Breast Neoplasms/pathology , Cell Adhesion Molecules, Neuronal/metabolism , Cell Proliferation/physiology , Extracellular Matrix Proteins/metabolism , Mammary Neoplasms, Animal/metabolism , Mammary Neoplasms, Animal/pathology , Nerve Tissue Proteins/metabolism , Serine Endopeptidases/metabolism , Animals , Breast/metabolism , Breast/pathology , Cell Line , Cell Line, Tumor , Cytokines/metabolism , Disease Progression , Female , Gene Expression/physiology , HEK293 Cells , Humans , Macrophages/metabolism , Macrophages/pathology , Mice , Mice, Inbred BALB C , Reelin Protein
...