Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 72
Filter
Add more filters

Country/Region as subject
Publication year range
1.
Nature ; 629(8012): 679-687, 2024 May.
Article in English | MEDLINE | ID: mdl-38693266

ABSTRACT

Pancreatic intraepithelial neoplasias (PanINs) are the most common precursors of pancreatic cancer, but their small size and inaccessibility in humans make them challenging to study1. Critically, the number, dimensions and connectivity of human PanINs remain largely unknown, precluding important insights into early cancer development. Here, we provide a microanatomical survey of human PanINs by analysing 46 large samples of grossly normal human pancreas with a machine-learning pipeline for quantitative 3D histological reconstruction at single-cell resolution. To elucidate genetic relationships between and within PanINs, we developed a workflow in which 3D modelling guides multi-region microdissection and targeted and whole-exome sequencing. From these samples, we calculated a mean burden of 13 PanINs per cm3 and extrapolated that the normal intact adult pancreas harbours hundreds of PanINs, almost all with oncogenic KRAS hotspot mutations. We found that most PanINs originate as independent clones with distinct somatic mutation profiles. Some spatially continuous PanINs were found to contain multiple KRAS mutations; computational and in situ analyses demonstrated that different KRAS mutations localize to distinct cell subpopulations within these neoplasms, indicating their polyclonal origins. The extensive multifocality and genetic heterogeneity of PanINs raises important questions about mechanisms that drive precancer initiation and confer differential progression risk in the human pancreas. This detailed 3D genomic mapping of molecular alterations in human PanINs provides an empirical foundation for early detection and rational interception of pancreatic cancer.


Subject(s)
Genetic Heterogeneity , Genomics , Imaging, Three-Dimensional , Pancreatic Neoplasms , Precancerous Conditions , Single-Cell Analysis , Adult , Female , Humans , Male , Clone Cells/metabolism , Clone Cells/pathology , Exome Sequencing , Machine Learning , Mutation , Pancreas/anatomy & histology , Pancreas/cytology , Pancreas/metabolism , Pancreas/pathology , Pancreatic Neoplasms/genetics , Pancreatic Neoplasms/pathology , Precancerous Conditions/genetics , Precancerous Conditions/pathology , Workflow , Disease Progression , Early Detection of Cancer , Oncogenes/genetics
2.
Nature ; 570(7761): 385-389, 2019 06.
Article in English | MEDLINE | ID: mdl-31142840

ABSTRACT

Cell-free DNA in the blood provides a non-invasive diagnostic avenue for patients with cancer1. However, characteristics of the origins and molecular features of cell-free DNA are poorly understood. Here we developed an approach to evaluate fragmentation patterns of cell-free DNA across the genome, and found that profiles of healthy individuals reflected nucleosomal patterns of white blood cells, whereas patients with cancer had altered fragmentation profiles. We used this method to analyse the fragmentation profiles of 236 patients with breast, colorectal, lung, ovarian, pancreatic, gastric or bile duct cancer and 245 healthy individuals. A machine learning model that incorporated genome-wide fragmentation features had sensitivities of detection ranging from 57% to more than 99% among the seven cancer types at 98% specificity, with an overall area under the curve value of 0.94. Fragmentation profiles could be used to identify the tissue of origin of the cancers to a limited number of sites in 75% of cases. Combining our approach with mutation-based cell-free DNA analyses detected 91% of patients with cancer. The results of these analyses highlight important properties of cell-free DNA and provide a proof-of-principle approach for the screening, early detection and monitoring of human cancer.


Subject(s)
Circulating Tumor DNA/blood , Circulating Tumor DNA/genetics , DNA Fragmentation , Genome, Human/genetics , Neoplasms/diagnosis , Neoplasms/genetics , Case-Control Studies , Cohort Studies , DNA Mutational Analysis , Humans , Machine Learning , Mutation , Neoplasms/blood , Neoplasms/pathology
3.
Genet Epidemiol ; 46(3-4): 170-181, 2022 04.
Article in English | MEDLINE | ID: mdl-35312098

ABSTRACT

Genome-wide association studies (GWAS) have successfully identified thousands of single nucleotide polymorphisms (SNPs) associated with complex traits; however, the identified SNPs account for a fraction of trait heritability, and identifying the functional elements through which genetic variants exert their effects remains a challenge. Recent evidence suggests that SNPs associated with complex traits are more likely to be expression quantitative trait loci (eQTL). Thus, incorporating eQTL information can potentially improve power to detect causal variants missed by traditional GWAS approaches. Using genomic, transcriptomic, and platelet phenotype data from the Genetic Study of Atherosclerosis Risk family-based study, we investigated the potential to detect novel genomic risk loci by incorporating information from eQTL in the relevant target tissues (i.e., platelets and megakaryocytes) using established statistical principles in a novel way. Permutation analyses were performed to obtain family-wise error rates for eQTL associations, substantially lowering the genome-wide significance threshold for SNP-phenotype associations. In addition to confirming the well known association between PEAR1 and platelet aggregation, our eQTL-focused approach identified a novel locus (rs1354034) and gene (ARHGEF3) not previously identified in a GWAS of platelet aggregation phenotypes. A colocalization analysis showed strong evidence for a functional role of this eQTL.


Subject(s)
Genome-Wide Association Study , Quantitative Trait Loci , Humans , Phenotype , Polymorphism, Single Nucleotide , Quantitative Trait Loci/genetics , Receptors, Cell Surface , Transcriptome
4.
Bioinformatics ; 38(15): 3677-3683, 2022 08 02.
Article in English | MEDLINE | ID: mdl-35642899

ABSTRACT

MOTIVATION: Multi-region sequencing of solid tumors can improve our understanding of intratumor subclonal diversity and the evolutionary history of mutational events. Due to uncertainty in clonal composition and the multitude of possible ancestral relationships between clones, elucidating the most probable relationships from bulk tumor sequencing poses statistical and computational challenges. RESULTS: We developed a Bayesian hierarchical model called PICTograph to model uncertainty in assigning mutations to subclones, to enable posterior distributions of cancer cell fractions (CCFs) and to visualize the most probable ancestral relationships between subclones. Compared with available methods, PICTograph provided more consistent and accurate estimates of CCFs and improved tree inference over a range of simulated clonal diversity. Application of PICTograph to multi-region whole-exome sequencing of tumors from individuals with pancreatic cancer precursor lesions confirmed known early-occurring mutations and indicated substantial molecular diversity, including 6-12 distinct subclones and intra-sample mixing of subclones. Using ensemble-based visualizations, we highlight highly probable evolutionary relationships recovered in multiple models. PICTograph provides a useful approximation to evolutionary inference from cross-sectional multi-region sequencing, particularly for complex cases. AVAILABILITY AND IMPLEMENTATION: https://github.com/KarchinLab/pictograph. The data underlying this article will be shared on reasonable request to the corresponding author. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Neoplasms , Humans , Bayes Theorem , Cross-Sectional Studies , Neoplasms/genetics , Sequence Analysis , Mutation , Clone Cells , Phylogeny , Software
5.
Bioinformatics ; 38(19): 4647-4649, 2022 09 30.
Article in English | MEDLINE | ID: mdl-35959988

ABSTRACT

SUMMARY: Because of their high abundance, easy accessibility in peripheral blood, and relative stability ex vivo, antibodies serve as excellent records of environmental exposures and immune responses. Phage Immuno-Precipitation Sequencing (PhIP-Seq) is the most efficient technique available for assessing antibody binding to hundreds of thousands of peptides at a cohort scale. PhIP-Seq is a high-throughput approach for assessing antibody reactivity to hundreds of thousands of candidate epitopes. Accurate detection of weakly reactive peptides is particularly important for characterizing the development and decline of antibody responses. Here, we present BEER (Bayesian Enrichment Estimation in R), a software package specifically developed for the quantification of peptide reactivity from PhIP-Seq experiments. BEER implements a hierarchical model and produces posterior probabilities for peptide reactivity and a fold change estimate to quantify the magnitude. BEER also offers functionality to infer peptide reactivity based on the edgeR package, though the improvement in speed is offset by slightly lower sensitivity compared to the Bayesian approach, specifically for weakly reactive peptides. AVAILABILITY AND IMPLEMENTATION: BEER is implemented in R and freely available from the Bioconductor repository at https://bioconductor.org/packages/release/bioc/html/beer.html.


Subject(s)
Beer , Software , Humans , Bayes Theorem , Antibodies , Peptides
6.
BMC Genomics ; 23(1): 654, 2022 Sep 15.
Article in English | MEDLINE | ID: mdl-36109689

ABSTRACT

Phage ImmunoPrecipitation Sequencing (PhIP-Seq) is a recently developed technology to assess antibody reactivity, quantifying antibody binding towards hundreds of thousands of candidate epitopes. The output from PhIP-Seq experiments are read count matrices, similar to RNA-Seq data; however some important differences do exist. In this manuscript we investigated whether the publicly available method edgeR (Robinson et al., Bioinformatics 26(1):139-140, 2010) for normalization and analysis of RNA-Seq data is also suitable for PhIP-Seq data. We find that edgeR is remarkably effective, but improvements can be made and introduce a Bayesian framework specifically tailored for data from PhIP-Seq experiments (Bayesian Enrichment Estimation in R, BEER).


Subject(s)
Bacteriophages , Antibodies , Bacteriophages/genetics , Bayes Theorem , Epitopes , Gene Expression Profiling/methods , Immunoprecipitation , Sequence Analysis, RNA/methods
7.
N Engl J Med ; 378(21): 1976-1986, 2018 May 24.
Article in English | MEDLINE | ID: mdl-29658848

ABSTRACT

BACKGROUND: Antibodies that block programmed death 1 (PD-1) protein improve survival in patients with advanced non-small-cell lung cancer (NSCLC) but have not been tested in resectable NSCLC, a condition in which little progress has been made during the past decade. METHODS: In this pilot study, we administered two preoperative doses of PD-1 inhibitor nivolumab in adults with untreated, surgically resectable early (stage I, II, or IIIA) NSCLC. Nivolumab (at a dose of 3 mg per kilogram of body weight) was administered intravenously every 2 weeks, with surgery planned approximately 4 weeks after the first dose. The primary end points of the study were safety and feasibility. We also evaluated the tumor pathological response, expression of programmed death ligand 1 (PD-L1), mutational burden, and mutation-associated, neoantigen-specific T-cell responses. RESULTS: Neoadjuvant nivolumab had an acceptable side-effect profile and was not associated with delays in surgery. Of the 21 tumors that were removed, 20 were completely resected. A major pathological response occurred in 9 of 20 resected tumors (45%). Responses occurred in both PD-L1-positive and PD-L1-negative tumors. There was a significant correlation between the pathological response and the pretreatment tumor mutational burden. The number of T-cell clones that were found in both the tumor and peripheral blood increased systemically after PD-1 blockade in eight of nine patients who were evaluated. Mutation-associated, neoantigen-specific T-cell clones from a primary tumor with a complete response on pathological assessment rapidly expanded in peripheral blood at 2 to 4 weeks after treatment; some of these clones were not detected before the administration of nivolumab. CONCLUSIONS: Neoadjuvant nivolumab was associated with few side effects, did not delay surgery, and induced a major pathological response in 45% of resected tumors. The tumor mutational burden was predictive of the pathological response to PD-1 blockade. Treatment induced expansion of mutation-associated, neoantigen-specific T-cell clones in peripheral blood. (Funded by Cancer Research Institute-Stand Up 2 Cancer and others; ClinicalTrials.gov number, NCT02259621 .).


Subject(s)
Antibodies, Monoclonal/therapeutic use , Antineoplastic Agents/therapeutic use , B7-H1 Antigen/antagonists & inhibitors , Carcinoma, Non-Small-Cell Lung/drug therapy , Lung Neoplasms/drug therapy , Adenocarcinoma/pathology , Aged , Aged, 80 and over , Antibodies, Monoclonal/adverse effects , Antineoplastic Agents/adverse effects , Biopsy , Carcinoma, Non-Small-Cell Lung/genetics , Carcinoma, Non-Small-Cell Lung/pathology , Carcinoma, Non-Small-Cell Lung/surgery , Carcinoma, Squamous Cell/pathology , Female , Humans , Lung Neoplasms/genetics , Lung Neoplasms/pathology , Lung Neoplasms/surgery , Male , Middle Aged , Mutation , Neoadjuvant Therapy , Nivolumab , Pilot Projects
8.
Gastroenterology ; 157(4): 1123-1137.e22, 2019 10.
Article in English | MEDLINE | ID: mdl-31175866

ABSTRACT

BACKGROUND & AIMS: Intraductal papillary mucinous neoplasms (IPMNs) are lesions that can progress to invasive pancreatic cancer and constitute an important system for studies of pancreatic tumorigenesis. We performed comprehensive genomic analyses of entire IPMNs to determine the diversity of somatic mutations in genes that promote tumorigenesis. METHODS: We microdissected neoplastic tissues from 6-24 regions each of 20 resected IPMNs, resulting in 227 neoplastic samples that were analyzed by capture-based targeted sequencing. Somatic mutations in genes associated with pancreatic tumorigenesis were assessed across entire IPMN lesions, and the resulting data were supported by evolutionary modeling, whole-exome sequencing, and in situ detection of mutations. RESULTS: We found a high prevalence of heterogeneity among mutations in IPMNs. Heterogeneity in mutations in KRAS and GNAS was significantly more prevalent in IPMNs with low-grade dysplasia than in IPMNs with high-grade dysplasia (P < .02). Whole-exome sequencing confirmed that IPMNs contained multiple independent clones, each with distinct mutations, as originally indicated by targeted sequencing and evolutionary modeling. We also found evidence for convergent evolution of mutations in RNF43 and TP53, which are acquired during later stages of tumorigenesis. CONCLUSIONS: In an analysis of the heterogeneity of mutations throughout IPMNs, we found that early-stage IPMNs contain multiple independent clones, each with distinct mutations, indicating their polyclonal origin. These findings challenge the model in which pancreatic neoplasms arise from a single clone. Increasing our understanding of the mechanisms of IPMN polyclonality could lead to strategies to identify patients at increased risk for pancreatic cancer.


Subject(s)
Biomarkers, Tumor/genetics , Cell Transformation, Neoplastic/genetics , Mutation , Pancreatic Intraductal Neoplasms/genetics , Pancreatic Neoplasms/genetics , Aged , Aged, 80 and over , Cell Transformation, Neoplastic/pathology , Chromogranins/genetics , Clonal Evolution , DNA Mutational Analysis , DNA-Binding Proteins/genetics , Evolution, Molecular , Female , GTP-Binding Protein alpha Subunits, Gs/genetics , Genetic Predisposition to Disease , Humans , Male , Middle Aged , Mutation Rate , Neoplasm Staging , Oncogene Proteins/genetics , Pancreatic Intraductal Neoplasms/pathology , Pancreatic Neoplasms/pathology , Phenotype , Proto-Oncogene Proteins p21(ras)/genetics , Retrospective Studies , Ubiquitin-Protein Ligases
9.
Bioinformatics ; 35(14): 2509-2511, 2019 07 15.
Article in English | MEDLINE | ID: mdl-30500888

ABSTRACT

SUMMARY: Family-based sequencing studies enable researchers to identify highly penetrant genetic variants too rare to be tested in conventional case-control studies, by studying co-segregation of variant and disease phenotypes. When multiple affected subjects in a family are sequenced, the probability that a variant or a set of variants is shared identical-by-descent by some or all affected relatives provides evidence against the null hypothesis of complete absence of linkage and association. The Rare Variant Sharing software package RVS implements a suite of tools to assess association and linkage between rare genetic variants and a dichotomous disease indicator in family pedigrees. AVAILABILITY AND IMPLEMENTATION: RVS is available as open source software from the Bioconductor webpage at https://bioconductor.org/packages/release/bioc/html/RVS.html. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Rare Diseases , Software , Genetic Linkage , Humans , Pedigree , Phenotype
10.
Bioinformatics ; 35(4): 571-578, 2019 02 15.
Article in English | MEDLINE | ID: mdl-30084993

ABSTRACT

MOTIVATION: De novo copy number deletions have been implicated in many diseases, but there is no formal method to date that identifies de novo deletions in parent-offspring trios from capture-based sequencing platforms. RESULTS: We developed Minimum Distance for Targeted Sequencing (MDTS) to fill this void. MDTS has similar sensitivity (recall), but a much lower false positive rate compared to less specific CNV callers, resulting in a much higher positive predictive value (precision). MDTS also exhibited much better scalability. AVAILABILITY AND IMPLEMENTATION: MDTS is freely available as open source software from the Bioconductor repository. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Algorithms , DNA Copy Number Variations , Sequence Deletion , Software , Computational Biology
11.
BMC Cancer ; 20(1): 856, 2020 Sep 07.
Article in English | MEDLINE | ID: mdl-32894098

ABSTRACT

BACKGROUND: Germline copy number variants (CNVs) increase risk for many diseases, yet detection of CNVs and quantifying their contribution to disease risk in large-scale studies is challenging due to biological and technical sources of heterogeneity that vary across the genome within and between samples. METHODS: We developed an approach called CNPBayes to identify latent batch effects in genome-wide association studies involving copy number, to provide probabilistic estimates of integer copy number across the estimated batches, and to fully integrate the copy number uncertainty in the association model for disease. RESULTS: Applying a hidden Markov model (HMM) to identify CNVs in a large multi-site Pancreatic Cancer Case Control study (PanC4) of 7598 participants, we found CNV inference was highly sensitive to technical noise that varied appreciably among participants. Applying CNPBayes to this dataset, we found that the major sources of technical variation were linked to sample processing by the centralized laboratory and not the individual study sites. Modeling the latent batch effects at each CNV region hierarchically, we developed probabilistic estimates of copy number that were directly incorporated in a Bayesian regression model for pancreatic cancer risk. Candidate associations aided by this approach include deletions of 8q24 near regulatory elements of the tumor oncogene MYC and of Tumor Suppressor Candidate 3 (TUSC3). CONCLUSIONS: Laboratory effects may not account for the major sources of technical variation in genome-wide association studies. This study provides a robust Bayesian inferential framework for identifying latent batch effects, estimating copy number, and evaluating the role of copy number in heritable diseases.


Subject(s)
DNA Copy Number Variations/genetics , Genetic Predisposition to Disease , Genome, Human/genetics , Pancreatic Neoplasms/genetics , Bayes Theorem , Case-Control Studies , Genome-Wide Association Study , Humans , Membrane Proteins/genetics , Pancreatic Neoplasms/pathology , Proto-Oncogene Proteins c-myc/genetics , Tumor Suppressor Proteins/genetics
12.
Genet Epidemiol ; 41(1): 61-69, 2017 Jan.
Article in English | MEDLINE | ID: mdl-27910131

ABSTRACT

By sequencing the exomes of distantly related individuals in multiplex families, rare mutational and structural changes to coding DNA can be characterized and their relationship to disease risk can be assessed. Recently, several rare single nucleotide variants (SNVs) were associated with an increased risk of nonsyndromic oral cleft, highlighting the importance of rare sequence variants in oral clefts and illustrating the strength of family-based study designs. However, the extent to which rare deletions in coding regions of the genome occur and contribute to risk of nonsyndromic clefts is not well understood. To identify putative structural variants underlying risk, we developed a pipeline for rare hemizygous deletions in families from whole exome sequencing and statistical inference based on rare variant sharing. Among 56 multiplex families with 115 individuals, we identified 53 regions with one or more rare hemizygous deletions. We found 45 of the 53 regions contained rare deletions occurring in only one family member. Members of the same family shared a rare deletion in only eight regions. We also devised a scalable global test for enrichment of shared rare deletions.


Subject(s)
Biomarkers/analysis , Cleft Palate/genetics , Exome/genetics , Gene Deletion , Genetic Variation/genetics , Algorithms , Family , Female , Genome, Human , High-Throughput Nucleotide Sequencing , Humans , Male
13.
Proc Natl Acad Sci U S A ; 112(37): 11583-8, 2015 Sep 15.
Article in English | MEDLINE | ID: mdl-26324937

ABSTRACT

The tumor protein 53 (TP53) tumor suppressor gene is the most frequently somatically altered gene in human cancers. Here we show expression of N-Myc down-regulated gene 1 (NDRG1) is induced by p53 during physiologic low proliferative states, and mediates centrosome homeostasis, thus maintaining genome stability. When placed in physiologic low-proliferating conditions, human TP53 null cells fail to increase expression of NDRG1 compared with isogenic wild-type controls and TP53 R248W knockin cells. Overexpression and RNA interference studies demonstrate that NDRG1 regulates centrosome number and amplification. Mechanistically, NDRG1 physically associates with γ-tubulin, a key component of the centrosome, with reduced association in p53 null cells. Strikingly, TP53 homozygous loss was mutually exclusive of NDRG1 overexpression in over 96% of human cancers, supporting the broad applicability of these results. Our study elucidates a mechanism of how TP53 loss leads to abnormal centrosome numbers and genomic instability mediated by NDRG1.


Subject(s)
Cell Cycle Proteins/metabolism , Centrosome/ultrastructure , Gene Expression Regulation, Neoplastic , Intracellular Signaling Peptides and Proteins/metabolism , Tumor Suppressor Protein p53/metabolism , Aneuploidy , Animals , Breast/metabolism , Cell Line , Cell Proliferation , Centrosome/metabolism , Female , Genome , Heterozygote , Homeostasis , Homozygote , Humans , In Situ Hybridization, Fluorescence , Mice , Mice, Knockout , Neoplasms/pathology , Phenotype , RNA Interference , Tubulin/metabolism
14.
Genet Epidemiol ; 40(1): 81-8, 2016 Jan.
Article in English | MEDLINE | ID: mdl-26643968

ABSTRACT

Chronic obstructive pulmonary disease (COPD) is a progressive disease with both environmental and genetic risk factors. Genome-wide association studies (GWAS) have identified multiple genomic regions influencing risk of COPD. To thoroughly investigate the genetic etiology of COPD, however, it is also important to explore the role of copy number variants (CNVs) because the presence of structural variants can alter gene expression and can be causal for some diseases. Here, we investigated effects of polymorphic CNVs on quantitative measures of pulmonary function and chest computed tomography (CT) phenotypes among subjects enrolled in COPDGene, a multisite study. COPDGene subjects consist of roughly one-third African American (AA) and two-thirds non-Hispanic white adult smokers (with or without COPD). We estimated CNVs using PennCNV on 9,076 COPDGene subjects using Illumina's Omni-Express genome-wide marker array. We tested for association between polymorphic CNV components (defined as disjoint intervals of copy number regions) for several quantitative phenotypes associated with COPD within each racial group. Among the AAs, we identified a polymorphic CNV on chromosome 5q35.2 located between two genes (FAM153B and SIMK1, but also harboring several pseudo-genes) giving genome-wide significance in tests of association with total lung capacity (TLCCT ) as measured by chest CT scans. This is the first study of genome-wide association tests of polymorphic CNVs and TLCCT . Although the ARIC cohort did not have the phenotype of TLCCT , we found similar counts of CNV deletions and amplifications among AA and European subjects in this second cohort.


Subject(s)
Chromosome Deletion , Chromosomes, Human, Pair 5 , DNA Copy Number Variations , Pulmonary Disease, Chronic Obstructive/genetics , Smoking , Black or African American/genetics , Aged , Biomarkers , Cohort Studies , Female , Genetic Predisposition to Disease , Genome-Wide Association Study , Humans , Male , Markov Chains , Middle Aged , Total Lung Capacity , White People/genetics
15.
Nat Rev Genet ; 11(10): 733-9, 2010 10.
Article in English | MEDLINE | ID: mdl-20838408

ABSTRACT

High-throughput technologies are widely used, for example to assay genetic variants, gene and protein expression, and epigenetic modifications. One often overlooked complication with such studies is batch effects, which occur because measurements are affected by laboratory conditions, reagent lots and personnel differences. This becomes a major problem when batch effects are correlated with an outcome of interest and lead to incorrect conclusions. Using both published studies and our own analyses, we argue that batch effects (as well as other technical and biological artefacts) are widespread and critical to address. We review experimental and computational approaches for doing so.


Subject(s)
Biotechnology/methods , Genomics/methods , Oligonucleotide Array Sequence Analysis/methods , Sequence Analysis, DNA/methods , Biotechnology/standards , Biotechnology/statistics & numerical data , Computational Biology/methods , Genomics/standards , Genomics/statistics & numerical data , Oligonucleotide Array Sequence Analysis/standards , Oligonucleotide Array Sequence Analysis/statistics & numerical data , Periodicals as Topic/standards , Research Design/standards , Research Design/statistics & numerical data , Sequence Analysis, DNA/standards , Sequence Analysis, DNA/statistics & numerical data
16.
Genet Epidemiol ; 38(6): 516-22, 2014 Sep.
Article in English | MEDLINE | ID: mdl-25048299

ABSTRACT

Case-parent trio studies are commonly employed in genetics to detect variants underlying common complex disease risk. Both commercial and freely available software suites for genetic data analysis usually contain methods for case-parent trio designs. A user might, however, experience limitations with these packages, which can include missing functionality to extend the software if a desired analysis has not been implemented, and the inability to programmatically capture all the software versions used for low-level processing and high-level inference of genomic data, a critical consideration in particular for high-throughput experiments. Here, we present a software vignette (i.e., a manual with step by step instructions and examples to demonstrate software functionality) for reproducible genome-wide analyses of case-parent trio data using the open source Bioconductor package trio. The workflow for the practitioner uses data from previous genetic trio studies to illustrate functions for marginal association tests, assessment of parent-of-origin effects, power and sample size calculations, and functions to detect gene-gene and gene-environment interactions associated with disease.


Subject(s)
Genetic Variation , Software , Child , Gene-Environment Interaction , Genetic Association Studies , Genotype , Humans , Parents , Polymorphism, Single Nucleotide
17.
Birth Defects Res A Clin Mol Teratol ; 103(4): 276-83, 2015 Apr.
Article in English | MEDLINE | ID: mdl-25776870

ABSTRACT

BACKGROUND: DNA copy number variants play an important part in the development of common birth defects such as oral clefts. Individual patients with multiple birth defects (including oral clefts) have been shown to carry small and large chromosomal deletions. METHODS: We investigated the role of polymorphic copy number deletions by comparing transmission rates of deletions from parents to offspring in case-parent trios of European ancestry ascertained through a cleft proband with trios ascertained through a normal offspring. DNA copy numbers in trios were called using the joint hidden Markov model in the freely available PennCNV software. All statistical analyses were performed using Bioconductor tools in the open source environment R. RESULTS: We identified a 67 kb region in the gene MGAM on chromosome 7q34, and a 206 kb region overlapping genes ADAM3A and ADAM5 on chromosome 8p11, where deletions are more frequently transmitted to cleft offspring than control offspring. CONCLUSIONS: These genes or nearby regulatory elements may be involved in the etiology of oral clefts.


Subject(s)
Chromosome Deletion , Chromosomes, Human, Pair 7/genetics , Chromosomes, Human, Pair 8/genetics , Cleft Lip/genetics , Cleft Palate/genetics , DNA Copy Number Variations/genetics , Inheritance Patterns/genetics , Genomics/methods , Humans , Markov Chains , Models, Genetic
18.
BMC Genet ; 15: 24, 2014 Feb 14.
Article in English | MEDLINE | ID: mdl-24528994

ABSTRACT

BACKGROUND: Copy number variants (CNVs) may play an important part in the development of common birth defects such as oral clefts, and individual patients with multiple birth defects (including clefts) have been shown to carry small and large chromosomal deletions. In this paper we investigate de novo deletions defined as DNA segments missing in an oral cleft proband but present in both unaffected parents. We compare de novo deletion frequencies in children of European ancestry with an isolated, non-syndromic oral cleft to frequencies in children of European ancestry from randomly sampled trios. RESULTS: We identified a genome-wide significant 62 kilo base (kb) non-coding region on chromosome 7p14.1 where de novo deletions occur more frequently among oral cleft cases than controls. We also observed wider de novo deletions among cleft lip and palate (CLP) cases than seen among cleft palate (CP) and cleft lip (CL) cases. CONCLUSIONS: This study presents a region where de novo deletions appear to be involved in the etiology of oral clefts, although the underlying biological mechanisms are still unknown. Larger de novo deletions are more likely to interfere with normal craniofacial development and may result in more severe clefts. Study protocol and sample DNA source can severely affect estimates of de novo deletion frequencies. Follow-up studies are needed to further validate these findings and to potentially identify additional structural variants underlying oral clefts.


Subject(s)
Chromosome Deletion , Cleft Lip/genetics , Cleft Palate/genetics , DNA Copy Number Variations , Algorithms , Alleles , Child , Gene Frequency , Genome-Wide Association Study , Humans , White People/genetics
19.
BMC Genet ; 15: 81, 2014 Jul 09.
Article in English | MEDLINE | ID: mdl-25007794

ABSTRACT

BACKGROUND: Hyperuricemia is associated with multiple diseases, including gout, cardiovascular disease, and renal disease. Serum urate is highly heritable, yet association studies of single nucleotide polymorphisms (SNPs) and serum uric acid explain a small fraction of the heritability. Whether copy number polymorphisms (CNPs) contribute to uric acid levels is unknown. RESULTS: We assessed copy number on a genome-wide scale among 8,411 individuals of European ancestry (EA) who participated in the Atherosclerosis Risk in Communities (ARIC) study. CNPs upstream of the urate transporter SLC2A9 on chromosome 4p16.1 are associated with uric acid (χ2df2=3545, p=3.19×10-23). Effect sizes, expressed as the percentage change in uric acid per deleted copy, are most pronounced among women (3.974.935.87 [ 2.55097.5 denoting percentiles], p=4.57×10-23) and independent of previously reported SNPs in SLC2A9 as assessed by SNP and CNP regression models and the phasing SNP and CNP haplotypes (χ2df2=3190,p=7.23×10-08). Our finding is replicated in the Framingham Heart Study (FHS), where the effect size estimated from 4,089 women is comparable to ARIC in direction and magnitude (1.414.707.88, p=5.46×10-03). CONCLUSIONS: This is the first study to characterize CNPs in ARIC and the first genome-wide analysis of CNPs and uric acid. Our findings suggests a novel, non-coding regulatory mechanism for SLC2A9-mediated modulation of serum uric acid, and detail a bioinformatic approach for assessing the contribution of CNPs to heritable traits in large population-based studies where technical sources of variation are substantial.


Subject(s)
DNA Copy Number Variations , Glucose Transport Proteins, Facilitative/genetics , Uric Acid/blood , Female , Gene Frequency , Genome-Wide Association Study , Genotype , Humans , Male , Middle Aged , Models, Statistical , Organic Anion Transporters/genetics , Polymorphism, Single Nucleotide , Regression Analysis , White People/genetics
20.
ArXiv ; 2024 Feb 14.
Article in English | MEDLINE | ID: mdl-38410652

ABSTRACT

Most neoplastic tumors originate from a single cell, and their evolution can be genetically traced through lineages characterized by common alterations such as small somatic mutations (SSMs), copy number alterations (CNAs), structural variants (SVs), and aneuploidies. Due to the complexity of these alterations in most tumors and the errors introduced by sequencing protocols and calling algorithms, tumor subclonal reconstruction algorithms are necessary to recapitulate the DNA sequence composition and tumor evolution in silico. With a growing number of these algorithms available, there is a pressing need for consistent and comprehensive benchmarking, which relies on realistic tumor sequencing generated by simulation tools. Here, we examine the current simulation methods, identifying their strengths and weaknesses, and provide recommendations for their improvement. Our review also explores potential new directions for research in this area. This work aims to serve as a resource for understanding and enhancing tumor genomic simulations, contributing to the advancement of the field.

SELECTION OF CITATIONS
SEARCH DETAIL