Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 75
Filter
Add more filters

Publication year range
1.
Cell ; 185(23): 4409-4427.e18, 2022 11 10.
Article in English | MEDLINE | ID: mdl-36368308

ABSTRACT

Fully understanding autism spectrum disorder (ASD) genetics requires whole-genome sequencing (WGS). We present the latest release of the Autism Speaks MSSNG resource, which includes WGS data from 5,100 individuals with ASD and 6,212 non-ASD parents and siblings (total n = 11,312). Examining a wide variety of genetic variants in MSSNG and the Simons Simplex Collection (SSC; n = 9,205), we identified ASD-associated rare variants in 718/5,100 individuals with ASD from MSSNG (14.1%) and 350/2,419 from SSC (14.5%). Considering genomic architecture, 52% were nuclear sequence-level variants, 46% were nuclear structural variants (including copy-number variants, inversions, large insertions, uniparental isodisomies, and tandem repeat expansions), and 2% were mitochondrial variants. Our study provides a guidebook for exploring genotype-phenotype correlations in families who carry ASD-associated rare variants and serves as an entry point to the expanded studies required to dissect the etiology in the ∼85% of the ASD population that remain idiopathic.


Subject(s)
Autism Spectrum Disorder , Autistic Disorder , Humans , Autism Spectrum Disorder/genetics , Genetic Predisposition to Disease , DNA Copy Number Variations/genetics , Genomics
2.
Nature ; 586(7827): 80-86, 2020 10.
Article in English | MEDLINE | ID: mdl-32717741

ABSTRACT

Tandem DNA repeats vary in the size and sequence of each unit (motif). When expanded, these tandem DNA repeats have been associated with more than 40 monogenic disorders1. Their involvement in disorders with complex genetics is largely unknown, as is the extent of their heterogeneity. Here we investigated the genome-wide characteristics of tandem repeats that had motifs with a length of 2-20 base pairs in 17,231 genomes of families containing individuals with autism spectrum disorder (ASD)2,3 and population control individuals4. We found extensive polymorphism in the size and sequence of motifs. Many of the tandem repeat loci that we detected correlated with cytogenetic fragile sites. At 2,588 loci, gene-associated expansions of tandem repeats that were rare among population control individuals were significantly more prevalent among individuals with ASD than their siblings without ASD, particularly in exons and near splice junctions, and in genes related to the development of the nervous system and cardiovascular system or muscle. Rare tandem repeat expansions had a prevalence of 23.3% in children with ASD compared with 20.7% in children without ASD, which suggests that tandem repeat expansions make a collective contribution to the risk of ASD of 2.6%. These rare tandem repeat expansions included previously undescribed ASD-linked expansions in DMPK and FXN, which are associated with neuromuscular conditions, and in previously unknown loci such as FGF14 and CACNB1. Rare tandem repeat expansions were associated with lower IQ and adaptive ability. Our results show that tandem DNA repeat expansions contribute strongly to the genetic aetiology and phenotypic complexity of ASD.


Subject(s)
Autism Spectrum Disorder/genetics , DNA Repeat Expansion/genetics , Genome, Human/genetics , Genomics , Tandem Repeat Sequences/genetics , Female , Fibroblast Growth Factors/genetics , Genetic Predisposition to Disease , Humans , Intelligence/genetics , Iron-Binding Proteins/genetics , Male , Myotonin-Protein Kinase/genetics , Nucleotide Motifs , Polymorphism, Genetic , Frataxin
3.
Hum Mol Genet ; 32(15): 2411-2421, 2023 07 20.
Article in English | MEDLINE | ID: mdl-37154571

ABSTRACT

We assessed the relationship of gene copy number variation (CNV) in mental health/neurodevelopmental traits and diagnoses, physical health and cognition in a community sample of 7100 unrelated children and youth of European or East Asian ancestry (Spit for Science). Clinically significant or susceptibility CNVs were present in 3.9% of participants and were associated with elevated scores on a continuous measure of attention-deficit/hyperactivity disorder (ADHD) traits (P = 5.0 × 10-3), longer response inhibition (a cognitive deficit found in several mental health and neurodevelopmental disorders; P = 1.0 × 10-2) and increased prevalence of mental health diagnoses (P = 1.9 × 10-6, odds ratio: 3.09), specifically ADHD, autism spectrum disorder anxiety and learning problems/learning disorder (P's < 0.01). There was an increased burden of rare deletions in gene-sets related to brain function or expression in brain associated with more ADHD traits. With the current mental health crisis, our data established a baseline for delineating genetic contributors in pediatric-onset conditions.


Subject(s)
Attention Deficit Disorder with Hyperactivity , Autism Spectrum Disorder , Adolescent , Humans , Child , Mental Health , DNA Copy Number Variations/genetics , Autism Spectrum Disorder/genetics , Genetic Predisposition to Disease , Genome-Wide Association Study , Attention Deficit Disorder with Hyperactivity/epidemiology , Attention Deficit Disorder with Hyperactivity/genetics , Gene Dosage
4.
Mol Psychiatry ; 28(1): 475-482, 2023 01.
Article in English | MEDLINE | ID: mdl-36380236

ABSTRACT

Tandem repeat expansions (TREs) are associated with over 60 monogenic disorders and have recently been implicated in complex disorders such as cancer and autism spectrum disorder. The role of TREs in schizophrenia is now emerging. In this study, we have performed a genome-wide investigation of TREs in schizophrenia. Using genome sequence data from 1154 Swedish schizophrenia cases and 934 ancestry-matched population controls, we have detected genome-wide rare (<0.1% population frequency) TREs that have motifs with a length of 2-20 base pairs. We find that the proportion of individuals carrying rare TREs is significantly higher in the schizophrenia group. There is a significantly higher burden of rare TREs in schizophrenia cases than in controls in genic regions, particularly in postsynaptic genes, in genes overlapping brain expression quantitative trait loci, and in brain-expressed genes that are differentially expressed between schizophrenia cases and controls. We demonstrate that TRE-associated genes are more constrained and primarily impact synaptic and neuronal signaling functions. These results have been replicated in an independent Canadian sample that consisted of 252 schizophrenia cases of European ancestry and 222 ancestry-matched controls. Our results support the involvement of rare TREs in schizophrenia etiology.


Subject(s)
Autism Spectrum Disorder , Schizophrenia , Humans , Schizophrenia/genetics , Genome-Wide Association Study , Canada , Gene Frequency , Genetic Predisposition to Disease/genetics
5.
J Med Genet ; 60(12): 1153-1160, 2023 Nov 27.
Article in English | MEDLINE | ID: mdl-37290907

ABSTRACT

BACKGROUND: We present genomic and phenotypic findings of a transgenerational family consisting of three male offspring, each with a maternally inherited distal 220 kb deletion at locus 16p11.2 (BP2-BP3). Genomic analysis of all family members was prompted by a diagnosis of autism spectrum disorder (ASD) in the eldest child, who also presented with a low body mass index. METHODS: All male offspring underwent extensive neuropsychiatric evaluation. Both parents were also assessed for social functioning and cognition. The family underwent whole-genome sequencing. Further data curation was undertaken from samples ascertained for neurodevelopmental disorders and congenital abnormalities. RESULTS: On medical examination, both the second and third-born male offspring presented with obesity. The second-born male offspring met research diagnostic criteria for ASD at 8 years of age and presented with mild attention deficits. The third-born male offspring was only noted as having motor deficits and received a diagnosis of developmental coordination disorder. Other than the 16p11.2 distal deletion, no additional contributing variants of clinical significance were observed. The mother was clinically evaluated and noted as having a broader autism phenotype. CONCLUSION: In this family, the phenotypes observed are most likely caused by the 16p11.2 distal deletion. The lack of other overt pathogenic mutations identified by genomic sequencing reinforces the variable expressivity that should be heeded in a clinical setting. Importantly, distal 16p11.2 deletions can present with a highly variable phenotype even within a single family. Our additional data curation provides further evidence on the variable clinical presentation among those with pathogenetic 16p11.2 (BP2-BP3) mutations.


Subject(s)
Autism Spectrum Disorder , Autistic Disorder , Intellectual Disability , Child , Humans , Male , Chromosome Deletion , Autism Spectrum Disorder/diagnosis , Autism Spectrum Disorder/genetics , Autistic Disorder/genetics , Family , Phenotype , Biological Variation, Population , Chromosomes, Human, Pair 16/genetics , Intellectual Disability/diagnosis , Intellectual Disability/genetics
6.
Psychiatry Clin Neurosci ; 78(7): 405-415, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38751214

ABSTRACT

AIM: Short tandem repeats (STRs) are repetitive DNA sequences and highly mutable in various human disorders. While the involvement of STRs in various genetic disorders has been extensively studied, their role in autism spectrum disorder (ASD) remains largely unexplored. In this study, we aimed to investigate genetic association of STR expansions with ASD using whole genome sequencing (WGS) and identify risk loci associated with ASD phenotypes. METHODS: We analyzed WGS data of 634 ASD families and performed genome-wide evaluation for 12,929 STR loci. We found rare STR expansions that exceeded normal repeat lengths in autism cases compared to unaffected controls. By integrating single cell RNA and ATAC sequencing datasets of human postmortem brains, we prioritized STR loci in genes specifically expressed in cortical development stages. A deep learning method was used to predict functionality of ASD-associated STR loci. RESULTS: In ASD cases, rare STR expansions predominantly occurred in early cortical layer-specific genes involved in neurodevelopment, highlighting the cellular specificity of STR-associated genes in ASD risk. Leveraging deep learning prediction models, we demonstrated that these STR expansions disrupted the regulatory activity of enhancers and promoters, suggesting a potential mechanism through which they contribute to ASD pathogenesis. We found that individuals with ASD-associated STR expansions exhibited more severe ASD phenotypes and diminished adaptability compared to non-carriers. CONCLUSION: Short tandem repeat expansions in cortical layer-specific genes are associated with ASD and could potentially be a risk genetic factor for ASD. Our study is the first to show evidence of STR expansion associated with ASD in an under-investigated population.


Subject(s)
Autism Spectrum Disorder , Microsatellite Repeats , Humans , Autism Spectrum Disorder/genetics , Microsatellite Repeats/genetics , Male , Female , Cerebral Cortex/pathology , Phenotype , Child , Whole Genome Sequencing , Deep Learning , Severity of Illness Index , Adult , DNA Repeat Expansion/genetics
7.
Article in English | MEDLINE | ID: mdl-38967411

ABSTRACT

This study investigated the neurodevelopmental impact of pathogenic adenomatous polyposis coli (APC) gene variants in patients with familial adenomatous polyposis (FAP), a cancer predisposition syndrome. We hypothesized that certain pathogenic APC variants result in behavioral-cognitive challenges. We compared 66 FAP patients (cases) and 34 unaffected siblings (controls) to explore associations between APC variants and behavioral and cognitive challenges. Our findings indicate that FAP patients exhibited higher Social Responsiveness Scale (SRS) scores, suggesting a greater prevalence of autistic traits when compared to unaffected siblings (mean 53.8 vs. 47.4, Wilcoxon p = 0.018). The distribution of SRS scores in cases suggested a bimodal pattern, potentially linked to the location of the APC variant, with scores increasing from the 5' to 3' end of the gene (Pearson's r = 0.33, p = 0.022). While we observed a trend toward lower educational attainment in cases, this difference was not statistically significant. This study is the first to explore the connection between APC variant location and neurodevelopmental traits in FAP, expanding our understanding of the genotype-phenotype correlation. Our results emphasize the importance of clinical assessment for autistic traits in FAP patients, shedding light on the potential role of APC gene variants in these behavioral and cognitive challenges.

8.
Hum Mol Genet ; 30(R2): R174-R186, 2021 10 01.
Article in English | MEDLINE | ID: mdl-34296264

ABSTRACT

Over the past 30 years (the timespan of a generation), advances in genomics technologies have revealed tremendous and unexpected variation in the human genome and have provided increasingly accurate answers to long-standing questions of how much genetic variation exists in human populations and to what degree the DNA complement changes between parents and offspring. Tracking the characteristics of these inherited and spontaneous (or de novo) variations has been the basis of the study of human genetic disease. From genome-wide microarray and next-generation sequencing scans, we now know that each human genome contains over 3 million single nucleotide variants when compared with the ~ 3 billion base pairs in the human reference genome, along with roughly an order of magnitude more DNA-approximately 30 megabase pairs (Mb)-being 'structurally variable', mostly in the form of indels and copy number changes. Additional large-scale variations include balanced inversions (average of 18 Mb) and complex, difficult-to-resolve alterations. Collectively, ~1% of an individual's genome will differ from the human reference sequence. When comparing across a generation, fewer than 100 new genetic variants are typically detected in the euchromatic portion of a child's genome. Driven by increasingly higher-resolution and higher-throughput sequencing technologies, newer and more accurate databases of genetic variation (for instance, more comprehensive structural variation data and phasing of combinations of variants along chromosomes) of worldwide populations will emerge to underpin the next era of discovery in human molecular genetics.


Subject(s)
Genetic Variation , Genome, Human , Genomics , Female , Genome-Wide Association Study , Genomics/methods , High-Throughput Nucleotide Sequencing , Humans , Male , Mutation , Whole Genome Sequencing
9.
Hum Genet ; 142(2): 201-216, 2023 Feb.
Article in English | MEDLINE | ID: mdl-36376761

ABSTRACT

Copy number variants (CNVs) represent major etiologic factors in rare genetic diseases. Current clinical CNV interpretation workflows require extensive back-and-forth with multiple tools and databases. This increases complexity and time burden, potentially resulting in missed genetic diagnoses. We present the Suite for CNV Interpretation and Prioritization (SCIP), a software package for the clinical interpretation of CNVs detected by whole-genome sequencing (WGS). The SCIP Visualization Module near-instantaneously displays all information necessary for CNV interpretation (variant quality, population frequency, inheritance pattern, and clinical relevance) on a single page-supported by modules providing variant filtration and prioritization. SCIP was comprehensively evaluated using WGS data from 1027 families with congenital cardiac disease and/or autism spectrum disorder, containing 187 pathogenic or likely pathogenic (P/LP) CNVs identified in previous curations. SCIP was efficient in filtration and prioritization: a median of just two CNVs per case were selected for review, yet it captured all P/LP findings (92.5% of which ranked 1st). SCIP was also able to identify one pathogenic CNV previously missed. SCIP was benchmarked against AnnotSV and a spreadsheet-based manual workflow and performed superiorly than both. In conclusion, SCIP is a novel software package for efficient clinical CNV interpretation, substantially faster and more accurate than previous tools (available at https://github.com/qd29/SCIP , a video tutorial series is available at https://bit.ly/SCIPVideos ).


Subject(s)
Autism Spectrum Disorder , DNA Copy Number Variations , Humans , Whole Genome Sequencing , Software , Rare Diseases
10.
Mol Psychiatry ; 27(9): 3692-3698, 2022 09.
Article in English | MEDLINE | ID: mdl-35546631

ABSTRACT

Tandem repeat expansions (TREs) can cause neurological diseases but their impact in schizophrenia is unclear. Here we analyzed genome sequences of adults with schizophrenia and found that they have a higher burden of TREs that are near exons and rare in the general population, compared with non-psychiatric controls. These TREs are disproportionately found at loci known to be associated with schizophrenia from genome-wide association studies, in individuals with clinically-relevant genetic variants at other schizophrenia loci, and in families where multiple individuals have schizophrenia. We showed that rare TREs in schizophrenia may impact synaptic functions by disrupting the splicing process of their associated genes in a loss-of-function manner. Our findings support the involvement of genome-wide rare TREs in the polygenic nature of schizophrenia.


Subject(s)
Schizophrenia , Adult , Humans , Schizophrenia/genetics , Schizophrenia/epidemiology , Genome-Wide Association Study , Genetic Predisposition to Disease/genetics , Multifactorial Inheritance/genetics , Tandem Repeat Sequences , Polymorphism, Single Nucleotide/genetics
11.
Am J Hum Genet ; 104(6): 1116-1126, 2019 06 06.
Article in English | MEDLINE | ID: mdl-31104771

ABSTRACT

Huntington disease (HD) is caused by a CAG repeat expansion in the huntingtin (HTT) gene. Although the length of this repeat is inversely correlated with age of onset (AOO), it does not fully explain the variability in AOO. We assessed the sequence downstream of the CAG repeat in HTT [reference: (CAG)n-CAA-CAG], since variants within this region have been previously described, but no study of AOO has been performed. These analyses identified a variant that results in complete loss of interrupting (LOI) adenine nucleotides in this region [(CAG)n-CAG-CAG]. Analysis of multiple HD pedigrees showed that this LOI variant is associated with dramatically earlier AOO (average of 25 years) despite the same polyglutamine length as in individuals with the interrupting penultimate CAA codon. This LOI allele is particularly frequent in persons with reduced penetrance alleles who manifest with HD and increases the likelihood of presenting clinically with HD with a CAG of 36-39 repeats. Further, we show that the LOI variant is associated with increased somatic repeat instability, highlighting this as a significant driver of this effect. These findings indicate that the number of uninterrupted CAG repeats, which is lengthened by the LOI, is the most significant contributor to AOO of HD and is more significant than polyglutamine length, which is not altered in these individuals. In addition, we identified another variant in this region, where the CAA-CAG sequence is duplicated, which was associated with later AOO. Identification of these cis-acting modifiers have potentially important implications for genetic counselling in HD-affected families.


Subject(s)
Codon/genetics , Huntington Disease/genetics , Huntington Disease/pathology , Peptides/genetics , Trinucleotide Repeat Expansion/genetics , Adolescent , Adult , Age of Onset , Child , Female , Humans , Male , Middle Aged , Pedigree
12.
N Engl J Med ; 380(15): 1433-1441, 2019 04 11.
Article in English | MEDLINE | ID: mdl-30970188

ABSTRACT

We report an inborn error of metabolism caused by an expansion of a GCA-repeat tract in the 5' untranslated region of the gene encoding glutaminase (GLS) that was identified through detailed clinical and biochemical phenotyping, combined with whole-genome sequencing. The expansion was observed in three unrelated patients who presented with an early-onset delay in overall development, progressive ataxia, and elevated levels of glutamine. In addition to ataxia, one patient also showed cerebellar atrophy. The expansion was associated with a relative deficiency of GLS messenger RNA transcribed from the expanded allele, which probably resulted from repeat-mediated chromatin changes upstream of the GLS repeat. Our discovery underscores the importance of careful examination of regions of the genome that are typically excluded from or poorly captured by exome sequencing.


Subject(s)
Amino Acid Metabolism, Inborn Errors/genetics , Ataxia/genetics , Developmental Disabilities/genetics , Glutaminase/deficiency , Glutaminase/genetics , Glutamine/metabolism , Microsatellite Repeats , Mutation , Atrophy/genetics , Cerebellum/pathology , Child, Preschool , Female , Genotype , Glutamine/analysis , Humans , Male , Phenotype , Polymerase Chain Reaction , Whole Genome Sequencing
13.
Am J Hum Genet ; 102(1): 142-155, 2018 01 04.
Article in English | MEDLINE | ID: mdl-29304372

ABSTRACT

A remaining hurdle to whole-genome sequencing (WGS) becoming a first-tier genetic test has been accurate detection of copy-number variations (CNVs). Here, we used several datasets to empirically develop a detailed workflow for identifying germline CNVs >1 kb from short-read WGS data using read depth-based algorithms. Our workflow is comprehensive in that it addresses all stages of the CNV-detection process, including DNA library preparation, sequencing, quality control, reference mapping, and computational CNV identification. We used our workflow to detect rare, genic CNVs in individuals with autism spectrum disorder (ASD), and 120/120 such CNVs tested using orthogonal methods were successfully confirmed. We also identified 71 putative genic de novo CNVs in this cohort, which had a confirmation rate of 70%; the remainder were incorrectly identified as de novo due to false positives in the proband (7%) or parental false negatives (23%). In individuals with an ASD diagnosis in which both microarray and WGS experiments were performed, our workflow detected all clinically relevant CNVs identified by microarrays, as well as additional potentially pathogenic CNVs < 20 kb. Thus, CNVs of clinical relevance can be discovered from WGS with a detection rate exceeding microarrays, positioning WGS as a single assay for genetic variation detection.


Subject(s)
DNA Copy Number Variations/genetics , Whole Genome Sequencing , Workflow , Algorithms , Child , Female , Haplotypes/genetics , Humans , Male , Reproducibility of Results , Sequence Analysis, DNA
14.
Europace ; 23(6): 844-850, 2021 06 07.
Article in English | MEDLINE | ID: mdl-33682005

ABSTRACT

AIMS: Atrial fibrillation (AF) is a complex heritable disease whose genetic underpinnings remain largely unexplained, though recent work has suggested that the arrhythmia may develop secondary to an underlying atrial cardiomyopathy. We sought to evaluate for enrichment of loss-of-function (LOF) and copy number variants (CNVs) in genes implicated in ventricular cardiomyopathy in 'lone' AF. METHODS AND RESULTS: Whole-exome sequencing was performed in 255 early onset 'lone' AF cases, defined as arrhythmia onset prior to 60 years of age in the absence of known clinical risk factors. Subsequent evaluations were restricted to 195 cases of European genetic ancestry, as defined by principal component analysis, and focused on a pre-defined set of 43 genes previously implicated in ventricular cardiomyopathy. Bioinformatic analysis identified 6 LOF variants (3.1%), including 3 within the TTN gene, among cases in comparison with 4 of 503 (0.80%) controls [odds ratio: 3.96; 95% confidence interval (CI): 1.11-14.2; P = 0.033]. Further, two AF cases possessed a novel heterozygous 8521 base pair TTN deletion, confirmed with Sanger sequencing and breakpoint validation, which was absent from 4958 controls (P = 0.0014). Subsequent cascade screening in two families revealed evidence of co-segregation of a LOF variant with 'lone' AF. CONCLUSION: 'Lone' AF cases are enriched in rare LOF variants from cardiomyopathy genes, findings primarily driven by TTN, and a novel TTN deletion, providing additional evidence to implicate atrial cardiomyopathy as an AF genetic sub-phenotype. Our results also highlight that AF may develop in the context of these variants in the absence of a discernable ventricular cardiomyopathy.


Subject(s)
Atrial Fibrillation , Cardiomyopathies , Atrial Fibrillation/diagnosis , Atrial Fibrillation/genetics , Cardiomyopathies/diagnosis , Cardiomyopathies/genetics , DNA Copy Number Variations , Genetic Predisposition to Disease , Heterozygote , Humans , Phenotype
15.
Rheumatology (Oxford) ; 59(5): 1066-1075, 2020 05 01.
Article in English | MEDLINE | ID: mdl-32321162

ABSTRACT

OBJECTIVE: To identify discrete clusters comprising clinical features and inflammatory biomarkers in children with JIA and to determine cluster alignment with JIA categories. METHODS: A Canadian prospective inception cohort comprising 150 children with JIA was evaluated at baseline (visit 1) and after six months (visit 2). Data included clinical manifestations and inflammation-related biomarkers. Probabilistic principal component analysis identified sets of composite variables, or principal components, from 191 original variables. To discern new clinical-biomarker clusters (clusters), Gaussian mixture models were fit to the data. Newly-defined clusters and JIA categories were compared. Agreement between the two was assessed using Kruskal-Wallis analyses and contingency plots. RESULTS: Three principal components recovered 35% (three clusters) and 40% (five clusters) of the variance in patient profiles in visits 1 and 2, respectively. None of the clusters aligned precisely with any of the seven JIA categories but rather spanned multiple categories. Results demonstrated that the newly defined clinical-biomarker lustres are more homogeneous than JIA categories. CONCLUSION: Applying unsupervised data mining to clinical and inflammatory biomarker data discerns discrete clusters that intersect multiple JIA categories. Results suggest that certain groups of patients within different JIA categories are more aligned pathobiologically than their separate clinical categorizations suggest. Applying data mining analyses to complex datasets can generate insights into JIA pathogenesis and could contribute to biologically based refinements in JIA classification.


Subject(s)
Arthritis, Juvenile/blood , Arthritis, Juvenile/physiopathology , Inflammation Mediators/blood , Adolescent , Age Factors , Arthritis, Juvenile/epidemiology , Biomarkers/blood , Canada/epidemiology , Child , Cluster Analysis , Cohort Studies , Data Mining , Female , Humans , Incidence , Male , Normal Distribution , Prospective Studies , Risk Assessment , Severity of Illness Index , Sex Factors , Syndrome
16.
Rheumatology (Oxford) ; 59(9): 2402-2411, 2020 09 01.
Article in English | MEDLINE | ID: mdl-31919503

ABSTRACT

OBJECTIVE: To identify early predictors of disease activity at 18 months in JIA using clinical and biomarker profiling. METHODS: Clinical and biomarker data were collected at JIA diagnosis in a prospective longitudinal inception cohort of 82 children with non-systemic JIA, and their ability to predict an active joint count of 0, a physician global assessment of disease activity of ≤1 cm, and inactive disease by Wallace 2004 criteria 18 months later was assessed. Correlation-based feature selection and ReliefF were used to shortlist predictors and random forest models were trained to predict outcomes. RESULTS: From the original 112 features, 13 effectively predicted 18-month outcomes. They included age, number of active/effused joints, wrist, ankle and/or knee involvement, ESR, ANA positivity and plasma levels of five inflammatory biomarkers (IL-10, IL-17, IL-12p70, soluble low-density lipoprotein receptor-related protein 1 and vitamin D), at enrolment. The clinical plus biomarker panel predicted active joint count = 0, physician global assessment ≤ 1, and inactive disease after 18 months with 0.79, 0.80 and 0.83 accuracy and 0.84, 0.83, 0.88 area under the curve, respectively. Using clinical features alone resulted in 0.75, 0.72 and 0.80 accuracy, and area under the curve values of 0.81, 0.78 and 0.83, respectively. CONCLUSION: A panel of five plasma biomarkers combined with clinical features at the time of diagnosis more accurately predicted short-term disease activity in JIA than clinical characteristics alone. If validated in external cohorts, such a panel may guide more rationally conceived, biologically based, personalized treatment strategies in early JIA.


Subject(s)
Arthritis, Juvenile/diagnosis , Interleukins/blood , Low Density Lipoprotein Receptor-Related Protein-1/blood , Severity of Illness Index , Vitamin D/blood , Adolescent , Ankle Joint/pathology , Area Under Curve , Arthritis, Juvenile/blood , Arthritis, Juvenile/pathology , Biomarkers/blood , Canada , Child , Child, Preschool , Female , Humans , Interleukin-10/blood , Interleukin-12/blood , Interleukin-17/blood , Knee Joint/pathology , Longitudinal Studies , Male , Predictive Value of Tests , Prospective Studies , Wrist Joint/pathology
17.
J Med Genet ; 56(12): 809-817, 2019 12.
Article in English | MEDLINE | ID: mdl-31515274

ABSTRACT

BACKGROUND: Whole blood is currently the most common DNA source for whole-genome sequencing (WGS), but for studies requiring non-invasive collection, self-collection, greater sample stability or additional tissue references, saliva or buccal samples may be preferred. However, the relative quality of sequencing data and accuracy of genetic variant detection from blood-derived, saliva-derived and buccal-derived DNA need to be thoroughly investigated. METHODS: Matched blood, saliva and buccal samples from four unrelated individuals were used to compare sequencing metrics and variant-detection accuracy among these DNA sources. RESULTS: We observed significant differences among DNA sources for sequencing quality metrics such as percentage of reads aligned and mean read depth (p<0.05). Differences were negligible in the accuracy of detecting short insertions and deletions; however, the false positive rate for single nucleotide variation detection was slightly higher in some saliva and buccal samples. The sensitivity of copy number variant (CNV) detection was up to 25% higher in blood samples, depending on CNV size and type, and appeared to be worse in saliva and buccal samples with high bacterial concentration. We also show that methylation-based enrichment for eukaryotic DNA in saliva and buccal samples increased alignment rates but also reduced read-depth uniformity, hampering CNV detection. CONCLUSION: For WGS, we recommend using DNA extracted from blood rather than saliva or buccal swabs; if saliva or buccal samples are used, we recommend against using methylation-based eukaryotic DNA enrichment. All data used in this study are available for further open-science investigation.


Subject(s)
DNA Copy Number Variations/genetics , DNA/genetics , Whole Genome Sequencing/standards , Adult , DNA/blood , DNA/chemistry , DNA/standards , DNA Methylation/genetics , Female , Genotype , Humans , Male , Middle Aged , Mouth Mucosa/chemistry , Polymorphism, Single Nucleotide/genetics , Saliva/chemistry , Sequence Analysis, DNA/standards
18.
BMC Genomics ; 19(1): 23, 2018 01 05.
Article in English | MEDLINE | ID: mdl-29304736

ABSTRACT

BACKGROUND: Clubroot is an important disease caused by the obligate parasite Plasmodiophora brassicae that infects the Brassicaceae. As a soil-borne pathogen, P. brassicae induces the generation of abnormal tissue in the root, resulting in the formation of galls. Root infection negatively affects the uptake of water and nutrients in host plants, severely reducing their growth and productivity. Many studies have emphasized the molecular and physiological effects of the clubroot disease on root tissues. The aim of the present study is to better understand the effect of P. brassicae on the transcriptome of both shoot and root tissues of Arabidopsis thaliana. RESULTS: Transcriptome profiling using RNA-seq was performed on both shoot and root tissues at 17, 20 and 24 days post inoculation (dpi) of A. thaliana, a model plant host for P. brassicae. The number of differentially expressed genes (DEGs) between infected and uninfected samples was larger in shoot than in root. In both shoot and root, more genes were differentially regulated at 24 dpi than the two earlier time points. Genes that were highly regulated in response to infection in both shoot and root primarily were involved in the metabolism of cell wall compounds, lipids, and shikimate pathway metabolites. Among hormone-related pathways, several jasmonic acid biosynthesis genes were upregulated in both shoot and root tissue. Genes encoding enzymes involved in cell wall modification, biosynthesis of sucrose and starch, and several classes of transcription factors were generally differently regulated in shoot and root. CONCLUSIONS: These results highlight the similarities and differences in the transcriptomic response of above- and below-ground tissues of the model host Arabidopsis following P. brassicae infection. The main transcriptomic changes in root metabolism during clubroot disease progression were identified. An overview of DEGs in the shoot underlined the physiological changes in above-ground tissues following pathogen establishment and disease progression. This study provides insights into host tissue-specific molecular responses to clubroot development and may have applications in the development of clubroot markers for more effective breeding strategies.


Subject(s)
Arabidopsis/genetics , Arabidopsis/parasitology , Gene Expression Regulation, Plant , Plant Diseases/parasitology , Plasmodiophorida , Transcriptome , Arabidopsis/anatomy & histology , Arabidopsis/metabolism , Gene Expression Profiling , Plant Diseases/genetics , Plant Growth Regulators/biosynthesis , Plant Roots/genetics , Plant Roots/metabolism , Plant Roots/parasitology , Plant Shoots/genetics , Plant Shoots/metabolism , Plant Shoots/parasitology , Transcription Factors/genetics , Transcription Factors/metabolism
19.
CMAJ ; 190(5): E126-E136, 2018 02 05.
Article in English | MEDLINE | ID: mdl-29431110

ABSTRACT

BACKGROUND: The Personal Genome Project Canada is a comprehensive public data resource that integrates whole genome sequencing data and health information. We describe genomic variation identified in the initial recruitment cohort of 56 volunteers. METHODS: Volunteers were screened for eligibility and provided informed consent for open data sharing. Using blood DNA, we performed whole genome sequencing and identified all possible classes of DNA variants. A genetic counsellor explained the implication of the results to each participant. RESULTS: Whole genome sequencing of the first 56 participants identified 207 662 805 sequence variants and 27 494 copy number variations. We analyzed a prioritized disease-associated data set (n = 1606 variants) according to standardized guidelines, and interpreted 19 variants in 14 participants (25%) as having obvious health implications. Six of these variants (e.g., in BRCA1 or mosaic loss of an X chromosome) were pathogenic or likely pathogenic. Seven were risk factors for cancer, cardiovascular or neurobehavioural conditions. Four other variants - associated with cancer, cardiac or neurodegenerative phenotypes - remained of uncertain significance because of discrepancies among databases. We also identified a large structural chromosome aberration and a likely pathogenic mitochondrial variant. There were 172 recessive disease alleles (e.g., 5 individuals carried mutations for cystic fibrosis). Pharmacogenomics analyses revealed another 3.9 potentially relevant genotypes per individual. INTERPRETATION: Our analyses identified a spectrum of genetic variants with potential health impact in 25% of participants. When also considering recessive alleles and variants with potential pharmacologic relevance, all 56 participants had medically relevant findings. Although access is mostly limited to research, whole genome sequencing can provide specific and novel information with the potential of major impact for health care.


Subject(s)
Genetic Variation/genetics , Genome, Human/genetics , Sequence Analysis, DNA/methods , Whole Genome Sequencing/methods , Canada , Female , Genes, Recessive/genetics , Genetic Predisposition to Disease/genetics , Humans , Male
20.
Brief Bioinform ; 16(5): 820-9, 2015 Sep.
Article in English | MEDLINE | ID: mdl-25380664

ABSTRACT

The majority of scientific resources are devoted to studying a relatively small number of model species, meaning that the ability to translate knowledge across species is of considerable importance. Obtaining species-specific knowledge enables targeted investigations of the biology and pathobiology of a particular species, and facilitates comparative analyses. Phosphorylation is the most widespread posttranslational modification in eukaryotes, and although many phosphorylation sites have been experimentally identified for some species, little or no data are available for others. Using the honeybee as a test organism, this case study illustrates the process of using protein sequence homology to identify putative phosphorylation sites in a species of interest using experimentally determined sites from other species. A number of issues associated with this process are examined and discussed. Several databases of experimentally determined phosphorylation sites exist; however, it can be difficult for the nonspecialist to ascertain how their contents compare. Thus, this case study assesses the content and comparability of several phosphorylation site databases. Additional issues examined include the efficacy of homology-based phosphorylation site prediction, the impact of the level of evolutionary relatedness between species in making these predictions, the ability to translate knowledge of phosphorylation sites across large evolutionary distances and the criteria that should be used in selecting probable phosphorylation sites in the species of interest. Although focusing on phosphorylation, the issues discussed here also apply to the homology-based cross-species prediction of other posttranslational modifications, as well as to sequence motifs in general.


Subject(s)
Bees/metabolism , Biological Evolution , Insect Proteins/metabolism , Animals , Humans , Insect Proteins/chemistry , Likelihood Functions , Phosphorylation , Sequence Homology, Amino Acid , Species Specificity
SELECTION OF CITATIONS
SEARCH DETAIL