Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 9 de 9
Filter
Add more filters










Database
Language
Publication year range
1.
Genome Med ; 15(1): 81, 2023 10 07.
Article in English | MEDLINE | ID: mdl-37805537

ABSTRACT

BACKGROUND: Autism spectrum disorder (ASD) is a neurodevelopmental condition characterized by impaired social and communication skills, restricted interests, and repetitive behaviors. The prevalence of ASD among children in Qatar was recently estimated to be 1.1%, though the genetic architecture underlying ASD both in Qatar and the greater Middle East has been largely unexplored. Here, we describe the first genomic data release from the BARAKA-Qatar Study-a nationwide program building a broadly consented biorepository of individuals with ASD and their families available for sample and data sharing and multi-omics research. METHODS: In this first release, we present a comprehensive analysis of whole-genome sequencing (WGS) data of the first 100 families (372 individuals), investigating the genetic architecture, including single-nucleotide variants (SNVs), copy number variants (CNVs), tandem repeat expansions (TREs), as well as mitochondrial DNA variants (mtDNA) segregating with ASD in local families. RESULTS: Overall, we identify potentially pathogenic variants in known genes or regions in 27 out of 100 families (27%), of which 11 variants (40.7%) were classified as pathogenic or likely-pathogenic based on American College of Medical Genetics (ACMG) guidelines. Dominant variants, including de novo and inherited, contributed to 15 (55.6%) of these families, consisting of SNVs/indels (66.7%), CNVs (13.3%), TREs (13.3%), and mtDNA variants (6.7%). Moreover, homozygous variants were found in 7 families (25.9%), with a sixfold increase in homozygous burden in consanguineous versus non-consanguineous families (13.6% and 1.8%, respectively). Furthermore, 28 novel ASD candidate genes were identified in 20 families, 23 of which had recurrent hits in MSSNG and SSC cohorts. CONCLUSIONS: This study illustrates the value of ASD studies in under-represented populations and the importance of WGS as a comprehensive tool for establishing a molecular diagnosis for families with ASD. Moreover, it uncovers a significant role for recessive variation in ASD architecture in consanguineous settings and provides a unique resource of Middle Eastern genomes for future research to the global ASD community.


Subject(s)
Autism Spectrum Disorder , Child , Humans , Autism Spectrum Disorder/epidemiology , Autism Spectrum Disorder/genetics , Qatar/epidemiology , Genome , DNA Copy Number Variations , Genomics , DNA, Mitochondrial , Genetic Predisposition to Disease
2.
HGG Adv ; 4(1): 100156, 2023 01 12.
Article in English | MEDLINE | ID: mdl-36386424

ABSTRACT

Phasing of heterozygous alleles is critical for interpretation of cis-effects of disease-relevant variation. We sequenced 477 individuals with cystic fibrosis (CF) using linked-read sequencing, which display an average phase block N50 of 4.39 Mb. We use these samples to construct a graph representation of CFTR haplotypes, demonstrating its utility for understanding complex CF alleles. These are visualized in a Web app, CFTbaRcodes, that enables interactive exploration of CFTR haplotypes present in this cohort. We perform fine-mapping and phasing of the chr7q35 trypsinogen locus associated with CF meconium ileus, an intestinal obstruction at birth associated with more severe CF outcomes and pancreatic disease. A 20-kb deletion polymorphism and a PRSS2 missense variant p.Thr8Ile (rs62473563) are shown to independently contribute to meconium ileus risk (p = 0.0028, p = 0.011, respectively) and are PRSS2 pancreas eQTLs (p = 9.5 × 10-7 and p = 1.4 × 10-4, respectively), suggesting the mechanism by which these polymorphisms contribute to CF. The phase information from linked reads provides a putative causal explanation for variation at a CF-relevant locus, which also has implications for the genetic basis of non-CF pancreatitis, to which this locus has been reported to contribute.


Subject(s)
Cystic Fibrosis , Intestinal Obstruction , Meconium Ileus , Infant, Newborn , Humans , Cystic Fibrosis/genetics , Cystic Fibrosis Transmembrane Conductance Regulator/genetics , Meconium Ileus/complications , Meconium , Intestinal Obstruction/complications , Trypsin , Trypsinogen/genetics
3.
Cell ; 185(23): 4409-4427.e18, 2022 11 10.
Article in English | MEDLINE | ID: mdl-36368308

ABSTRACT

Fully understanding autism spectrum disorder (ASD) genetics requires whole-genome sequencing (WGS). We present the latest release of the Autism Speaks MSSNG resource, which includes WGS data from 5,100 individuals with ASD and 6,212 non-ASD parents and siblings (total n = 11,312). Examining a wide variety of genetic variants in MSSNG and the Simons Simplex Collection (SSC; n = 9,205), we identified ASD-associated rare variants in 718/5,100 individuals with ASD from MSSNG (14.1%) and 350/2,419 from SSC (14.5%). Considering genomic architecture, 52% were nuclear sequence-level variants, 46% were nuclear structural variants (including copy-number variants, inversions, large insertions, uniparental isodisomies, and tandem repeat expansions), and 2% were mitochondrial variants. Our study provides a guidebook for exploring genotype-phenotype correlations in families who carry ASD-associated rare variants and serves as an entry point to the expanded studies required to dissect the etiology in the ∼85% of the ASD population that remain idiopathic.


Subject(s)
Autism Spectrum Disorder , Autistic Disorder , Humans , Autism Spectrum Disorder/genetics , Genetic Predisposition to Disease , DNA Copy Number Variations/genetics , Genomics
4.
Front Genet ; 11: 612515, 2020.
Article in English | MEDLINE | ID: mdl-33335541

ABSTRACT

Population sequencing often requires collaboration across a distributed network of sequencing centers for the timely processing of thousands of samples. In such massive efforts, it is important that participating scientists can be confident that the accuracy of the sequence data produced is not affected by which center generates the data. A study was conducted across three established sequencing centers, located in Montreal, Toronto, and Vancouver, constituting Canada's Genomics Enterprise (www.cgen.ca). Whole genome sequencing was performed at each center, on three genomic DNA replicates from three well-characterized cell lines. Secondary analysis pipelines employed by each site were applied to sequence data from each of the sites, resulting in three datasets for each of four variables (cell line, replicate, sequencing center, and analysis pipeline), for a total of 81 datasets. These datasets were each assessed according to multiple quality metrics including concordance with benchmark variant truth sets to assess consistent quality across all three conditions for each variable. Three-way concordance analysis of variants across conditions for each variable was performed. Our results showed that the variant concordance between datasets differing only by sequencing center was similar to the concordance for datasets differing only by replicate, using the same analysis pipeline. We also showed that the statistically significant differences between datasets result from the analysis pipeline used, which can be unified and updated as new approaches become available. We conclude that genome sequencing projects can rely on the quality and reproducibility of aggregate data generated across a network of distributed sites.

5.
Science ; 360(6386): 327-331, 2018 04 20.
Article in English | MEDLINE | ID: mdl-29674594

ABSTRACT

The genetic basis of autism spectrum disorder (ASD) is known to consist of contributions from de novo mutations in variant-intolerant genes. We hypothesize that rare inherited structural variants in cis-regulatory elements (CRE-SVs) of these genes also contribute to ASD. We investigated this by assessing the evidence for natural selection and transmission distortion of CRE-SVs in whole genomes of 9274 subjects from 2600 families affected by ASD. In a discovery cohort of 829 families, structural variants were depleted within promoters and untranslated regions, and paternally inherited CRE-SVs were preferentially transmitted to affected offspring and not to their unaffected siblings. The association of paternal CRE-SVs was replicated in an independent sample of 1771 families. Our results suggest that rare inherited noncoding variants predispose children to ASD, with differing contributions from each parent.


Subject(s)
Autism Spectrum Disorder/genetics , Genetic Predisposition to Disease , Genetic Variation , Paternal Inheritance , Promoter Regions, Genetic/genetics , Exons , Gene Expression Regulation , Genome, Human , Humans , Mutation , Pedigree , RNA, Untranslated/genetics , Selection, Genetic , Sequence Deletion , Transcription Factors/genetics
6.
CMAJ ; 190(5): E126-E136, 2018 02 05.
Article in English | MEDLINE | ID: mdl-29431110

ABSTRACT

BACKGROUND: The Personal Genome Project Canada is a comprehensive public data resource that integrates whole genome sequencing data and health information. We describe genomic variation identified in the initial recruitment cohort of 56 volunteers. METHODS: Volunteers were screened for eligibility and provided informed consent for open data sharing. Using blood DNA, we performed whole genome sequencing and identified all possible classes of DNA variants. A genetic counsellor explained the implication of the results to each participant. RESULTS: Whole genome sequencing of the first 56 participants identified 207 662 805 sequence variants and 27 494 copy number variations. We analyzed a prioritized disease-associated data set (n = 1606 variants) according to standardized guidelines, and interpreted 19 variants in 14 participants (25%) as having obvious health implications. Six of these variants (e.g., in BRCA1 or mosaic loss of an X chromosome) were pathogenic or likely pathogenic. Seven were risk factors for cancer, cardiovascular or neurobehavioural conditions. Four other variants - associated with cancer, cardiac or neurodegenerative phenotypes - remained of uncertain significance because of discrepancies among databases. We also identified a large structural chromosome aberration and a likely pathogenic mitochondrial variant. There were 172 recessive disease alleles (e.g., 5 individuals carried mutations for cystic fibrosis). Pharmacogenomics analyses revealed another 3.9 potentially relevant genotypes per individual. INTERPRETATION: Our analyses identified a spectrum of genetic variants with potential health impact in 25% of participants. When also considering recessive alleles and variants with potential pharmacologic relevance, all 56 participants had medically relevant findings. Although access is mostly limited to research, whole genome sequencing can provide specific and novel information with the potential of major impact for health care.


Subject(s)
Genetic Variation/genetics , Genome, Human/genetics , Sequence Analysis, DNA/methods , Whole Genome Sequencing/methods , Canada , Female , Genes, Recessive/genetics , Genetic Predisposition to Disease/genetics , Humans , Male
7.
Am J Hum Genet ; 102(1): 142-155, 2018 01 04.
Article in English | MEDLINE | ID: mdl-29304372

ABSTRACT

A remaining hurdle to whole-genome sequencing (WGS) becoming a first-tier genetic test has been accurate detection of copy-number variations (CNVs). Here, we used several datasets to empirically develop a detailed workflow for identifying germline CNVs >1 kb from short-read WGS data using read depth-based algorithms. Our workflow is comprehensive in that it addresses all stages of the CNV-detection process, including DNA library preparation, sequencing, quality control, reference mapping, and computational CNV identification. We used our workflow to detect rare, genic CNVs in individuals with autism spectrum disorder (ASD), and 120/120 such CNVs tested using orthogonal methods were successfully confirmed. We also identified 71 putative genic de novo CNVs in this cohort, which had a confirmation rate of 70%; the remainder were incorrectly identified as de novo due to false positives in the proband (7%) or parental false negatives (23%). In individuals with an ASD diagnosis in which both microarray and WGS experiments were performed, our workflow detected all clinically relevant CNVs identified by microarrays, as well as additional potentially pathogenic CNVs < 20 kb. Thus, CNVs of clinical relevance can be discovered from WGS with a detection rate exceeding microarrays, positioning WGS as a single assay for genetic variation detection.


Subject(s)
DNA Copy Number Variations/genetics , Whole Genome Sequencing , Workflow , Algorithms , Child , Female , Haplotypes/genetics , Humans , Male , Reproducibility of Results , Sequence Analysis, DNA
8.
Nat Neurosci ; 20(4): 602-611, 2017 Apr.
Article in English | MEDLINE | ID: mdl-28263302

ABSTRACT

We are performing whole-genome sequencing of families with autism spectrum disorder (ASD) to build a resource (MSSNG) for subcategorizing the phenotypes and underlying genetic factors involved. Here we report sequencing of 5,205 samples from families with ASD, accompanied by clinical information, creating a database accessible on a cloud platform and through a controlled-access internet portal. We found an average of 73.8 de novo single nucleotide variants and 12.6 de novo insertions and deletions or copy number variations per ASD subject. We identified 18 new candidate ASD-risk genes and found that participants bearing mutations in susceptibility genes had significantly lower adaptive ability (P = 6 × 10-4). In 294 of 2,620 (11.2%) of ASD cases, a molecular basis could be determined and 7.2% of these carried copy number variations and/or chromosomal abnormalities, emphasizing the importance of detecting all forms of genetic variation as diagnostic and therapeutic targets in ASD.


Subject(s)
Autism Spectrum Disorder/genetics , Databases, Genetic , Genetic Predisposition to Disease/genetics , Genome-Wide Association Study/methods , Chromosome Aberrations , DNA Copy Number Variations , Humans , Mutagenesis, Insertional/genetics , Phenotype , Polymorphism, Single Nucleotide/genetics , Sequence Deletion/genetics
9.
BMC Bioinformatics ; 6: 23, 2005 Feb 08.
Article in English | MEDLINE | ID: mdl-15701178

ABSTRACT

BACKGROUND: Sequence similarity searching is a powerful tool to help develop hypotheses in the quest to assign functional, structural and evolutionary information to DNA and protein sequences. As sequence databases continue to grow exponentially, it becomes increasingly important to repeat searches at frequent intervals, and similarity searches retrieve larger and larger sets of results. New and potentially significant results may be buried in a long list of previously obtained sequence hits from past searches. RESULTS: ReHAB (Recent Hits Acquired from BLAST) is a tool for finding new protein hits in repeated PSI-BLAST searches. ReHAB compares results from PSI-BLAST searches performed with two versions of a protein sequence database and highlights hits that are present only in the updated database. Results are presented in an easily comprehended table, or in a BLAST-like report, using colors to highlight the new hits. ReHAB is designed to handle large numbers of query sequences, such as whole genomes or sets of genomes. Advanced computer skills are not needed to use ReHAB; the graphics interface is simple to use and was designed with the bench biologist in mind. CONCLUSIONS: This software greatly simplifies the problem of evaluating the output of large numbers of protein database searches.


Subject(s)
Computational Biology/methods , Software , Algorithms , Amino Acid Sequence , Computers , DNA/chemistry , Data Interpretation, Statistical , Databases, Factual , Databases, Genetic , Databases, Protein , Genome , Information Storage and Retrieval , Internet , Molecular Sequence Data , Sequence Alignment , Sequence Analysis, Protein , Sequence Homology, Amino Acid , User-Computer Interface
SELECTION OF CITATIONS
SEARCH DETAIL
...