ABSTRACT
Structural variants are implicated in numerous diseases and make up the majority of varying nucleotides among human genomes. Here we describe an integrated set of eight structural variant classes comprising both balanced and unbalanced variants, which we constructed using short-read DNA sequencing data and statistically phased onto haplotype blocks in 26 human populations. Analysing this set, we identify numerous gene-intersecting structural variants exhibiting population stratification and describe naturally occurring homozygous gene knockouts that suggest the dispensability of a variety of human genes. We demonstrate that structural variants are enriched on haplotypes identified by genome-wide association studies and exhibit enrichment for expression quantitative trait loci. Additionally, we uncover appreciable levels of structural variant complexity at different scales, including genic loci subject to clusters of repeated rearrangement and complex structural variants with multiple breakpoints likely to have formed through individual mutational events. Our catalogue will enhance future studies into structural variant demography, functional impact and disease association.
Subject(s)
Genetic Variation/genetics , Genome, Human/genetics , Physical Chromosome Mapping , Amino Acid Sequence , Genetic Predisposition to Disease , Genetics, Medical , Genetics, Population , Genome-Wide Association Study , Genomics , Genotype , Haplotypes/genetics , Homozygote , Humans , Molecular Sequence Data , Mutation Rate , Polymorphism, Single Nucleotide/genetics , Quantitative Trait Loci/genetics , Sequence Analysis, DNA , Sequence Deletion/geneticsABSTRACT
Brain aging is a complex process, which involves multiple pathways including various components from cellular to molecular. This study aimed to investigate the gene expression changes in zebrafish brains through young-adult to adult, and adult to old age. RNA sequencing was performed on isolated neuronal cells from zebrafish brains. The cells were enriched in progenitor cell markers, which are known to diminish throughout the aging process. We found 176 statistically significant, differentially expressed genes among the groups, and identified a group of genes based on gene ontology descriptions, which were classified as cell adhesion molecules. The relevance of these genes was further tested in another set of zebrafish brains, human healthy, and Alzheimer's disease brain samples, as well as in Allen Brain Atlas data. We observed that the expression change of 2 genes, GJC2 and ALCAM, during the aging process was consistent in all experimental sets. Our findings provide a new set of markers for healthy brain aging and suggest new targets for therapeutic approaches to neurodegenerative diseases.
Subject(s)
Aging/genetics , Aging/metabolism , Brain/metabolism , Cell Adhesion Molecules/genetics , Cell Adhesion Molecules/metabolism , RNA/genetics , RNA/metabolism , Sequence Analysis, RNA/methods , Activated-Leukocyte Cell Adhesion Molecule/genetics , Activated-Leukocyte Cell Adhesion Molecule/metabolism , Alzheimer Disease/genetics , Alzheimer Disease/metabolism , Animals , Antigens, CD/genetics , Antigens, CD/metabolism , Cell Adhesion Molecules, Neuronal/genetics , Cell Adhesion Molecules, Neuronal/metabolism , Connexins/genetics , Fetal Proteins/genetics , Fetal Proteins/metabolism , Gene Expression/genetics , Gene Expression Regulation, Developmental/genetics , Humans , ZebrafishABSTRACT
With the developments in high-throughput sequencing (HTS) technologies, researchers have gained a powerful tool to identify structural variants (SVs) in genomes with substantially less cost than before. SVs can be broadly classified into two main categories: balanced rearrangements and copy number variations (CNVs). Many algorithms have been developed to characterize CNVs using HTS data, with focus on different types and size range of variants using different read signatures. Read depth (RD) based tools are more common in characterizing large (>10 kb) CNVs since RD strategy does not rely on the fragment size and read length, which are limiting factors in read pair and split read analysis. Here we provide a guideline for a user friendly tool for detecting large segmental duplications and deletions that can also predict integer copy numbers for duplicated genes.