Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 10 de 10
Filter
Add more filters










Publication year range
1.
Nucleic Acids Res ; 44(8): 3750-62, 2016 05 05.
Article in English | MEDLINE | ID: mdl-27060133

ABSTRACT

Despite representing an important source of genetic variation, tandem repeats (TRs) remain poorly studied due to technical difficulties. We hypothesized that TRs can operate as expression (eQTLs) and methylation (mQTLs) quantitative trait loci. To test this we analyzed the effect of variation at 4849 promoter-associated TRs, genotyped in 120 individuals, on neighboring gene expression and DNA methylation. Polymorphic promoter TRs were associated with increased variance in local gene expression and DNA methylation, suggesting functional consequences related to TR variation. We identified >100 TRs associated with expression/methylation levels of adjacent genes. These potential eQTL/mQTL TRs were enriched for overlaps with transcription factor binding and DNaseI hypersensitivity sites, providing a rationale for their effects. Moreover, we showed that most TR variants are poorly tagged by nearby single nucleotide polymorphisms (SNPs) markers, indicating that many functional TR variants are not effectively assayed by SNP-based approaches. Our study assigns biological significance to TR variations in the human genome, and suggests that a significant fraction of TR variations exert functional effects via alterations of local gene expression or epigenetics. We conclude that targeted studies that focus on genotyping TR variants are required to fully ascertain functional variation in the genome.


Subject(s)
DNA Methylation , Gene Expression Regulation , Polymorphism, Single Nucleotide , Promoter Regions, Genetic , Tandem Repeat Sequences , Genotyping Techniques , Humans , Linkage Disequilibrium , Quantitative Trait Loci , Sequence Analysis, DNA
2.
Am J Hum Genet ; 97(6): 922-32, 2015 Dec 03.
Article in English | MEDLINE | ID: mdl-26637982

ABSTRACT

We describe an X-linked genetic syndrome associated with mutations in TAF1 and manifesting with global developmental delay, intellectual disability (ID), characteristic facial dysmorphology, generalized hypotonia, and variable neurologic features, all in male individuals. Simultaneous studies using diverse strategies led to the identification of nine families with overlapping clinical presentations and affected by de novo or maternally inherited single-nucleotide changes. Two additional families harboring large duplications involving TAF1 were also found to share phenotypic overlap with the probands harboring single-nucleotide changes, but they also demonstrated a severe neurodegeneration phenotype. Functional analysis with RNA-seq for one of the families suggested that the phenotype is associated with downregulation of a set of genes notably enriched with genes regulated by E-box proteins. In addition, knockdown and mutant studies of this gene in zebrafish have shown a quantifiable, albeit small, effect on a neuronal phenotype. Our results suggest that mutations in TAF1 play a critical role in the development of this X-linked ID syndrome.


Subject(s)
Developmental Disabilities/genetics , Histone Acetyltransferases/genetics , Intellectual Disability/genetics , Neurodegenerative Diseases/genetics , TATA-Binding Protein Associated Factors/genetics , Transcription Factor TFIID/genetics , Adolescent , Animals , Child , Child, Preschool , Developmental Disabilities/metabolism , Developmental Disabilities/pathology , Disease Models, Animal , E-Box Elements , Facies , Family , Gene Expression Regulation , Histone Acetyltransferases/metabolism , Humans , Infant , Inheritance Patterns , Intellectual Disability/metabolism , Intellectual Disability/pathology , Male , Mutation , Neurodegenerative Diseases/metabolism , Neurodegenerative Diseases/pathology , Pedigree , Phenotype , Signal Transduction , TATA-Binding Protein Associated Factors/metabolism , Transcription Factor TFIID/metabolism , Young Adult , Zebrafish
3.
Genome Res ; 25(11): 1591-9, 2015 Nov.
Article in English | MEDLINE | ID: mdl-26290536

ABSTRACT

Tandem repeats (TRs) are stretches of DNA that are highly variable in length and mutate rapidly. They are thus an important source of genetic variation. This variation is highly informative for population and conservation genetics. It has also been associated with several pathological conditions and with gene expression regulation. However, genome-wide surveys of TR variation in humans and closely related species have been scarce due to technical difficulties derived from short-read technology. Here we explored the genome-wide diversity of TRs in a panel of 83 human and nonhuman great ape genomes, in a total of six different species, and studied their impact on gene expression evolution. We found that population diversity patterns can be efficiently captured with short TRs (repeat unit length, 1-5 bp). We examined the potential evolutionary role of TRs in gene expression differences between humans and primates by using 30,275 larger TRs (repeat unit length, 2-50 bp). Genes that contained TRs in the promoters, in their 3' untranslated region, in introns, and in exons had higher expression divergence than genes without repeats in the regions. Polymorphic small repeats (1-5 bp) had also higher expression divergence compared with genes with fixed or no TRs in the gene promoters. Our findings highlight the potential contribution of TRs to human evolution through gene regulation.


Subject(s)
Gene Expression Regulation , Genetic Variation , Microsatellite Repeats , Primates/genetics , 3' Untranslated Regions , Animals , Chromosome Mapping , Evolution, Molecular , Exons , Female , Genetic Loci , Genome, Human , Genotyping Techniques , Humans , Introns , Male , Promoter Regions, Genetic
4.
PLoS One ; 10(4): e0122968, 2015.
Article in English | MEDLINE | ID: mdl-25849548

ABSTRACT

Y-chromosomal haplogroup G1 is a minor component of the overall gene pool of South-West and Central Asia but reaches up to 80% frequency in some populations scattered within this area. We have genotyped the G1-defining marker M285 in 27 Eurasian populations (n= 5,346), analyzed 367 M285-positive samples using 17 Y-STRs, and sequenced ~11 Mb of the Y-chromosome in 20 of these samples to an average coverage of 67X. This allowed detailed phylogenetic reconstruction. We identified five branches, all with high geographical specificity: G1-L1323 in Kazakhs, the closely related G1-GG1 in Mongols, G1-GG265 in Armenians and its distant brother clade G1-GG162 in Bashkirs, and G1-GG362 in West Indians. The haplotype diversity, which decreased from West Iran to Central Asia, allows us to hypothesize that this rare haplogroup could have been carried by the expansion of Iranic speakers northwards to the Eurasian steppe and via founder effects became a predominant genetic component of some populations, including the Argyn tribe of the Kazakhs. The remarkable agreement between genetic and genealogical trees of Argyns allowed us to calibrate the molecular clock using a historical date (1405 AD) of the most recent common genealogical ancestor. The mutation rate for Y-chromosomal sequence data obtained was 0.78×10-9 per bp per year, falling within the range of published rates. The mutation rate for Y-chromosomal STRs was 0.0022 per locus per generation, very close to the so-called genealogical rate. The "clan-based" approach to estimating the mutation rate provides a third, middle way between direct farther-to-son comparisons and using archeologically known migrations, whose dates are subject to revision and of uncertain relationship to genetic events.


Subject(s)
Chromosomes, Human, Y/genetics , Gene Frequency , Haplotypes , Human Migration , Humans , Iran , Language , Microsatellite Repeats , Phylogeny , Polymorphism, Single Nucleotide
5.
Nat Commun ; 6: 6275, 2015 Feb 25.
Article in English | MEDLINE | ID: mdl-25711446

ABSTRACT

The standardization and performance testing of analysis tools is a prerequisite to widespread adoption of genome-wide sequencing, particularly in the clinic. However, performance testing is currently complicated by the paucity of standards and comparison metrics, as well as by the heterogeneity in sequencing platforms, applications and protocols. Here we present the genome comparison and analytic testing (GCAT) platform to facilitate development of performance metrics and comparisons of analysis tools across these metrics. Performance is reported through interactive visualizations of benchmark and performance testing data, with support for data slicing and filtering. The platform is freely accessible at http://www.bioplanet.com/gcat.


Subject(s)
Genomics , Software , Benchmarking , Genetic Variation , Humans
6.
Genome Res ; 24(11): 1894-904, 2014 Nov.
Article in English | MEDLINE | ID: mdl-25135957

ABSTRACT

Short tandem repeats are among the most polymorphic loci in the human genome. These loci play a role in the etiology of a range of genetic diseases and have been frequently utilized in forensics, population genetics, and genetic genealogy. Despite this plethora of applications, little is known about the variation of most STRs in the human population. Here, we report the largest-scale analysis of human STR variation to date. We collected information for nearly 700,000 STR loci across more than 1000 individuals in Phase 1 of the 1000 Genomes Project. Extensive quality controls show that reliable allelic spectra can be obtained for close to 90% of the STR loci in the genome. We utilize this call set to analyze determinants of STR variation, assess the human reference genome's representation of STR alleles, find STR loci with common loss-of-function alleles, and obtain initial estimates of the linkage disequilibrium between STRs and common SNPs. Overall, these analyses further elucidate the scale of genetic variation beyond classical point mutations.


Subject(s)
Genetics, Population/methods , Genome, Human/genetics , Microsatellite Repeats/genetics , Polymorphism, Single Nucleotide , Alleles , Gene Frequency , Genetic Variation , Genomics/methods , Genotype , Humans , Linkage Disequilibrium
7.
Genome Res ; 24(7): 1193-208, 2014 Jul.
Article in English | MEDLINE | ID: mdl-24714809

ABSTRACT

The Drosophila melanogaster Genetic Reference Panel (DGRP) is a community resource of 205 sequenced inbred lines, derived to improve our understanding of the effects of naturally occurring genetic variation on molecular and organismal phenotypes. We used an integrated genotyping strategy to identify 4,853,802 single nucleotide polymorphisms (SNPs) and 1,296,080 non-SNP variants. Our molecular population genomic analyses show higher deletion than insertion mutation rates and stronger purifying selection on deletions. Weaker selection on insertions than deletions is consistent with our observed distribution of genome size determined by flow cytometry, which is skewed toward larger genomes. Insertion/deletion and single nucleotide polymorphisms are positively correlated with each other and with local recombination, suggesting that their nonrandom distributions are due to hitchhiking and background selection. Our cytogenetic analysis identified 16 polymorphic inversions in the DGRP. Common inverted and standard karyotypes are genetically divergent and account for most of the variation in relatedness among the DGRP lines. Intriguingly, variation in genome size and many quantitative traits are significantly associated with inversions. Approximately 50% of the DGRP lines are infected with Wolbachia, and four lines have germline insertions of Wolbachia sequences, but effects of Wolbachia infection on quantitative traits are rarely significant. The DGRP complements ongoing efforts to functionally annotate the Drosophila genome. Indeed, 15% of all D. melanogaster genes segregate for potentially damaged proteins in the DGRP, and genome-wide analyses of quantitative traits identify novel candidate genes. The DGRP lines, sequence data, genotypes, quality scores, phenotypes, and analysis and visualization tools are publicly available.


Subject(s)
Drosophila melanogaster/genetics , Genetic Variation , Genome, Insect , Phenotype , Animals , Chromatin/genetics , Chromatin/metabolism , Drosophila melanogaster/microbiology , Female , Genetic Linkage , Genome Size , Genome-Wide Association Study , Genotype , Genotyping Techniques , High-Throughput Nucleotide Sequencing , INDEL Mutation , Linkage Disequilibrium , Male , Molecular Sequence Annotation , Polymorphism, Single Nucleotide , Quantitative Trait, Heritable , Reproducibility of Results
8.
Hum Mutat ; 34(9): 1304-11, 2013 Sep.
Article in English | MEDLINE | ID: mdl-23696428

ABSTRACT

Although simple tandem repeats (STRs) comprise ~2% of the human genome and represent an important source of polymorphism, this class of variation remains understudied. We have developed a cost-effective strategy for performing targeted enrichment of STR regions that utilizes capture probes targeting the flanking sequences of STR loci, enabling specific capture of DNA fragments containing STRs for subsequent high-throughput sequencing. Utilizing a capture design targeting 6,243 STR loci <94 bp and multiplexing eight individuals in a single Illumina HiSeq2000 sequencing lane we were able to call genotypes in at least one individual for 67.5% of the targeted STRs. We observed a strong relationship between (G+C) content and genotyping rate. STRs with moderate (G+C) content were recovered with >90% success rate, whereas only 12% of STRs with ≥ 80% (G+C) were genotyped in our assay. Analysis of a parent-offspring trio, complete hydatidiform mole samples, repeat analyses of the same individual, and Sanger sequencing-based validation indicated genotyping error rates between 7.6% and 12.4%. The majority of such errors were a single repeat unit at mono- or dinucleotide repeats. Altogether, our STR capture assay represents a cost-effective method that enables multiplexed genotyping of thousands of STR loci suitable for large-scale population studies.


Subject(s)
Genomics/methods , High-Throughput Nucleotide Sequencing/methods , Tandem Repeat Sequences , Base Composition , Genetic Variation , Genome, Human , Genotype , HapMap Project , Humans , Reproducibility of Results
9.
Nucleic Acids Res ; 41(1): e32, 2013 Jan 07.
Article in English | MEDLINE | ID: mdl-23090981

ABSTRACT

Repetitive sequences are biologically and clinically important because they can influence traits and disease, but repeats are challenging to analyse using short-read sequencing technology. We present a tool for genotyping microsatellite repeats called RepeatSeq, which uses Bayesian model selection guided by an empirically derived error model that incorporates sequence and read properties. Next, we apply RepeatSeq to high-coverage genomes from the 1000 Genomes Project to evaluate performance and accuracy. The software uses common formats, such as VCF, for compatibility with existing genome analysis pipelines. Source code and binaries are available at http://github.com/adaptivegenome/repeatseq.


Subject(s)
Genotyping Techniques , Microsatellite Repeats , Software , Bayes Theorem , Genome, Human , Genomics/methods , Genotype , High-Throughput Nucleotide Sequencing , Humans
10.
Genome Biol ; 13(12): 324, 2012 Dec 19.
Article in English | MEDLINE | ID: mdl-23253090

ABSTRACT

A report of the fifth annual Personal Genomes and Medical Genomics meeting, held at Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, USA, November 14-17, 2012.


Subject(s)
Genome, Human , Genomics , Precision Medicine , Genetic Diseases, Inborn/diagnosis , Genetic Diseases, Inborn/therapy , High-Throughput Nucleotide Sequencing , Humans , Sequence Analysis, DNA
SELECTION OF CITATIONS
SEARCH DETAIL
...