Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 38
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Genome Biol ; 21(1): 202, 2020 08 10.
Artículo en Inglés | MEDLINE | ID: mdl-32778141

RESUMEN

BACKGROUND: The complex interspersed pattern of segmental duplications in humans is responsible for rearrangements associated with neurodevelopmental disease, including the emergence of novel genes important in human brain evolution. We investigate the evolution of LCR16a, a putative driver of this phenomenon that encodes one of the most rapidly evolving human-ape gene families, nuclear pore interacting protein (NPIP). RESULTS: Comparative analysis shows that LCR16a has independently expanded in five primate lineages over the last 35 million years of primate evolution. The expansions are associated with independent lineage-specific segmental duplications flanking LCR16a leading to the emergence of large interspersed duplication blocks at non-orthologous chromosomal locations in each primate lineage. The intron-exon structure of the NPIP gene family has changed dramatically throughout primate evolution with different branches showing characteristic gene models yet maintaining an open reading frame. In the African ape lineage, we detect signatures of positive selection that occurred after a transition to more ubiquitous expression among great ape tissues when compared to Old World and New World monkeys. Mouse transgenic experiments from baboon and human genomic loci confirm these expression differences and suggest that the broader ape expression pattern arose due to mutational changes that emerged in cis. CONCLUSIONS: LCR16a promotes serial interspersed duplications and creates hotspots of genomic instability that appear to be an ancient property of primate genomes. Dramatic changes to NPIP gene structure and altered tissue expression preceded major bouts of positive selection in the African ape lineage, suggestive of a gene undergoing strong adaptive evolution.


Asunto(s)
Evolución Molecular , Duplicación de Gen , Primates/genética , Duplicaciones Segmentarias en el Genoma , Animales , Biodiversidad , Encéfalo , Mapeo Cromosómico , Cromosomas , Exones , Fusión Génica , Genoma Humano , Inestabilidad Genómica , Hominidae , Humanos , Filogenia
2.
Methods Mol Biol ; 2161: 209-228, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-32681515

RESUMEN

R-loops are three-stranded structures that form during transcription when the nascent RNA hybridizes with the template DNA resulting in a DNA:RNA hybrid and a looped-out single-stranded DNA (ssDNA) strand. These structures are important for normal cellular processes and aberrant R-loop formation has been implicated in a number of pathological outcomes, including certain cancers and neurodegenerative diseases. Mapping R-loops has primarily been performed using DRIP (DNA:RNA immunoprecipitation) based methods that are dependent on the anti-DNA:RNA hybrid S9.6 antibody and short-read sequencing. While DRIP-based methods are robust and report R-loop formation genome-wide, they only do so at the population average level; interrogating R-loop formation at the single molecule level is not feasible with such approaches. Here we present single molecule R-loop footprinting (SMRF-seq), a method that relies on the chemical reactivity of the displaced ssDNA strand to non-denaturing sodium bisulfite and single molecule long-read sequencing as a readout, to characterize R-loops. SMRF-seq can be used independently of S9.6 to generate high resolution, strand-specific, maps of individual R-loops at ultra-deep coverage on kilobases-length DNA fragments.


Asunto(s)
Estructuras R-Loop , Análisis de Secuencia de ARN/métodos , Células HeLa , Humanos
3.
J Mol Biol ; 432(7): 2271-2288, 2020 03 27.
Artículo en Inglés | MEDLINE | ID: mdl-32105733

RESUMEN

R-loops are a prevalent class of non-B DNA structures that have been associated with both positive and negative cellular outcomes. DNA:RNA immunoprecipitation (DRIP) approaches based on the anti-DNA:RNA hybrid S9.6 antibody revealed that R-loops form dynamically over conserved genic hotspots. We have developed an orthogonal approach that queries R-loops via the presence of long stretches of single-stranded DNA on their looped-out strand. Nondenaturing sodium bisulfite treatment catalyzes the conversion of unpaired cytosines to uracils, creating permanent genetic tags for the position of an R-loop. Long-read, single-molecule PacBio sequencing allows the identification of R-loop 'footprints' at near nucleotide resolution in a strand-specific manner on long single DNA molecules and at ultra-deep coverage. Single-molecule R-loop footprinting coupled with PacBio sequencing (SMRF-seq) revealed a strong agreement between S9.6-based and bisulfite-based R-loop mapping and confirmed that R-loops form over genic hotspots, including gene bodies and terminal gene regions. Based on the largest single-molecule R-loop dataset to date, we show that individual R-loops form nonrandomly, defining discrete sets of overlapping molecular clusters that pileup through larger R-loop zones. R-loops most often map to intronic regions and their individual start and stop positions do not match with intron-exon boundaries, reinforcing the model that they form cotranscriptionally from unspliced transcripts. SMRF-seq further established that R-loop distribution patterns are not simply driven by intrinsic DNA sequence features but most likely also reflect DNA topological constraints. Overall, DRIP-based and SMRF-based approaches independently provide a complementary and congruent view of R-loop distribution, consolidating our understanding of the principles underlying R-loop formation.


Asunto(s)
ADN/química , Células Madre de Carcinoma Embrionario/metabolismo , Estructuras R-Loop , ARN/química , Análisis de la Célula Individual/métodos , Transcripción Genética , Células Madre de Carcinoma Embrionario/citología , Humanos
4.
Proc Natl Acad Sci U S A ; 116(13): 6260-6269, 2019 03 26.
Artículo en Inglés | MEDLINE | ID: mdl-30850542

RESUMEN

R-loops are abundant three-stranded nucleic-acid structures that form in cis during transcription. Experimental evidence suggests that R-loop formation is affected by DNA sequence and topology. However, the exact manner by which these factors interact to determine R-loop susceptibility is unclear. To investigate this, we developed a statistical mechanical equilibrium model of R-loop formation in superhelical DNA. In this model, the energy involved in forming an R-loop includes four terms-junctional and base-pairing energies and energies associated with superhelicity and with the torsional winding of the displaced DNA single strand around the RNA:DNA hybrid. This model shows that the significant energy barrier imposed by the formation of junctions can be overcome in two ways. First, base-pairing energy can favor RNA:DNA over DNA:DNA duplexes in favorable sequences. Second, R-loops, by absorbing negative superhelicity, partially or fully relax the rest of the DNA domain, thereby returning it to a lower energy state. In vitro transcription assays confirmed that R-loops cause plasmid relaxation and that negative superhelicity is required for R-loops to form, even in a favorable region. Single-molecule R-loop footprinting following in vitro transcription showed a strong agreement between theoretical predictions and experimental mapping of stable R-loop positions and further revealed the impact of DNA topology on the R-loop distribution landscape. Our results clarify the interplay between base sequence and DNA superhelicity in controlling R-loop stability. They also reveal R-loops as powerful and reversible topology sinks that cells may use to nonenzymatically relieve superhelical stress during transcription.


Asunto(s)
Secuencia de Bases , ADN Superhelicoidal/química , ADN/química , Conformación de Ácido Nucleico , ADN de Cadena Simple/química , Modelos Genéticos , Hibridación de Ácido Nucleico , Plásmidos/química , ARN/química , Transcripción Genética
5.
Nat Ecol Evol ; 1(3): 69, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-28580430

RESUMEN

Segmental duplications contribute to human evolution, adaptation and genomic instability but are often poorly characterized. We investigate the evolution, genetic variation and coding potential of human-specific segmental duplications (HSDs). We identify 218 HSDs based on analysis of 322 deeply sequenced archaic and contemporary hominid genomes. We sequence 550 human and nonhuman primate genomic clones to reconstruct the evolution of the largest, most complex regions with protein-coding potential (n=80 genes/33 gene families). We show that HSDs are non-randomly organized, associate preferentially with ancestral ape duplications termed "core duplicons", and evolved primarily in an interspersed inverted orientation. In addition to Homo sapiens-specific gene expansions (e.g., TCAF1/2), we highlight ten gene families (e.g., ARHGAP11B and SRGAP2C) where copy number never returns to the ancestral state, there is evidence of mRNA splicing, and no common gene-disruptive mutations are observed in the general population. Such duplicates are candidates for the evolution of human-specific adaptive traits.

6.
Sci Rep ; 7: 41980, 2017 02 03.
Artículo en Inglés | MEDLINE | ID: mdl-28155877

RESUMEN

Most evolutionary new centromeres (ENC) are composed of large arrays of satellite DNA and surrounded by segmental duplications. However, the hypothesis is that ENCs are seeded in an anonymous sequence and only over time have acquired the complexity of "normal" centromeres. Up to now evidence to test this hypothesis was lacking. We recently discovered that the well-known polymorphism of orangutan chromosome 12 was due to the presence of an ENC. We sequenced the genome of an orangutan homozygous for the ENC, and we focused our analysis on the comparison of the ENC domain with respect to its wild type counterpart. No significant variations were found. This finding is the first clear evidence that ENC seedings are epigenetic in nature. The compaction of the ENC domain was found significantly higher than the corresponding WT region and, interestingly, the expression of the only gene embedded in the region was significantly repressed.


Asunto(s)
Centrómero/genética , Epigénesis Genética , Evolución Molecular , Animales , Línea Celular , Secuencia Conservada , ADN Satélite/genética , Humanos , Pongo abelii
7.
Nature ; 536(7615): 205-9, 2016 08 11.
Artículo en Inglés | MEDLINE | ID: mdl-27487209

RESUMEN

Genetic differences that specify unique aspects of human evolution have typically been identified by comparative analyses between the genomes of humans and closely related primates, including more recently the genomes of archaic hominins. Not all regions of the genome, however, are equally amenable to such study. Recurrent copy number variation (CNV) at chromosome 16p11.2 accounts for approximately 1% of cases of autism and is mediated by a complex set of segmental duplications, many of which arose recently during human evolution. Here we reconstruct the evolutionary history of the locus and identify bolA family member 2 (BOLA2) as a gene duplicated exclusively in Homo sapiens. We estimate that a 95-kilobase-pair segment containing BOLA2 duplicated across the critical region approximately 282 thousand years ago (ka), one of the latest among a series of genomic changes that dramatically restructured the locus during hominid evolution. All humans examined carried one or more copies of the duplication, which nearly fixed early in the human lineage--a pattern unlikely to have arisen so rapidly in the absence of selection (P < 0.0097). We show that the duplication of BOLA2 led to a novel, human-specific in-frame fusion transcript and that BOLA2 copy number correlates with both RNA expression (r = 0.36) and protein level (r = 0.65), with the greatest expression difference between human and chimpanzee in experimentally derived stem cells. Analyses of 152 patients carrying a chromosome 16p11. rearrangement show that more than 96% of breakpoints occur within the H. sapiens-specific duplication. In summary, the duplicative transposition of BOLA2 at the root of the H. sapiens lineage about 282 ka simultaneously increased copy number of a gene associated with iron homeostasis and predisposed our species to recurrent rearrangements associated with disease.


Asunto(s)
Cromosomas Humanos Par 16/genética , Variaciones en el Número de Copia de ADN/genética , Evolución Molecular , Predisposición Genética a la Enfermedad , Proteínas/genética , Animales , Trastorno Autístico/genética , Rotura Cromosómica , Duplicación de Gen , Homeostasis/genética , Humanos , Hierro/metabolismo , Pan troglodytes/genética , Pongo/genética , Proteínas/análisis , Recombinación Genética , Especificidad de la Especie , Factores de Tiempo
8.
G3 (Bethesda) ; 6(7): 2213-23, 2016 07 07.
Artículo en Inglés | MEDLINE | ID: mdl-27207956

RESUMEN

Skeletal atavism in Shetland ponies is a heritable disorder characterized by abnormal growth of the ulna and fibula that extend the carpal and tarsal joints, respectively. This causes abnormal skeletal structure and impaired movements, and affected foals are usually killed. In order to identify the causal mutation we subjected six confirmed Swedish cases and a DNA pool consisting of 21 control individuals to whole genome resequencing. We screened for polymorphisms where the cases and the control pool were fixed for opposite alleles and observed this signature for only 25 SNPs, most of which were scattered on genome assembly unassigned scaffolds. Read depth analysis at these loci revealed homozygosity or compound heterozygosity for two partially overlapping large deletions in the pseudoautosomal region (PAR) of chromosome X/Y in cases but not in the control pool. One of these deletions removes the entire coding region of the SHOX gene and both deletions remove parts of the CRLF2 gene located downstream of SHOX. The horse reference assembly of the PAR is highly fragmented, and in order to characterize this region we sequenced bacterial artificial chromosome (BAC) clones by single-molecule real-time (SMRT) sequencing technology. This considerably improved the assembly and enabled size estimations of the two deletions to 160-180 kb and 60-80 kb, respectively. Complete association between the presence of these deletions and disease status was verified in eight other affected horses. The result of the present study is consistent with previous studies in humans showing crucial importance of SHOX for normal skeletal development.


Asunto(s)
Huesos/metabolismo , Mapeo Cromosómico , Genoma , Proteínas de Homeodominio/genética , Caballos/genética , Regiones Pseudoautosómicas/química , Eliminación de Secuencia , Animales , Secuencia de Bases , Huesos/anomalías , Femenino , Sitios Genéticos , Heterocigoto , Secuenciación de Nucleótidos de Alto Rendimiento , Proteínas de Homeodominio/metabolismo , Homocigoto , Masculino , Regiones Pseudoautosómicas/metabolismo , Receptores de Citocinas/genética , Receptores de Citocinas/metabolismo
9.
Science ; 352(6281): aae0344, 2016 Apr 01.
Artículo en Inglés | MEDLINE | ID: mdl-27034376

RESUMEN

Accurate sequence and assembly of genomes is a critical first step for studies of genetic variation. We generated a high-quality assembly of the gorilla genome using single-molecule, real-time sequence technology and a string graph de novo assembly algorithm. The new assembly improves contiguity by two to three orders of magnitude with respect to previously released assemblies, recovering 87% of missing reference exons and incomplete gene models. Although regions of large, high-identity segmental duplications remain largely unresolved, this comprehensive assembly provides new biological insight into genetic diversity, structural variation, gene loss, and representation of repeat structures within the gorilla genome. The approach provides a path forward for the routine assembly of mammalian genomes at a level approaching that of the current quality of the human genome.


Asunto(s)
Gorilla gorilla/genética , Análisis de Secuencia de ADN/métodos , Animales , Mapeo Contig , Evolución Molecular , Etiquetas de Secuencia Expresada , Femenino , Variación Genética , Genoma Humano , Genómica , Humanos , Alineación de Secuencia
10.
Proc Natl Acad Sci U S A ; 112(52): E7223-9, 2015 Dec 29.
Artículo en Inglés | MEDLINE | ID: mdl-26668394

RESUMEN

NK-lysin is an antimicrobial peptide and effector protein in the host innate immune system. It is coded by a single gene in humans and most other mammalian species. In this study, we provide evidence for the existence of four NK-lysin genes in a repetitive region on cattle chromosome 11. The NK2A, NK2B, and NK2C genes are tandemly arrayed as three copies in ∼30-35-kb segments, located 41.8 kb upstream of NK1. All four genes are functional, albeit with differential tissue expression. NK1, NK2A, and NK2B exhibited the highest expression in intestine Peyer's patch, whereas NK2C was expressed almost exclusively in lung. The four peptide products were synthesized ex vivo, and their antimicrobial effects against both Gram-positive and Gram-negative bacteria were confirmed with a bacteria-killing assay. Transmission electron microcopy indicated that bovine NK-lysins exhibited their antimicrobial activities by lytic action in the cell membranes. In summary, the single NK-lysin gene in other mammals has expanded to a four-member gene family by tandem duplications in cattle; all four genes are transcribed, and the synthetic peptides corresponding to the core regions are biologically active and likely contribute to innate immunity in ruminants.


Asunto(s)
Bovinos/genética , Dosificación de Gen , Familia de Multigenes , Proteolípidos/genética , Secuencia de Aminoácidos , Animales , Secuencia de Bases , Cromosomas de los Mamíferos/genética , Escherichia coli/efectos de los fármacos , Escherichia coli/crecimiento & desarrollo , Escherichia coli/ultraestructura , Perfilación de la Expresión Génica , Orden Génico , Microscopía Electrónica de Transmisión , Datos de Secuencia Molecular , Especificidad de Órganos/genética , Péptidos/farmacología , Filogenia , Proteolípidos/clasificación , Proteolípidos/farmacología , Homología de Secuencia de Aminoácido , Homología de Secuencia de Ácido Nucleico
11.
Nature ; 526(7571): 75-81, 2015 Oct 01.
Artículo en Inglés | MEDLINE | ID: mdl-26432246

RESUMEN

Structural variants are implicated in numerous diseases and make up the majority of varying nucleotides among human genomes. Here we describe an integrated set of eight structural variant classes comprising both balanced and unbalanced variants, which we constructed using short-read DNA sequencing data and statistically phased onto haplotype blocks in 26 human populations. Analysing this set, we identify numerous gene-intersecting structural variants exhibiting population stratification and describe naturally occurring homozygous gene knockouts that suggest the dispensability of a variety of human genes. We demonstrate that structural variants are enriched on haplotypes identified by genome-wide association studies and exhibit enrichment for expression quantitative trait loci. Additionally, we uncover appreciable levels of structural variant complexity at different scales, including genic loci subject to clusters of repeated rearrangement and complex structural variants with multiple breakpoints likely to have formed through individual mutational events. Our catalogue will enhance future studies into structural variant demography, functional impact and disease association.


Asunto(s)
Variación Genética/genética , Genoma Humano/genética , Mapeo Físico de Cromosoma , Secuencia de Aminoácidos , Predisposición Genética a la Enfermedad , Genética Médica , Genética de Población , Estudio de Asociación del Genoma Completo , Genómica , Genotipo , Haplotipos/genética , Homocigoto , Humanos , Datos de Secuencia Molecular , Tasa de Mutación , Polimorfismo de Nucleótido Simple/genética , Sitios de Carácter Cuantitativo/genética , Análisis de Secuencia de ADN , Eliminación de Secuencia/genética
12.
Genes Immun ; 16(1): 24-34, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-25338678

RESUMEN

Germline variation at immunoglobulin (IG) loci is critical for pathogen-mediated immunity, but establishing complete haplotype sequences in these regions has been problematic because of complex sequence architecture and diploid source DNA. We sequenced BAC clones from the effectively haploid human hydatidiform mole cell line, CHM1htert, across the light chain IG loci, kappa (IGK) and lambda (IGL), creating single haplotype representations of these regions. The IGL haplotype generated here is 1.25 Mb of contiguous sequence, including four novel IGLV alleles, one novel IGLC allele, and an 11.9-kb insertion. The CH17 IGK haplotype consists of two 644 kb proximal and 466 kb distal contigs separated by a large gap of unknown size; these assemblies added 49 kb of unique sequence extending into this gap. Our analysis also resulted in the characterization of seven novel IGKV alleles and a 16.7-kb region exhibiting signatures of interlocus sequence exchange between distal and proximal IGKV gene clusters. Genetic diversity in IGK/IGL was compared with that of the IG heavy chain (IGH) locus within the same haploid genome, revealing threefold (IGK) and sixfold (IGL) higher diversity in the IGH locus, potentially associated with increased levels of segmental duplication and the telomeric location of IGH.


Asunto(s)
Genes de las Cadenas Ligeras de las Inmunoglobulinas , Mola Hidatiforme/genética , Línea Celular Tumoral , Cromosomas Artificiales Bacterianos , Femenino , Genes de las Cadenas Pesadas de las Inmunoglobulinas , Humanos , Datos de Secuencia Molecular , Polimorfismo de Nucleótido Simple , Embarazo
13.
Nature ; 517(7536): 608-11, 2015 Jan 29.
Artículo en Inglés | MEDLINE | ID: mdl-25383537

RESUMEN

The human genome is arguably the most complete mammalian reference assembly, yet more than 160 euchromatic gaps remain and aspects of its structural variation remain poorly understood ten years after its completion. To identify missing sequence and genetic variation, here we sequence and analyse a haploid human genome (CHM1) using single-molecule, real-time DNA sequencing. We close or extend 55% of the remaining interstitial gaps in the human GRCh37 reference genome--78% of which carried long runs of degenerate short tandem repeats, often several kilobases in length, embedded within (G+C)-rich genomic regions. We resolve the complete sequence of 26,079 euchromatic structural variants at the base-pair level, including inversions, complex insertions and long tracts of tandem repeats. Most have not been previously reported, with the greatest increases in sensitivity occurring for events less than 5 kilobases in size. Compared to the human reference, we find a significant insertional bias (3:1) in regions corresponding to complex insertions and long short tandem repeats. Our results suggest a greater complexity of the human genome in the form of variation of longer and more complex repetitive DNA that can now be largely resolved with the application of this longer-read sequencing technology.


Asunto(s)
Variación Genética/genética , Genoma Humano/genética , Genómica , Análisis de Secuencia de ADN/métodos , Inversión Cromosómica/genética , Cromosomas Humanos Par 10/genética , Clonación Molecular , Secuencia Rica en GC/genética , Haploidia , Humanos , Mutagénesis Insercional/genética , Estándares de Referencia , Secuencias Repetidas en Tándem/genética
14.
Nat Genet ; 46(12): 1293-302, 2014 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-25326701

RESUMEN

Recurrent deletions of chromosome 15q13.3 associate with intellectual disability, schizophrenia, autism and epilepsy. To gain insight into the instability of this region, we sequenced it in affected individuals, normal individuals and nonhuman primates. We discovered five structural configurations of the human chromosome 15q13.3 region ranging in size from 2 to 3 Mb. These configurations arose recently (∼0.5-0.9 million years ago) as a result of human-specific expansions of segmental duplications and two independent inversion events. All inversion breakpoints map near GOLGA8 core duplicons-a ∼14-kb primate-specific chromosome 15 repeat that became organized into larger palindromic structures. GOLGA8-flanked palindromes also demarcate the breakpoints of recurrent 15q13.3 microdeletions, the expansion of chromosome 15 segmental duplications in the human lineage and independent structural changes in apes. The significant clustering (P = 0.002) of breakpoints provides mechanistic evidence for the role of this core duplicon and its palindromic architecture in promoting the evolutionary and disease-related instability of chromosome 15.


Asunto(s)
Trastornos de los Cromosomas/genética , Discapacidad Intelectual/genética , Secuencias Repetitivas de Ácidos Nucleicos , Duplicaciones Segmentarias en el Genoma , Convulsiones/genética , Animales , Evolución Biológica , Deleción Cromosómica , Cromosomas Artificiales Bacterianos , Cromosomas Humanos Par 15/genética , Análisis por Conglomerados , Hibridación Genómica Comparativa , Dosificación de Gen , Genoma Humano , Humanos , Hibridación Fluorescente in Situ , Modelos Genéticos , Polimorfismo Genético , Primates , Análisis de Secuencia de ADN
15.
PLoS One ; 9(8): e104396, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-25116239

RESUMEN

Asthma is a complex genetic disease caused by a combination of genetic and environmental risk factors. We sought to test classes of genetic variants largely missed by genome-wide association studies (GWAS), including copy number variants (CNVs) and low-frequency variants, by performing whole-genome sequencing (WGS) on 16 individuals from asthma-enriched and asthma-depleted families. The samples were obtained from an extended 13-generation Hutterite pedigree with reduced genetic heterogeneity due to a small founding gene pool and reduced environmental heterogeneity as a result of a communal lifestyle. We sequenced each individual to an average depth of 13-fold, generated a comprehensive catalog of genetic variants, and tested the most severe mutations for association with asthma. We identified and validated 1960 CNVs, 19 nonsense or splice-site single nucleotide variants (SNVs), and 18 insertions or deletions that were out of frame. As follow-up, we performed targeted sequencing of 16 genes in 837 cases and 540 controls of Puerto Rican ancestry and found that controls carry a significantly higher burden of mutations in IL27RA (2.0% of controls; 0.23% of cases; nominal p = 0.004; Bonferroni p = 0.21). We also genotyped 593 CNVs in 1199 Hutterite individuals. We identified a nominally significant association (p = 0.03; Odds ratio (OR) = 3.13) between a 6 kbp deletion in an intron of NEDD4L and increased risk of asthma. We genotyped this deletion in an additional 4787 non-Hutterite individuals (nominal p = 0.056; OR = 1.69). NEDD4L is expressed in bronchial epithelial cells, and conditional knockout of this gene in the lung in mice leads to severe inflammation and mucus accumulation. Our study represents one of the early instances of applying WGS to complex disease with a large environmental component and demonstrates how WGS can identify risk variants, including CNVs and low-frequency variants, largely untested in GWAS.


Asunto(s)
Asma/genética , Efecto Fundador , Predisposición Genética a la Enfermedad , Genoma Humano , Estudio de Asociación del Genoma Completo , Alelos , Mapeo Cromosómico , Hibridación Genómica Comparativa , Variaciones en el Número de Copia de ADN , Complejos de Clasificación Endosomal Requeridos para el Transporte/genética , Femenino , Frecuencia de los Genes , Variación Genética , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Intrones , Masculino , Ubiquitina-Proteína Ligasas Nedd4 , Polimorfismo de Nucleótido Simple , Grupos de Población/genética , Eliminación de Secuencia , Ubiquitina-Proteína Ligasas/genética
16.
Genome Res ; 24(4): 688-96, 2014 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-24418700

RESUMEN

Obtaining high-quality sequence continuity of complex regions of recent segmental duplication remains one of the major challenges of finishing genome assemblies. In the human and mouse genomes, this was achieved by targeting large-insert clones using costly and laborious capillary-based sequencing approaches. Sanger shotgun sequencing of clone inserts, however, has now been largely abandoned, leaving most of these regions unresolved in newer genome assemblies generated primarily by next-generation sequencing hybrid approaches. Here we show that it is possible to resolve regions that are complex in a genome-wide context but simple in isolation for a fraction of the time and cost of traditional methods using long-read single molecule, real-time (SMRT) sequencing and assembly technology from Pacific Biosciences (PacBio). We sequenced and assembled BAC clones corresponding to a 1.3-Mbp complex region of chromosome 17q21.31, demonstrating 99.994% identity to Sanger assemblies of the same clones. We targeted 44 differences using Illumina sequencing and find that PacBio and Sanger assemblies share a comparable number of validated variants, albeit with different sequence context biases. Finally, we targeted a poorly assembled 766-kbp duplicated region of the chimpanzee genome and resolved the structure and organization for a fraction of the cost and time of traditional finishing approaches. Our data suggest a straightforward path for upgrading genomes to a higher quality finished state.


Asunto(s)
Cromosomas Humanos Par 17/genética , Genoma Bacteriano/genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Animales , Cromosomas Artificiales Bacterianos/genética , Humanos , Ratones , Datos de Secuencia Molecular , Pan troglodytes/genética
17.
Genome Res ; 23(11): 1763-73, 2013 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-24077392

RESUMEN

Ape chromosomes homologous to human chromosomes 14 and 15 were generated by a fission event of an ancestral submetacentric chromosome, where the two chromosomes were joined head-to-tail. The hominoid ancestral chromosome most closely resembles the macaque chromosome 7. In this work, we provide insights into the evolution of human chromosomes 14 and 15, performing a comparative study between macaque boundary region 14/15 and the orthologous human regions. We construct a 1.6-Mb contig of macaque BAC clones in the region orthologous to the ancestral hominoid fission site and use it to define the structural changes that occurred on human 14q pericentromeric and 15q subtelomeric regions. We characterize the novel euchromatin-heterochromatin transition region (∼20 Mb) acquired during the neocentromere establishment on chromosome 14, and find it was mainly derived through pericentromeric duplications from ancestral hominoid chromosomes homologous to human 2q14-qter and 10. Further, we show a relationship between evolutionary hotspots and low-copy repeat loci for chromosome 15, revealing a possible role of segmental duplications not only in mediating but also in "stitching" together rearrangement breakpoints.


Asunto(s)
Cromosomas Humanos Par 14/genética , Cromosomas Humanos Par 15/genética , Cromosomas de los Mamíferos/genética , Evolución Molecular , Hominidae/genética , Duplicaciones Segmentarias en el Genoma , Animales , Puntos de Rotura del Cromosoma , Duplicación Cromosómica , Cromosomas Artificiales Bacterianos , Clonación Molecular , Eucromatina/genética , Heterocromatina/genética , Humanos , Datos de Secuencia Molecular , Filogenia
18.
Proc Natl Acad Sci U S A ; 110(33): 13457-62, 2013 Aug 13.
Artículo en Inglés | MEDLINE | ID: mdl-23884656

RESUMEN

We analyzed 83 fully sequenced great ape genomes for mobile element insertions, predicting a total of 49,452 fixed and polymorphic Alu and long interspersed element 1 (L1) insertions not present in the human reference assembly and assigning each retrotransposition event to a different time point during great ape evolution. We used these homoplasy-free markers to construct a mobile element insertions-based phylogeny of humans and great apes and demonstrate their differential power to discern ape subspecies and populations. Within this context, we find a good correlation between L1 diversity and single-nucleotide polymorphism heterozygosity (r(2) = 0.65) in contrast to Alu repeats, which show little correlation (r(2) = 0.07). We estimate that the "rate" of Alu retrotransposition has differed by a factor of 15-fold in these lineages. Humans, chimpanzees, and bonobos show the highest rates of Alu accumulation--the latter two since divergence 1.5 Mya. The L1 insertion rate, in contrast, has remained relatively constant, with rates differing by less than a factor of three. We conclude that Alu retrotransposition has been the most variable form of genetic variation during recent human-great ape evolution, with increases and decreases occurring over very short periods of evolutionary time.


Asunto(s)
Variación Genética , Genoma/genética , Hominidae/genética , Filogenia , Elementos Alu/genética , Animales , Análisis por Conglomerados , Cartilla de ADN/genética , Genómica , Hominidae/clasificación , Humanos , Funciones de Verosimilitud , Elementos de Nucleótido Esparcido Largo/genética , Reacción en Cadena de la Polimerasa , Polimorfismo de Nucleótido Simple , Análisis de Componente Principal , Especificidad de la Especie
19.
Nature ; 499(7459): 471-5, 2013 Jul 25.
Artículo en Inglés | MEDLINE | ID: mdl-23823723

RESUMEN

Most great ape genetic variation remains uncharacterized; however, its study is critical for understanding population history, recombination, selection and susceptibility to disease. Here we sequence to high coverage a total of 79 wild- and captive-born individuals representing all six great ape species and seven subspecies and report 88.8 million single nucleotide polymorphisms. Our analysis provides support for genetically distinct populations within each species, signals of gene flow, and the split of common chimpanzees into two distinct groups: Nigeria-Cameroon/western and central/eastern populations. We find extensive inbreeding in almost all wild populations, with eastern gorillas being the most extreme. Inferred effective population sizes have varied radically over time in different lineages and this appears to have a profound effect on the genetic diversity at, or close to, genes in almost all species. We discover and assign 1,982 loss-of-function variants throughout the human and great ape lineages, determining that the rate of gene loss has not been different in the human branch compared to other internal branches in the great ape phylogeny. This comprehensive catalogue of great ape genome diversity provides a framework for understanding evolution and a resource for more effective management of wild and captive great ape populations.


Asunto(s)
Variación Genética , Hominidae/genética , África , Animales , Animales Salvajes/genética , Animales de Zoológico/genética , Asia Sudoriental , Evolución Molecular , Flujo Génico/genética , Genética de Población , Genoma/genética , Gorilla gorilla/clasificación , Gorilla gorilla/genética , Hominidae/clasificación , Humanos , Endogamia , Pan paniscus/clasificación , Pan paniscus/genética , Pan troglodytes/clasificación , Pan troglodytes/genética , Filogenia , Polimorfismo de Nucleótido Simple/genética , Densidad de Población
20.
Genome Res ; 23(9): 1373-82, 2013 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-23825009

RESUMEN

Copy number variation (CNV) contributes to disease and has restructured the genomes of great apes. The diversity and rate of this process, however, have not been extensively explored among great ape lineages. We analyzed 97 deeply sequenced great ape and human genomes and estimate 16% (469 Mb) of the hominid genome has been affected by recent CNV. We identify a comprehensive set of fixed gene deletions (n = 340) and duplications (n = 405) as well as >13.5 Mb of sequence that has been specifically lost on the human lineage. We compared the diversity and rates of copy number and single nucleotide variation across the hominid phylogeny. We find that CNV diversity partially correlates with single nucleotide diversity (r(2) = 0.5) and recapitulates the phylogeny of apes with few exceptions. Duplications significantly outpace deletions (2.8-fold). The load of segregating duplications remains significantly higher in bonobos, Western chimpanzees, and Sumatran orangutans-populations that have experienced recent genetic bottlenecks (P = 0.0014, 0.02, and 0.0088, respectively). The rate of fixed deletion has been more clocklike with the exception of the chimpanzee lineage, where we observe a twofold increase in the chimpanzee-bonobo ancestor (P = 4.79 × 10(-9)) and increased deletion load among Western chimpanzees (P = 0.002). The latter includes the first genomic disorder in a chimpanzee with features resembling Smith-Magenis syndrome mediated by a chimpanzee-specific increase in segmental duplication complexity. We hypothesize that demographic effects, such as bottlenecks, have contributed to larger and more gene-rich segments being deleted in the chimpanzee lineage and that this effect, more generally, may account for episodic bursts in CNV during hominid evolution.


Asunto(s)
Variaciones en el Número de Copia de ADN , Evolución Molecular , Hominidae/genética , Filogenia , Animales , Secuencia de Bases , Eliminación de Gen , Duplicación de Gen , Carga Genética , Genoma Humano , Humanos , Datos de Secuencia Molecular , Linaje , Polimorfismo de Nucleótido Simple , Análisis de Secuencia de ADN
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...