Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 39
Filtrar
1.
medRxiv ; 2024 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-38853970

RESUMO

Background: Cytogenetic analysis encompasses a suite of standard-of-care diagnostic testing methods that is routinely applied in cases of acute myeloid leukemia (AML) to assess chromosomal changes that are clinically relevant for risk classification and treatment decisions. Objective: In this study, we assess the use of Genomic Proximity Mapping (GPM) for cytogenomic analysis of AML diagnostic specimens for detection of cytogenetic risk variants included in the European Leukemia Network (ELN) risk stratification guidelines. Methods: Archival patient samples (N=48) from the Fred Hutchinson Cancer Center leukemia bank with historical clinical cytogenetic data were processed for GPM and analyzed with the CytoTerra® cloud-based analysis platform. Results: GPM showed 100% concordance for all specific variants that have associated impacts on risk stratification as defined by ELN 2022 criteria, and a 72% concordance rate when considering all variants reported by the FH cytogenetic lab. GPM identified 39 additional variants, including variants of known clinical impact, not observed by cytogenetics. Conclusions: GPM is an effective solution for the evaluation of known AML-associated risk variants and a source for biomarker discovery.

2.
Genome Biol ; 21(1): 202, 2020 08 10.
Artigo em Inglês | MEDLINE | ID: mdl-32778141

RESUMO

BACKGROUND: The complex interspersed pattern of segmental duplications in humans is responsible for rearrangements associated with neurodevelopmental disease, including the emergence of novel genes important in human brain evolution. We investigate the evolution of LCR16a, a putative driver of this phenomenon that encodes one of the most rapidly evolving human-ape gene families, nuclear pore interacting protein (NPIP). RESULTS: Comparative analysis shows that LCR16a has independently expanded in five primate lineages over the last 35 million years of primate evolution. The expansions are associated with independent lineage-specific segmental duplications flanking LCR16a leading to the emergence of large interspersed duplication blocks at non-orthologous chromosomal locations in each primate lineage. The intron-exon structure of the NPIP gene family has changed dramatically throughout primate evolution with different branches showing characteristic gene models yet maintaining an open reading frame. In the African ape lineage, we detect signatures of positive selection that occurred after a transition to more ubiquitous expression among great ape tissues when compared to Old World and New World monkeys. Mouse transgenic experiments from baboon and human genomic loci confirm these expression differences and suggest that the broader ape expression pattern arose due to mutational changes that emerged in cis. CONCLUSIONS: LCR16a promotes serial interspersed duplications and creates hotspots of genomic instability that appear to be an ancient property of primate genomes. Dramatic changes to NPIP gene structure and altered tissue expression preceded major bouts of positive selection in the African ape lineage, suggestive of a gene undergoing strong adaptive evolution.


Assuntos
Evolução Molecular , Duplicação Gênica , Primatas/genética , Duplicações Segmentares Genômicas , Animais , Biodiversidade , Encéfalo , Mapeamento Cromossômico , Cromossomos , Éxons , Fusão Gênica , Genoma Humano , Instabilidade Genômica , Hominidae , Humanos , Filogenia
3.
Methods Mol Biol ; 2161: 209-228, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32681515

RESUMO

R-loops are three-stranded structures that form during transcription when the nascent RNA hybridizes with the template DNA resulting in a DNA:RNA hybrid and a looped-out single-stranded DNA (ssDNA) strand. These structures are important for normal cellular processes and aberrant R-loop formation has been implicated in a number of pathological outcomes, including certain cancers and neurodegenerative diseases. Mapping R-loops has primarily been performed using DRIP (DNA:RNA immunoprecipitation) based methods that are dependent on the anti-DNA:RNA hybrid S9.6 antibody and short-read sequencing. While DRIP-based methods are robust and report R-loop formation genome-wide, they only do so at the population average level; interrogating R-loop formation at the single molecule level is not feasible with such approaches. Here we present single molecule R-loop footprinting (SMRF-seq), a method that relies on the chemical reactivity of the displaced ssDNA strand to non-denaturing sodium bisulfite and single molecule long-read sequencing as a readout, to characterize R-loops. SMRF-seq can be used independently of S9.6 to generate high resolution, strand-specific, maps of individual R-loops at ultra-deep coverage on kilobases-length DNA fragments.


Assuntos
Estruturas R-Loop , Análise de Sequência de RNA/métodos , Células HeLa , Humanos
4.
J Mol Biol ; 432(7): 2271-2288, 2020 03 27.
Artigo em Inglês | MEDLINE | ID: mdl-32105733

RESUMO

R-loops are a prevalent class of non-B DNA structures that have been associated with both positive and negative cellular outcomes. DNA:RNA immunoprecipitation (DRIP) approaches based on the anti-DNA:RNA hybrid S9.6 antibody revealed that R-loops form dynamically over conserved genic hotspots. We have developed an orthogonal approach that queries R-loops via the presence of long stretches of single-stranded DNA on their looped-out strand. Nondenaturing sodium bisulfite treatment catalyzes the conversion of unpaired cytosines to uracils, creating permanent genetic tags for the position of an R-loop. Long-read, single-molecule PacBio sequencing allows the identification of R-loop 'footprints' at near nucleotide resolution in a strand-specific manner on long single DNA molecules and at ultra-deep coverage. Single-molecule R-loop footprinting coupled with PacBio sequencing (SMRF-seq) revealed a strong agreement between S9.6-based and bisulfite-based R-loop mapping and confirmed that R-loops form over genic hotspots, including gene bodies and terminal gene regions. Based on the largest single-molecule R-loop dataset to date, we show that individual R-loops form nonrandomly, defining discrete sets of overlapping molecular clusters that pileup through larger R-loop zones. R-loops most often map to intronic regions and their individual start and stop positions do not match with intron-exon boundaries, reinforcing the model that they form cotranscriptionally from unspliced transcripts. SMRF-seq further established that R-loop distribution patterns are not simply driven by intrinsic DNA sequence features but most likely also reflect DNA topological constraints. Overall, DRIP-based and SMRF-based approaches independently provide a complementary and congruent view of R-loop distribution, consolidating our understanding of the principles underlying R-loop formation.


Assuntos
DNA/química , Células-Tronco de Carcinoma Embrionário/metabolismo , Estruturas R-Loop , RNA/química , Análise de Célula Única/métodos , Transcrição Gênica , Células-Tronco de Carcinoma Embrionário/citologia , Humanos
5.
Proc Natl Acad Sci U S A ; 116(13): 6260-6269, 2019 03 26.
Artigo em Inglês | MEDLINE | ID: mdl-30850542

RESUMO

R-loops are abundant three-stranded nucleic-acid structures that form in cis during transcription. Experimental evidence suggests that R-loop formation is affected by DNA sequence and topology. However, the exact manner by which these factors interact to determine R-loop susceptibility is unclear. To investigate this, we developed a statistical mechanical equilibrium model of R-loop formation in superhelical DNA. In this model, the energy involved in forming an R-loop includes four terms-junctional and base-pairing energies and energies associated with superhelicity and with the torsional winding of the displaced DNA single strand around the RNA:DNA hybrid. This model shows that the significant energy barrier imposed by the formation of junctions can be overcome in two ways. First, base-pairing energy can favor RNA:DNA over DNA:DNA duplexes in favorable sequences. Second, R-loops, by absorbing negative superhelicity, partially or fully relax the rest of the DNA domain, thereby returning it to a lower energy state. In vitro transcription assays confirmed that R-loops cause plasmid relaxation and that negative superhelicity is required for R-loops to form, even in a favorable region. Single-molecule R-loop footprinting following in vitro transcription showed a strong agreement between theoretical predictions and experimental mapping of stable R-loop positions and further revealed the impact of DNA topology on the R-loop distribution landscape. Our results clarify the interplay between base sequence and DNA superhelicity in controlling R-loop stability. They also reveal R-loops as powerful and reversible topology sinks that cells may use to nonenzymatically relieve superhelical stress during transcription.


Assuntos
Sequência de Bases , DNA Super-Helicoidal/química , DNA/química , Conformação de Ácido Nucleico , DNA de Cadeia Simples/química , Modelos Genéticos , Hibridização de Ácido Nucleico , Plasmídeos/química , RNA/química , Transcrição Gênica
6.
Nat Ecol Evol ; 1(3): 69, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28580430

RESUMO

Segmental duplications contribute to human evolution, adaptation and genomic instability but are often poorly characterized. We investigate the evolution, genetic variation and coding potential of human-specific segmental duplications (HSDs). We identify 218 HSDs based on analysis of 322 deeply sequenced archaic and contemporary hominid genomes. We sequence 550 human and nonhuman primate genomic clones to reconstruct the evolution of the largest, most complex regions with protein-coding potential (n=80 genes/33 gene families). We show that HSDs are non-randomly organized, associate preferentially with ancestral ape duplications termed "core duplicons", and evolved primarily in an interspersed inverted orientation. In addition to Homo sapiens-specific gene expansions (e.g., TCAF1/2), we highlight ten gene families (e.g., ARHGAP11B and SRGAP2C) where copy number never returns to the ancestral state, there is evidence of mRNA splicing, and no common gene-disruptive mutations are observed in the general population. Such duplicates are candidates for the evolution of human-specific adaptive traits.

7.
Sci Rep ; 7: 41980, 2017 02 03.
Artigo em Inglês | MEDLINE | ID: mdl-28155877

RESUMO

Most evolutionary new centromeres (ENC) are composed of large arrays of satellite DNA and surrounded by segmental duplications. However, the hypothesis is that ENCs are seeded in an anonymous sequence and only over time have acquired the complexity of "normal" centromeres. Up to now evidence to test this hypothesis was lacking. We recently discovered that the well-known polymorphism of orangutan chromosome 12 was due to the presence of an ENC. We sequenced the genome of an orangutan homozygous for the ENC, and we focused our analysis on the comparison of the ENC domain with respect to its wild type counterpart. No significant variations were found. This finding is the first clear evidence that ENC seedings are epigenetic in nature. The compaction of the ENC domain was found significantly higher than the corresponding WT region and, interestingly, the expression of the only gene embedded in the region was significantly repressed.


Assuntos
Centrômero/genética , Epigênese Genética , Evolução Molecular , Animais , Linhagem Celular , Sequência Conservada , DNA Satélite/genética , Humanos , Pongo abelii
8.
Nature ; 536(7615): 205-9, 2016 08 11.
Artigo em Inglês | MEDLINE | ID: mdl-27487209

RESUMO

Genetic differences that specify unique aspects of human evolution have typically been identified by comparative analyses between the genomes of humans and closely related primates, including more recently the genomes of archaic hominins. Not all regions of the genome, however, are equally amenable to such study. Recurrent copy number variation (CNV) at chromosome 16p11.2 accounts for approximately 1% of cases of autism and is mediated by a complex set of segmental duplications, many of which arose recently during human evolution. Here we reconstruct the evolutionary history of the locus and identify bolA family member 2 (BOLA2) as a gene duplicated exclusively in Homo sapiens. We estimate that a 95-kilobase-pair segment containing BOLA2 duplicated across the critical region approximately 282 thousand years ago (ka), one of the latest among a series of genomic changes that dramatically restructured the locus during hominid evolution. All humans examined carried one or more copies of the duplication, which nearly fixed early in the human lineage--a pattern unlikely to have arisen so rapidly in the absence of selection (P < 0.0097). We show that the duplication of BOLA2 led to a novel, human-specific in-frame fusion transcript and that BOLA2 copy number correlates with both RNA expression (r = 0.36) and protein level (r = 0.65), with the greatest expression difference between human and chimpanzee in experimentally derived stem cells. Analyses of 152 patients carrying a chromosome 16p11. rearrangement show that more than 96% of breakpoints occur within the H. sapiens-specific duplication. In summary, the duplicative transposition of BOLA2 at the root of the H. sapiens lineage about 282 ka simultaneously increased copy number of a gene associated with iron homeostasis and predisposed our species to recurrent rearrangements associated with disease.


Assuntos
Cromossomos Humanos Par 16/genética , Variações do Número de Cópias de DNA/genética , Evolução Molecular , Predisposição Genética para Doença , Proteínas/genética , Animais , Transtorno Autístico/genética , Quebra Cromossômica , Duplicação Gênica , Homeostase/genética , Humanos , Ferro/metabolismo , Pan troglodytes/genética , Pongo/genética , Proteínas/análise , Recombinação Genética , Especificidade da Espécie , Fatores de Tempo
9.
G3 (Bethesda) ; 6(7): 2213-23, 2016 07 07.
Artigo em Inglês | MEDLINE | ID: mdl-27207956

RESUMO

Skeletal atavism in Shetland ponies is a heritable disorder characterized by abnormal growth of the ulna and fibula that extend the carpal and tarsal joints, respectively. This causes abnormal skeletal structure and impaired movements, and affected foals are usually killed. In order to identify the causal mutation we subjected six confirmed Swedish cases and a DNA pool consisting of 21 control individuals to whole genome resequencing. We screened for polymorphisms where the cases and the control pool were fixed for opposite alleles and observed this signature for only 25 SNPs, most of which were scattered on genome assembly unassigned scaffolds. Read depth analysis at these loci revealed homozygosity or compound heterozygosity for two partially overlapping large deletions in the pseudoautosomal region (PAR) of chromosome X/Y in cases but not in the control pool. One of these deletions removes the entire coding region of the SHOX gene and both deletions remove parts of the CRLF2 gene located downstream of SHOX. The horse reference assembly of the PAR is highly fragmented, and in order to characterize this region we sequenced bacterial artificial chromosome (BAC) clones by single-molecule real-time (SMRT) sequencing technology. This considerably improved the assembly and enabled size estimations of the two deletions to 160-180 kb and 60-80 kb, respectively. Complete association between the presence of these deletions and disease status was verified in eight other affected horses. The result of the present study is consistent with previous studies in humans showing crucial importance of SHOX for normal skeletal development.


Assuntos
Osso e Ossos/metabolismo , Mapeamento Cromossômico , Genoma , Proteínas de Homeodomínio/genética , Cavalos/genética , Regiões Pseudoautossômicas/química , Deleção de Sequência , Animais , Sequência de Bases , Osso e Ossos/anormalidades , Feminino , Loci Gênicos , Heterozigoto , Sequenciamento de Nucleotídeos em Larga Escala , Proteínas de Homeodomínio/metabolismo , Homozigoto , Masculino , Regiões Pseudoautossômicas/metabolismo , Receptores de Citocinas/genética , Receptores de Citocinas/metabolismo
10.
Science ; 352(6281): aae0344, 2016 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-27034376

RESUMO

Accurate sequence and assembly of genomes is a critical first step for studies of genetic variation. We generated a high-quality assembly of the gorilla genome using single-molecule, real-time sequence technology and a string graph de novo assembly algorithm. The new assembly improves contiguity by two to three orders of magnitude with respect to previously released assemblies, recovering 87% of missing reference exons and incomplete gene models. Although regions of large, high-identity segmental duplications remain largely unresolved, this comprehensive assembly provides new biological insight into genetic diversity, structural variation, gene loss, and representation of repeat structures within the gorilla genome. The approach provides a path forward for the routine assembly of mammalian genomes at a level approaching that of the current quality of the human genome.


Assuntos
Gorilla gorilla/genética , Análise de Sequência de DNA/métodos , Animais , Mapeamento de Sequências Contíguas , Evolução Molecular , Etiquetas de Sequências Expressas , Feminino , Variação Genética , Genoma Humano , Genômica , Humanos , Alinhamento de Sequência
11.
Proc Natl Acad Sci U S A ; 112(52): E7223-9, 2015 Dec 29.
Artigo em Inglês | MEDLINE | ID: mdl-26668394

RESUMO

NK-lysin is an antimicrobial peptide and effector protein in the host innate immune system. It is coded by a single gene in humans and most other mammalian species. In this study, we provide evidence for the existence of four NK-lysin genes in a repetitive region on cattle chromosome 11. The NK2A, NK2B, and NK2C genes are tandemly arrayed as three copies in ∼30-35-kb segments, located 41.8 kb upstream of NK1. All four genes are functional, albeit with differential tissue expression. NK1, NK2A, and NK2B exhibited the highest expression in intestine Peyer's patch, whereas NK2C was expressed almost exclusively in lung. The four peptide products were synthesized ex vivo, and their antimicrobial effects against both Gram-positive and Gram-negative bacteria were confirmed with a bacteria-killing assay. Transmission electron microcopy indicated that bovine NK-lysins exhibited their antimicrobial activities by lytic action in the cell membranes. In summary, the single NK-lysin gene in other mammals has expanded to a four-member gene family by tandem duplications in cattle; all four genes are transcribed, and the synthetic peptides corresponding to the core regions are biologically active and likely contribute to innate immunity in ruminants.


Assuntos
Bovinos/genética , Dosagem de Genes , Família Multigênica , Proteolipídeos/genética , Sequência de Aminoácidos , Animais , Sequência de Bases , Cromossomos de Mamíferos/genética , Escherichia coli/efeitos dos fármacos , Escherichia coli/crescimento & desenvolvimento , Escherichia coli/ultraestrutura , Perfilação da Expressão Gênica , Ordem dos Genes , Microscopia Eletrônica de Transmissão , Dados de Sequência Molecular , Especificidade de Órgãos/genética , Peptídeos/farmacologia , Filogenia , Proteolipídeos/classificação , Proteolipídeos/farmacologia , Homologia de Sequência de Aminoácidos , Homologia de Sequência do Ácido Nucleico
12.
Nature ; 526(7571): 75-81, 2015 Oct 01.
Artigo em Inglês | MEDLINE | ID: mdl-26432246

RESUMO

Structural variants are implicated in numerous diseases and make up the majority of varying nucleotides among human genomes. Here we describe an integrated set of eight structural variant classes comprising both balanced and unbalanced variants, which we constructed using short-read DNA sequencing data and statistically phased onto haplotype blocks in 26 human populations. Analysing this set, we identify numerous gene-intersecting structural variants exhibiting population stratification and describe naturally occurring homozygous gene knockouts that suggest the dispensability of a variety of human genes. We demonstrate that structural variants are enriched on haplotypes identified by genome-wide association studies and exhibit enrichment for expression quantitative trait loci. Additionally, we uncover appreciable levels of structural variant complexity at different scales, including genic loci subject to clusters of repeated rearrangement and complex structural variants with multiple breakpoints likely to have formed through individual mutational events. Our catalogue will enhance future studies into structural variant demography, functional impact and disease association.


Assuntos
Variação Genética/genética , Genoma Humano/genética , Mapeamento Físico do Cromossomo , Sequência de Aminoácidos , Predisposição Genética para Doença , Genética Médica , Genética Populacional , Estudo de Associação Genômica Ampla , Genômica , Genótipo , Haplótipos/genética , Homozigoto , Humanos , Dados de Sequência Molecular , Taxa de Mutação , Polimorfismo de Nucleotídeo Único/genética , Locos de Características Quantitativas/genética , Análise de Sequência de DNA , Deleção de Sequência/genética
13.
Genes Immun ; 16(1): 24-34, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-25338678

RESUMO

Germline variation at immunoglobulin (IG) loci is critical for pathogen-mediated immunity, but establishing complete haplotype sequences in these regions has been problematic because of complex sequence architecture and diploid source DNA. We sequenced BAC clones from the effectively haploid human hydatidiform mole cell line, CHM1htert, across the light chain IG loci, kappa (IGK) and lambda (IGL), creating single haplotype representations of these regions. The IGL haplotype generated here is 1.25 Mb of contiguous sequence, including four novel IGLV alleles, one novel IGLC allele, and an 11.9-kb insertion. The CH17 IGK haplotype consists of two 644 kb proximal and 466 kb distal contigs separated by a large gap of unknown size; these assemblies added 49 kb of unique sequence extending into this gap. Our analysis also resulted in the characterization of seven novel IGKV alleles and a 16.7-kb region exhibiting signatures of interlocus sequence exchange between distal and proximal IGKV gene clusters. Genetic diversity in IGK/IGL was compared with that of the IG heavy chain (IGH) locus within the same haploid genome, revealing threefold (IGK) and sixfold (IGL) higher diversity in the IGH locus, potentially associated with increased levels of segmental duplication and the telomeric location of IGH.


Assuntos
Genes de Cadeia Leve de Imunoglobulina , Mola Hidatiforme/genética , Linhagem Celular Tumoral , Cromossomos Artificiais Bacterianos , Feminino , Genes de Cadeia Pesada de Imunoglobulina , Humanos , Dados de Sequência Molecular , Polimorfismo de Nucleotídeo Único , Gravidez
14.
Nature ; 517(7536): 608-11, 2015 Jan 29.
Artigo em Inglês | MEDLINE | ID: mdl-25383537

RESUMO

The human genome is arguably the most complete mammalian reference assembly, yet more than 160 euchromatic gaps remain and aspects of its structural variation remain poorly understood ten years after its completion. To identify missing sequence and genetic variation, here we sequence and analyse a haploid human genome (CHM1) using single-molecule, real-time DNA sequencing. We close or extend 55% of the remaining interstitial gaps in the human GRCh37 reference genome--78% of which carried long runs of degenerate short tandem repeats, often several kilobases in length, embedded within (G+C)-rich genomic regions. We resolve the complete sequence of 26,079 euchromatic structural variants at the base-pair level, including inversions, complex insertions and long tracts of tandem repeats. Most have not been previously reported, with the greatest increases in sensitivity occurring for events less than 5 kilobases in size. Compared to the human reference, we find a significant insertional bias (3:1) in regions corresponding to complex insertions and long short tandem repeats. Our results suggest a greater complexity of the human genome in the form of variation of longer and more complex repetitive DNA that can now be largely resolved with the application of this longer-read sequencing technology.


Assuntos
Variação Genética/genética , Genoma Humano/genética , Genômica , Análise de Sequência de DNA/métodos , Inversão Cromossômica/genética , Cromossomos Humanos Par 10/genética , Clonagem Molecular , Sequência Rica em GC/genética , Haploidia , Humanos , Mutagênese Insercional/genética , Padrões de Referência , Sequências de Repetição em Tandem/genética
15.
Nat Genet ; 46(12): 1293-302, 2014 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-25326701

RESUMO

Recurrent deletions of chromosome 15q13.3 associate with intellectual disability, schizophrenia, autism and epilepsy. To gain insight into the instability of this region, we sequenced it in affected individuals, normal individuals and nonhuman primates. We discovered five structural configurations of the human chromosome 15q13.3 region ranging in size from 2 to 3 Mb. These configurations arose recently (∼0.5-0.9 million years ago) as a result of human-specific expansions of segmental duplications and two independent inversion events. All inversion breakpoints map near GOLGA8 core duplicons-a ∼14-kb primate-specific chromosome 15 repeat that became organized into larger palindromic structures. GOLGA8-flanked palindromes also demarcate the breakpoints of recurrent 15q13.3 microdeletions, the expansion of chromosome 15 segmental duplications in the human lineage and independent structural changes in apes. The significant clustering (P = 0.002) of breakpoints provides mechanistic evidence for the role of this core duplicon and its palindromic architecture in promoting the evolutionary and disease-related instability of chromosome 15.


Assuntos
Transtornos Cromossômicos/genética , Deficiência Intelectual/genética , Sequências Repetitivas de Ácido Nucleico , Duplicações Segmentares Genômicas , Convulsões/genética , Animais , Evolução Biológica , Deleção Cromossômica , Cromossomos Artificiais Bacterianos , Cromossomos Humanos Par 15/genética , Análise por Conglomerados , Hibridização Genômica Comparativa , Dosagem de Genes , Genoma Humano , Humanos , Hibridização in Situ Fluorescente , Modelos Genéticos , Polimorfismo Genético , Primatas , Análise de Sequência de DNA
16.
PLoS One ; 9(8): e104396, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25116239

RESUMO

Asthma is a complex genetic disease caused by a combination of genetic and environmental risk factors. We sought to test classes of genetic variants largely missed by genome-wide association studies (GWAS), including copy number variants (CNVs) and low-frequency variants, by performing whole-genome sequencing (WGS) on 16 individuals from asthma-enriched and asthma-depleted families. The samples were obtained from an extended 13-generation Hutterite pedigree with reduced genetic heterogeneity due to a small founding gene pool and reduced environmental heterogeneity as a result of a communal lifestyle. We sequenced each individual to an average depth of 13-fold, generated a comprehensive catalog of genetic variants, and tested the most severe mutations for association with asthma. We identified and validated 1960 CNVs, 19 nonsense or splice-site single nucleotide variants (SNVs), and 18 insertions or deletions that were out of frame. As follow-up, we performed targeted sequencing of 16 genes in 837 cases and 540 controls of Puerto Rican ancestry and found that controls carry a significantly higher burden of mutations in IL27RA (2.0% of controls; 0.23% of cases; nominal p = 0.004; Bonferroni p = 0.21). We also genotyped 593 CNVs in 1199 Hutterite individuals. We identified a nominally significant association (p = 0.03; Odds ratio (OR) = 3.13) between a 6 kbp deletion in an intron of NEDD4L and increased risk of asthma. We genotyped this deletion in an additional 4787 non-Hutterite individuals (nominal p = 0.056; OR = 1.69). NEDD4L is expressed in bronchial epithelial cells, and conditional knockout of this gene in the lung in mice leads to severe inflammation and mucus accumulation. Our study represents one of the early instances of applying WGS to complex disease with a large environmental component and demonstrates how WGS can identify risk variants, including CNVs and low-frequency variants, largely untested in GWAS.


Assuntos
Asma/genética , Efeito Fundador , Predisposição Genética para Doença , Genoma Humano , Estudo de Associação Genômica Ampla , Alelos , Mapeamento Cromossômico , Hibridização Genômica Comparativa , Variações do Número de Cópias de DNA , Complexos Endossomais de Distribuição Requeridos para Transporte/genética , Feminino , Frequência do Gene , Variação Genética , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Íntrons , Masculino , Ubiquitina-Proteína Ligases Nedd4 , Polimorfismo de Nucleotídeo Único , Grupos Populacionais/genética , Deleção de Sequência , Ubiquitina-Proteína Ligases/genética
17.
Genome Res ; 24(4): 688-96, 2014 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-24418700

RESUMO

Obtaining high-quality sequence continuity of complex regions of recent segmental duplication remains one of the major challenges of finishing genome assemblies. In the human and mouse genomes, this was achieved by targeting large-insert clones using costly and laborious capillary-based sequencing approaches. Sanger shotgun sequencing of clone inserts, however, has now been largely abandoned, leaving most of these regions unresolved in newer genome assemblies generated primarily by next-generation sequencing hybrid approaches. Here we show that it is possible to resolve regions that are complex in a genome-wide context but simple in isolation for a fraction of the time and cost of traditional methods using long-read single molecule, real-time (SMRT) sequencing and assembly technology from Pacific Biosciences (PacBio). We sequenced and assembled BAC clones corresponding to a 1.3-Mbp complex region of chromosome 17q21.31, demonstrating 99.994% identity to Sanger assemblies of the same clones. We targeted 44 differences using Illumina sequencing and find that PacBio and Sanger assemblies share a comparable number of validated variants, albeit with different sequence context biases. Finally, we targeted a poorly assembled 766-kbp duplicated region of the chimpanzee genome and resolved the structure and organization for a fraction of the cost and time of traditional finishing approaches. Our data suggest a straightforward path for upgrading genomes to a higher quality finished state.


Assuntos
Cromossomos Humanos Par 17/genética , Genoma Bacteriano/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Animais , Cromossomos Artificiais Bacterianos/genética , Humanos , Camundongos , Dados de Sequência Molecular , Pan troglodytes/genética
18.
Genome Res ; 23(11): 1763-73, 2013 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-24077392

RESUMO

Ape chromosomes homologous to human chromosomes 14 and 15 were generated by a fission event of an ancestral submetacentric chromosome, where the two chromosomes were joined head-to-tail. The hominoid ancestral chromosome most closely resembles the macaque chromosome 7. In this work, we provide insights into the evolution of human chromosomes 14 and 15, performing a comparative study between macaque boundary region 14/15 and the orthologous human regions. We construct a 1.6-Mb contig of macaque BAC clones in the region orthologous to the ancestral hominoid fission site and use it to define the structural changes that occurred on human 14q pericentromeric and 15q subtelomeric regions. We characterize the novel euchromatin-heterochromatin transition region (∼20 Mb) acquired during the neocentromere establishment on chromosome 14, and find it was mainly derived through pericentromeric duplications from ancestral hominoid chromosomes homologous to human 2q14-qter and 10. Further, we show a relationship between evolutionary hotspots and low-copy repeat loci for chromosome 15, revealing a possible role of segmental duplications not only in mediating but also in "stitching" together rearrangement breakpoints.


Assuntos
Cromossomos Humanos Par 14/genética , Cromossomos Humanos Par 15/genética , Cromossomos de Mamíferos/genética , Evolução Molecular , Hominidae/genética , Duplicações Segmentares Genômicas , Animais , Pontos de Quebra do Cromossomo , Duplicação Cromossômica , Cromossomos Artificiais Bacterianos , Clonagem Molecular , Eucromatina/genética , Heterocromatina/genética , Humanos , Dados de Sequência Molecular , Filogenia
19.
Genome Res ; 23(9): 1373-82, 2013 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-23825009

RESUMO

Copy number variation (CNV) contributes to disease and has restructured the genomes of great apes. The diversity and rate of this process, however, have not been extensively explored among great ape lineages. We analyzed 97 deeply sequenced great ape and human genomes and estimate 16% (469 Mb) of the hominid genome has been affected by recent CNV. We identify a comprehensive set of fixed gene deletions (n = 340) and duplications (n = 405) as well as >13.5 Mb of sequence that has been specifically lost on the human lineage. We compared the diversity and rates of copy number and single nucleotide variation across the hominid phylogeny. We find that CNV diversity partially correlates with single nucleotide diversity (r(2) = 0.5) and recapitulates the phylogeny of apes with few exceptions. Duplications significantly outpace deletions (2.8-fold). The load of segregating duplications remains significantly higher in bonobos, Western chimpanzees, and Sumatran orangutans-populations that have experienced recent genetic bottlenecks (P = 0.0014, 0.02, and 0.0088, respectively). The rate of fixed deletion has been more clocklike with the exception of the chimpanzee lineage, where we observe a twofold increase in the chimpanzee-bonobo ancestor (P = 4.79 × 10(-9)) and increased deletion load among Western chimpanzees (P = 0.002). The latter includes the first genomic disorder in a chimpanzee with features resembling Smith-Magenis syndrome mediated by a chimpanzee-specific increase in segmental duplication complexity. We hypothesize that demographic effects, such as bottlenecks, have contributed to larger and more gene-rich segments being deleted in the chimpanzee lineage and that this effect, more generally, may account for episodic bursts in CNV during hominid evolution.


Assuntos
Variações do Número de Cópias de DNA , Evolução Molecular , Hominidae/genética , Filogenia , Animais , Sequência de Bases , Deleção de Genes , Duplicação Gênica , Carga Genética , Genoma Humano , Humanos , Dados de Sequência Molecular , Linhagem , Polimorfismo de Nucleotídeo Único , Análise de Sequência de DNA
20.
Nature ; 499(7459): 471-5, 2013 Jul 25.
Artigo em Inglês | MEDLINE | ID: mdl-23823723

RESUMO

Most great ape genetic variation remains uncharacterized; however, its study is critical for understanding population history, recombination, selection and susceptibility to disease. Here we sequence to high coverage a total of 79 wild- and captive-born individuals representing all six great ape species and seven subspecies and report 88.8 million single nucleotide polymorphisms. Our analysis provides support for genetically distinct populations within each species, signals of gene flow, and the split of common chimpanzees into two distinct groups: Nigeria-Cameroon/western and central/eastern populations. We find extensive inbreeding in almost all wild populations, with eastern gorillas being the most extreme. Inferred effective population sizes have varied radically over time in different lineages and this appears to have a profound effect on the genetic diversity at, or close to, genes in almost all species. We discover and assign 1,982 loss-of-function variants throughout the human and great ape lineages, determining that the rate of gene loss has not been different in the human branch compared to other internal branches in the great ape phylogeny. This comprehensive catalogue of great ape genome diversity provides a framework for understanding evolution and a resource for more effective management of wild and captive great ape populations.


Assuntos
Variação Genética , Hominidae/genética , África , Animais , Animais Selvagens/genética , Animais de Zoológico/genética , Sudeste Asiático , Evolução Molecular , Fluxo Gênico/genética , Genética Populacional , Genoma/genética , Gorilla gorilla/classificação , Gorilla gorilla/genética , Hominidae/classificação , Humanos , Endogamia , Pan paniscus/classificação , Pan paniscus/genética , Pan troglodytes/classificação , Pan troglodytes/genética , Filogenia , Polimorfismo de Nucleotídeo Único/genética , Densidade Demográfica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA