Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 18 de 18
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Nature ; 621(7978): 344-354, 2023 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-37612512

RESUMO

The human Y chromosome has been notoriously difficult to sequence and assemble because of its complex repeat structure that includes long palindromes, tandem repeats and segmental duplications1-3. As a result, more than half of the Y chromosome is missing from the GRCh38 reference sequence and it remains the last human chromosome to be finished4,5. Here, the Telomere-to-Telomere (T2T) consortium presents the complete 62,460,029-base-pair sequence of a human Y chromosome from the HG002 genome (T2T-Y) that corrects multiple errors in GRCh38-Y and adds over 30 million base pairs of sequence to the reference, showing the complete ampliconic structures of gene families TSPY, DAZ and RBMY; 41 additional protein-coding genes, mostly from the TSPY family; and an alternating pattern of human satellite 1 and 3 blocks in the heterochromatic Yq12 region. We have combined T2T-Y with a previous assembly of the CHM13 genome4 and mapped available population variation, clinical variants and functional genomics data to produce a complete and comprehensive reference sequence for all 24 human chromosomes.


Assuntos
Cromossomos Humanos Y , Genômica , Análise de Sequência de DNA , Humanos , Sequência de Bases , Cromossomos Humanos Y/genética , DNA Satélite/genética , Variação Genética/genética , Genética Populacional , Genômica/métodos , Genômica/normas , Heterocromatina/genética , Família Multigênica/genética , Padrões de Referência , Duplicações Segmentares Genômicas/genética , Análise de Sequência de DNA/normas , Sequências de Repetição em Tandem/genética , Telômero/genética
2.
Genome Res ; 32(2): 242-257, 2022 02.
Artigo em Inglês | MEDLINE | ID: mdl-35042723

RESUMO

Single-cell RNA sequencing (scRNA-seq) enables molecular characterization of complex biological tissues at high resolution. The requirement of single-cell extraction, however, makes it challenging for profiling tissues such as adipose tissue, for which collection of intact single adipocytes is complicated by their fragile nature. For such tissues, single-nucleus extraction is often much more efficient and therefore single-nucleus RNA sequencing (snRNA-seq) presents an alternative to scRNA-seq. However, nuclear transcripts represent only a fraction of the transcriptome in a single cell, with snRNA-seq marked with inherent transcript enrichment and detection biases. Therefore, snRNA-seq may be inadequate for mapping important transcriptional signatures in adipose tissue. In this study, we compare the transcriptomic landscape of single nuclei isolated from preadipocytes and mature adipocytes across human white and brown adipocyte lineages, with whole-cell transcriptome. We show that snRNA-seq is capable of identifying the broad cell types present in scRNA-seq at all states of adipogenesis. However, we also explore how and why the nuclear transcriptome is biased and limited, as well as how it can be advantageous. We robustly characterize the enrichment of nuclear-localized transcripts and adipogenic regulatory lncRNAs in snRNA-seq, while also providing a detailed understanding for the preferential detection of long genes upon using this technique. To remove such technical detection biases, we propose a normalization strategy for a more accurate comparison of nuclear and cellular data. Finally, we show successful integration of scRNA-seq and snRNA-seq data sets with existing bioinformatic tools. Overall, our results illustrate the applicability of snRNA-seq for the characterization of cellular diversity in the adipose tissue.


Assuntos
Adipócitos/citologia , Linhagem da Célula , Perfilação da Expressão Gênica , RNA-Seq , Análise de Célula Única , Viés , Perfilação da Expressão Gênica/métodos , Humanos , RNA-Seq/métodos , Análise de Célula Única/métodos , Transcriptoma
3.
Nat Methods ; 19(6): 711-723, 2022 06.
Artigo em Inglês | MEDLINE | ID: mdl-35396487

RESUMO

Studies of genome regulation routinely use high-throughput DNA sequencing approaches to determine where specific proteins interact with DNA, and they rely on DNA amplification and short-read sequencing, limiting their quantitative application in complex genomic regions. To address these limitations, we developed directed methylation with long-read sequencing (DiMeLo-seq), which uses antibody-tethered enzymes to methylate DNA near a target protein's binding sites in situ. These exogenous methylation marks are then detected simultaneously with endogenous CpG methylation on unamplified DNA using long-read, single-molecule sequencing technologies. We optimized and benchmarked DiMeLo-seq by mapping chromatin-binding proteins and histone modifications across the human genome. Furthermore, we identified where centromere protein A localizes within highly repetitive regions that were unmappable with short sequencing reads, and we estimated the density of centromere protein A molecules along single chromatin fibers. DiMeLo-seq is a versatile method that provides multimodal, genome-wide information for investigating protein-DNA interactions.


Assuntos
Metilação de DNA , Sequenciamento de Nucleotídeos em Larga Escala , Proteína Centromérica A/genética , Cromatina/genética , DNA/química , DNA/genética , Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Análise de Sequência de DNA/métodos
4.
Semin Cell Dev Biol ; 128: 2-14, 2022 08.
Artigo em Inglês | MEDLINE | ID: mdl-35487859

RESUMO

The classical human satellite DNAs, also referred to as human satellites 1, 2 and 3 (HSat1, HSat2, HSat3, or collectively HSat1-3), occur on most human chromosomes as large, pericentromeric tandem repeat arrays, which together constitute roughly 3% of the human genome (100 megabases, on average). Even though HSat1-3 were among the first human DNA sequences to be isolated and characterized at the dawn of molecular biology, they have remained almost entirely missing from the human genome reference assembly for 20 years, hindering studies of their sequence, regulation, and potential structural roles in the nucleus. Recently, the Telomere-to-Telomere Consortium produced the first truly complete assembly of a human genome, paving the way for new studies of HSat1-3 with modern genomic tools. This review provides an account of the history and current understanding of HSat1-3, with a view towards future studies of their evolution and roles in health and disease.


Assuntos
DNA Satélite , Genômica , DNA Satélite/genética , Genoma Humano/genética , Humanos
5.
Nature ; 530(7589): 171-176, 2016 Feb 11.
Artigo em Inglês | MEDLINE | ID: mdl-26840484

RESUMO

The DNA-binding protein PRDM9 directs positioning of the double-strand breaks (DSBs) that initiate meiotic recombination in mice and humans. Prdm9 is the only mammalian speciation gene yet identified and is responsible for sterility phenotypes in male hybrids of certain mouse subspecies. To investigate PRDM9 binding and its role in fertility and meiotic recombination, we humanized the DNA-binding domain of PRDM9 in C57BL/6 mice. This change repositions DSB hotspots and completely restores fertility in male hybrids. Here we show that alteration of one Prdm9 allele impacts the behaviour of DSBs controlled by the other allele at chromosome-wide scales. These effects correlate strongly with the degree to which each PRDM9 variant binds both homologues at the DSB sites it controls. Furthermore, higher genome-wide levels of such 'symmetric' PRDM9 binding associate with increasing fertility measures, and comparisons of individual hotspots suggest binding symmetry plays a downstream role in the recombination process. These findings reveal that subspecies-specific degradation of PRDM9 binding sites by meiotic drive, which steadily increases asymmetric PRDM9 binding, has impacts beyond simply changing hotspot positions, and strongly support a direct involvement in hybrid infertility. Because such meiotic drive occurs across mammals, PRDM9 may play a wider, yet transient, role in the early stages of speciation.


Assuntos
Especiação Genética , Histona-Lisina N-Metiltransferase/química , Histona-Lisina N-Metiltransferase/metabolismo , Hibridização Genética/genética , Infertilidade/genética , Engenharia de Proteínas , Dedos de Zinco/genética , Alelos , Animais , Sítios de Ligação , Pareamento Cromossômico/genética , Cromossomos de Mamíferos/genética , Cromossomos de Mamíferos/metabolismo , Quebras de DNA de Cadeia Dupla , Feminino , Histona-Lisina N-Metiltransferase/genética , Humanos , Masculino , Meiose/genética , Camundongos , Camundongos Endogâmicos C57BL , Ligação Proteica , Estrutura Terciária de Proteína/genética , Recombinação Genética/genética
6.
Genome Res ; 24(4): 697-707, 2014 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-24501022

RESUMO

The human genome sequence remains incomplete, with multimegabase-sized gaps representing the endogenous centromeres and other heterochromatic regions. Available sequence-based studies within these sites in the genome have demonstrated a role in centromere function and chromosome pairing, necessary to ensure proper chromosome segregation during cell division. A common genomic feature of these regions is the enrichment of long arrays of near-identical tandem repeats, known as satellite DNAs, which offer a limited number of variant sites to differentiate individual repeat copies across millions of bases. This substantial sequence homogeneity challenges available assembly strategies and, as a result, centromeric regions are omitted from ongoing genomic studies. To address this problem, we utilize monomer sequence and ordering information obtained from whole-genome shotgun reads to model two haploid human satellite arrays on chromosomes X and Y, resulting in an initial characterization of 3.83 Mb of centromeric DNA within an individual genome. To further expand the utility of each centromeric reference sequence model, we evaluate sites within the arrays for short-read mappability and chromosome specificity. Because satellite DNAs evolve in a concerted manner, we use these centromeric assemblies to assess the extent of sequence variation among 366 individuals from distinct human populations. We thus identify two satellite array variants in both X and Y centromeres, as determined by array length and sequence composition. This study provides an initial sequence characterization of a regional centromere and establishes a foundation to extend genomic characterization to these sites as well as to other repeat-rich regions within complex genomes.


Assuntos
Centrômero/genética , DNA Satélite/genética , Análise de Sequência de DNA , Sequências de Repetição em Tandem/genética , Cromossomos Humanos X/genética , Cromossomos Humanos Y/genética , Genoma Humano , Humanos , Dados de Sequência Molecular
7.
PLoS Genet ; 10(7): e1004503, 2014 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-25033397

RESUMO

The pseudoautosomal region (PAR) is a short region of homology between the mammalian X and Y chromosomes, which has undergone rapid evolution. A crossover in the PAR is essential for the proper disjunction of X and Y chromosomes in male meiosis, and PAR deletion results in male sterility. This leads the human PAR with the obligatory crossover, PAR1, to having an exceptionally high male crossover rate, which is 17-fold higher than the genome-wide average. However, the mechanism by which this obligatory crossover occurs remains unknown, as does the fine-scale positioning of crossovers across this region. Recent research in mice has suggested that crossovers in PAR may be mediated independently of the protein PRDM9, which localises virtually all crossovers in the autosomes. To investigate recombination in this region, we construct the most fine-scale genetic map containing directly observed crossovers to date using African-American pedigrees. We leverage recombination rates inferred from the breakdown of linkage disequilibrium in human populations and investigate the signatures of DNA evolution due to recombination. Further, we identify direct PRDM9 binding sites using ChIP-seq in human cells. Using these independent lines of evidence, we show that, in contrast with mouse, PRDM9 does localise peaks of recombination in the human PAR1. We find that recombination is a far more rapid and intense driver of sequence evolution in PAR1 than it is on the autosomes. We also show that PAR1 hotspot activities differ significantly among human populations. Finally, we find evidence that PAR1 hotspot positions have changed between human and chimpanzee, with no evidence of sharing among the hottest hotspots. We anticipate that the genetic maps built and validated in this work will aid research on this vital and fascinating region of the genome.


Assuntos
Troca Genética , Histona-Lisina N-Metiltransferase/genética , Infertilidade Masculina/genética , Recombinação Genética , Cromossomos Humanos X/genética , Cromossomos Humanos Y/genética , Feminino , Genética Populacional , Projeto HapMap , Humanos , Desequilíbrio de Ligação , Masculino , Meiose/genética
8.
PLoS Comput Biol ; 10(5): e1003628, 2014 May.
Artigo em Inglês | MEDLINE | ID: mdl-24831296

RESUMO

The largest gaps in the human genome assembly correspond to multi-megabase heterochromatic regions composed primarily of two related families of tandem repeats, Human Satellites 2 and 3 (HSat2,3). The abundance of repetitive DNA in these regions challenges standard mapping and assembly algorithms, and as a result, the sequence composition and potential biological functions of these regions remain largely unexplored. Furthermore, existing genomic tools designed to predict consensus-based descriptions of repeat families cannot be readily applied to complex satellite repeats such as HSat2,3, which lack a consistent repeat unit reference sequence. Here we present an alignment-free method to characterize complex satellites using whole-genome shotgun read datasets. Utilizing this approach, we classify HSat2,3 sequences into fourteen subfamilies and predict their chromosomal distributions, resulting in a comprehensive satellite reference database to further enable genomic studies of heterochromatic regions. We also identify 1.3 Mb of non-repetitive sequence interspersed with HSat2,3 across 17 unmapped assembly scaffolds, including eight annotated gene predictions. Finally, we apply our satellite reference database to high-throughput sequence data from 396 males to estimate array size variation of the predominant HSat3 array on the Y chromosome, confirming that satellite array sizes can vary between individuals over an order of magnitude (7 to 98 Mb) and further demonstrating that array sizes are distributed differently within distinct Y haplogroups. In summary, we present a novel framework for generating initial reference databases for unassembled genomic regions enriched with complex satellite DNA, and we further demonstrate the utility of these reference databases for studying patterns of sequence variation within human populations.


Assuntos
Mapeamento Cromossômico/métodos , Cromossomos Humanos Y/genética , DNA Satélite/genética , Genoma Humano/genética , Heterocromatina/genética , Análise de Sequência de DNA/métodos , Sequência de Bases , Humanos , Dados de Sequência Molecular
9.
Science ; 376(6588): eabj5089, 2022 04.
Artigo em Inglês | MEDLINE | ID: mdl-35357915

RESUMO

The completion of a telomere-to-telomere human reference genome, T2T-CHM13, has resolved complex regions of the genome, including repetitive and homologous regions. Here, we present a high-resolution epigenetic study of previously unresolved sequences, representing entire acrocentric chromosome short arms, gene family expansions, and a diverse collection of repeat classes. This resource precisely maps CpG methylation (32.28 million CpGs), DNA accessibility, and short-read datasets (166,058 previously unresolved chromatin immunoprecipitation sequencing peaks) to provide evidence of activity across previously unidentified or corrected genes and reveals clinically relevant paralog-specific regulation. Probing CpG methylation across human centromeres from six diverse individuals generated an estimate of variability in kinetochore localization. This analysis provides a framework with which to investigate the most elusive regions of the human genome, granting insights into epigenetic regulation.


Assuntos
Ilhas de CpG , Metilação de DNA , Epigênese Genética , Genoma Humano , Centrômero/genética , Centrômero/metabolismo , Doença/genética , Loci Gênicos , Genômica/normas , Humanos , Padrões de Referência , Análise de Sequência de DNA
10.
Science ; 376(6588): eabk3112, 2022 04.
Artigo em Inglês | MEDLINE | ID: mdl-35357925

RESUMO

Mobile elements and repetitive genomic regions are sources of lineage-specific genomic innovation and uniquely fingerprint individual genomes. Comprehensive analyses of such repeat elements, including those found in more complex regions of the genome, require a complete, linear genome assembly. We present a de novo repeat discovery and annotation of the T2T-CHM13 human reference genome. We identified previously unknown satellite arrays, expanded the catalog of variants and families for repeats and mobile elements, characterized classes of complex composite repeats, and located retroelement transduction events. We detected nascent transcription and delineated CpG methylation profiles to define the structure of transcriptionally active retroelements in humans, including those in centromeres. These data expand our insight into the diversity, distribution, and evolution of repetitive regions that have shaped the human genome.


Assuntos
Epigênese Genética , Genoma Humano , Sequências Repetitivas de Ácido Nucleico , Telômero/genética , Transcrição Gênica , Humanos
11.
Science ; 376(6588): eabl4178, 2022 04.
Artigo em Inglês | MEDLINE | ID: mdl-35357911

RESUMO

Existing human genome assemblies have almost entirely excluded repetitive sequences within and near centromeres, limiting our understanding of their organization, evolution, and functions, which include facilitating proper chromosome segregation. Now, a complete, telomere-to-telomere human genome assembly (T2T-CHM13) has enabled us to comprehensively characterize pericentromeric and centromeric repeats, which constitute 6.2% of the genome (189.9 megabases). Detailed maps of these regions revealed multimegabase structural rearrangements, including in active centromeric repeat arrays. Analysis of centromere-associated sequences uncovered a strong relationship between the position of the centromere and the evolution of the surrounding DNA through layered repeat expansions. Furthermore, comparisons of chromosome X centromeres across a diverse panel of individuals illuminated high degrees of structural, epigenetic, and sequence variation in these complex and rapidly evolving regions.


Assuntos
Centrômero/genética , Mapeamento Cromossômico , Epigênese Genética , Genoma Humano , Evolução Molecular , Genômica , Humanos , Sequências Repetitivas de Ácido Nucleico
12.
Science ; 376(6588): 44-53, 2022 04.
Artigo em Inglês | MEDLINE | ID: mdl-35357919

RESUMO

Since its initial release in 2000, the human reference genome has covered only the euchromatic fraction of the genome, leaving important heterochromatic regions unfinished. Addressing the remaining 8% of the genome, the Telomere-to-Telomere (T2T) Consortium presents a complete 3.055 billion-base pair sequence of a human genome, T2T-CHM13, that includes gapless assemblies for all chromosomes except Y, corrects errors in the prior references, and introduces nearly 200 million base pairs of sequence containing 1956 gene predictions, 99 of which are predicted to be protein coding. The completed regions include all centromeric satellite arrays, recent segmental duplications, and the short arms of all five acrocentric chromosomes, unlocking these complex regions of the genome to variational and functional studies.


Assuntos
Genoma Humano , Projeto Genoma Humano , Análise de Sequência de DNA/normas , Linhagem Celular , Cromossomos Artificiais Bacterianos/genética , Cromossomos Humanos/genética , Humanos , Valores de Referência
13.
Cell Syst ; 11(4): 354-366.e9, 2020 10 21.
Artigo em Inglês | MEDLINE | ID: mdl-33099405

RESUMO

DNA adenine methyltransferase identification (DamID) measures a protein's DNA-binding history by methylating adenine bases near each protein-DNA interaction site and then selectively amplifying and sequencing these methylated regions. Additionally, these interactions can be visualized using m6A-Tracer, a fluorescent protein that binds to methyladenines. Here, we combine these imaging and sequencing technologies in an integrated microfluidic platform (µDamID) that enables single-cell isolation, imaging, and sorting, followed by DamID. We use µDamID and an improved m6A-Tracer protein to generate paired imaging and sequencing data from individual human cells. We validate interactions between Lamin-B1 protein and lamina-associated domains (LADs), observe variable 3D chromatin organization and broad gene regulation patterns, and jointly measure single-cell heterogeneity in Dam expression and background methylation. µDamID provides the unique ability to compare paired imaging and sequencing data for each cell and between cells, enabling the joint analysis of the nuclear localization, sequence identity, and variability of protein-DNA interactions. A record of this paper's transparent peer review process is included in the Supplemental Information.


Assuntos
Microfluídica/métodos , Análise de Sequência de DNA/métodos , Análise de Célula Única/métodos , Adenina/metabolismo , Núcleo Celular/metabolismo , Cromatina/metabolismo , DNA/metabolismo , Metilação de DNA/genética , Proteínas de Ligação a DNA/genética , Genômica/métodos , Células HEK293 , Humanos , Lamina Tipo B/metabolismo , Receptores Purinérgicos/metabolismo
14.
Sci Rep ; 10(1): 16902, 2020 10 09.
Artigo em Inglês | MEDLINE | ID: mdl-33037294

RESUMO

Epidemiological studies have suggested differences in the rate of multiple sclerosis (MS) in individuals of European ancestry compared to African ancestry, motivating genetic scans to identify variants that could contribute to such patterns. In a whole-genome scan in 899 African-American cases and 1155 African-American controls, we confirm that African-Americans who inherit segments of the genome of European ancestry at a chromosome 1 locus are at increased risk for MS [logarithm of odds (LOD) = 9.8], although the signal weakens when adding an additional 406 cases, reflecting heterogeneity in the two sets of cases [logarithm of odds (LOD) = 2.7]. The association in the 899 individuals can be fully explained by two variants previously associated with MS in European ancestry individuals. These variants tag a MS susceptibility haplotype associated with decreased CD58 gene expression (odds ratio of 1.37; frequency of 84% in Europeans and 22% in West Africans for the tagging variant) as well as another haplotype near the FCRL3 gene (odds ratio of 1.07; frequency of 49% in Europeans and 8% in West Africans). Controlling for all other genetic and environmental factors, the two variants predict a 1.44-fold higher rate of MS in European-Americans compared to African-Americans.


Assuntos
Negro ou Afro-Americano/genética , Predisposição Genética para Doença/genética , Esclerose Múltipla/genética , Polimorfismo de Nucleotídeo Único/genética , População Branca/genética , Feminino , Estudo de Associação Genômica Ampla/métodos , Haplótipos/genética , Humanos , Masculino , Razão de Chances
15.
Nat Commun ; 10(1): 3900, 2019 08 29.
Artigo em Inglês | MEDLINE | ID: mdl-31467277

RESUMO

During meiotic recombination, homologue-templated repair of programmed DNA double-strand breaks (DSBs) produces relatively few crossovers and many difficult-to-detect non-crossovers. By intercrossing two diverged mouse subspecies over five generations and deep-sequencing 119 offspring, we detect thousands of crossover and non-crossover events genome-wide with unprecedented power and spatial resolution. We find that both crossovers and non-crossovers are strongly depleted at DSB hotspots where the DSB-positioning protein PRDM9 fails to bind to the unbroken homologous chromosome, revealing that PRDM9 also functions to promote homologue-templated repair. Our results show that complex non-crossovers are much rarer in mice than humans, consistent with complex events arising from accumulated non-programmed DNA damage. Unexpectedly, we also find that GC-biased gene conversion is restricted to non-crossover tracts containing only one mismatch. These results demonstrate that local genetic diversity profoundly alters meiotic repair pathway decisions via at least two distinct mechanisms, impacting genome evolution and Prdm9-related hybrid infertility.


Assuntos
Quebras de DNA de Cadeia Dupla , Variação Genética , Recombinação Homóloga , Alelos , Animais , Proteínas de Ciclo Celular/genética , Cromossomos , Troca Genética , Dano ao DNA , Reparo de Erro de Pareamento de DNA , Feminino , Conversão Gênica , Histona-Lisina N-Metiltransferase/genética , Histonas/genética , Humanos , Hibridização Genética , Masculino , Camundongos , Camundongos Endogâmicos C57BL , Modelos Genéticos , Proteínas de Ligação a Fosfato/genética , Polimorfismo de Nucleotídeo Único , Reparo de DNA por Recombinação
16.
Elife ; 62017 10 26.
Artigo em Inglês | MEDLINE | ID: mdl-29072575

RESUMO

PRDM9 binding localizes almost all meiotic recombination sites in humans and mice. However, most PRDM9-bound loci do not become recombination hotspots. To explore factors that affect binding and subsequent recombination outcomes, we mapped human PRDM9 binding sites in a transfected human cell line and measured PRDM9-induced histone modifications. These data reveal varied DNA-binding modalities of PRDM9. We also find that human PRDM9 frequently binds promoters, despite their low recombination rates, and it can activate expression of a small number of genes including CTCFL and VCX. Furthermore, we identify specific sequence motifs that predict consistent, localized meiotic recombination suppression around a subset of PRDM9 binding sites. These motifs strongly associate with KRAB-ZNF protein binding, TRIM28 recruitment, and specific histone modifications. Finally, we demonstrate that, in addition to binding DNA, PRDM9's zinc fingers also mediate its multimerization, and we show that a pair of highly diverged alleles preferentially form homo-multimers.


Assuntos
DNA/metabolismo , Histona-Lisina N-Metiltransferase/metabolismo , Recombinação Homóloga , Meiose , Sítios de Ligação , Mapeamento Cromossômico , Células HEK293 , Humanos , Ligação Proteica , Multimerização Proteica
17.
Elife ; 42015 Mar 25.
Artigo em Inglês | MEDLINE | ID: mdl-25806687

RESUMO

Although the past decade has seen tremendous progress in our understanding of fine-scale recombination, little is known about non-crossover (NCO) gene conversion. We report the first genome-wide study of NCO events in humans. Using SNP array data from 98 meioses, we identified 103 sites affected by NCO, of which 50/52 were confirmed in sequence data. Overlap with double strand break (DSB) hotspots indicates that most of the events are likely of meiotic origin. We estimate that a site is involved in a NCO at a rate of 5.9 × 10(-6)/bp/generation, consistent with sperm-typing studies, and infer that tract lengths span at least an order of magnitude. Observed NCO events show strong allelic bias at heterozygous AT/GC SNPs, with 68% (58-78%) transmitting GC alleles (p = 5 × 10(-4)). Strikingly, in 4 of 15 regions with resequencing data, multiple disjoint NCO tracts cluster in close proximity (∼20-30 kb), a phenomenon not previously seen in mammals.


Assuntos
Composição de Bases/genética , Troca Genética , Conversão Gênica , Alelos , Sequência de Bases , Análise por Conglomerados , Feminino , Humanos , Masculino , Linhagem , Polimorfismo de Nucleotídeo Único/genética
18.
Nat Genet ; 45(4): 406-14, 414e1-2, 2013 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-23435088

RESUMO

Tens of millions of base pairs of euchromatic human genome sequence, including many protein-coding genes, have no known location in the human genome. We describe an approach for localizing the human genome's missing pieces using the patterns of genome sequence variation created by population admixture. We mapped the locations of 70 scaffolds spanning 4 million base pairs of the human genome's unplaced euchromatic sequence, including more than a dozen protein-coding genes, and identified 8 new large interchromosomal segmental duplications. We find that most of these sequences are hidden in the genome's heterochromatin, particularly its pericentromeric regions. Many cryptic, pericentromeric genes are expressed at the RNA level and have been maintained intact for millions of years while their expression patterns diverged from those of paralogous genes elsewhere in the genome. We describe how knowledge of the locations of these sequences can inform disease association and genome biology studies.


Assuntos
Mapeamento Cromossômico , Eucromatina/genética , Evolução Molecular , Variação Genética/genética , Genética Populacional , Genoma Humano , Heterocromatina/genética , Biologia Computacional , Duplicação Gênica , Humanos , Hibridização in Situ Fluorescente
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA