Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 21
Filtrar
1.
Nature ; 630(8016): 401-411, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38811727

RESUMEN

Apes possess two sex chromosomes-the male-specific Y chromosome and the X chromosome, which is present in both males and females. The Y chromosome is crucial for male reproduction, with deletions being linked to infertility1. The X chromosome is vital for reproduction and cognition2. Variation in mating patterns and brain function among apes suggests corresponding differences in their sex chromosomes. However, owing to their repetitive nature and incomplete reference assemblies, ape sex chromosomes have been challenging to study. Here, using the methodology developed for the telomere-to-telomere (T2T) human genome, we produced gapless assemblies of the X and Y chromosomes for five great apes (bonobo (Pan paniscus), chimpanzee (Pan troglodytes), western lowland gorilla (Gorilla gorilla gorilla), Bornean orangutan (Pongo pygmaeus) and Sumatran orangutan (Pongo abelii)) and a lesser ape (the siamang gibbon (Symphalangus syndactylus)), and untangled the intricacies of their evolution. Compared with the X chromosomes, the ape Y chromosomes vary greatly in size and have low alignability and high levels of structural rearrangements-owing to the accumulation of lineage-specific ampliconic regions, palindromes, transposable elements and satellites. Many Y chromosome genes expand in multi-copy families and some evolve under purifying selection. Thus, the Y chromosome exhibits dynamic evolution, whereas the X chromosome is more stable. Mapping short-read sequencing data to these assemblies revealed diversity and selection patterns on sex chromosomes of more than 100 individual great apes. These reference assemblies are expected to inform human evolution and conservation genetics of non-human apes, all of which are endangered species.


Asunto(s)
Hominidae , Cromosoma X , Cromosoma Y , Animales , Femenino , Masculino , Gorilla gorilla/genética , Hominidae/genética , Hominidae/clasificación , Hylobatidae/genética , Pan paniscus/genética , Pan troglodytes/genética , Filogenia , Pongo abelii/genética , Pongo pygmaeus/genética , Telómero/genética , Cromosoma X/genética , Cromosoma Y/genética , Evolución Molecular , Variaciones en el Número de Copia de ADN/genética , Humanos , Especies en Peligro de Extinción , Estándares de Referencia
2.
Nature ; 585(7823): 79-84, 2020 09.
Artículo en Inglés | MEDLINE | ID: mdl-32663838

RESUMEN

After two decades of improvements, the current human reference genome (GRCh38) is the most accurate and complete vertebrate genome ever produced. However, no single chromosome has been finished end to end, and hundreds of unresolved gaps persist1,2. Here we present a human genome assembly that surpasses the continuity of GRCh382, along with a gapless, telomere-to-telomere assembly of a human chromosome. This was enabled by high-coverage, ultra-long-read nanopore sequencing of the complete hydatidiform mole CHM13 genome, combined with complementary technologies for quality improvement and validation. Focusing our efforts on the human X chromosome3, we reconstructed the centromeric satellite DNA array (approximately 3.1 Mb) and closed the 29 remaining gaps in the current reference, including new sequences from the human pseudoautosomal regions and from cancer-testis ampliconic gene families (CT-X and GAGE). These sequences will be integrated into future human reference genome releases. In addition, the complete chromosome X, combined with the ultra-long nanopore data, allowed us to map methylation patterns across complex tandem repeats and satellite arrays. Our results demonstrate that finishing the entire human genome is now within reach, and the data presented here will facilitate ongoing efforts to complete the other human chromosomes.


Asunto(s)
Cromosomas Humanos X/genética , Genoma Humano/genética , Telómero/genética , Centrómero/genética , Islas de CpG/genética , Metilación de ADN , ADN Satélite/genética , Femenino , Humanos , Mola Hidatiforme/genética , Masculino , Embarazo , Reproducibilidad de los Resultados , Testículo/metabolismo
3.
PLoS Genet ; 10(3): e1004190, 2014 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-24603370

RESUMEN

Although a considerable proportion of serum lipids loci identified in European ancestry individuals (EA) replicate in African Americans (AA), interethnic differences in the distribution of serum lipids suggest that some genetic determinants differ by ethnicity. We conducted a comprehensive evaluation of five lipid candidate genes to identify variants with ethnicity-specific effects. We sequenced ABCA1, LCAT, LPL, PON1, and SERPINE1 in 48 AA individuals with extreme serum lipid concentrations (high HDLC/low TG or low HDLC/high TG). Identified variants were genotyped in the full population-based sample of AA (n = 1694) and tested for an association with serum lipids. rs328 (LPL) and correlated variants were associated with higher HDLC and lower TG. Interestingly, a stronger effect was observed on a "European" vs. "African" genetic background at this locus. To investigate this effect, we evaluated the region among West Africans (WA). For TG, the effect size among WA was the same in AA with only African local ancestry (2-3% lower TG), while the larger association among AA with local European ancestry matched previous reports in EA (10%). For HDLC, there was no association with rs328 in AA with only African local ancestry or in WA, while the association among AA with European local ancestry was much greater than what has been observed for EA (15 vs. ∼ 5 mg/dl), suggesting an interaction with an environmental or genetic factor that differs by ethnicity. Beyond this ancestry effect, the importance of African ancestry-focused, sequence-based work was also highlighted by serum lipid associations of variants that were in higher frequency (or present only) among those of African ancestry. By beginning our study with the sequence variation present in AA individuals, investigating local ancestry effects, and seeking replication in WA, we were able to comprehensively evaluate the role of a set of candidate genes in serum lipids in AA.


Asunto(s)
Negro o Afroamericano/genética , Etnicidad/genética , Estudio de Asociación del Genoma Completo , Lípidos/genética , Variación Genética , Genotipo , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Desequilibrio de Ligamiento , Lípidos/sangre , Polimorfismo de Nucleótido Simple , Población Blanca/genética
4.
bioRxiv ; 2023 Dec 01.
Artículo en Inglés | MEDLINE | ID: mdl-38077089

RESUMEN

Apes possess two sex chromosomes-the male-specific Y and the X shared by males and females. The Y chromosome is crucial for male reproduction, with deletions linked to infertility. The X chromosome carries genes vital for reproduction and cognition. Variation in mating patterns and brain function among great apes suggests corresponding differences in their sex chromosome structure and evolution. However, due to their highly repetitive nature and incomplete reference assemblies, ape sex chromosomes have been challenging to study. Here, using the state-of-the-art experimental and computational methods developed for the telomere-to-telomere (T2T) human genome, we produced gapless, complete assemblies of the X and Y chromosomes for five great apes (chimpanzee, bonobo, gorilla, Bornean and Sumatran orangutans) and a lesser ape, the siamang gibbon. These assemblies completely resolved ampliconic, palindromic, and satellite sequences, including the entire centromeres, allowing us to untangle the intricacies of ape sex chromosome evolution. We found that, compared to the X, ape Y chromosomes vary greatly in size and have low alignability and high levels of structural rearrangements. This divergence on the Y arises from the accumulation of lineage-specific ampliconic regions and palindromes (which are shared more broadly among species on the X) and from the abundance of transposable elements and satellites (which have a lower representation on the X). Our analysis of Y chromosome genes revealed lineage-specific expansions of multi-copy gene families and signatures of purifying selection. In summary, the Y exhibits dynamic evolution, while the X is more stable. Finally, mapping short-read sequencing data from >100 great ape individuals revealed the patterns of diversity and selection on their sex chromosomes, demonstrating the utility of these reference assemblies for studies of great ape evolution. These complete sex chromosome assemblies are expected to further inform conservation genetics of nonhuman apes, all of which are endangered species.

5.
Genome Res ; 19(9): 1665-74, 2009 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-19602640

RESUMEN

ClinSeq is a pilot project to investigate the use of whole-genome sequencing as a tool for clinical research. By piloting the acquisition of large amounts of DNA sequence data from individual human subjects, we are fostering the development of hypothesis-generating approaches for performing research in genomic medicine, including the exploration of issues related to the genetic architecture of disease, implementation of genomic technology, informed consent, disclosure of genetic information, and archiving, analyzing, and displaying sequence data. In the initial phase of ClinSeq, we are enrolling roughly 1000 participants; the evaluation of each includes obtaining a detailed family and medical history, as well as a clinical evaluation. The participants are being consented broadly for research on many traits and for whole-genome sequencing. Initially, Sanger-based sequencing of 300-400 genes thought to be relevant to atherosclerosis is being performed, with the resulting data analyzed for rare, high-penetrance variants associated with specific clinical traits. The participants are also being consented to allow the contact of family members for additional studies of sequence variants to explore their potential association with specific phenotypes. Here, we present the general considerations in designing ClinSeq, preliminary results based on the generation of an initial 826 Mb of sequence data, the findings for several genes that serve as positive controls for the project, and our views about the potential implications of ClinSeq. The early experiences with ClinSeq illustrate how large-scale medical sequencing can be a practical, productive, and critical component of research in genomic medicine.


Asunto(s)
Aterosclerosis/genética , Investigación Biomédica , Enfermedades Cardiovasculares/genética , Genoma Humano , Genómica , Proyectos Piloto , Análisis de Secuencia de ADN/métodos , Anciano , Estudios de Cohortes , Femenino , Humanos , Masculino , Linaje , Fenotipo
6.
Science ; 376(6588): 44-53, 2022 04.
Artículo en Inglés | MEDLINE | ID: mdl-35357919

RESUMEN

Since its initial release in 2000, the human reference genome has covered only the euchromatic fraction of the genome, leaving important heterochromatic regions unfinished. Addressing the remaining 8% of the genome, the Telomere-to-Telomere (T2T) Consortium presents a complete 3.055 billion-base pair sequence of a human genome, T2T-CHM13, that includes gapless assemblies for all chromosomes except Y, corrects errors in the prior references, and introduces nearly 200 million base pairs of sequence containing 1956 gene predictions, 99 of which are predicted to be protein coding. The completed regions include all centromeric satellite arrays, recent segmental duplications, and the short arms of all five acrocentric chromosomes, unlocking these complex regions of the genome to variational and functional studies.


Asunto(s)
Genoma Humano , Proyecto Genoma Humano , Análisis de Secuencia de ADN/normas , Línea Celular , Cromosomas Artificiales Bacterianos/genética , Cromosomas Humanos/genética , Humanos , Valores de Referencia
7.
BMC Genomics ; 11: 21, 2010 Jan 11.
Artículo en Inglés | MEDLINE | ID: mdl-20064230

RESUMEN

BACKGROUND: The approaches for shotgun-based sequencing of vertebrate genomes are now well-established, and have resulted in the generation of numerous draft whole-genome sequence assemblies. In contrast, the process of refining those assemblies to improve contiguity and increase accuracy (known as 'sequence finishing') remains tedious, labor-intensive, and expensive. As a result, the vast majority of vertebrate genome sequences generated to date remain at a draft stage. RESULTS: To date, our genome sequencing efforts have focused on comparative studies of targeted genomic regions, requiring sequence finishing of large blocks of orthologous sequence (average size 0.5-2 Mb) from various subsets of 75 vertebrates. This experience has provided a unique opportunity to compare the relative effort required to finish shotgun-generated genome sequence assemblies from different species, which we report here. Importantly, we found that the sequence assemblies generated for the same orthologous regions from various vertebrates show substantial variation with respect to misassemblies and, in particular, the frequency and characteristics of sequence gaps. As a consequence, the work required to finish different species' sequences varied greatly. Application of the same standardized methods for finishing provided a novel opportunity to "assay" characteristics of genome sequences among many vertebrate species. It is important to note that many of the problems we have encountered during sequence finishing reflect unique architectural features of a particular vertebrate's genome, which in some cases may have important functional and/or evolutionary implications. Finally, based on our analyses, we have been able to improve our procedures to overcome some of these problems and to increase the overall efficiency of the sequence-finishing process, although significant challenges still remain. CONCLUSION: Our findings have important implications for the eventual finishing of the draft whole-genome sequences that have now been generated for a large number of vertebrates.


Asunto(s)
Genómica/métodos , Análisis de Secuencia de ADN/métodos , Vertebrados/genética , Animales , Mapeo Cromosómico , Cromosomas Artificiales Bacterianos , Genoma
8.
Nucleic Acids Res ; 32(Database issue): D572-4, 2004 Jan 01.
Artículo en Inglés | MEDLINE | ID: mdl-14681483

RESUMEN

Hembase (http://hembase.niddk.nih.gov) is an integrated browser and genome portal designed for web-based examination of the human erythroid transcriptome. To date, Hembase contains 15,752 entries from erythroblast Expressed Sequenced Tags (ESTs) and 380 referenced genes relevant for erythropoiesis. The database is organized to provide a cytogenetic band position, a unique name as well as a concise annotation for each entry. Search queries may be performed by name, keyword or cytogenetic location. Search results are linked to primary sequence data and three major human genome browsers for access to information considered current at the time of each search. Hembase provides interested scientists and clinical hematologists with a genome-based approach toward the study of erythroid biology.


Asunto(s)
Bases de Datos Genéticas , Eritrocitos/metabolismo , Eritropoyesis/genética , Genómica , Hematología , Biología Computacional , Citogenética , Etiquetas de Secuencia Expresada , Genoma Humano , Humanos , Almacenamiento y Recuperación de la Información , Internet , Transcripción Genética/genética
9.
Nucleic Acids Res ; 30(11): 2469-77, 2002 Jun 01.
Artículo en Inglés | MEDLINE | ID: mdl-12034835

RESUMEN

In parallel with the production of genomic sequence data, attention is being focused on the generation of comprehensive cDNA-sequence resources. Such efforts are increasingly emphasizing the production of high-accuracy sequence corresponding to the entire insert of cDNA clones, especially those presumed to reflect the full-length mRNA. The complete sequencing of cDNA clones on a large scale presents unique challenges because of the generally small, yet heterogeneous, sizes of the cloned inserts. We have developed a strategy for high-throughput sequencing of cDNA clones using the transposon Tn5. This approach has been tailored for implementation within an existing large-scale 'shotgun-style' sequencing program, although it could be readily adapted for use in virtually any sequencing environment. In addition, we have developed a modified version of our strategy that can be applied to cDNA clones with large cloning vectors, thereby overcoming a potential limitation of transposon-based approaches. Here we describe the details of our cDNA-sequencing pipeline, including a summary of the experience in sequencing more than 4200 cDNA clones to produce more than 8 million base pairs of high-accuracy cDNA sequence. These data provide both convincing evidence that the insertion of Tn5 into cDNA clones is sufficiently random for its effective use in large-scale cDNA sequencing as well as interesting insight about the sequence context preferred for insertion by Tn5.


Asunto(s)
Elementos Transponibles de ADN/genética , ADN Complementario/genética , Análisis de Secuencia de ADN/métodos , Composición de Base , Distribución Binomial , Clonación Molecular , Vectores Genéticos/genética , Mutagénesis Insercional/genética , Mapeo Físico de Cromosoma/métodos , Recombinación Genética/genética , Sensibilidad y Especificidad
10.
Sci Transl Med ; 6(254): 254ra126, 2014 Sep 17.
Artículo en Inglés | MEDLINE | ID: mdl-25232178

RESUMEN

Public health officials have raised concerns that plasmid transfer between Enterobacteriaceae species may spread resistance to carbapenems, an antibiotic class of last resort, thereby rendering common health care-associated infections nearly impossible to treat. To determine the diversity of carbapenemase-encoding plasmids and assess their mobility among bacterial species, we performed comprehensive surveillance and genomic sequencing of carbapenem-resistant Enterobacteriaceae in the National Institutes of Health (NIH) Clinical Center patient population and hospital environment. We isolated a repertoire of carbapenemase-encoding Enterobacteriaceae, including multiple strains of Klebsiella pneumoniae, Klebsiella oxytoca, Escherichia coli, Enterobacter cloacae, Citrobacter freundii, and Pantoea species. Long-read genome sequencing with full end-to-end assembly revealed that these organisms carry the carbapenem resistance genes on a wide array of plasmids. K. pneumoniae and E. cloacae isolated simultaneously from a single patient harbored two different carbapenemase-encoding plasmids, indicating that plasmid transfer between organisms was unlikely within this patient. We did, however, find evidence of horizontal transfer of carbapenemase-encoding plasmids between K. pneumoniae, E. cloacae, and C. freundii in the hospital environment. Our data, including full plasmid identification, challenge assumptions about horizontal gene transfer events within patients and identify possible connections between patients and the hospital environment. In addition, we identified a new carbapenemase-encoding plasmid of potentially high clinical impact carried by K. pneumoniae, E. coli, E. cloacae, and Pantoea species, in unrelated patients and in the hospital environment.


Asunto(s)
Proteínas Bacterianas/biosíntesis , Infección Hospitalaria , Enterobacteriaceae/enzimología , Plásmidos , beta-Lactamasas/biosíntesis , Enterobacteriaceae/clasificación , Enterobacteriaceae/genética , Hospitales Públicos , Humanos , National Institutes of Health (U.S.) , Vigilancia de la Población , Reacción en Cadena en Tiempo Real de la Polimerasa , Estados Unidos
11.
Genome Biol ; 13(7): R64, 2012 Jul 25.
Artículo en Inglés | MEDLINE | ID: mdl-22830599

RESUMEN

BACKGROUND: While Staphylococcus epidermidis is commonly isolated from healthy human skin, it is also the most frequent cause of nosocomial infections on indwelling medical devices. Despite its importance, few genome sequences existed and the most frequent hospital-associated lineage, ST2, had not been fully sequenced. RESULTS: We cultivated 71 commensal S. epidermidis isolates from 15 skin sites and compared them with 28 nosocomial isolates from venous catheters and blood cultures. We produced 21 commensal and 9 nosocomial draft genomes, and annotated and compared their gene content, phylogenetic relatedness and biochemical functions. The commensal strains had an open pan-genome with 80% core genes and 20% variable genes. The variable genome was characterized by an overabundance of transposable elements, transcription factors and transporters. Biochemical diversity, as assayed by antibiotic resistance and in vitro biofilm formation, demonstrated the varied phenotypic consequences of this genomic diversity. The nosocomial isolates exhibited both large-scale rearrangements and single-nucleotide variation. We showed that S. epidermidis genomes separate into two phylogenetic groups, one consisting only of commensals. The formate dehydrogenase gene, present only in commensals, is a discriminatory marker between the two groups. CONCLUSIONS: Commensal skin S. epidermidis have an open pan-genome and show considerable diversity between isolates, even when derived from a single individual or body site. For ST2, the most common nosocomial lineage, we detect variation between three independent isolates sequenced. Finally, phylogenetic analyses revealed a previously unrecognized group of S. epidermidis strains characterized by reduced virulence and formate dehydrogenase, which we propose as a clinical molecular marker.


Asunto(s)
Infecciones Relacionadas con Catéteres/microbiología , Infección Hospitalaria/microbiología , Análisis de Secuencia de ADN/métodos , Piel/microbiología , Staphylococcus epidermidis/clasificación , Staphylococcus epidermidis/genética , Farmacorresistencia Bacteriana , Evolución Molecular , Variación Genética , Genoma Bacteriano , Humanos , Datos de Secuencia Molecular , Tipificación Molecular , Filogenia , Staphylococcus epidermidis/aislamiento & purificación
12.
Nat Genet ; 43(3): 189-96, 2011 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-21258341

RESUMEN

Ciliary dysfunction leads to a broad range of overlapping phenotypes, collectively termed ciliopathies. This grouping is underscored by genetic overlap, where causal genes can also contribute modifier alleles to clinically distinct disorders. Here we show that mutations in TTC21B, which encodes the retrograde intraflagellar transport protein IFT139, cause both isolated nephronophthisis and syndromic Jeune asphyxiating thoracic dystrophy. Moreover, although resequencing of TTC21B in a large, clinically diverse ciliopathy cohort and matched controls showed a similar frequency of rare changes, in vivo and in vitro evaluations showed a significant enrichment of pathogenic alleles in cases (P < 0.003), suggesting that TTC21B contributes pathogenic alleles to ∼5% of ciliopathy cases. Our data illustrate how genetic lesions can be both causally associated with diverse ciliopathies and interact in trans with other disease-causing genes and highlight how saturated resequencing followed by functional analysis of all variants informs the genetic architecture of inherited disorders.


Asunto(s)
Proteínas Adaptadoras Transductoras de Señales/genética , Alelos , Trastornos de la Motilidad Ciliar/genética , Animales , Variación Genética , Humanos , Ratones , Mutación , Linaje , Células Fotorreceptoras/fisiología , Pez Cebra/genética
13.
Science ; 324(5931): 1190-2, 2009 May 29.
Artículo en Inglés | MEDLINE | ID: mdl-19478181

RESUMEN

Human skin is a large, heterogeneous organ that protects the body from pathogens while sustaining microorganisms that influence human health and disease. Our analysis of 16S ribosomal RNA gene sequences obtained from 20 distinct skin sites of healthy humans revealed that physiologically comparable sites harbor similar bacterial communities. The complexity and stability of the microbial community are dependent on the specific characteristics of the skin site. This topographical and temporal survey provides a baseline for studies that examine the role of bacterial communities in disease states and the microbial interdependencies required to maintain healthy skin.


Asunto(s)
Bacterias/aislamiento & purificación , Metagenoma , Piel/microbiología , Actinobacteria/clasificación , Actinobacteria/genética , Actinobacteria/aislamiento & purificación , Adulto , Bacterias/clasificación , Bacterias/genética , Bacteroidetes/clasificación , Bacteroidetes/genética , Bacteroidetes/aislamiento & purificación , Biodiversidad , Femenino , Genes de ARNr , Humanos , Masculino , Datos de Secuencia Molecular , Filogenia , Proteobacteria/clasificación , Proteobacteria/genética , Proteobacteria/aislamiento & purificación , ARN Ribosómico 16S , Factores de Tiempo , Adulto Joven
14.
Genome Res ; 18(7): 1043-50, 2008 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-18502944

RESUMEN

The many layers and structures of the skin serve as elaborate hosts to microbes, including a diversity of commensal and pathogenic bacteria that contribute to both human health and disease. To determine the complexity and identity of the microbes inhabiting the skin, we sequenced bacterial 16S small-subunit ribosomal RNA genes isolated from the inner elbow of five healthy human subjects. This analysis revealed 113 operational taxonomic units (OTUs; "phylotypes") at the level of 97% similarity that belong to six bacterial divisions. To survey all depths of the skin, we sampled using three methods: swab, scrape, and punch biopsy. Proteobacteria dominated the skin microbiota at all depths of sampling. Interpersonal variation is approximately equal to intrapersonal variation when considering bacterial community membership and structure. Finally, we report strong similarities in the complexity and identity of mouse and human skin microbiota. This study of healthy human skin microbiota will serve to direct future research addressing the role of skin microbiota in health and disease, and metagenomic projects addressing the complex physiological interactions between the skin and the microbes that inhabit this environment.


Asunto(s)
Bacterias/genética , Variación Genética , Piel/microbiología , Adulto , Anciano , Animales , ADN Bacteriano/análisis , ADN Bacteriano/genética , ADN Ribosómico/genética , Femenino , Humanos , Masculino , Ratones , Ratones Endogámicos C57BL , Persona de Mediana Edad , ARN Ribosómico 16S/genética
15.
J Mol Evol ; 65(3): 207-14, 2007 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-17676366

RESUMEN

It is understood that DNA and amino acid substitution rates are highly sequence context-dependent, e.g., C --> T substitutions in vertebrates may occur much more frequently at CpG sites and that cysteine substitution rates may depend on support of the context for participation in a disulfide bond. Furthermore, many applications rely on quantitative models of nucleotide or amino acid substitution, including phylogenetic inference and identification of amino acid sequence positions involved in functional specificity. We describe quantification of the context dependence of nucleotide substitution rates using baboon, chimpanzee, and human genomic sequence data generated by the NISC Comparative Sequencing Program. Relative mutation rates are reported for the 96 classes of mutations of the form 5' alphabetagamma 3' --> 5' alphadeltagamma 3', where alpha, beta, gamma, and delta are nucleotides and beta not equal delta, based on maximum likelihood calculations. Our results confirm that C --> T substitutions are enhanced at CpG sites compared with other transitions, relatively independent of the identity of the preceding nucleotide. While, as expected, transitions generally occur more frequently than transversions, we find that the most frequent transversions involve the C at CpG sites (CpG transversions) and that their rate is comparable to the rate of transitions at non-CpG sites. A four-class model of the rates of context-dependent evolution of primate DNA sequences, CpG transitions > non-CpG transitions approximately CpG transversions > non-CpG transversions, captures qualitative features of the mutation spectrum. We find that despite qualitative similarity of mutation rates among different genomic regions, there are statistically significant differences.


Asunto(s)
Inestabilidad Genómica , Mutación , Primates/genética , Animales , Composición de Base , Secuencia de Bases , Islas de CpG , Análisis Mutacional de ADN , Humanos , Funciones de Verosimilitud , Modelos Genéticos
16.
Genome Res ; 17(6): 760-74, 2007 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-17567995

RESUMEN

A key component of the ongoing ENCODE project involves rigorous comparative sequence analyses for the initially targeted 1% of the human genome. Here, we present orthologous sequence generation, alignment, and evolutionary constraint analyses of 23 mammalian species for all ENCODE targets. Alignments were generated using four different methods; comparisons of these methods reveal large-scale consistency but substantial differences in terms of small genomic rearrangements, sensitivity (sequence coverage), and specificity (alignment accuracy). We describe the quantitative and qualitative trade-offs concomitant with alignment method choice and the levels of technical error that need to be accounted for in applications that require multisequence alignments. Using the generated alignments, we identified constrained regions using three different methods. While the different constraint-detecting methods are in general agreement, there are important discrepancies relating to both the underlying alignments and the specific algorithms. However, by integrating the results across the alignments and constraint-detecting methods, we produced constraint annotations that were found to be robust based on multiple independent measures. Analyses of these annotations illustrate that most classes of experimentally annotated functional elements are enriched for constrained sequences; however, large portions of each class (with the exception of protein-coding sequences) do not overlap constrained regions. The latter elements might not be under primary sequence constraint, might not be constrained across all mammals, or might have expendable molecular functions. Conversely, 40% of the constrained sequences do not overlap any of the functional elements that have been experimentally identified. Together, these findings demonstrate and quantify how many genomic functional elements await basic molecular characterization.


Asunto(s)
Evolución Molecular , Genoma Humano , Mamíferos/genética , Sistemas de Lectura Abierta , Filogenia , Alineación de Secuencia , Animales , Proyecto Genoma Humano , Humanos
17.
Proc Natl Acad Sci U S A ; 103(4): 1030-5, 2006 Jan 24.
Artículo en Inglés | MEDLINE | ID: mdl-16418266

RESUMEN

Identification of the specific cytogenetic abnormality is one of the critical steps for classification of acute myeloblastic leukemia (AML) which influences the selection of appropriate therapy and provides information about disease prognosis. However at present, the genetic complexity of AML is only partially understood. To obtain a comprehensive, unbiased, quantitative measure, we performed serial analysis of gene expression (SAGE) on CD15(+) myeloid progenitor cells from 22 AML patients who had four of the most common translocations, namely t(8;21), t(15;17), t(9;11), and inv(16). The quantitative data provide clear evidence that the major change in all these translocation-carrying leukemias is a decrease in expression of the majority of transcripts compared with normal CD15(+) cells. From a total of 1,247,535 SAGE tags, we identified 2,604 transcripts whose expression was significantly altered in these leukemias compared with normal myeloid progenitor cells. The gene ontology of the 1,110 transcripts that matched known genes revealed that each translocation had a uniquely altered profile in various functional categories including regulation of transcription, cell cycle, protein synthesis, and apoptosis. Our global analysis of gene expression of common translocations in AML can focus attention on the function of the genes with altered expression for future biological studies as well as highlight genes/pathways for more specifically targeted therapy.


Asunto(s)
Perfilación de la Expresión Génica/métodos , Regulación Neoplásica de la Expresión Génica , Regulación de la Expresión Génica , Leucemia Mieloide Aguda/genética , Leucemia/genética , Translocación Genética , Apoptosis , Diferenciación Celular , Cromosomas Humanos Par 11/genética , Cromosomas Humanos Par 9/genética , Biología Computacional , ADN Complementario/metabolismo , Etiquetas de Secuencia Expresada , Biblioteca de Genes , Humanos , Leucocitos Mononucleares/citología , Antígeno Lewis X/biosíntesis , Células Progenitoras Mieloides/citología , Análisis de Secuencia por Matrices de Oligonucleótidos , ARN/química , ARN Mensajero/metabolismo , Factores de Tiempo
18.
Genome Res ; 16(6): 796-803, 2006 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-16672307

RESUMEN

Sequencing of full-insert clones from full-length cDNA libraries from both Xenopus laevis and Xenopus tropicalis has been ongoing as part of the Xenopus Gene Collection Initiative. Here we present 10,967 full ORF verified cDNA clones (8049 from X. laevis and 2918 from X. tropicalis) as a community resource. Because the genome of X. laevis, but not X. tropicalis, has undergone allotetraploidization, comparison of coding sequences from these two clawed (pipid) frogs provides a unique angle for exploring the molecular evolution of duplicate genes. Within our clone set, we have identified 445 gene trios, each comprised of an allotetraploidization-derived X. laevis gene pair and their shared X. tropicalis ortholog. Pairwise dN/dS, comparisons within trios show strong evidence for purifying selection acting on all three members. However, dN/dS ratios between X. laevis gene pairs are elevated relative to their X. tropicalis ortholog. This difference is highly significant and indicates an overall relaxation of selective pressures on duplicated gene pairs. We have found that the paralogs that have been lost since the tetraploidization event are enriched for several molecular functions, but have found no such enrichment in the extant paralogs. Approximately 14% of the paralogous pairs analyzed here also show differential expression indicative of subfunctionalization.


Asunto(s)
Secuencia de Bases , Biblioteca de Genes , Poliploidía , Xenopus laevis/genética , Xenopus/genética , Animales , Evolución Molecular , Expresión Génica , Genes Duplicados , Genoma , Datos de Secuencia Molecular , Sistemas de Lectura Abierta/genética , Filogenia , Homología de Secuencia de Ácido Nucleico
19.
Genome Res ; 13(1): 55-63, 2003 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-12529306

RESUMEN

Duplications have long been postulated to be an important mechanism by which genomes evolve. Interspecies genomic comparisons are one method by which the origin and molecular mechanism of duplications can be inferred. By comparative mapping in human, mouse, and rat, we previously found evidence for a recent chromosome-fission event that occurred in the mouse lineage. Cytogenetic mapping revealed that the genomic segments flanking the fission site appeared to be duplicated, with copies residing near the centromere of multiple mouse chromosomes. Here we report the mapping and sequencing of the regions of mouse chromosomes 5 and 6 involved in this chromosome-fission event as well as the results of comparative sequence analysis with the orthologous human and rat genomic regions. Our data indicate that the duplications associated with mouse chromosomes 5 and 6 are recent and that the resulting duplicated segments share significant sequence similarity with a series of regions near the centromeres of the mouse chromosomes previously identified by cytogenetic mapping. We also identified pericentromeric duplicated segments shared between mouse chromosomes 5 and 1. Finally, novel mouse satellite sequences as well as putative chimeric transcripts were found to be associated with the duplicated segments. Together, these findings demonstrate that pericentromeric duplications are not restricted to primates and may be a common mechanism for genome evolution in mammals.


Asunto(s)
Centrómero/genética , Duplicación de Gen , Animales , Quimera/genética , Cromosomas/genética , Cromosomas Humanos/genética , Secuencia Conservada/genética , ADN Satélite/genética , Evolución Molecular , Marcadores Genéticos/genética , Humanos , Ratones , Mapeo Físico de Cromosoma/métodos , Ratas
20.
Genome Res ; 12(1): 3-15, 2002 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-11779826

RESUMEN

Williams syndrome is a complex developmental disorder that results from the heterozygous deletion of a approximately 1.6-Mb segment of human chromosome 7q11.23. These deletions are mediated by large (approximately 300 kb) duplicated blocks of DNA of near-identical sequence. Previously, we showed that the orthologous region of the mouse genome is devoid of such duplicated segments. Here, we extend our studies to include the generation of approximately 3.3 Mb of genomic sequence from the mouse Williams syndrome region, of which just over 1.4 Mb is finished to high accuracy. Comparative analyses of the mouse and human sequences within and immediately flanking the interval commonly deleted in Williams syndrome have facilitated the identification of nine previously unreported genes, provided detailed sequence-based information regarding 30 genes residing in the region, and revealed a number of potentially interesting conserved noncoding sequences. Finally, to facilitate comparative sequence analysis, we implemented several enhancements to the program, including the addition of links from annotated features within a generated percent-identity plot to specific records in public databases. Taken together, the results reported here provide an important comparative sequence resource that should catalyze additional studies of Williams syndrome, including those that aim to characterize genes within the commonly deleted interval and to develop mouse models of the disorder.


Asunto(s)
Cromosomas Humanos Par 7/genética , Análisis de Secuencia de ADN/métodos , Homología de Secuencia de Ácido Nucleico , Síndrome de Williams/genética , Animales , Composición de Base , Secuencia Conservada/genética , Humanos , Ratones , Datos de Secuencia Molecular , Mapeo Físico de Cromosoma
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA