Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 55
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
PLoS Biol ; 21(5): e3001822, 2023 05.
Artículo en Inglés | MEDLINE | ID: mdl-37205709

RESUMEN

Candida albicans is a frequent colonizer of human mucosal surfaces as well as an opportunistic pathogen. C. albicans is remarkably versatile in its ability to colonize diverse host sites with differences in oxygen and nutrient availability, pH, immune responses, and resident microbes, among other cues. It is unclear how the genetic background of a commensal colonizing population can influence the shift to pathogenicity. Therefore, we examined 910 commensal isolates from 35 healthy donors to identify host niche-specific adaptations. We demonstrate that healthy people are reservoirs for genotypically and phenotypically diverse C. albicans strains. Using limited diversity exploitation, we identified a single nucleotide change in the uncharacterized ZMS1 transcription factor that was sufficient to drive hyper invasion into agar. We found that SC5314 was significantly different from the majority of both commensal and bloodstream isolates in its ability to induce host cell death. However, our commensal strains retained the capacity to cause disease in the Galleria model of systemic infection, including outcompeting the SC5314 reference strain during systemic competition assays. This study provides a global view of commensal strain variation and within-host strain diversity of C. albicans and suggests that selection for commensalism in humans does not result in a fitness cost for invasive disease.


Asunto(s)
Candida albicans , Simbiosis , Humanos , Candida albicans/genética , Factores de Transcripción/genética , Regulación de la Expresión Génica
2.
Nat Rev Genet ; 21(3): 171-189, 2020 03.
Artículo en Inglés | MEDLINE | ID: mdl-31729472

RESUMEN

Identifying structural variation (SV) is essential for genome interpretation but has been historically difficult due to limitations inherent to available genome technologies. Detection methods that use ensemble algorithms and emerging sequencing technologies have enabled the discovery of thousands of SVs, uncovering information about their ubiquity, relationship to disease and possible effects on biological mechanisms. Given the variability in SV type and size, along with unique detection biases of emerging genomic platforms, multiplatform discovery is necessary to resolve the full spectrum of variation. Here, we review modern approaches for investigating SVs and proffer that, moving forwards, studies integrating biological information with detection will be necessary to comprehensively understand the impact of SV in the human genome.


Asunto(s)
Variación Estructural del Genoma , Análisis de Secuencia/métodos , Algoritmos , Genoma Humano , Humanos
3.
Cell ; 141(7): 1253-61, 2010 Jun 25.
Artículo en Inglés | MEDLINE | ID: mdl-20603005

RESUMEN

Two abundant classes of mobile elements, namely Alu and L1 elements, continue to generate new retrotransposon insertions in human genomes. Estimates suggest that these elements have generated millions of new germline insertions in individual human genomes worldwide. Unfortunately, current technologies are not capable of detecting most of these young insertions, and the true extent of germline mutagenesis by endogenous human retrotransposons has been difficult to examine. Here, we describe technologies for detecting these young retrotransposon insertions and demonstrate that such insertions indeed are abundant in human populations. We also found that new somatic L1 insertions occur at high frequencies in human lung cancer genomes. Genome-wide analysis suggests that altered DNA methylation may be responsible for the high levels of L1 mobilization observed in these tumors. Our data indicate that transposon-mediated mutagenesis is extensive in human genomes and is likely to have a major impact on human biology and diseases.


Asunto(s)
Elementos Alu , Genoma Humano , Elementos de Nucleótido Esparcido Largo , Mutagénesis , Análisis de Secuencia de ADN/métodos , Neoplasias Encefálicas/genética , Humanos , Neoplasias Pulmonares/genética , Metilación
4.
Am J Hum Genet ; 108(5): 919-928, 2021 05 06.
Artículo en Inglés | MEDLINE | ID: mdl-33789087

RESUMEN

Virtually all genome sequencing efforts in national biobanks, complex and Mendelian disease programs, and medical genetic initiatives are reliant upon short-read whole-genome sequencing (srWGS), which presents challenges for the detection of structural variants (SVs) relative to emerging long-read WGS (lrWGS) technologies. Given this ubiquity of srWGS in large-scale genomics initiatives, we sought to establish expectations for routine SV detection from this data type by comparison with lrWGS assembly, as well as to quantify the genomic properties and added value of SVs uniquely accessible to each technology. Analyses from the Human Genome Structural Variation Consortium (HGSVC) of three families captured ~11,000 SVs per genome from srWGS and ~25,000 SVs per genome from lrWGS assembly. Detection power and precision for SV discovery varied dramatically by genomic context and variant class: 9.7% of the current GRCh38 reference is defined by segmental duplication (SD) and simple repeat (SR), yet 91.4% of deletions that were specifically discovered by lrWGS localized to these regions. Across the remaining 90.3% of reference sequence, we observed extremely high (93.8%) concordance between technologies for deletions in these datasets. In contrast, lrWGS was superior for detection of insertions across all genomic contexts. Given that non-SD/SR sequences encompass 95.9% of currently annotated disease-associated exons, improved sensitivity from lrWGS to discover novel pathogenic deletions in these currently interpretable genomic regions is likely to be incremental. However, these analyses highlight the considerable added value of assembly-based lrWGS to create new catalogs of insertions and transposable elements, as well as disease-associated repeat expansions in genomic sequences that were previously recalcitrant to routine assessment.


Asunto(s)
Genoma Humano/genética , Variación Estructural del Genoma , Genómica/métodos , Objetivos , Secuenciación Completa del Genoma/métodos , Secuenciación Completa del Genoma/normas , Variaciones en el Número de Copia de ADN , Exones/genética , Humanos , Proyectos de Investigación , Duplicaciones Segmentarias en el Genoma , Alineación de Secuencia
5.
Nucleic Acids Res ; 48(3): 1146-1163, 2020 02 20.
Artículo en Inglés | MEDLINE | ID: mdl-31853540

RESUMEN

Long Interspersed Element-1 (LINE-1) retrotransposition contributes to inter- and intra-individual genetic variation and occasionally can lead to human genetic disorders. Various strategies have been developed to identify human-specific LINE-1 (L1Hs) insertions from short-read whole genome sequencing (WGS) data; however, they have limitations in detecting insertions in complex repetitive genomic regions. Here, we developed a computational tool (PALMER) and used it to identify 203 non-reference L1Hs insertions in the NA12878 benchmark genome. Using PacBio long-read sequencing data, we identified L1Hs insertions that were absent in previous short-read studies (90/203). Approximately 81% (73/90) of the L1Hs insertions reside within endogenous LINE-1 sequences in the reference assembly and the analysis of unique breakpoint junction sequences revealed 63% (57/90) of these L1Hs insertions could be genotyped in 1000 Genomes Project sequences. Moreover, we observed that amplification biases encountered in single-cell WGS experiments led to a wide variation in L1Hs insertion detection rates between four individual NA12878 cells; under-amplification limited detection to 32% (65/203) of insertions, whereas over-amplification increased false positive calls. In sum, these data indicate that L1Hs insertions are often missed using standard short-read sequencing approaches and long-read sequencing approaches can significantly improve the detection of L1Hs insertions present in individual genomes.


Asunto(s)
Elementos de Nucleótido Esparcido Largo , Análisis de Secuencia de ADN/métodos , Línea Celular , Genoma Humano , Humanos , Polimorfismo Genético , Análisis de la Célula Individual , Programas Informáticos , Secuenciación Completa del Genoma
6.
Proc Natl Acad Sci U S A ; 116(41): 20612-20622, 2019 10 08.
Artículo en Inglés | MEDLINE | ID: mdl-31548405

RESUMEN

Long interspersed element-1 (LINE-1 or L1) amplifies via retrotransposition. Active L1s encode 2 proteins (ORF1p and ORF2p) that bind their encoding transcript to promote retrotransposition in cis The L1-encoded proteins also promote the retrotransposition of small-interspersed element RNAs, noncoding RNAs, and messenger RNAs in trans Some L1-mediated retrotransposition events consist of a copy of U6 RNA conjoined to a variably 5'-truncated L1, but how U6/L1 chimeras are formed requires elucidation. Here, we report the following: The RNA ligase RtcB can join U6 RNAs ending in a 2',3'-cyclic phosphate to L1 RNAs containing a 5'-OH in vitro; depletion of endogenous RtcB in HeLa cell extracts reduces U6/L1 RNA ligation efficiency; retrotransposition of U6/L1 RNAs leads to U6/L1 pseudogene formation; and a unique cohort of U6/L1 chimeric RNAs are present in multiple human cell lines. Thus, these data suggest that U6 small nuclear RNA (snRNA) and RtcB participate in the formation of chimeric RNAs and that retrotransposition of chimeric RNA contributes to interindividual genetic variation.


Asunto(s)
Células Madre Embrionarias/metabolismo , Elementos de Nucleótido Esparcido Largo/genética , Neoplasias/genética , Células-Madre Neurales/metabolismo , ARN Nuclear Pequeño/genética , ARN/genética , Retroelementos/genética , Células HeLa , Humanos , Seudogenes , ARN/química , ARN Mensajero/química , ARN Mensajero/genética , ARN Nuclear Pequeño/química
7.
Cancer ; 127(19): 3531-3540, 2021 10 01.
Artículo en Inglés | MEDLINE | ID: mdl-34160069

RESUMEN

BACKGROUND: Human papillomavirus (HPV) is a well-established driver of malignant transformation at a number of sites, including head and neck, cervical, vulvar, anorectal, and penile squamous cell carcinomas; however, the impact of HPV integration into the host human genome on this process remains largely unresolved. This is due to the technical challenge of identifying HPV integration sites, which includes limitations of existing informatics approaches to discovering viral-host breakpoints from low-read-coverage sequencing data. METHODS: To overcome this limitation, the authors developed SearcHPV, a new HPV detection pipeline based on targeted capture technology, and applied the algorithm to targeted capture data. They performed an integrated analysis of SearcHPV-defined breakpoints with genome-wide linked-read sequencing to identify potential HPV-related structural variations. RESULTS: Through an analysis of HPV+ models, the authors showed that SearcHPV detected HPV-host integration sites with a higher sensitivity and specificity than 2 other commonly used HPV detection callers. SearcHPV uncovered HPV integration sites adjacent to known cancer-related genes, including TP63, MYC, and TRAF2, and near regions of large structural variation. The authors further validated the junction contig assembly feature of SearcHPV, which helped to accurately identify viral-host junction breakpoint sequences. They found that viral integration occurred through a variety of DNA repair mechanisms, including nonhomologous end joining, alternative end joining, and microhomology-mediated repair. CONCLUSIONS: In summary, SearcHPV is a new optimized tool for the accurate detection of HPV-human integration sites from targeted capture DNA sequencing data.


Asunto(s)
Alphapapillomavirus , Carcinoma de Células Escamosas , Infecciones por Papillomavirus , Neoplasias del Cuello Uterino , Alphapapillomavirus/genética , ADN Viral/genética , Femenino , Genómica , Humanos , Papillomaviridae/genética , Infecciones por Papillomavirus/complicaciones , Infecciones por Papillomavirus/genética
8.
Gastroenterology ; 156(5): 1404-1415, 2019 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-30578782

RESUMEN

BACKGROUND & AIMS: African American and European American individuals have a similar prevalence of gastroesophageal reflux disease (GERD), yet esophageal adenocarcinoma (EAC) disproportionately affects European American individuals. We investigated whether the esophageal squamous mucosa of African American individuals has features that protect against GERD-induced damage, compared with European American individuals. METHODS: We performed transcriptional profile analysis of esophageal squamous mucosa tissues from 20 African American and 20 European American individuals (24 with no disease and 16 with Barrett's esophagus and/or EAC). We confirmed our findings in a cohort of 56 patients and analyzed DNA samples from patients to identify associated variants. Observations were validated using matched genomic sequence and expression data from lymphoblasts from the 1000 Genomes Project. A panel of esophageal samples from African American and European American subjects was used to confirm allele-related differences in protein levels. The esophageal squamous-derived cell line Het-1A and a rat esophagogastroduodenal anastomosis model for reflux-generated esophageal damage were used to investigate the effects of the DNA-damaging agent cumene-hydroperoxide (cum-OOH) and a chemopreventive cranberry proanthocyanidin (C-PAC) extract, respectively, on levels of protein and messenger RNA (mRNA). RESULTS: We found significantly higher levels of glutathione S-transferase theta 2 (GSTT2) mRNA in squamous mucosa from African American compared with European American individuals and associated these with variants within the GSTT2 locus in African American individuals. We confirmed that 2 previously identified genomic variants at the GSTT2 locus, a 37-kb deletion and a 17-bp promoter duplication, reduce expression of GSTT2 in tissues from European American individuals. The nonduplicated 17-bp promoter was more common in tissue samples from populations of African descendant. GSTT2 protected Het-1A esophageal squamous cells from cum-OOH-induced DNA damage. Addition of C-PAC increased GSTT2 expression in Het-1A cells incubated with cum-OOH and in rats with reflux-induced esophageal damage. C-PAC also reduced levels of DNA damage in reflux-exposed rat esophagi, as observed by reduced levels of phospho-H2A histone family member X. CONCLUSIONS: We found GSTT2 to protect esophageal squamous cells against DNA damage from genotoxic stress and that GSTT2 expression can be induced by C-PAC. Increased levels of GSTT2 in esophageal tissues of African American individuals might protect them from GERD-induced damage and contribute to the low incidence of EAC in this population.


Asunto(s)
Adenocarcinoma/genética , Esófago de Barrett/genética , Negro o Afroamericano/genética , Daño del ADN , Mucosa Esofágica/enzimología , Neoplasias Esofágicas/genética , Reflujo Gastroesofágico/genética , Glutatión Transferasa/genética , Población Blanca/genética , Adenocarcinoma/enzimología , Adenocarcinoma/etnología , Adenocarcinoma/patología , Animales , Esófago de Barrett/enzimología , Esófago de Barrett/etnología , Esófago de Barrett/patología , Modelos Animales de Enfermedad , Mucosa Esofágica/patología , Neoplasias Esofágicas/enzimología , Neoplasias Esofágicas/etnología , Neoplasias Esofágicas/patología , Femenino , Reflujo Gastroesofágico/enzimología , Reflujo Gastroesofágico/etnología , Reflujo Gastroesofágico/patología , Glutatión Transferasa/metabolismo , Células HeLa , Histonas/metabolismo , Humanos , Incidencia , Masculino , Persona de Mediana Edad , Fosfoproteínas/metabolismo , Fosforilación , Factores Protectores , Ratas Sprague-Dawley , Factores de Riesgo , Estados Unidos/epidemiología , Regulación hacia Arriba
9.
Genome Res ; 27(11): 1916-1929, 2017 11.
Artículo en Inglés | MEDLINE | ID: mdl-28855259

RESUMEN

Mobile element insertions (MEIs) represent ∼25% of all structural variants in human genomes. Moreover, when they disrupt genes, MEIs can influence human traits and diseases. Therefore, MEIs should be fully discovered along with other forms of genetic variation in whole genome sequencing (WGS) projects involving population genetics, human diseases, and clinical genomics. Here, we describe the Mobile Element Locator Tool (MELT), which was developed as part of the 1000 Genomes Project to perform MEI discovery on a population scale. Using both Illumina WGS data and simulations, we demonstrate that MELT outperforms existing MEI discovery tools in terms of speed, scalability, specificity, and sensitivity, while also detecting a broader spectrum of MEI-associated features. Several run modes were developed to perform MEI discovery on local and cloud systems. In addition to using MELT to discover MEIs in modern humans as part of the 1000 Genomes Project, we also used it to discover MEIs in chimpanzees and ancient (Neanderthal and Denisovan) hominids. We detected diverse patterns of MEI stratification across these populations that likely were caused by (1) diverse rates of MEI production from source elements, (2) diverse patterns of MEI inheritance, and (3) the introgression of ancient MEIs into modern human genomes. Overall, our study provides the most comprehensive map of MEIs to date spanning chimpanzees, ancient hominids, and modern humans and reveals new aspects of MEI biology in these lineages. We also demonstrate that MELT is a robust platform for MEI discovery and analysis in a variety of experimental settings.


Asunto(s)
Biología Computacional/métodos , Elementos Transponibles de ADN , Hombre de Neandertal/genética , Pan troglodytes/genética , Animales , Bases de Datos Genéticas , Evolución Molecular , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Polimorfismo de Nucleótido Simple , Programas Informáticos , Secuenciación Completa del Genoma/métodos
10.
BMC Genomics ; 20(1): 391, 2019 May 20.
Artículo en Inglés | MEDLINE | ID: mdl-31109297

RESUMEN

BACKGROUND: Upstream open reading frames (uORFs) initiate translation within mRNA 5' leaders, and have the potential to alter main coding sequence (CDS) translation on transcripts in which they reside. Ribosome profiling (RP) studies suggest that translating ribosomes are pervasive within 5' leaders across model systems. However, the significance of this observation remains unclear. To explore a role for uORF usage in a model of neuronal differentiation, we performed RP on undifferentiated and differentiated human neuroblastoma cells. RESULTS: Using a spectral coherence algorithm (SPECtre), we identify 4954 consistently translated uORFs across 31% of all neuroblastoma transcripts. These uORFs predominantly utilize non-AUG initiation codons and exhibit translational efficiencies (TE) comparable to annotated coding regions. On a population basis, the global impact of both AUG and non-AUG initiated uORFs on basal CDS translation were small, even when analysis is limited to conserved and consistently translated uORFs. However, uORFs did alter the translation of a subset of genes, including the Diamond-Blackfan Anemia associated ribosomal gene RPS24. With retinoic acid induced differentiation, we observed an overall positive correlation in translational shifts between uORF/CDS pairs. However, CDSs downstream of uORFs show smaller shifts in TE with differentiation relative to CDSs without a predicted uORF, suggesting that uORF translation buffers cell state dependent fluctuations in CDS translation. CONCLUSION: This work provides insights into the dynamic relationships and potential regulatory functions of uORF/CDS pairs in a model of neuronal differentiation.


Asunto(s)
Diferenciación Celular/genética , Neuronas/metabolismo , Sistemas de Lectura Abierta , Biosíntesis de Proteínas , Algoritmos , Línea Celular Tumoral , Regulación de la Expresión Génica , Humanos , Modelos Biológicos , Neuronas/citología , Ribosomas/metabolismo
11.
Nature ; 470(7332): 59-65, 2011 Feb 03.
Artículo en Inglés | MEDLINE | ID: mdl-21293372

RESUMEN

Genomic structural variants (SVs) are abundant in humans, differing from other forms of variation in extent, origin and functional impact. Despite progress in SV characterization, the nucleotide resolution architecture of most SVs remains unknown. We constructed a map of unbalanced SVs (that is, copy number variants) based on whole genome DNA sequencing data from 185 human genomes, integrating evidence from complementary SV discovery approaches with extensive experimental validations. Our map encompassed 22,025 deletions and 6,000 additional SVs, including insertions and tandem duplications. Most SVs (53%) were mapped to nucleotide resolution, which facilitated analysing their origin and functional impact. We examined numerous whole and partial gene deletions with a genotyping approach and observed a depletion of gene disruptions amongst high frequency deletions. Furthermore, we observed differences in the size spectra of SVs originating from distinct formation mechanisms, and constructed a map of SV hotspots formed by common mechanisms. Our analytical framework and SV map serves as a resource for sequencing-based association studies.


Asunto(s)
Variaciones en el Número de Copia de ADN/genética , Genética de Población , Genoma Humano/genética , Genómica , Duplicación de Gen/genética , Predisposición Genética a la Enfermedad/genética , Genotipo , Humanos , Mutagénesis Insercional/genética , Reproducibilidad de los Resultados , Análisis de Secuencia de ADN , Eliminación de Secuencia/genética
12.
BMC Bioinformatics ; 17(1): 482, 2016 Nov 25.
Artículo en Inglés | MEDLINE | ID: mdl-27884106

RESUMEN

BACKGROUND: Active protein translation can be assessed and measured using ribosome profiling sequencing strategies. Prevailing analytical approaches applied to this technology make use of sequence fragment length profiling or reading frame occupancy enrichment to differentiate between active translation and background noise, however they do not consider additional characteristics inherent to the technology which limits their overall accuracy. RESULTS: Here, we present an analytical tool that models the overall tri-nucleotide periodicity of ribosomal occupancy using a classifier based on spectral coherence. Our software, SPECtre, examines the relationship of normalized ribosome profiling read coverage over a rolling series of windows along a transcript relative to an idealized reference signal without the matched requirement of mRNA-Seq. CONCLUSIONS: A comparison of SPECtre against previously published methods on existing data shows a marked improvement in accuracy for detecting active translation and exhibits overall high accuracy at a low false discovery rate. In addition, SPECtre performs comparably to a recently published method similarly based on spectral coherence, however with reduced runtime and memory requirements. SPECtre is available as an open source software package at https://github.com/mills-lab/spectreok .


Asunto(s)
Algoritmos , ARN Mensajero/metabolismo , Ribosomas/metabolismo , Análisis de Secuencia de ARN/métodos , Programas Informáticos , Transcriptoma/genética , Perfilación de la Expresión Génica , Células HEK293 , Humanos , Sistemas de Lectura Abierta , Biosíntesis de Proteínas , ARN Mensajero/genética , Ribosomas/genética
13.
Nucleic Acids Res ; 42(20): 12640-9, 2014 Nov 10.
Artículo en Inglés | MEDLINE | ID: mdl-25348406

RESUMEN

The transfer of mitochondrial genetic material into the nuclear genomes of eukaryotes is a well-established phenomenon that has been previously limited to the study of static reference genomes. The recent advancement of high throughput sequencing has enabled an expanded exploration into the diversity of polymorphic nuclear mitochondrial insertions (NumtS) within human populations. We have developed an approach to discover and genotype novel Numt insertions using whole genome, paired-end sequencing data. We have applied this method to a thousand individuals in 20 populations from the 1000 Genomes Project and other datasets and identified 141 new sites of Numt insertions, extending our current knowledge of existing NumtS by almost 20%. We find that recent Numt insertions are derived from throughout the mitochondrial genome, including the D-loop, and have integration biases that differ in some respects from previous studies on older, fixed NumtS in the reference genome. We determined the complete inserted sequence for a subset of these events and have identified a number of nearly full-length mitochondrial genome insertions into nuclear chromosomes. We further define their age and origin of insertion and present an analysis of their potential impact to ongoing studies of mitochondrial heteroplasmy and disease.


Asunto(s)
Núcleo Celular/genética , Genoma Mitocondrial , Polimorfismo Genético , Genoma Humano , Genómica/métodos , Humanos , Datos de Secuencia Molecular , Mutagénesis Insercional , Filogenia
14.
Proc Natl Acad Sci U S A ; 110(39): 15764-9, 2013 Sep 24.
Artículo en Inglés | MEDLINE | ID: mdl-24014587

RESUMEN

Although nucleotide resolution maps of genomic structural variants (SVs) have provided insights into the origin and impact of phenotypic diversity in humans, comparable maps in nonhuman primates have thus far been lacking. Using massively parallel DNA sequencing, we constructed fine-resolution genomic structural variation maps in five chimpanzees, five orang-utans, and five rhesus macaques. The SV maps, which are comprised of thousands of deletions, duplications, and mobile element insertions, revealed a high activity of retrotransposition in macaques compared with great apes. By comparison, nonallelic homologous recombination is specifically active in the great apes, which is correlated with architectural differences between the genomes of great apes and macaque. Transcriptome analyses across nonhuman primates and humans revealed effects of species-specific whole-gene duplication on gene expression. We identified 13 gene duplications coinciding with the species-specific gain of tissue-specific gene expression in keeping with a role of gene duplication in the promotion of diversification and the acquisition of unique functions. Differences in the present day activity of SV formation mechanisms that our study revealed may contribute to ongoing diversification and adaptation of great ape and Old World monkey lineages.


Asunto(s)
Genoma/genética , Variación Estructural del Genoma/genética , Primates/genética , Animales , Duplicación de Gen , Perfilación de la Expresión Génica , Regulación de la Expresión Génica , Humanos , Nucleótidos/genética , Especificidad de Órganos/genética , Especificidad de la Especie
15.
Nature ; 460(7258): 1011-5, 2009 Aug 20.
Artículo en Inglés | MEDLINE | ID: mdl-19587683

RESUMEN

Recent advances in sequencing technologies have initiated an era of personal genome sequences. To date, human genome sequences have been reported for individuals with ancestry in three distinct geographical regions: a Yoruba African, two individuals of northwest European origin, and a person from China. Here we provide a highly annotated, whole-genome sequence for a Korean individual, known as AK1. The genome of AK1 was determined by an exacting, combined approach that included whole-genome shotgun sequencing (27.8x coverage), targeted bacterial artificial chromosome sequencing, and high-resolution comparative genomic hybridization using custom microarrays featuring more than 24 million probes. Alignment to the NCBI reference, a composite of several ethnic clades, disclosed nearly 3.45 million single nucleotide polymorphisms (SNPs), including 10,162 non-synonymous SNPs, and 170,202 deletion or insertion polymorphisms (indels). SNP and indel densities were strongly correlated genome-wide. Applying very conservative criteria yielded highly reliable copy number variants for clinical considerations. Potential medical phenotypes were annotated for non-synonymous SNPs, coding domain indels, and structural variants. The integration of several human whole-genome sequences derived from several ethnic groups will assist in understanding genetic ancestry, migration patterns and population bottlenecks.


Asunto(s)
Pueblo Asiatico/genética , Genoma Humano/genética , Cromosomas Artificiales Bacterianos/genética , Hibridación Genómica Comparativa , Biología Computacional , Humanos , Mutación INDEL/genética , Corea (Geográfico) , Análisis de Secuencia por Matrices de Oligonucleótidos , Polimorfismo de Nucleótido Simple/genética , Análisis de Secuencia de ADN
16.
Proc Natl Acad Sci U S A ; 109(31): 12656-61, 2012 Jul 31.
Artículo en Inglés | MEDLINE | ID: mdl-22797897

RESUMEN

Gene expression differences are shaped by selective pressures and contribute to phenotypic differences between species. We identified 964 copy number differences (CNDs) of conserved sequences across three primate species and examined their potential effects on gene expression profiles. Samples with copy number different genes had significantly different expression than samples with neutral copy number. Genes encoding regulatory molecules differed in copy number and were associated with significant expression differences. Additionally, we identified 127 CNDs that were processed pseudogenes and some of which were expressed. Furthermore, there were copy number-different regulatory regions such as ultraconserved elements and long intergenic noncoding RNAs with the potential to affect expression. We postulate that CNDs of these conserved sequences fine-tune developmental pathways by altering the levels of RNA.


Asunto(s)
ADN Intergénico/fisiología , Dosificación de Gen/fisiología , Regulación de la Expresión Génica/fisiología , Seudogenes/fisiología , ARN no Traducido/fisiología , Elementos Reguladores de la Transcripción/fisiología , Animales , Línea Celular , Humanos , Macaca mulatta , Pan troglodytes , Especificidad de la Especie
17.
Proc Natl Acad Sci U S A ; 109(2): 529-34, 2012 Jan 10.
Artículo en Inglés | MEDLINE | ID: mdl-22203992

RESUMEN

Copy number variants (CNVs) represent a substantial source of genomic variation in vertebrates and have been associated with numerous human diseases. Despite this, the extent of CNVs in the zebrafish, an important model for human disease, remains unknown. Using 80 zebrafish genomes, representing three commonly used laboratory strains and one native population, we constructed a genome-wide, high-resolution CNV map for the zebrafish comprising 6,080 CNV elements and encompassing 14.6% of the zebrafish reference genome. This amount of copy number variation is four times that previously observed in other vertebrates, including humans. Moreover, 69% of the CNV elements exhibited strain specificity, with the highest number observed for Tubingen. This variation likely arose, in part, from Tubingen's large founding size and composite population origin. Additional population genetic studies also provided important insight into the origins and substructure of these commonly used laboratory strains. This extensive variation among and within zebrafish strains may have functional effects that impact phenotype and, if not properly addressed, such extensive levels of germ-line variation and population substructure in this commonly used model organism can potentially confound studies intended for translation to human diseases.


Asunto(s)
Variaciones en el Número de Copia de ADN/genética , Variación Genética , Genómica/métodos , Pez Cebra/genética , Animales , Hibridación Genómica Comparativa , Cartilla de ADN/genética , Genética de Población , Especificidad de la Especie , Pez Cebra/clasificación
18.
Genome Res ; 21(6): 830-9, 2011 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-21460062

RESUMEN

Human genetic variation is expected to play a central role in personalized medicine. Yet only a fraction of the natural genetic variation that is harbored by humans has been discovered to date. Here we report almost 2 million small insertions and deletions (INDELs) that range from 1 bp to 10,000 bp in length in the genomes of 79 diverse humans. These variants include 819,363 small INDELs that map to human genes. Small INDELs frequently were found in the coding exons of these genes, and several lines of evidence indicate that such variation is a major determinant of human biological diversity. Microarray-based genotyping experiments revealed several interesting observations regarding the population genetics of small INDEL variation. For example, we found that many of our INDELs had high levels of linkage disequilibrium (LD) with both HapMap SNPs and with high-scoring SNPs from genome-wide association studies. Overall, our study indicates that small INDEL variation is likely to be a key factor underlying inherited traits and diseases in humans.


Asunto(s)
Variación Genética , Genoma Humano/genética , Mutación INDEL/genética , Genómica/métodos , Genotipo , Humanos , Análisis por Micromatrices , Medicina de Precisión/métodos
19.
Nat Commun ; 15(1): 4220, 2024 May 17.
Artículo en Inglés | MEDLINE | ID: mdl-38760338

RESUMEN

When somatic cells acquire complex karyotypes, they often are removed by the immune system. Mutant somatic cells that evade immune surveillance can lead to cancer. Neurons with complex karyotypes arise during neurotypical brain development, but neurons are almost never the origin of brain cancers. Instead, somatic mutations in neurons can bring about neurodevelopmental disorders, and contribute to the polygenic landscape of neuropsychiatric and neurodegenerative disease. A subset of human neurons harbors idiosyncratic copy number variants (CNVs, "CNV neurons"), but previous analyses of CNV neurons are limited by relatively small sample sizes. Here, we develop an allele-based validation approach, SCOVAL, to corroborate or reject read-depth based CNV calls in single human neurons. We apply this approach to 2,125 frontal cortical neurons from a neurotypical human brain. SCOVAL identifies 226 CNV neurons, which include a subclass of 65 CNV neurons with highly aberrant karyotypes containing whole or substantial losses on multiple chromosomes. Moreover, we find that CNV location appears to be nonrandom. Recurrent regions of neuronal genome rearrangement contain fewer, but longer, genes.


Asunto(s)
Variaciones en el Número de Copia de ADN , Mosaicismo , Neuronas , Humanos , Neuronas/metabolismo , Alelos
20.
BMC Bioinformatics ; 14: 157, 2013 May 09.
Artículo en Inglés | MEDLINE | ID: mdl-23656838

RESUMEN

BACKGROUND: In recent years there has been a growing interest in the role of copy number variations (CNV) in genetic diseases. Though there has been rapid development of technologies and statistical methods devoted to detection in CNVs from array data, the inherent challenges in data quality associated with most hybridization techniques remains a challenging problem in CNV association studies. RESULTS: To help address these data quality issues in the context of family-based association studies, we introduce a statistical framework for the intensity-based array data that takes into account the family information for copy-number assignment. The method is an adaptation of traditional methods for modeling SNP genotype data that assume Gaussian mixture model, whereby CNV calling is performed for all family members simultaneously and leveraging within family-data to reduce CNV calls that are incompatible with Mendelian inheritance while still allowing de-novo CNVs. Applying this method to simulation studies and a genome-wide association study in asthma, we find that our approach significantly improves CNV calls accuracy, and reduces the Mendelian inconsistency rates and false positive genotype calls. The results were validated using qPCR experiments. CONCLUSIONS: In conclusion, we have demonstrated that the use of family information can improve the quality of CNV calling and hopefully give more powerful association test of CNVs.


Asunto(s)
Variaciones en el Número de Copia de ADN , Técnicas de Genotipaje , Asma/genética , Familia , Estudio de Asociación del Genoma Completo , Humanos , Reacción en Cadena de la Polimerasa
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA