Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 55
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
PLoS Biol ; 21(5): e3001822, 2023 05.
Artigo em Inglês | MEDLINE | ID: mdl-37205709

RESUMO

Candida albicans is a frequent colonizer of human mucosal surfaces as well as an opportunistic pathogen. C. albicans is remarkably versatile in its ability to colonize diverse host sites with differences in oxygen and nutrient availability, pH, immune responses, and resident microbes, among other cues. It is unclear how the genetic background of a commensal colonizing population can influence the shift to pathogenicity. Therefore, we examined 910 commensal isolates from 35 healthy donors to identify host niche-specific adaptations. We demonstrate that healthy people are reservoirs for genotypically and phenotypically diverse C. albicans strains. Using limited diversity exploitation, we identified a single nucleotide change in the uncharacterized ZMS1 transcription factor that was sufficient to drive hyper invasion into agar. We found that SC5314 was significantly different from the majority of both commensal and bloodstream isolates in its ability to induce host cell death. However, our commensal strains retained the capacity to cause disease in the Galleria model of systemic infection, including outcompeting the SC5314 reference strain during systemic competition assays. This study provides a global view of commensal strain variation and within-host strain diversity of C. albicans and suggests that selection for commensalism in humans does not result in a fitness cost for invasive disease.


Assuntos
Candida albicans , Simbiose , Humanos , Candida albicans/genética , Fatores de Transcrição/genética , Regulação da Expressão Gênica
2.
Nat Rev Genet ; 21(3): 171-189, 2020 03.
Artigo em Inglês | MEDLINE | ID: mdl-31729472

RESUMO

Identifying structural variation (SV) is essential for genome interpretation but has been historically difficult due to limitations inherent to available genome technologies. Detection methods that use ensemble algorithms and emerging sequencing technologies have enabled the discovery of thousands of SVs, uncovering information about their ubiquity, relationship to disease and possible effects on biological mechanisms. Given the variability in SV type and size, along with unique detection biases of emerging genomic platforms, multiplatform discovery is necessary to resolve the full spectrum of variation. Here, we review modern approaches for investigating SVs and proffer that, moving forwards, studies integrating biological information with detection will be necessary to comprehensively understand the impact of SV in the human genome.


Assuntos
Variação Estrutural do Genoma , Análise de Sequência/métodos , Algoritmos , Genoma Humano , Humanos
3.
Cell ; 141(7): 1253-61, 2010 Jun 25.
Artigo em Inglês | MEDLINE | ID: mdl-20603005

RESUMO

Two abundant classes of mobile elements, namely Alu and L1 elements, continue to generate new retrotransposon insertions in human genomes. Estimates suggest that these elements have generated millions of new germline insertions in individual human genomes worldwide. Unfortunately, current technologies are not capable of detecting most of these young insertions, and the true extent of germline mutagenesis by endogenous human retrotransposons has been difficult to examine. Here, we describe technologies for detecting these young retrotransposon insertions and demonstrate that such insertions indeed are abundant in human populations. We also found that new somatic L1 insertions occur at high frequencies in human lung cancer genomes. Genome-wide analysis suggests that altered DNA methylation may be responsible for the high levels of L1 mobilization observed in these tumors. Our data indicate that transposon-mediated mutagenesis is extensive in human genomes and is likely to have a major impact on human biology and diseases.


Assuntos
Elementos Alu , Genoma Humano , Elementos Nucleotídeos Longos e Dispersos , Mutagênese , Análise de Sequência de DNA/métodos , Neoplasias Encefálicas/genética , Humanos , Neoplasias Pulmonares/genética , Metilação
4.
Am J Hum Genet ; 108(5): 919-928, 2021 05 06.
Artigo em Inglês | MEDLINE | ID: mdl-33789087

RESUMO

Virtually all genome sequencing efforts in national biobanks, complex and Mendelian disease programs, and medical genetic initiatives are reliant upon short-read whole-genome sequencing (srWGS), which presents challenges for the detection of structural variants (SVs) relative to emerging long-read WGS (lrWGS) technologies. Given this ubiquity of srWGS in large-scale genomics initiatives, we sought to establish expectations for routine SV detection from this data type by comparison with lrWGS assembly, as well as to quantify the genomic properties and added value of SVs uniquely accessible to each technology. Analyses from the Human Genome Structural Variation Consortium (HGSVC) of three families captured ~11,000 SVs per genome from srWGS and ~25,000 SVs per genome from lrWGS assembly. Detection power and precision for SV discovery varied dramatically by genomic context and variant class: 9.7% of the current GRCh38 reference is defined by segmental duplication (SD) and simple repeat (SR), yet 91.4% of deletions that were specifically discovered by lrWGS localized to these regions. Across the remaining 90.3% of reference sequence, we observed extremely high (93.8%) concordance between technologies for deletions in these datasets. In contrast, lrWGS was superior for detection of insertions across all genomic contexts. Given that non-SD/SR sequences encompass 95.9% of currently annotated disease-associated exons, improved sensitivity from lrWGS to discover novel pathogenic deletions in these currently interpretable genomic regions is likely to be incremental. However, these analyses highlight the considerable added value of assembly-based lrWGS to create new catalogs of insertions and transposable elements, as well as disease-associated repeat expansions in genomic sequences that were previously recalcitrant to routine assessment.


Assuntos
Genoma Humano/genética , Variação Estrutural do Genoma , Genômica/métodos , Objetivos , Sequenciamento Completo do Genoma/métodos , Sequenciamento Completo do Genoma/normas , Variações do Número de Cópias de DNA , Éxons/genética , Humanos , Projetos de Pesquisa , Duplicações Segmentares Genômicas , Alinhamento de Sequência
5.
Nucleic Acids Res ; 48(3): 1146-1163, 2020 02 20.
Artigo em Inglês | MEDLINE | ID: mdl-31853540

RESUMO

Long Interspersed Element-1 (LINE-1) retrotransposition contributes to inter- and intra-individual genetic variation and occasionally can lead to human genetic disorders. Various strategies have been developed to identify human-specific LINE-1 (L1Hs) insertions from short-read whole genome sequencing (WGS) data; however, they have limitations in detecting insertions in complex repetitive genomic regions. Here, we developed a computational tool (PALMER) and used it to identify 203 non-reference L1Hs insertions in the NA12878 benchmark genome. Using PacBio long-read sequencing data, we identified L1Hs insertions that were absent in previous short-read studies (90/203). Approximately 81% (73/90) of the L1Hs insertions reside within endogenous LINE-1 sequences in the reference assembly and the analysis of unique breakpoint junction sequences revealed 63% (57/90) of these L1Hs insertions could be genotyped in 1000 Genomes Project sequences. Moreover, we observed that amplification biases encountered in single-cell WGS experiments led to a wide variation in L1Hs insertion detection rates between four individual NA12878 cells; under-amplification limited detection to 32% (65/203) of insertions, whereas over-amplification increased false positive calls. In sum, these data indicate that L1Hs insertions are often missed using standard short-read sequencing approaches and long-read sequencing approaches can significantly improve the detection of L1Hs insertions present in individual genomes.


Assuntos
Elementos Nucleotídeos Longos e Dispersos , Análise de Sequência de DNA/métodos , Linhagem Celular , Genoma Humano , Humanos , Polimorfismo Genético , Análise de Célula Única , Software , Sequenciamento Completo do Genoma
6.
Proc Natl Acad Sci U S A ; 116(41): 20612-20622, 2019 10 08.
Artigo em Inglês | MEDLINE | ID: mdl-31548405

RESUMO

Long interspersed element-1 (LINE-1 or L1) amplifies via retrotransposition. Active L1s encode 2 proteins (ORF1p and ORF2p) that bind their encoding transcript to promote retrotransposition in cis The L1-encoded proteins also promote the retrotransposition of small-interspersed element RNAs, noncoding RNAs, and messenger RNAs in trans Some L1-mediated retrotransposition events consist of a copy of U6 RNA conjoined to a variably 5'-truncated L1, but how U6/L1 chimeras are formed requires elucidation. Here, we report the following: The RNA ligase RtcB can join U6 RNAs ending in a 2',3'-cyclic phosphate to L1 RNAs containing a 5'-OH in vitro; depletion of endogenous RtcB in HeLa cell extracts reduces U6/L1 RNA ligation efficiency; retrotransposition of U6/L1 RNAs leads to U6/L1 pseudogene formation; and a unique cohort of U6/L1 chimeric RNAs are present in multiple human cell lines. Thus, these data suggest that U6 small nuclear RNA (snRNA) and RtcB participate in the formation of chimeric RNAs and that retrotransposition of chimeric RNA contributes to interindividual genetic variation.


Assuntos
Células-Tronco Embrionárias/metabolismo , Elementos Nucleotídeos Longos e Dispersos/genética , Neoplasias/genética , Células-Tronco Neurais/metabolismo , RNA Nuclear Pequeno/genética , RNA/genética , Retroelementos/genética , Células HeLa , Humanos , Pseudogenes , RNA/química , RNA Mensageiro/química , RNA Mensageiro/genética , RNA Nuclear Pequeno/química
7.
Cancer ; 127(19): 3531-3540, 2021 10 01.
Artigo em Inglês | MEDLINE | ID: mdl-34160069

RESUMO

BACKGROUND: Human papillomavirus (HPV) is a well-established driver of malignant transformation at a number of sites, including head and neck, cervical, vulvar, anorectal, and penile squamous cell carcinomas; however, the impact of HPV integration into the host human genome on this process remains largely unresolved. This is due to the technical challenge of identifying HPV integration sites, which includes limitations of existing informatics approaches to discovering viral-host breakpoints from low-read-coverage sequencing data. METHODS: To overcome this limitation, the authors developed SearcHPV, a new HPV detection pipeline based on targeted capture technology, and applied the algorithm to targeted capture data. They performed an integrated analysis of SearcHPV-defined breakpoints with genome-wide linked-read sequencing to identify potential HPV-related structural variations. RESULTS: Through an analysis of HPV+ models, the authors showed that SearcHPV detected HPV-host integration sites with a higher sensitivity and specificity than 2 other commonly used HPV detection callers. SearcHPV uncovered HPV integration sites adjacent to known cancer-related genes, including TP63, MYC, and TRAF2, and near regions of large structural variation. The authors further validated the junction contig assembly feature of SearcHPV, which helped to accurately identify viral-host junction breakpoint sequences. They found that viral integration occurred through a variety of DNA repair mechanisms, including nonhomologous end joining, alternative end joining, and microhomology-mediated repair. CONCLUSIONS: In summary, SearcHPV is a new optimized tool for the accurate detection of HPV-human integration sites from targeted capture DNA sequencing data.


Assuntos
Alphapapillomavirus , Carcinoma de Células Escamosas , Infecções por Papillomavirus , Neoplasias do Colo do Útero , Alphapapillomavirus/genética , DNA Viral/genética , Feminino , Genômica , Humanos , Papillomaviridae/genética , Infecções por Papillomavirus/complicações , Infecções por Papillomavirus/genética
8.
Gastroenterology ; 156(5): 1404-1415, 2019 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-30578782

RESUMO

BACKGROUND & AIMS: African American and European American individuals have a similar prevalence of gastroesophageal reflux disease (GERD), yet esophageal adenocarcinoma (EAC) disproportionately affects European American individuals. We investigated whether the esophageal squamous mucosa of African American individuals has features that protect against GERD-induced damage, compared with European American individuals. METHODS: We performed transcriptional profile analysis of esophageal squamous mucosa tissues from 20 African American and 20 European American individuals (24 with no disease and 16 with Barrett's esophagus and/or EAC). We confirmed our findings in a cohort of 56 patients and analyzed DNA samples from patients to identify associated variants. Observations were validated using matched genomic sequence and expression data from lymphoblasts from the 1000 Genomes Project. A panel of esophageal samples from African American and European American subjects was used to confirm allele-related differences in protein levels. The esophageal squamous-derived cell line Het-1A and a rat esophagogastroduodenal anastomosis model for reflux-generated esophageal damage were used to investigate the effects of the DNA-damaging agent cumene-hydroperoxide (cum-OOH) and a chemopreventive cranberry proanthocyanidin (C-PAC) extract, respectively, on levels of protein and messenger RNA (mRNA). RESULTS: We found significantly higher levels of glutathione S-transferase theta 2 (GSTT2) mRNA in squamous mucosa from African American compared with European American individuals and associated these with variants within the GSTT2 locus in African American individuals. We confirmed that 2 previously identified genomic variants at the GSTT2 locus, a 37-kb deletion and a 17-bp promoter duplication, reduce expression of GSTT2 in tissues from European American individuals. The nonduplicated 17-bp promoter was more common in tissue samples from populations of African descendant. GSTT2 protected Het-1A esophageal squamous cells from cum-OOH-induced DNA damage. Addition of C-PAC increased GSTT2 expression in Het-1A cells incubated with cum-OOH and in rats with reflux-induced esophageal damage. C-PAC also reduced levels of DNA damage in reflux-exposed rat esophagi, as observed by reduced levels of phospho-H2A histone family member X. CONCLUSIONS: We found GSTT2 to protect esophageal squamous cells against DNA damage from genotoxic stress and that GSTT2 expression can be induced by C-PAC. Increased levels of GSTT2 in esophageal tissues of African American individuals might protect them from GERD-induced damage and contribute to the low incidence of EAC in this population.


Assuntos
Adenocarcinoma/genética , Esôfago de Barrett/genética , Negro ou Afro-Americano/genética , Dano ao DNA , Mucosa Esofágica/enzimologia , Neoplasias Esofágicas/genética , Refluxo Gastroesofágico/genética , Glutationa Transferase/genética , População Branca/genética , Adenocarcinoma/enzimologia , Adenocarcinoma/etnologia , Adenocarcinoma/patologia , Animais , Esôfago de Barrett/enzimologia , Esôfago de Barrett/etnologia , Esôfago de Barrett/patologia , Modelos Animais de Doenças , Mucosa Esofágica/patologia , Neoplasias Esofágicas/enzimologia , Neoplasias Esofágicas/etnologia , Neoplasias Esofágicas/patologia , Feminino , Refluxo Gastroesofágico/enzimologia , Refluxo Gastroesofágico/etnologia , Refluxo Gastroesofágico/patologia , Glutationa Transferase/metabolismo , Células HeLa , Histonas/metabolismo , Humanos , Incidência , Masculino , Pessoa de Meia-Idade , Fosfoproteínas/metabolismo , Fosforilação , Fatores de Proteção , Ratos Sprague-Dawley , Fatores de Risco , Estados Unidos/epidemiologia , Regulação para Cima
9.
Genome Res ; 27(11): 1916-1929, 2017 11.
Artigo em Inglês | MEDLINE | ID: mdl-28855259

RESUMO

Mobile element insertions (MEIs) represent ∼25% of all structural variants in human genomes. Moreover, when they disrupt genes, MEIs can influence human traits and diseases. Therefore, MEIs should be fully discovered along with other forms of genetic variation in whole genome sequencing (WGS) projects involving population genetics, human diseases, and clinical genomics. Here, we describe the Mobile Element Locator Tool (MELT), which was developed as part of the 1000 Genomes Project to perform MEI discovery on a population scale. Using both Illumina WGS data and simulations, we demonstrate that MELT outperforms existing MEI discovery tools in terms of speed, scalability, specificity, and sensitivity, while also detecting a broader spectrum of MEI-associated features. Several run modes were developed to perform MEI discovery on local and cloud systems. In addition to using MELT to discover MEIs in modern humans as part of the 1000 Genomes Project, we also used it to discover MEIs in chimpanzees and ancient (Neanderthal and Denisovan) hominids. We detected diverse patterns of MEI stratification across these populations that likely were caused by (1) diverse rates of MEI production from source elements, (2) diverse patterns of MEI inheritance, and (3) the introgression of ancient MEIs into modern human genomes. Overall, our study provides the most comprehensive map of MEIs to date spanning chimpanzees, ancient hominids, and modern humans and reveals new aspects of MEI biology in these lineages. We also demonstrate that MELT is a robust platform for MEI discovery and analysis in a variety of experimental settings.


Assuntos
Biologia Computacional/métodos , Elementos de DNA Transponíveis , Homem de Neandertal/genética , Pan troglodytes/genética , Animais , Bases de Dados Genéticas , Evolução Molecular , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Polimorfismo de Nucleotídeo Único , Software , Sequenciamento Completo do Genoma/métodos
10.
BMC Genomics ; 20(1): 391, 2019 May 20.
Artigo em Inglês | MEDLINE | ID: mdl-31109297

RESUMO

BACKGROUND: Upstream open reading frames (uORFs) initiate translation within mRNA 5' leaders, and have the potential to alter main coding sequence (CDS) translation on transcripts in which they reside. Ribosome profiling (RP) studies suggest that translating ribosomes are pervasive within 5' leaders across model systems. However, the significance of this observation remains unclear. To explore a role for uORF usage in a model of neuronal differentiation, we performed RP on undifferentiated and differentiated human neuroblastoma cells. RESULTS: Using a spectral coherence algorithm (SPECtre), we identify 4954 consistently translated uORFs across 31% of all neuroblastoma transcripts. These uORFs predominantly utilize non-AUG initiation codons and exhibit translational efficiencies (TE) comparable to annotated coding regions. On a population basis, the global impact of both AUG and non-AUG initiated uORFs on basal CDS translation were small, even when analysis is limited to conserved and consistently translated uORFs. However, uORFs did alter the translation of a subset of genes, including the Diamond-Blackfan Anemia associated ribosomal gene RPS24. With retinoic acid induced differentiation, we observed an overall positive correlation in translational shifts between uORF/CDS pairs. However, CDSs downstream of uORFs show smaller shifts in TE with differentiation relative to CDSs without a predicted uORF, suggesting that uORF translation buffers cell state dependent fluctuations in CDS translation. CONCLUSION: This work provides insights into the dynamic relationships and potential regulatory functions of uORF/CDS pairs in a model of neuronal differentiation.


Assuntos
Diferenciação Celular/genética , Neurônios/metabolismo , Fases de Leitura Aberta , Biossíntese de Proteínas , Algoritmos , Linhagem Celular Tumoral , Regulação da Expressão Gênica , Humanos , Modelos Biológicos , Neurônios/citologia , Ribossomos/metabolismo
11.
Nature ; 470(7332): 59-65, 2011 Feb 03.
Artigo em Inglês | MEDLINE | ID: mdl-21293372

RESUMO

Genomic structural variants (SVs) are abundant in humans, differing from other forms of variation in extent, origin and functional impact. Despite progress in SV characterization, the nucleotide resolution architecture of most SVs remains unknown. We constructed a map of unbalanced SVs (that is, copy number variants) based on whole genome DNA sequencing data from 185 human genomes, integrating evidence from complementary SV discovery approaches with extensive experimental validations. Our map encompassed 22,025 deletions and 6,000 additional SVs, including insertions and tandem duplications. Most SVs (53%) were mapped to nucleotide resolution, which facilitated analysing their origin and functional impact. We examined numerous whole and partial gene deletions with a genotyping approach and observed a depletion of gene disruptions amongst high frequency deletions. Furthermore, we observed differences in the size spectra of SVs originating from distinct formation mechanisms, and constructed a map of SV hotspots formed by common mechanisms. Our analytical framework and SV map serves as a resource for sequencing-based association studies.


Assuntos
Variações do Número de Cópias de DNA/genética , Genética Populacional , Genoma Humano/genética , Genômica , Duplicação Gênica/genética , Predisposição Genética para Doença/genética , Genótipo , Humanos , Mutagênese Insercional/genética , Reprodutibilidade dos Testes , Análise de Sequência de DNA , Deleção de Sequência/genética
12.
BMC Bioinformatics ; 17(1): 482, 2016 Nov 25.
Artigo em Inglês | MEDLINE | ID: mdl-27884106

RESUMO

BACKGROUND: Active protein translation can be assessed and measured using ribosome profiling sequencing strategies. Prevailing analytical approaches applied to this technology make use of sequence fragment length profiling or reading frame occupancy enrichment to differentiate between active translation and background noise, however they do not consider additional characteristics inherent to the technology which limits their overall accuracy. RESULTS: Here, we present an analytical tool that models the overall tri-nucleotide periodicity of ribosomal occupancy using a classifier based on spectral coherence. Our software, SPECtre, examines the relationship of normalized ribosome profiling read coverage over a rolling series of windows along a transcript relative to an idealized reference signal without the matched requirement of mRNA-Seq. CONCLUSIONS: A comparison of SPECtre against previously published methods on existing data shows a marked improvement in accuracy for detecting active translation and exhibits overall high accuracy at a low false discovery rate. In addition, SPECtre performs comparably to a recently published method similarly based on spectral coherence, however with reduced runtime and memory requirements. SPECtre is available as an open source software package at https://github.com/mills-lab/spectreok .


Assuntos
Algoritmos , RNA Mensageiro/metabolismo , Ribossomos/metabolismo , Análise de Sequência de RNA/métodos , Software , Transcriptoma/genética , Perfilação da Expressão Gênica , Células HEK293 , Humanos , Fases de Leitura Aberta , Biossíntese de Proteínas , RNA Mensageiro/genética , Ribossomos/genética
13.
Nucleic Acids Res ; 42(20): 12640-9, 2014 Nov 10.
Artigo em Inglês | MEDLINE | ID: mdl-25348406

RESUMO

The transfer of mitochondrial genetic material into the nuclear genomes of eukaryotes is a well-established phenomenon that has been previously limited to the study of static reference genomes. The recent advancement of high throughput sequencing has enabled an expanded exploration into the diversity of polymorphic nuclear mitochondrial insertions (NumtS) within human populations. We have developed an approach to discover and genotype novel Numt insertions using whole genome, paired-end sequencing data. We have applied this method to a thousand individuals in 20 populations from the 1000 Genomes Project and other datasets and identified 141 new sites of Numt insertions, extending our current knowledge of existing NumtS by almost 20%. We find that recent Numt insertions are derived from throughout the mitochondrial genome, including the D-loop, and have integration biases that differ in some respects from previous studies on older, fixed NumtS in the reference genome. We determined the complete inserted sequence for a subset of these events and have identified a number of nearly full-length mitochondrial genome insertions into nuclear chromosomes. We further define their age and origin of insertion and present an analysis of their potential impact to ongoing studies of mitochondrial heteroplasmy and disease.


Assuntos
Núcleo Celular/genética , Genoma Mitocondrial , Polimorfismo Genético , Genoma Humano , Genômica/métodos , Humanos , Dados de Sequência Molecular , Mutagênese Insercional , Filogenia
14.
Proc Natl Acad Sci U S A ; 110(39): 15764-9, 2013 Sep 24.
Artigo em Inglês | MEDLINE | ID: mdl-24014587

RESUMO

Although nucleotide resolution maps of genomic structural variants (SVs) have provided insights into the origin and impact of phenotypic diversity in humans, comparable maps in nonhuman primates have thus far been lacking. Using massively parallel DNA sequencing, we constructed fine-resolution genomic structural variation maps in five chimpanzees, five orang-utans, and five rhesus macaques. The SV maps, which are comprised of thousands of deletions, duplications, and mobile element insertions, revealed a high activity of retrotransposition in macaques compared with great apes. By comparison, nonallelic homologous recombination is specifically active in the great apes, which is correlated with architectural differences between the genomes of great apes and macaque. Transcriptome analyses across nonhuman primates and humans revealed effects of species-specific whole-gene duplication on gene expression. We identified 13 gene duplications coinciding with the species-specific gain of tissue-specific gene expression in keeping with a role of gene duplication in the promotion of diversification and the acquisition of unique functions. Differences in the present day activity of SV formation mechanisms that our study revealed may contribute to ongoing diversification and adaptation of great ape and Old World monkey lineages.


Assuntos
Genoma/genética , Variação Estrutural do Genoma/genética , Primatas/genética , Animais , Duplicação Gênica , Perfilação da Expressão Gênica , Regulação da Expressão Gênica , Humanos , Nucleotídeos/genética , Especificidade de Órgãos/genética , Especificidade da Espécie
15.
Nature ; 460(7258): 1011-5, 2009 Aug 20.
Artigo em Inglês | MEDLINE | ID: mdl-19587683

RESUMO

Recent advances in sequencing technologies have initiated an era of personal genome sequences. To date, human genome sequences have been reported for individuals with ancestry in three distinct geographical regions: a Yoruba African, two individuals of northwest European origin, and a person from China. Here we provide a highly annotated, whole-genome sequence for a Korean individual, known as AK1. The genome of AK1 was determined by an exacting, combined approach that included whole-genome shotgun sequencing (27.8x coverage), targeted bacterial artificial chromosome sequencing, and high-resolution comparative genomic hybridization using custom microarrays featuring more than 24 million probes. Alignment to the NCBI reference, a composite of several ethnic clades, disclosed nearly 3.45 million single nucleotide polymorphisms (SNPs), including 10,162 non-synonymous SNPs, and 170,202 deletion or insertion polymorphisms (indels). SNP and indel densities were strongly correlated genome-wide. Applying very conservative criteria yielded highly reliable copy number variants for clinical considerations. Potential medical phenotypes were annotated for non-synonymous SNPs, coding domain indels, and structural variants. The integration of several human whole-genome sequences derived from several ethnic groups will assist in understanding genetic ancestry, migration patterns and population bottlenecks.


Assuntos
Povo Asiático/genética , Genoma Humano/genética , Cromossomos Artificiais Bacterianos/genética , Hibridização Genômica Comparativa , Biologia Computacional , Humanos , Mutação INDEL/genética , Coreia (Geográfico) , Análise de Sequência com Séries de Oligonucleotídeos , Polimorfismo de Nucleotídeo Único/genética , Análise de Sequência de DNA
16.
Proc Natl Acad Sci U S A ; 109(31): 12656-61, 2012 Jul 31.
Artigo em Inglês | MEDLINE | ID: mdl-22797897

RESUMO

Gene expression differences are shaped by selective pressures and contribute to phenotypic differences between species. We identified 964 copy number differences (CNDs) of conserved sequences across three primate species and examined their potential effects on gene expression profiles. Samples with copy number different genes had significantly different expression than samples with neutral copy number. Genes encoding regulatory molecules differed in copy number and were associated with significant expression differences. Additionally, we identified 127 CNDs that were processed pseudogenes and some of which were expressed. Furthermore, there were copy number-different regulatory regions such as ultraconserved elements and long intergenic noncoding RNAs with the potential to affect expression. We postulate that CNDs of these conserved sequences fine-tune developmental pathways by altering the levels of RNA.


Assuntos
DNA Intergênico/fisiologia , Dosagem de Genes/fisiologia , Regulação da Expressão Gênica/fisiologia , Pseudogenes/fisiologia , RNA não Traduzido/fisiologia , Elementos Reguladores de Transcrição/fisiologia , Animais , Linhagem Celular , Humanos , Macaca mulatta , Pan troglodytes , Especificidade da Espécie
17.
Proc Natl Acad Sci U S A ; 109(2): 529-34, 2012 Jan 10.
Artigo em Inglês | MEDLINE | ID: mdl-22203992

RESUMO

Copy number variants (CNVs) represent a substantial source of genomic variation in vertebrates and have been associated with numerous human diseases. Despite this, the extent of CNVs in the zebrafish, an important model for human disease, remains unknown. Using 80 zebrafish genomes, representing three commonly used laboratory strains and one native population, we constructed a genome-wide, high-resolution CNV map for the zebrafish comprising 6,080 CNV elements and encompassing 14.6% of the zebrafish reference genome. This amount of copy number variation is four times that previously observed in other vertebrates, including humans. Moreover, 69% of the CNV elements exhibited strain specificity, with the highest number observed for Tubingen. This variation likely arose, in part, from Tubingen's large founding size and composite population origin. Additional population genetic studies also provided important insight into the origins and substructure of these commonly used laboratory strains. This extensive variation among and within zebrafish strains may have functional effects that impact phenotype and, if not properly addressed, such extensive levels of germ-line variation and population substructure in this commonly used model organism can potentially confound studies intended for translation to human diseases.


Assuntos
Variações do Número de Cópias de DNA/genética , Variação Genética , Genômica/métodos , Peixe-Zebra/genética , Animais , Hibridização Genômica Comparativa , Primers do DNA/genética , Genética Populacional , Especificidade da Espécie , Peixe-Zebra/classificação
18.
Genome Res ; 21(6): 830-9, 2011 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-21460062

RESUMO

Human genetic variation is expected to play a central role in personalized medicine. Yet only a fraction of the natural genetic variation that is harbored by humans has been discovered to date. Here we report almost 2 million small insertions and deletions (INDELs) that range from 1 bp to 10,000 bp in length in the genomes of 79 diverse humans. These variants include 819,363 small INDELs that map to human genes. Small INDELs frequently were found in the coding exons of these genes, and several lines of evidence indicate that such variation is a major determinant of human biological diversity. Microarray-based genotyping experiments revealed several interesting observations regarding the population genetics of small INDEL variation. For example, we found that many of our INDELs had high levels of linkage disequilibrium (LD) with both HapMap SNPs and with high-scoring SNPs from genome-wide association studies. Overall, our study indicates that small INDEL variation is likely to be a key factor underlying inherited traits and diseases in humans.


Assuntos
Variação Genética , Genoma Humano/genética , Mutação INDEL/genética , Genômica/métodos , Genótipo , Humanos , Análise em Microsséries , Medicina de Precisão/métodos
19.
Nat Commun ; 15(1): 4220, 2024 May 17.
Artigo em Inglês | MEDLINE | ID: mdl-38760338

RESUMO

When somatic cells acquire complex karyotypes, they often are removed by the immune system. Mutant somatic cells that evade immune surveillance can lead to cancer. Neurons with complex karyotypes arise during neurotypical brain development, but neurons are almost never the origin of brain cancers. Instead, somatic mutations in neurons can bring about neurodevelopmental disorders, and contribute to the polygenic landscape of neuropsychiatric and neurodegenerative disease. A subset of human neurons harbors idiosyncratic copy number variants (CNVs, "CNV neurons"), but previous analyses of CNV neurons are limited by relatively small sample sizes. Here, we develop an allele-based validation approach, SCOVAL, to corroborate or reject read-depth based CNV calls in single human neurons. We apply this approach to 2,125 frontal cortical neurons from a neurotypical human brain. SCOVAL identifies 226 CNV neurons, which include a subclass of 65 CNV neurons with highly aberrant karyotypes containing whole or substantial losses on multiple chromosomes. Moreover, we find that CNV location appears to be nonrandom. Recurrent regions of neuronal genome rearrangement contain fewer, but longer, genes.


Assuntos
Variações do Número de Cópias de DNA , Mosaicismo , Neurônios , Humanos , Neurônios/metabolismo , Alelos
20.
BMC Bioinformatics ; 14: 157, 2013 May 09.
Artigo em Inglês | MEDLINE | ID: mdl-23656838

RESUMO

BACKGROUND: In recent years there has been a growing interest in the role of copy number variations (CNV) in genetic diseases. Though there has been rapid development of technologies and statistical methods devoted to detection in CNVs from array data, the inherent challenges in data quality associated with most hybridization techniques remains a challenging problem in CNV association studies. RESULTS: To help address these data quality issues in the context of family-based association studies, we introduce a statistical framework for the intensity-based array data that takes into account the family information for copy-number assignment. The method is an adaptation of traditional methods for modeling SNP genotype data that assume Gaussian mixture model, whereby CNV calling is performed for all family members simultaneously and leveraging within family-data to reduce CNV calls that are incompatible with Mendelian inheritance while still allowing de-novo CNVs. Applying this method to simulation studies and a genome-wide association study in asthma, we find that our approach significantly improves CNV calls accuracy, and reduces the Mendelian inconsistency rates and false positive genotype calls. The results were validated using qPCR experiments. CONCLUSIONS: In conclusion, we have demonstrated that the use of family information can improve the quality of CNV calling and hopefully give more powerful association test of CNVs.


Assuntos
Variações do Número de Cópias de DNA , Técnicas de Genotipagem , Asma/genética , Família , Estudo de Associação Genômica Ampla , Humanos , Reação em Cadeia da Polimerase
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA