Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
1.
Sci Rep ; 9(1): 1382, 2019 02 04.
Artigo em Inglês | MEDLINE | ID: mdl-30718733

RESUMO

We introduce the design and implementation of a new array, the Korea Biobank Array (referred to as KoreanChip), optimized for the Korean population and demonstrate findings from GWAS of blood biochemical traits. KoreanChip comprised >833,000 markers including >247,000 rare-frequency or functional variants estimated from >2,500 sequencing data in Koreans. Of the 833 K markers, 208 K functional markers were directly genotyped. Particularly, >89 K markers were presented in East Asians. KoreanChip achieved higher imputation performance owing to the excellent genomic coverage of 95.38% for common and 73.65% for low-frequency variants. From GWAS (Genome-wide association study) using 6,949 individuals, 28 associations were successfully recapitulated. Moreover, 9 missense variants were newly identified, of which we identified new associations between a common population-specific missense variant, rs671 (p.Glu457Lys) of ALDH2, and two traits including aspartate aminotransferase (P = 5.20 × 10-13) and alanine aminotransferase (P = 4.98 × 10-8). Furthermore, two novel missense variants of GPT with rare frequency in East Asians but extreme rarity in other populations were associated with alanine aminotransferase (rs200088103; p.Arg133Trp, P = 2.02 × 10-9 and rs748547625; p.Arg143Cys, P = 1.41 × 10-6). These variants were successfully replicated in 6,000 individuals (P = 5.30 × 10-8 and P = 1.24 × 10-6). GWAS results suggest the promising utility of KoreanChip with a substantial number of damaging variants to identify new population-specific disease-associated rare/functional variants.


Assuntos
Bancos de Espécimes Biológicos , Sangue/metabolismo , Variação Genética , Adulto , Idoso , Loci Gênicos , Genoma Humano , Estudo de Associação Genômica Ampla , Genótipo , Humanos , Pessoa de Meia-Idade , Mutação de Sentido Incorreto/genética , Polimorfismo de Nucleotídeo Único/genética , Reprodutibilidade dos Testes , República da Coreia
2.
Transfusion ; 59(1): 101-111, 2019 01.
Artigo em Inglês | MEDLINE | ID: mdl-30456907

RESUMO

BACKGROUND: Many aspects of transfusion medicine are affected by genetics. Current single-nucleotide polymorphism (SNP) arrays are limited in the number of targets that can be interrogated and cannot detect all variation of interest. We designed a transfusion medicine array (TM-Array) for study of both common and rare transfusion-relevant variations in genetically diverse donor and recipient populations. STUDY DESIGN AND METHODS: The array was designed by conducting extensive bioinformatics mining and consulting experts to identify genes and genetic variation related to a wide range of transfusion medicine clinical relevant and research-related topics. Copy number polymorphisms were added in the alpha globin, beta globin, and Rh gene clusters. RESULTS: The final array contains approximately 879,000 SNP and copy number polymorphism markers. Over 99% of SNPs were called reliably. Technical replication showed the array to be robust and reproducible, with an error rate less than 0.03%. The array also had a very low Mendelian error rate (average parent-child trio accuracy of 0.9997). Blood group results were in concordance with serology testing results, and the array accurately identifies rare variants (minor allele frequency of 0.5%). The array achieved high genome-wide imputation coverage for African-American (97.5%), Hispanic (96.1%), East Asian (94.6%), and white (96.1%) genomes at a minor allele frequency of 5%. CONCLUSIONS: A custom array for transfusion medicine research has been designed and evaluated. It gives wide coverage and accurate identification of rare SNPs in diverse populations. The TM-Array will be useful for future genetic studies in the diverse fields of transfusion medicine research.


Assuntos
Genoma Humano/genética , Medicina Transfusional/métodos , Negro ou Afro-Americano , Povo Asiático , Biologia Computacional , Frequência do Gene/genética , Estudo de Associação Genômica Ampla , Genótipo , Humanos , Polimorfismo de Nucleotídeo Único/genética , População Branca
3.
Genome Biol Evol ; 9(12): 3225-3237, 2017 12 01.
Artigo em Inglês | MEDLINE | ID: mdl-29165562

RESUMO

The human population displays wide variety in demographic history, ancestry, content of DNA derived from hominins or ancient populations, adaptation, traits, copy number variation, drug response, and more. These polymorphisms are of broad interest to population geneticists, forensics investigators, and medical professionals. Historically, much of that knowledge was gained from population survey projects. Although many commercial arrays exist for genome-wide single-nucleotide polymorphism genotyping, their design specifications are limited and they do not allow a full exploration of biodiversity. We thereby aimed to design the Diversity of REcent and Ancient huMan (DREAM)-an all-inclusive microarray that would allow both identification of known associations and exploration of standing questions in genetic anthropology, forensics, and personalized medicine. DREAM includes probes to interrogate ancestry informative markers obtained from over 450 human populations, over 200 ancient genomes, and 10 archaic hominins. DREAM can identify 94% and 61% of all known Y and mitochondrial haplogroups, respectively, and was vetted to avoid interrogation of clinically relevant markers. To demonstrate its capabilities, we compared its FST distributions with those of the 1000 Genomes Project and commercial arrays. Although all arrays yielded similarly shaped (inverse J) FST distributions, DREAM's autosomal and X-chromosomal distributions had the highest mean FST, attesting to its ability to discern subpopulations. DREAM performances are further illustrated in biogeographical, identical by descent, and copy number variation analyses. In summary, with approximately 800,000 markers spanning nearly 2,000 genes, DREAM is a useful tool for genetic anthropology, forensic, and personalized medicine studies.


Assuntos
Antropologia/métodos , Genética Populacional/métodos , Genoma Humano , Medicina de Precisão/métodos , Variações do Número de Cópias de DNA , DNA Antigo , Evolução Molecular , Marcadores Genéticos , Genótipo , Humanos , Análise em Microsséries , Linhagem , Polimorfismo de Nucleotídeo Único
4.
Genome Med ; 7: 90, 2015 Oct 01.
Artigo em Inglês | MEDLINE | ID: mdl-26423053

RESUMO

BACKGROUND: In addition to HLA genetic incompatibility, non-HLA difference between donor and recipients of transplantation leading to allograft rejection are now becoming evident. We aimed to create a unique genome-wide platform to facilitate genomic research studies in transplant-related studies. We designed a genome-wide genotyping tool based on the most recent human genomic reference datasets, and included customization for known and potentially relevant metabolic and pharmacological loci relevant to transplantation. METHODS: We describe here the design and implementation of a customized genome-wide genotyping array, the 'TxArray', comprising approximately 782,000 markers with tailored content for deeper capture of variants across HLA, KIR, pharmacogenomic, and metabolic loci important in transplantation. To test concordance and genotyping quality, we genotyped 85 HapMap samples on the array, including eight trios. RESULTS: We show low Mendelian error rates and high concordance rates for HapMap samples (average parent-parent-child heritability of 0.997, and concordance of 0.996). We performed genotype imputation across autosomal regions, masking directly genotyped SNPs to assess imputation accuracy and report an accuracy of >0.962 for directly genotyped SNPs. We demonstrate much higher capture of the natural killer cell immunoglobulin-like receptor (KIR) region versus comparable platforms. Overall, we show that the genotyping quality and coverage of the TxArray is very high when compared to reference samples and to other genome-wide genotyping platforms. CONCLUSIONS: We have designed a comprehensive genome-wide genotyping tool which enables accurate association testing and imputation of ungenotyped SNPs, facilitating powerful and cost-effective large-scale genotyping of transplant-related studies.


Assuntos
Estudo de Associação Genômica Ampla , Genótipo , Variações do Número de Cópias de DNA , Antígenos HLA/genética , Humanos , Polimorfismo de Nucleotídeo Único , Receptores KIR/genética
5.
Genetics ; 200(4): 1051-60, 2015 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-26092718

RESUMO

The Kaiser Permanente (KP) Research Program on Genes, Environment and Health (RPGEH), in collaboration with the University of California-San Francisco, undertook genome-wide genotyping of >100,000 subjects that constitute the Genetic Epidemiology Research on Adult Health and Aging (GERA) cohort. The project, which generated >70 billion genotypes, represents the first large-scale use of the Affymetrix Axiom Genotyping Solution. Because genotyping took place over a short 14-month period, creating a near-real-time analysis pipeline for experimental assay quality control and final optimized analyses was critical. Because of the multi-ethnic nature of the cohort, four different ethnic-specific arrays were employed to enhance genome-wide coverage. All assays were performed on DNA extracted from saliva samples. To improve sample call rates and significantly increase genotype concordance, we partitioned the cohort into disjoint packages of plates with similar assay contexts. Using strict QC criteria, the overall genotyping success rate was 103,067 of 109,837 samples assayed (93.8%), with a range of 92.1-95.4% for the four different arrays. Similarly, the SNP genotyping success rate ranged from 98.1 to 99.4% across the four arrays, the variation depending mostly on how many SNPs were included as single copy vs. double copy on a particular array. The high quality and large scale of genotype data created on this cohort, in conjunction with comprehensive longitudinal data from the KP electronic health records of participants, will enable a broad range of highly powered genome-wide association studies on a diversity of traits and conditions.


Assuntos
Envelhecimento/genética , Biologia Computacional/métodos , Técnicas de Genotipagem/métodos , Saúde , Adulto , Estudos de Coortes , Feminino , Humanos , Masculino , Epidemiologia Molecular , Análise de Sequência com Séries de Oligonucleotídeos , Polimorfismo de Nucleotídeo Único , Controle de Qualidade
6.
Genomics ; 98(6): 422-30, 2011 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-21903159

RESUMO

Four custom Axiom genotyping arrays were designed for a genome-wide association (GWA) study of 100,000 participants from the Kaiser Permanente Research Program on Genes, Environment and Health. The array optimized for individuals of European race/ethnicity was previously described. Here we detail the development of three additional microarrays optimized for individuals of East Asian, African American, and Latino race/ethnicity. For these arrays, we decreased redundancy of high-performing SNPs to increase SNP capacity. The East Asian array was designed using greedy pairwise SNP selection. However, removing SNPs from the target set based on imputation coverage is more efficient than pairwise tagging. Therefore, we developed a novel hybrid SNP selection method for the African American and Latino arrays utilizing rounds of greedy pairwise SNP selection, followed by removal from the target set of SNPs covered by imputation. The arrays provide excellent genome-wide coverage and are valuable additions for large-scale GWA studies.


Assuntos
Povo Asiático/genética , Negro ou Afro-Americano/genética , Estudo de Associação Genômica Ampla/métodos , Hispânico ou Latino/genética , Polimorfismo de Nucleotídeo Único , Algoritmos , Ásia Oriental , Genoma Humano , Genótipo , Humanos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Projetos Piloto , População Branca/genética
7.
Genomics ; 98(2): 79-89, 2011 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-21565264

RESUMO

The success of genome-wide association studies has paralleled the development of efficient genotyping technologies. We describe the development of a next-generation microarray based on the new highly-efficient Affymetrix Axiom genotyping technology that we are using to genotype individuals of European ancestry from the Kaiser Permanente Research Program on Genes, Environment and Health (RPGEH). The array contains 674,517 SNPs, and provides excellent genome-wide as well as gene-based and candidate-SNP coverage. Coverage was calculated using an approach based on imputation and cross validation. Preliminary results for the first 80,301 saliva-derived DNA samples from the RPGEH demonstrate very high quality genotypes, with sample success rates above 94% and over 98% of successful samples having SNP call rates exceeding 98%. At steady state, we have produced 462 million genotypes per week for each Axiom system. The new array provides a valuable addition to the repertoire of tools for large scale genome-wide association studies.


Assuntos
Estudo de Associação Genômica Ampla/métodos , Ensaios de Triagem em Larga Escala , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Polimorfismo de Nucleotídeo Único/genética , População Branca/genética , Humanos
8.
Proc Natl Acad Sci U S A ; 107(28): 12587-92, 2010 Jul 13.
Artigo em Inglês | MEDLINE | ID: mdl-20616066

RESUMO

A unique microarray-based method for determining the extent of DNA methylation has been developed. It relies on a selective enrichment of the regions to be assayed by target amplification by capture and ligation (mTACL). The assay is quantitatively accurate, relatively precise, and lends itself to high-throughput determination using nanogram amounts of DNA. The measurements using mTACLs are highly reproducible and in excellent agreement with those obtained by sequencing (r = 0.94). In the present work, the methylation status of >145,000 CpGs from 5,472 promoters in 221 samples was measured. The methylation levels of nearby CpGs are correlated, but the correlation falls off dramatically over several hundred base pairs. In some instances, nearby CpGs have very different levels of methylation. Comparison of normal and tumor samples indicates that in tumors, the promoter regions of genes involved in differentiation and signaling are preferentially hypermethylated, whereas those of housekeeping genes remain hypomethylated. mTACL is a platform for profiling the state of methylation of a large number of CpG in many samples in a cost-effective fashion, and is capable of scaling to much larger numbers of CpGs than those collected here.


Assuntos
Metilação de DNA , Diferenciação Celular/genética , DNA/genética , Fosfatos de Dinucleosídeos , Genoma , Humanos , Metilação
9.
BMC Bioinformatics ; 11: 305, 2010 Jun 04.
Artigo em Inglês | MEDLINE | ID: mdl-20525369

RESUMO

BACKGROUND: Methylation of CpG islands within the DNA promoter regions is one mechanism that leads to aberrant gene expression in cancer. In particular, the abnormal methylation of CpG islands may silence associated genes. Therefore, using high-throughput microarrays to measure CpG island methylation will lead to better understanding of tumor pathobiology and progression, while revealing potentially new biomarkers. We have examined a recently developed high-throughput technology for measuring genome-wide methylation patterns called mTACL. Here, we propose a computational pipeline for integrating gene expression and CpG island methylation profiles to identify epigenetically regulated genes for a panel of 45 breast cancer cell lines, which is widely used in the Integrative Cancer Biology Program (ICBP). The pipeline (i) reduces the dimensionality of the methylation data, (ii) associates the reduced methylation data with gene expression data, and (iii) ranks methylation-expression associations according to their epigenetic regulation. Dimensionality reduction is performed in two steps: (i) methylation sites are grouped across the genome to identify regions of interest, and (ii) methylation profiles are clustered within each region. Associations between the clustered methylation and the gene expression data sets generate candidate matches within a fixed neighborhood around each gene. Finally, the methylation-expression associations are ranked through a logistic regression, and their significance is quantified through permutation analysis. RESULTS: Our two-step dimensionality reduction compressed 90% of the original data, reducing 137,688 methylation sites to 14,505 clusters. Methylation-expression associations produced 18,312 correspondences, which were used to further analyze epigenetic regulation. Logistic regression was used to identify 58 genes from these correspondences that showed a statistically significant negative correlation between methylation profiles and gene expression in the panel of breast cancer cell lines. Subnetwork enrichment of these genes has identified 35 common regulators with 6 or more predicted markers. In addition to identifying epigenetically regulated genes, we show evidence of differentially expressed methylation patterns between the basal and luminal subtypes. CONCLUSIONS: Our results indicate that the proposed computational protocol is a viable platform for identifying epigenetically regulated genes. Our protocol has generated a list of predictors including COL1A2, TOP2A, TFF1, and VAV3, genes whose key roles in epigenetic regulation is documented in the literature. Subnetwork enrichment of these predicted markers further suggests that epigenetic regulation of individual genes occurs in a coordinated fashion and through common regulators.


Assuntos
Neoplasias da Mama/genética , Linhagem Celular Tumoral , Metilação de DNA , Epigênese Genética , Regulação Neoplásica da Expressão Gênica , Antígenos de Neoplasias/genética , Neoplasias da Mama/metabolismo , Neoplasias da Mama/patologia , Colágeno Tipo I/genética , Cadeia alfa 1 do Colágeno Tipo I , Ilhas de CpG , DNA Topoisomerases Tipo II/genética , Proteínas de Ligação a DNA/genética , Perfilação da Expressão Gênica , Genes Supressores de Tumor , Estudo de Associação Genômica Ampla , Humanos , Análise de Sequência com Séries de Oligonucleotídeos , Proteínas de Ligação a Poli-ADP-Ribose , Regiões Promotoras Genéticas , Proteínas Proto-Oncogênicas c-vav/genética , Fator Trefoil-1 , Proteínas Supressoras de Tumor/genética
10.
Genome Res ; 17(6): 839-51, 2007 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-17568002

RESUMO

Arising from either retrotransposition or genomic duplication of functional genes, pseudogenes are "genomic fossils" valuable for exploring the dynamics and evolution of genes and genomes. Pseudogene identification is an important problem in computational genomics, and is also critical for obtaining an accurate picture of a genome's structure and function. However, no consensus computational scheme for defining and detecting pseudogenes has been developed thus far. As part of the ENCyclopedia Of DNA Elements (ENCODE) project, we have compared several distinct pseudogene annotation strategies and found that different approaches and parameters often resulted in rather distinct sets of pseudogenes. We subsequently developed a consensus approach for annotating pseudogenes (derived from protein coding genes) in the ENCODE regions, resulting in 201 pseudogenes, two-thirds of which originated from retrotransposition. A survey of orthologs for these pseudogenes in 28 vertebrate genomes showed that a significant fraction ( approximately 80%) of the processed pseudogenes are primate-specific sequences, highlighting the increasing retrotransposition activity in primates. Analysis of sequence conservation and variation also demonstrated that most pseudogenes evolve neutrally, and processed pseudogenes appear to have lost their coding potential immediately or soon after their emergence. In order to explore the functional implication of pseudogene prevalence, we have extensively examined the transcriptional activity of the ENCODE pseudogenes. We performed systematic series of pseudogene-specific RACE analyses. These, together with complementary evidence derived from tiling microarrays and high throughput sequencing, demonstrated that at least a fifth of the 201 pseudogenes are transcribed in one or more cell lines or tissues.


Assuntos
Evolução Molecular , Duplicação Gênica , Pseudogenes , Transcrição Gênica , Animais , Linhagem Celular , Humanos , Primatas/genética , Retroelementos , Análise de Sequência de DNA , Especificidade da Espécie
11.
Genome Res ; 14(10B): 2034-40, 2004 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-15489323

RESUMO

The NCBI Reference Sequence (RefSeq) project and the NIH Mammalian Gene Collection (MGC) together define a set of approximately 30,000 nonredundant human mRNA sequences with identified coding regions representing 17,000 distinct loci. These high-quality mRNA sequences allow for the identification of transcribed regions in the human genome sequence, and many researchers accept them as the correct representation of each defined gene sequence. Computational comparison of these mRNA sequences and the recently published essentially finished human genome sequence reveals several thousand undocumented nonsynonymous substitution and frame shift discrepancies between the two resources. Additional analysis is undertaken to verify that the euchromatic human genome is sufficiently complete--containing nearly the whole mRNA collection, thus allowing for a comprehensive analysis to be undertaken. Many of the discrepancies will prove to be genuine polymorphisms in the human population, somatic cell genomic variants, or examples of RNA editing. It is observed that the genome sequence variant has significant additional support from other mRNAs and ESTs, almost four times more often than does the mRNA variant, suggesting that the genome sequence is more accurate. In approximately 15% of these cases, there is substantial support for both variants, suggestive of an undocumented polymorphism. An initial screening against a 24-individual genomic DNA diversity panel verified 60% of a small set of potential single nucleotide polymorphisms from which successful results could be obtained. We also find statistical evidence that a few of these discrepancies are due to RNA editing. Overall, these results suggest that the mRNA collections may contain a substantial number of errors. For current and future mRNA collections, it may be prudent to fully reconcile each genome sequence discrepancy, classifying each as a polymorphism, site of RNA editing or somatic cell variation, or genome sequence error.


Assuntos
Variação Genética , Genoma Humano , Projeto Genoma Humano , Polimorfismo Genético , Edição de RNA , RNA Mensageiro/análise , Biologia Computacional , Etiquetas de Sequências Expressas , Humanos , Análise de Sequência de DNA
12.
Genome Res ; 14(4): 528-38, 2004 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-15059993

RESUMO

Levels of recombination vary among species, among chromosomes within species, and among regions within chromosomes in mammals. This heterogeneity may affect levels of diversity, efficiency of selection, and genome composition, as well as have practical consequences for the genetic mapping of traits. We compared the genetic maps to the genome sequence assemblies of rat, mouse, and human to estimate local recombination rates across these genomes. Humans have greater overall levels of recombination, as well as greater variance. In rat and mouse, the size of the chromosome and proximity to telomere have less effect on local recombination rate than in human. At the chromosome level, rat and mouse X chromosomes have the lowest recombination rates, whereas human chromosome X does not show the same pattern. In all species, local recombination rate is significantly correlated with several sequence variables, including GC%, CpG density, repetitive elements, and the neutral mutation rate, with some pronounced differences between species. Recombination rate in one species is not strongly correlated with the rate in another, when comparing homologous syntenic blocks of the genome. This comparative approach provides additional insight into the causes and consequences of genomic heterogeneity in recombination.


Assuntos
Genoma Humano , Genoma , Recombinação Genética/genética , Animais , Composição de Bases/genética , Cromossomos/genética , Cruzamentos Genéticos , Evolução Molecular , Variação Genética/genética , Humanos , Camundongos , Camundongos Endogâmicos , Camundongos Obesos , Ratos , Ratos Endogâmicos BN , Ratos Endogâmicos SHR , Especificidade da Espécie
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...