Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 6 de 6
Filter
1.
PLoS Genet ; 6(3): e1000866, 2010 Mar 05.
Article in English | MEDLINE | ID: mdl-20221249

ABSTRACT

As we move forward from the current generation of genome-wide association (GWA) studies, additional cohorts of different ancestries will be studied to increase power, fine map association signals, and generalize association results to additional populations. Knowledge of genetic ancestry as well as population substructure will become increasingly important for GWA studies in populations of unknown ancestry. Here we propose genotyping pooled DNA samples using genome-wide SNP arrays as a viable option to efficiently and inexpensively estimate admixture proportion and identify ancestry informative markers (AIMs) in populations of unknown origin. We constructed DNA pools from African American, Native Hawaiian, Latina, and Jamaican samples and genotyped them using the Affymetrix 6.0 array. Aided by individual genotype data from the African American cohort, we established quality control filters to remove poorly performing SNPs and estimated allele frequencies for the remaining SNPs in each panel. We then applied a regression-based method to estimate the proportion of admixture in each cohort using the allele frequencies estimated from pooling and populations from the International HapMap Consortium as reference panels, and identified AIMs unique to each population. In this study, we demonstrated that genotyping pooled DNA samples yields estimates of admixture proportion that are both consistent with our knowledge of population history and similar to those obtained by genotyping known AIMs. Furthermore, through validation by individual genotyping, we demonstrated that pooling is quite effective for identifying SNPs with large allele frequency differences (i.e., AIMs) and that these AIMs are able to differentiate two closely related populations (HapMap JPT and CHB).


Subject(s)
Gene Pool , Genetics, Population/methods , Genome, Human/genetics , Phylogeny , Asian People/genetics , Gene Frequency/genetics , Genetic Markers , Genotype , Humans , Principal Component Analysis , Quality Control , Reproducibility of Results
2.
Nat Genet ; 40(10): 1253-60, 2008 Oct.
Article in English | MEDLINE | ID: mdl-18776909

ABSTRACT

Accurate and complete measurement of single nucleotide (SNP) and copy number (CNV) variants, both common and rare, will be required to understand the role of genetic variation in disease. We present Birdsuite, a four-stage analytical framework instantiated in software for deriving integrated and mutually consistent copy number and SNP genotypes. The method sequentially assigns copy number across regions of common copy number polymorphisms (CNPs), calls genotypes of SNPs, identifies rare CNVs via a hidden Markov model (HMM), and generates an integrated sequence and copy number genotype at every locus (for example, including genotypes such as A-null, AAB and BBB in addition to AA, AB and BB calls). Such genotypes more accurately depict the underlying sequence of each individual, reducing the rate of apparent mendelian inconsistencies. The Birdsuite software is applied here to data from the Affymetrix SNP 6.0 array. Additionally, we describe a method, implemented in PLINK, to utilize these combined SNP and CNV genotypes for association testing with a phenotype.


Subject(s)
Chromosomes, Human, Pair 4/genetics , Chromosomes, Human/genetics , DNA/genetics , Gene Dosage/genetics , Haplotypes/genetics , Models, Statistical , Polymorphism, Single Nucleotide , Algorithms , Female , Genome, Human , Genotype , Humans , Male , Markov Chains , Oligonucleotide Array Sequence Analysis , Polymerase Chain Reaction , Software
3.
Nat Genet ; 40(10): 1166-74, 2008 Oct.
Article in English | MEDLINE | ID: mdl-18776908

ABSTRACT

Dissecting the genetic basis of disease risk requires measuring all forms of genetic variation, including SNPs and copy number variants (CNVs), and is enabled by accurate maps of their locations, frequencies and population-genetic properties. We designed a hybrid genotyping array (Affymetrix SNP 6.0) to simultaneously measure 906,600 SNPs and copy number at 1.8 million genomic locations. By characterizing 270 HapMap samples, we developed a map of human CNV (at 2-kb breakpoint resolution) informed by integer genotypes for 1,320 copy number polymorphisms (CNPs) that segregate at an allele frequency >1%. More than 80% of the sequence in previously reported CNV regions fell outside our estimated CNV boundaries, indicating that large (>100 kb) CNVs affect much less of the genome than initially reported. Approximately 80% of observed copy number differences between pairs of individuals were due to common CNPs with an allele frequency >5%, and more than 99% derived from inheritance rather than new mutation. Most common, diallelic CNPs were in strong linkage disequilibrium with SNPs, and most low-frequency CNVs segregated on specific SNP haplotypes.


Subject(s)
Chromosomes, Human/genetics , DNA/genetics , Gene Dosage/genetics , Haplotypes/genetics , Polymorphism, Single Nucleotide , Population Groups/genetics , Genetic Variation , Genome, Human , Humans , Oligonucleotide Array Sequence Analysis , Polymerase Chain Reaction
4.
Nat Genet ; 40(5): 638-45, 2008 May.
Article in English | MEDLINE | ID: mdl-18372903

ABSTRACT

Genome-wide association (GWA) studies have identified multiple loci at which common variants modestly but reproducibly influence risk of type 2 diabetes (T2D). Established associations to common and rare variants explain only a small proportion of the heritability of T2D. As previously published analyses had limited power to identify variants with modest effects, we carried out meta-analysis of three T2D GWA scans comprising 10,128 individuals of European descent and approximately 2.2 million SNPs (directly genotyped and imputed), followed by replication testing in an independent sample with an effective sample size of up to 53,975. We detected at least six previously unknown loci with robust evidence for association, including the JAZF1 (P = 5.0 x 10(-14)), CDC123-CAMK1D (P = 1.2 x 10(-10)), TSPAN8-LGR5 (P = 1.1 x 10(-9)), THADA (P = 1.1 x 10(-9)), ADAMTS9 (P = 1.2 x 10(-8)) and NOTCH2 (P = 4.1 x 10(-8)) gene regions. Our results illustrate the value of large discovery and follow-up samples for gaining further insights into the inherited basis of T2D.


Subject(s)
Diabetes Mellitus, Type 2/genetics , Genetic Predisposition to Disease , Genome, Human , Humans , Polymorphism, Single Nucleotide
5.
Genome Biol ; 3(3): RESEARCH0011, 2002.
Article in English | MEDLINE | ID: mdl-11897023

ABSTRACT

BACKGROUND: Data from thousands of transcription-profiling experiments in organisms ranging from yeast to humans are now publicly available. How best to analyze these data remains an important challenge. A variety of tools have been used for this purpose, including hierarchical clustering, self-organizing maps and principal components analysis. In particular, concepts from vector algebra have proven useful in the study of genome-wide expression data. RESULTS: Here we present a framework based on vector algebra for the analysis of transcription profiles that is geometrically intuitive and computationally efficient. Concepts in vector algebra such as angles, magnitudes, subspaces, singular value decomposition, bases and projections have natural and powerful interpretations in the analysis of microarray data. Angles in particular offer a rigorous method of defining 'similarity' and are useful in evaluating the claims of a microarray-based study. We present a sample analysis of cells treated with rapamycin, an immunosuppressant whose effects have been extensively studied with microarrays. In addition, the algebraic concept of a basis for a space affords the opportunity to simplify data analysis and uncover a limited number of expression vectors to span the transcriptional range of cell behavior. CONCLUSIONS: This framework represents a compact, powerful and scalable construction for analysis and computation. As the amount of microarray data in the public domain grows, these vector-based methods are relevant in determining statistical significance. These approaches are also well suited to extract biologically meaningful information in the analysis of signaling networks.


Subject(s)
Computational Biology/statistics & numerical data , Gene Expression Profiling/methods , Gene Expression Profiling/statistics & numerical data , Genome, Fungal , Oligonucleotide Array Sequence Analysis/methods , Oligonucleotide Array Sequence Analysis/statistics & numerical data , Saccharomyces cerevisiae/genetics , Cluster Analysis , Computational Biology/methods , Genes, Fungal/genetics , Humans , Immunosuppressive Agents/pharmacology , Mutation/genetics , Saccharomyces cerevisiae/drug effects , Sirolimus/pharmacology
6.
Nature ; 416(6881): 653-7, 2002 Apr 11.
Article in English | MEDLINE | ID: mdl-11948353

ABSTRACT

Small molecules that alter protein function provide a means to modulate biological networks with temporal resolution. Here we demonstrate a potentially general and scalable method of identifying such molecules by application to a particular protein, Ure2p, which represses the transcription factors Gln3p and Nil1p. By probing a high-density microarray of small molecules generated by diversity-oriented synthesis with fluorescently labelled Ure2p, we performed 3,780 protein-binding assays in parallel and identified several compounds that bind Ure2p. One compound, which we call uretupamine, specifically activates a glucose-sensitive transcriptional pathway downstream of Ure2p. Whole-genome transcription profiling and chemical epistasis demonstrate the remarkable Ure2p specificity of uretupamine and its ability to modulate the glucose-sensitive subset of genes downstream of Ure2p. These results demonstrate that diversity-oriented synthesis and small-molecule microarrays can be used to identify small molecules that bind to a protein of interest, and that these small molecules can regulate specific functions of the protein.


Subject(s)
Dioxanes/metabolism , Gene Expression Regulation, Fungal , Glucose/metabolism , Oxazoles/metabolism , Prions , Saccharomyces cerevisiae Proteins/metabolism , Signal Transduction , Dioxanes/chemical synthesis , Dioxanes/chemistry , Dioxanes/pharmacokinetics , Dioxanes/pharmacology , Dose-Response Relationship, Drug , Gene Expression Profiling , Gene Expression Regulation, Fungal/drug effects , Glutathione Peroxidase , Ligands , Models, Biological , Oligonucleotide Array Sequence Analysis , Oxazoles/chemical synthesis , Oxazoles/chemistry , Oxazoles/pharmacokinetics , Oxazoles/pharmacology , Protein Binding , RNA, Messenger/genetics , RNA, Messenger/metabolism , Repressor Proteins/agonists , Repressor Proteins/antagonists & inhibitors , Repressor Proteins/metabolism , Saccharomyces cerevisiae/drug effects , Saccharomyces cerevisiae/genetics , Saccharomyces cerevisiae/metabolism , Saccharomyces cerevisiae Proteins/agonists , Saccharomyces cerevisiae Proteins/antagonists & inhibitors , Saccharomyces cerevisiae Proteins/genetics , Signal Transduction/drug effects , Structure-Activity Relationship , Substrate Specificity , Transcription, Genetic/drug effects
SELECTION OF CITATIONS
SEARCH DETAIL