Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 16 de 16
Filtrar
1.
BMC Bioinformatics ; 22(1): 254, 2021 May 17.
Artigo em Inglês | MEDLINE | ID: mdl-34000989

RESUMO

BACKGROUND: Colocalization is a statistical method used in genetics to determine whether the same variant is causal for multiple phenotypes, for example, complex traits and gene expression. It provides stronger mechanistic evidence than shared significance, which can be produced through separate causal variants in linkage disequilibrium. Current colocalization methods require full summary statistics for both traits, limiting their use with the majority of reported GWAS associations (e.g. GWAS Catalog). We propose a new approximation to the popular coloc method that can be applied when limited summary statistics are available. Our method (POint EstiMation of Colocalization, POEMColoc) imputes missing summary statistics for one or both traits using LD structure in a reference panel, and performs colocalization using the imputed summary statistics. RESULTS: We evaluate the performance of POEMColoc using real (UK Biobank phenotypes and GTEx eQTL) and simulated datasets. We show good correlation between posterior probabilities of colocalization computed from imputed and observed datasets and similar accuracy in simulation. We evaluate scenarios that might reduce performance and show that multiple independent causal variants in a region and imputation from a limited subset of typed variants have a larger effect while mismatched ancestry in the reference panel has a modest effect. Further, we find that POEMColoc is a better approximation of coloc when the imputed association statistics are from a well powered study (e.g., relatively larger sample size or effect size). Applying POEMColoc to estimate colocalization of GWAS Catalog entries and GTEx eQTL, we find evidence for colocalization of 150,000 trait-gene-tissue triplets. CONCLUSIONS: We find that colocalization analysis performed with full summary statistics can be closely approximated when only the summary statistics of the top SNP are available for one or both traits. When applied to the full GWAS Catalog and GTEx eQTL, we find that colocalized trait-gene pairs are enriched in tissues relevant to disease etiology and for matches to approved drug mechanisms. POEMColoc R package is available at https://github.com/AbbVie-ComputationalGenomics/POEMColoc .


Assuntos
Estudo de Associação Genômica Ampla , Locos de Características Quantitativas , Desequilíbrio de Ligação , Herança Multifatorial , Polimorfismo de Nucleotídeo Único , Probabilidade
2.
Dev Cell ; 55(5): 648-664.e9, 2020 12 07.
Artigo em Inglês | MEDLINE | ID: mdl-33171098

RESUMO

Enhancers are essential drivers of cell states, yet the relationship between accessibility, regulatory activity, and in vivo lineage commitment during embryogenesis remains poorly understood. Here, we measure chromatin accessibility in isolated neural and mesodermal lineages across a time course of Drosophila embryogenesis. Promoters, including tissue-specific genes, are often constitutively open, even in contexts where the gene is not expressed. In contrast, the majority of distal elements have dynamic, tissue-specific accessibility. Enhancer priming appears rarely within a lineage, perhaps reflecting the speed of Drosophila embryogenesis. However, many tissue-specific enhancers are accessible in other lineages early on and become progressively closed as embryogenesis proceeds. We demonstrate the usefulness of this tissue- and time-resolved resource to definitively identify single-cell clusters, to uncover predictive motifs, and to identify many regulators of tissue development. For one such predicted neural regulator, l(3)neo38, we generate a loss-of-function mutant and uncover an essential role for neuromuscular junction and brain development.


Assuntos
Drosophila melanogaster/embriologia , Drosophila melanogaster/genética , Desenvolvimento Embrionário/genética , Elementos Facilitadores Genéticos , Regiões Promotoras Genéticas , Animais , Linhagem da Célula/genética , Cromatina , Epigênese Genética , Regulação da Expressão Gênica no Desenvolvimento , Mesoderma/embriologia , Músculos/embriologia , Neurônios/citologia , Especificidade de Órgãos/genética , Ligação Proteica , Análise de Célula Única , Fatores de Tempo , Fatores de Transcrição/metabolismo
3.
PLoS Genet ; 15(12): e1008489, 2019 12.
Artigo em Inglês | MEDLINE | ID: mdl-31830040

RESUMO

Despite strong vetting for disease activity, only 10% of candidate new molecular entities in early stage clinical trials are eventually approved. Analyzing historical pipeline data, Nelson et al. 2015 (Nat. Genet.) concluded pipeline drug targets with human genetic evidence of disease association are twice as likely to lead to approved drugs. Taking advantage of recent clinical development advances and rapid growth in GWAS datasets, we extend the original work using updated data, test whether genetic evidence predicts future successes and introduce statistical models adjusting for target and indication-level properties. Our work confirms drugs with genetically supported targets were more likely to be successful in Phases II and III. When causal genes are clear (Mendelian traits and GWAS associations linked to coding variants), we find the use of human genetic evidence increases approval by greater than two-fold, and, for Mendelian associations, the positive association holds prospectively. Our findings suggest investments into genomics and genetics are likely to be beneficial to companies deploying this strategy.


Assuntos
Bases de Dados Genéticas , Aprovação de Drogas/estatística & dados numéricos , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Genômica/métodos , Humanos , Modelos Estatísticos , Variantes Farmacogenômicos , Fenótipo , Medicina de Precisão , Locos de Características Quantitativas
4.
Nat Genet ; 49(4): 550-558, 2017 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-28191888

RESUMO

Animal promoters initiate transcription either at precise positions (narrow promoters) or dispersed regions (broad promoters), a distinction referred to as promoter shape. Although highly conserved, the functional properties of promoters with different shapes and the genetic basis of their evolution remain unclear. Here we used natural genetic variation across a panel of 81 Drosophila lines to measure changes in transcriptional start site (TSS) usage, identifying thousands of genetic variants affecting transcript levels (strength) or the distribution of TSSs within a promoter (shape). Our results identify promoter shape as a molecular trait that can evolve independently of promoter strength. Broad promoters typically harbor shape-associated variants, with signatures of adaptive selection. Single-cell measurements demonstrate that variants modulating promoter shape often increase expression noise, whereas heteroallelic interactions with other promoter variants alleviate these effects. These results uncover new functional properties of natural promoters and suggest the minimization of expression noise as an important factor in promoter evolution.


Assuntos
Variação Genética/genética , Regiões Promotoras Genéticas/genética , Animais , Evolução Biológica , Drosophila/genética , Ruído , Sítio de Iniciação de Transcrição/fisiologia , Transcrição Gênica/genética
5.
Nature ; 541(7637): 402-406, 2017 01 19.
Artigo em Inglês | MEDLINE | ID: mdl-28024300

RESUMO

Embryonic development is driven by tightly regulated patterns of gene expression, despite extensive genetic variation among individuals. Studies of expression quantitative trait loci (eQTL) indicate that genetic variation frequently alters gene expression in cell-culture models and differentiated tissues. However, the extent and types of genetic variation impacting embryonic gene expression, and their interactions with developmental programs, remain largely unknown. Here we assessed the effect of genetic variation on transcriptional (expression levels) and post-transcriptional (3' RNA processing) regulation across multiple stages of metazoan development, using 80 inbred Drosophila wild isolates, identifying thousands of developmental-stage-specific and shared QTL. Given the small blocks of linkage disequilibrium in Drosophila, we obtain near base-pair resolution, resolving causal mutations in developmental enhancers, validated transcription-factor-binding sites and RNA motifs. This fine-grain mapping uncovered extensive allelic interactions within enhancers that have opposite effects, thereby buffering their impact on enhancer activity. QTL affecting 3' RNA processing identify new functional motifs leading to transcript isoform diversity and changes in the lengths of 3' untranslated regions. These results highlight how developmental stage influences the effects of genetic variation and uncover multiple mechanisms that regulate and buffer expression variation during embryogenesis.


Assuntos
Drosophila melanogaster/embriologia , Drosophila melanogaster/genética , Desenvolvimento Embrionário/genética , Regulação da Expressão Gênica no Desenvolvimento , Variação Genética , Regiões 3' não Traduzidas/genética , Alelos , Animais , Sítios de Ligação , Elementos Facilitadores Genéticos , Desequilíbrio de Ligação , Mutação , Locos de Características Quantitativas , Processamento de Terminações 3' de RNA , Fatores de Transcrição/metabolismo , Transcrição Gênica
6.
PLoS Genet ; 10(9): e1004663, 2014 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-25233095

RESUMO

DNA methylation is an important epigenetic regulator of gene expression. Recent studies have revealed widespread associations between genetic variation and methylation levels. However, the mechanistic links between genetic variation and methylation remain unclear. To begin addressing this gap, we collected methylation data at ∼300,000 loci in lymphoblastoid cell lines (LCLs) from 64 HapMap Yoruba individuals, and genome-wide bisulfite sequence data in ten of these individuals. We identified (at an FDR of 10%) 13,915 cis methylation QTLs (meQTLs)-i.e., CpG sites in which changes in DNA methylation are associated with genetic variation at proximal loci. We found that meQTLs are frequently associated with changes in methylation at multiple CpGs across regions of up to 3 kb. Interestingly, meQTLs are also frequently associated with variation in other properties of gene regulation, including histone modifications, DNase I accessibility, chromatin accessibility, and expression levels of nearby genes. These observations suggest that genetic variants may lead to coordinated molecular changes in all of these regulatory phenotypes. One plausible driver of coordinated changes in different regulatory mechanisms is variation in transcription factor (TF) binding. Indeed, we found that SNPs that change predicted TF binding affinities are significantly enriched for associations with DNA methylation at nearby CpGs.


Assuntos
Metilação de DNA , Regulação da Expressão Gênica , Histonas/metabolismo , Locos de Características Quantitativas , Fatores de Transcrição/metabolismo , Sítios de Ligação , Linhagem Celular Transformada , Biologia Computacional , Estudo de Associação Genômica Ampla , Genômica/métodos , Genótipo , Humanos , Fenótipo , Polimorfismo de Nucleotídeo Único , Ligação Proteica
7.
Science ; 342(6159): 747-9, 2013 Nov 08.
Artigo em Inglês | MEDLINE | ID: mdl-24136359

RESUMO

Histone modifications are important markers of function and chromatin state, yet the DNA sequence elements that direct them to specific genomic locations are poorly understood. Here, we identify hundreds of quantitative trait loci, genome-wide, that affect histone modification or RNA polymerase II (Pol II) occupancy in Yoruba lymphoblastoid cell lines (LCLs). In many cases, the same variant is associated with quantitative changes in multiple histone marks and Pol II, as well as in deoxyribonuclease I sensitivity and nucleosome positioning. Transcription factor binding site polymorphisms are correlated overall with differences in local histone modification, and we identify specific transcription factors whose binding leads to histone modification in LCLs. Furthermore, variants that affect chromatin at distal regulatory sites frequently also direct changes in chromatin and gene expression at associated promoters.


Assuntos
Regulação da Expressão Gênica , Variação Genética , Histonas/metabolismo , Processamento de Proteína Pós-Traducional/genética , RNA Polimerase II/metabolismo , Fatores de Transcrição/metabolismo , Sítios de Ligação/genética , Linhagem Celular Tumoral , Células/metabolismo , Cromatina/química , Cromatina/genética , Cromatina/metabolismo , Genoma Humano , Histonas/química , Histonas/genética , Humanos , Polimorfismo Genético , Regiões Promotoras Genéticas , Locos de Características Quantitativas , RNA Polimerase II/química , Fatores de Transcrição/genética
8.
PLoS Genet ; 8(10): e1003000, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-23071454

RESUMO

Recent gene expression QTL (eQTL) mapping studies have provided considerable insight into the genetic basis for inter-individual regulatory variation. However, a limitation of all eQTL studies to date, which have used measurements of steady-state gene expression levels, is the inability to directly distinguish between variation in transcription and decay rates. To address this gap, we performed a genome-wide study of variation in gene-specific mRNA decay rates across individuals. Using a time-course study design, we estimated mRNA decay rates for over 16,000 genes in 70 Yoruban HapMap lymphoblastoid cell lines (LCLs), for which extensive genotyping data are available. Considering mRNA decay rates across genes, we found that: (i) as expected, highly expressed genes are generally associated with lower mRNA decay rates, (ii) genes with rapid mRNA decay rates are enriched with putative binding sites for miRNA and RNA binding proteins, and (iii) genes with similar functional roles tend to exhibit correlated rates of mRNA decay. Focusing on variation in mRNA decay across individuals, we estimate that steady-state expression levels are significantly correlated with variation in decay rates in 10% of genes. Somewhat counter-intuitively, for about half of these genes, higher expression is associated with faster decay rates, possibly due to a coupling of mRNA decay with transcriptional processes in genes involved in rapid cellular responses. Finally, we used these data to map genetic variation that is specifically associated with variation in mRNA decay rates across individuals. We found 195 such loci, which we named RNA decay quantitative trait loci ("rdQTLs"). All the observed rdQTLs are located near the regulated genes and therefore are assumed to act in cis. By analyzing our data within the context of known steady-state eQTLs, we estimate that a substantial fraction of eQTLs are associated with inter-individual variation in mRNA decay rates.


Assuntos
Expressão Gênica , Variação Genética , Locos de Características Quantitativas , Estabilidade de RNA , Linhagem Celular , Mapeamento Cromossômico , Perfilação da Expressão Gênica , Regulação da Expressão Gênica , Estudo de Associação Genômica Ampla , Humanos , MicroRNAs/genética , MicroRNAs/metabolismo , Interferência de RNA
9.
Genome Biol ; 13(1): R7, 2012 Jan 31.
Artigo em Inglês | MEDLINE | ID: mdl-22293038

RESUMO

BACKGROUND: Expression quantitative trait loci (eQTLs) are likely to play an important role in the genetics of complex traits; however, their functional basis remains poorly understood. Using the HapMap lymphoblastoid cell lines, we combine 1000 Genomes genotypes and an extensive catalogue of human functional elements to investigate the biological mechanisms that eQTLs perturb. RESULTS: We use a Bayesian hierarchical model to estimate the enrichment of eQTLs in a wide variety of regulatory annotations. We find that approximately 40% of eQTLs occur in open chromatin, and that they are particularly enriched in transcription factor binding sites, suggesting that many directly impact protein-DNA interactions. Analysis of core promoter regions shows that eQTLs also frequently disrupt some known core promoter motifs but, surprisingly, are not enriched in other well-known motifs such as the TATA box. We also show that information from regulatory annotations alone, when weighted by the hierarchical model, can provide a meaningful ranking of the SNPs that are most likely to drive gene expression variation. CONCLUSIONS: Our study demonstrates how regulatory annotation and the association signal derived from eQTL-mapping can be combined into a single framework. We used this approach to further our understanding of the biology that drives human gene expression variation, and of the putatively causal SNPs that underlie it.


Assuntos
Proteínas de Ligação a DNA/genética , Desoxirribonuclease I , Expressão Gênica , Locos de Características Quantitativas/genética , Sequências Reguladoras de Ácido Nucleico/genética , Teorema de Bayes , Linhagem Celular , Cromatina/genética , Desoxirribonuclease I/genética , Desoxirribonuclease I/metabolismo , Genoma Humano , Genótipo , Projeto HapMap , Humanos , Polimorfismo de Nucleotídeo Único , Regiões Promotoras Genéticas , Fatores de Transcrição/genética
10.
Nature ; 482(7385): 390-4, 2012 Feb 05.
Artigo em Inglês | MEDLINE | ID: mdl-22307276

RESUMO

The mapping of expression quantitative trait loci (eQTLs) has emerged as an important tool for linking genetic variation to changes in gene regulation. However, it remains difficult to identify the causal variants underlying eQTLs, and little is known about the regulatory mechanisms by which they act. Here we show that genetic variants that modify chromatin accessibility and transcription factor binding are a major mechanism through which genetic variation leads to gene expression differences among humans. We used DNase I sequencing to measure chromatin accessibility in 70 Yoruba lymphoblastoid cell lines, for which genome-wide genotypes and estimates of gene expression levels are also available. We obtained a total of 2.7 billion uniquely mapped DNase I-sequencing (DNase-seq) reads, which allowed us to produce genome-wide maps of chromatin accessibility for each individual. We identified 8,902 locations at which the DNase-seq read depth correlated significantly with genotype at a nearby single nucleotide polymorphism or insertion/deletion (false discovery rate = 10%). We call such variants 'DNase I sensitivity quantitative trait loci' (dsQTLs). We found that dsQTLs are strongly enriched within inferred transcription factor binding sites and are frequently associated with allele-specific changes in transcription factor binding. A substantial fraction (16%) of dsQTLs are also associated with variation in the expression levels of nearby genes (that is, these loci are also classified as eQTLs). Conversely, we estimate that as many as 55% of eQTL single nucleotide polymorphisms are also dsQTLs. Our observations indicate that dsQTLs are highly abundant in the human genome and are likely to be important contributors to phenotypic variation.


Assuntos
Pegada de DNA , Desoxirribonuclease I/metabolismo , Regulação da Expressão Gênica/genética , Variação Genética/genética , Locos de Características Quantitativas/genética , Cromatina/genética , Cromatina/metabolismo , Perfilação da Expressão Gênica , Genoma Humano/genética , Humanos , Fenótipo , Polimorfismo de Nucleotídeo Único/genética , Análise de Sequência de DNA , Fatores de Transcrição/metabolismo
11.
Genome Biol ; 12(1): R10, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-21251332

RESUMO

BACKGROUND: DNA methylation is an essential epigenetic mechanism involved in gene regulation and disease, but little is known about the mechanisms underlying inter-individual variation in methylation profiles. Here we measured methylation levels at 22,290 CpG dinucleotides in lymphoblastoid cell lines from 77 HapMap Yoruba individuals, for which genome-wide gene expression and genotype data were also available. RESULTS: Association analyses of methylation levels with more than three million common single nucleotide polymorphisms (SNPs) identified 180 CpG-sites in 173 genes that were associated with nearby SNPs (putatively in cis, usually within 5 kb) at a false discovery rate of 10%. The most intriguing trans signal was obtained for SNP rs10876043 in the disco-interacting protein 2 homolog B gene (DIP2B, previously postulated to play a role in DNA methylation), that had a genome-wide significant association with the first principal component of patterns of methylation; however, we found only modest signal of trans-acting associations overall. As expected, we found significant negative correlations between promoter methylation and gene expression levels measured by RNA-sequencing across genes. Finally, there was a significant overlap of SNPs that were associated with both methylation and gene expression levels. CONCLUSIONS: Our results demonstrate a strong genetic component to inter-individual variation in DNA methylation profiles. Furthermore, there was an enrichment of SNPs that affect both methylation and gene expression, providing evidence for shared mechanisms in a fraction of genes.


Assuntos
Metilação de DNA , Epigênese Genética , Regulação da Expressão Gênica , Projeto HapMap , Linhagem Celular , Genoma Humano , Estudo de Associação Genômica Ampla , Genótipo , Histonas/metabolismo , Humanos , Polimorfismo de Nucleotídeo Único , Regiões Promotoras Genéticas , Locos de Características Quantitativas , Transcrição Gênica
12.
Genome Res ; 21(3): 447-55, 2011 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-21106904

RESUMO

Accurate functional annotation of regulatory elements is essential for understanding global gene regulation. Here, we report a genome-wide map of 827,000 transcription factor binding sites in human lymphoblastoid cell lines, which is comprised of sites corresponding to 239 position weight matrices of known transcription factor binding motifs, and 49 novel sequence motifs. To generate this map, we developed a probabilistic framework that integrates cell- or tissue-specific experimental data such as histone modifications and DNase I cleavage patterns with genomic information such as gene annotation and evolutionary conservation. Comparison to empirical ChIP-seq data suggests that our method is highly accurate yet has the advantage of targeting many factors in a single assay. We anticipate that this approach will be a valuable tool for genome-wide studies of gene regulation in a wide variety of cell types or tissues under diverse conditions.


Assuntos
Sítios de Ligação/genética , Cromatina/metabolismo , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Análise de Sequência de DNA , Fatores de Transcrição/metabolismo , Células Cultivadas , Cromatina/genética , Imunoprecipitação da Cromatina , Biologia Computacional , Clivagem do DNA , Genoma , Histonas/metabolismo , Humanos , Anotação de Sequência Molecular , Matrizes de Pontuação de Posição Específica , Ligação Proteica , Sequências Reguladoras de Ácido Nucleico , Fatores de Transcrição/genética , Transcrição Gênica
13.
Mol Ecol ; 19(12): 2501-15, 2010 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-20497321

RESUMO

The southeastern coastal plain of the United States is a region marked by extraordinary phylogeographic congruence that is frequently attributed to the changing sea levels that occurred during the glacial-interglacial cycles of the Pleistocene epoch. A phylogeographic break corresponding to the Apalachicola River has been suggested for many species studied to date that are endemic to this region. Here, we used this pattern of phylogeographic congruence to develop and test explicit hypotheses about the genetic structure in the ornate chorus frog (Pseudacris ornata). Using 1299 bp of mtDNA sequence and seven nuclear microsatellite markers in 13 natural populations of P. ornata, we found three clades corresponding to geographically distinct regions; one spans the Apalachicola River (Southern Clade), one encompasses Georgia and South Carolina (Central Clade) and a third comprises more northerly individuals (Northern Clade). However, it does not appear that typical phylogeographic barriers demarcate these clades. Instead, isolation by distance across the range of the entire species explained the pattern of genetic variation that we observed. We propose that P. ornata was historically widespread in the southeastern United States, and that a balance between genetic drift and migration was the root of the genetic divergence among populations. Additionally, we investigated fine-scale patterns of genetic structure and found the spatial scale at which there was significant genetic structure varied among the regions studied. Furthermore, we discuss our results in light of other phylogeographic studies of southeastern coastal plain organisms and in relation to amphibian conservation and management.


Assuntos
Anuros/genética , Evolução Molecular , Genética Populacional , Filogenia , Algoritmos , Animais , Teorema de Bayes , Análise por Conglomerados , DNA Mitocondrial/genética , Genótipo , Geografia , Repetições de Microssatélites , Dados de Sequência Molecular , Análise de Sequência de DNA , Estados Unidos
14.
Nature ; 464(7289): 768-72, 2010 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-20220758

RESUMO

Understanding the genetic mechanisms underlying natural variation in gene expression is a central goal of both medical and evolutionary genetics, and studies of expression quantitative trait loci (eQTLs) have become an important tool for achieving this goal. Although all eQTL studies so far have assayed messenger RNA levels using expression microarrays, recent advances in RNA sequencing enable the analysis of transcript variation at unprecedented resolution. We sequenced RNA from 69 lymphoblastoid cell lines derived from unrelated Nigerian individuals that have been extensively genotyped by the International HapMap Project. By pooling data from all individuals, we generated a map of the transcriptional landscape of these cells, identifying extensive use of unannotated untranslated regions and more than 100 new putative protein-coding exons. Using the genotypes from the HapMap project, we identified more than a thousand genes at which genetic variation influences overall expression levels or splicing. We demonstrate that eQTLs near genes generally act by a mechanism involving allele-specific expression, and that variation that influences the inclusion of an exon is enriched within and near the consensus splice sites. Our results illustrate the power of high-throughput sequencing for the joint analysis of variation in transcription, splicing and allele-specific expression across individuals.


Assuntos
Perfilação da Expressão Gênica , Regulação da Expressão Gênica/genética , Variação Genética/genética , RNA Mensageiro/análise , RNA Mensageiro/genética , Transcrição Gênica/genética , Alelos , População Negra/genética , Sequência Consenso/genética , DNA Complementar/genética , Éxons/genética , Humanos , Nigéria , Polimorfismo de Nucleotídeo Único/genética , Locos de Características Quantitativas/genética , Sítios de Splice de RNA/genética , Análise de Sequência de RNA
15.
Bioinformatics ; 25(24): 3207-12, 2009 Dec 15.
Artigo em Inglês | MEDLINE | ID: mdl-19808877

RESUMO

MOTIVATION: Next-generation sequencing has become an important tool for genome-wide quantification of DNA and RNA. However, a major technical hurdle lies in the need to map short sequence reads back to their correct locations in a reference genome. Here, we investigate the impact of SNP variation on the reliability of read-mapping in the context of detecting allele-specific expression (ASE). RESULTS: We generated 16 million 35 bp reads from mRNA of each of two HapMap Yoruba individuals. When we mapped these reads to the human genome we found that, at heterozygous SNPs, there was a significant bias toward higher mapping rates of the allele in the reference sequence, compared with the alternative allele. Masking known SNP positions in the genome sequence eliminated the reference bias but, surprisingly, did not lead to more reliable results overall. We find that even after masking, approximately 5-10% of SNPs still have an inherent bias toward more effective mapping of one allele. Filtering out inherently biased SNPs removes 40% of the top signals of ASE. The remaining SNPs showing ASE are enriched in genes previously known to harbor cis-regulatory variation or known to show uniparental imprinting. Our results have implications for a variety of applications involving detection of alternate alleles from short-read sequence data. AVAILABILITY: Scripts, written in Perl and R, for simulating short reads, masking SNP variation in a reference genome and analyzing the simulation output are available upon request from JFD. Raw short read data were deposited in GEO (http://www.ncbi.nlm.nih.gov/geo/) under accession number GSE18156. CONTACT: jdegner@uchicago.edu; marioni@uchicago.edu; gilad@uchicago.edu; pritch@uchicago.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Alelos , Biologia Computacional/métodos , Análise de Sequência de RNA/métodos , Sequência de Bases , Perfilação da Expressão Gênica , Genoma Humano , Humanos , Polimorfismo de Nucleotídeo Único , Software
16.
Mol Ecol Resour ; 9(2): 622-4, 2009 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-21564710

RESUMO

We describe the cloning and characterization of eight novel tetranucleotide microsatellite loci in the ornate chorus frog (Pseudacris ornata). We also screened 26 loci from GenBank that were isolated from other Pseudacris species and obtained consistent product from five of these dinucleotide loci. All loci are polymorphic. In our sample of 26 frogs from a natural population, polymorphism ranged from 1 to 22 alleles per locus with expected heterozygosities ranging from 0 to 0.958. These loci enable high-resolution studies of P. ornata. Moreover, cross-species amplification success suggests they will also be useful for other chorus frog species.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA