Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 28
Filtrar
1.
Nature ; 482(7385): 390-4, 2012 Feb 05.
Artigo em Inglês | MEDLINE | ID: mdl-22307276

RESUMO

The mapping of expression quantitative trait loci (eQTLs) has emerged as an important tool for linking genetic variation to changes in gene regulation. However, it remains difficult to identify the causal variants underlying eQTLs, and little is known about the regulatory mechanisms by which they act. Here we show that genetic variants that modify chromatin accessibility and transcription factor binding are a major mechanism through which genetic variation leads to gene expression differences among humans. We used DNase I sequencing to measure chromatin accessibility in 70 Yoruba lymphoblastoid cell lines, for which genome-wide genotypes and estimates of gene expression levels are also available. We obtained a total of 2.7 billion uniquely mapped DNase I-sequencing (DNase-seq) reads, which allowed us to produce genome-wide maps of chromatin accessibility for each individual. We identified 8,902 locations at which the DNase-seq read depth correlated significantly with genotype at a nearby single nucleotide polymorphism or insertion/deletion (false discovery rate = 10%). We call such variants 'DNase I sensitivity quantitative trait loci' (dsQTLs). We found that dsQTLs are strongly enriched within inferred transcription factor binding sites and are frequently associated with allele-specific changes in transcription factor binding. A substantial fraction (16%) of dsQTLs are also associated with variation in the expression levels of nearby genes (that is, these loci are also classified as eQTLs). Conversely, we estimate that as many as 55% of eQTL single nucleotide polymorphisms are also dsQTLs. Our observations indicate that dsQTLs are highly abundant in the human genome and are likely to be important contributors to phenotypic variation.


Assuntos
Pegada de DNA , Desoxirribonuclease I/metabolismo , Regulação da Expressão Gênica/genética , Variação Genética/genética , Locos de Características Quantitativas/genética , Cromatina/genética , Cromatina/metabolismo , Perfilação da Expressão Gênica , Genoma Humano/genética , Humanos , Fenótipo , Polimorfismo de Nucleotídeo Único/genética , Análise de Sequência de DNA , Fatores de Transcrição/metabolismo
2.
BMC Genomics ; 18(1): 286, 2017 04 08.
Artigo em Inglês | MEDLINE | ID: mdl-28390408

RESUMO

BACKGROUND: Human endogenous retroviruses (HERVs) have received much attention for their implications in the etiology of many human diseases and their profound effect on evolution. Notably, recent studies have highlighted associations between HERVs expression and cancers (Yu et al., Int J Mol Med 32, 2013), autoimmunity (Balada et al., Int Rev Immunol 29:351-370, 2010) and neurological (Christensen, J Neuroimmune Pharmacol 5:326-335, 2010) conditions. Their repetitive nature makes their study particularly challenging, where expression studies have largely focused on individual loci (De Parseval et al., J Virol 77:10414-10422, 2003) or general trends within families (Forsman et al., J Virol Methods 129:16-30, 2005; Seifarth et al., J Virol 79:341-352, 2005; Pichon et al., Nucleic Acids Res 34:e46, 2006). METHODS: To refine our understanding of HERVs activity, we introduce here a new microarray, HERV-V3. This work was made possible by the careful detection and annotation of genomic HERV/MaLR sequences as well as the development of a new hybridization model, allowing the optimization of probe performances and the control of cross-reactions. RESULTS: HERV-V3 offers an almost complete coverage of HERVs and their ancestors (mammalian apparent LTR-retrotransposons, MaLRs) at the locus level along with four other repertoires (active LINE-1 elements, lncRNA, a selection of 1559 human genes and common infectious viruses). We demonstrate that HERV-V3 analytical performances are comparable with commercial Affymetrix arrays, and that for a selection of tissue/pathological specific loci, the patterns of expression measured on HERV-V3 is consistent with those reported in the literature. CONCLUSIONS: Given its large HERVs/MaLRs coverage and additional repertoires, HERV-V3 opens the door to multiple applications such as enhancers and alternative promoters identification, biomarkers identification as well as the characterization of genes and HERVs/MaLRs modulation caused by viral infection.


Assuntos
Retrovirus Endógenos/genética , Perfilação da Expressão Gênica , Hibridização Genética , Modelos Genéticos , Transcriptoma , Algoritmos , Análise por Conglomerados , Biologia Computacional/métodos , Bases de Dados de Ácidos Nucleicos , Perfilação da Expressão Gênica/métodos , Loci Gênicos , Humanos , Hibridização de Ácido Nucleico , Reprodutibilidade dos Testes , Fluxo de Trabalho
3.
Bioinformatics ; 32(12): 1779-87, 2016 06 15.
Artigo em Inglês | MEDLINE | ID: mdl-26833346

RESUMO

MOTIVATION: Alignment-based taxonomic binning for metagenome characterization proceeds in two steps: reads mapping against a reference database (RDB) and taxonomic assignment according to the best hits. Beyond the sequencing technology and the completeness of the RDB, selecting the optimal configuration of the workflow, in particular the mapper parameters and the best hit selection threshold, to get the highest binning performance remains quite empirical. RESULTS: We developed a statistical framework to perform such optimization at a minimal computational cost. Using an optimization experimental design and simulated datasets for three sequencing technologies, we built accurate prediction models for five performance indicators and then derived the parameter configuration providing the optimal performance. Whatever the mapper and the dataset, we observed that the optimal configuration yielded better performance than the default configuration and that the best hit selection threshold had a large impact on performance. Finally, on a reference dataset from the Human Microbiome Project, we confirmed that the optimized configuration increased the performance compared with the default configuration. AVAILABILITY AND IMPLEMENTATION: Not applicable. CONTACT: magali.dancette@biomerieux.com SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Metagenômica , Algoritmos , Humanos , Metagenoma , Microbiota , Modelos Teóricos
4.
Bioinformatics ; 32(7): 1023-32, 2016 04 01.
Artigo em Inglês | MEDLINE | ID: mdl-26589281

RESUMO

MOTIVATION: Metagenomics characterizes the taxonomic diversity of microbial communities by sequencing DNA directly from an environmental sample. One of the main challenges in metagenomics data analysis is the binning step, where each sequenced read is assigned to a taxonomic clade. Because of the large volume of metagenomics datasets, binning methods need fast and accurate algorithms that can operate with reasonable computing requirements. While standard alignment-based methods provide state-of-the-art performance, compositional approaches that assign a taxonomic class to a DNA read based on the k-mers it contains have the potential to provide faster solutions. RESULTS: We propose a new rank-flexible machine learning-based compositional approach for taxonomic assignment of metagenomics reads and show that it benefits from increasing the number of fragments sampled from reference genome to tune its parameters, up to a coverage of about 10, and from increasing the k-mer size to about 12. Tuning the method involves training machine learning models on about 10(8) samples in 10(7) dimensions, which is out of reach of standard softwares but can be done efficiently with modern implementations for large-scale machine learning. The resulting method is competitive in terms of accuracy with well-established alignment and composition-based tools for problems involving a small to moderate number of candidate species and for reasonable amounts of sequencing errors. We show, however, that machine learning-based compositional approaches are still limited in their ability to deal with problems involving a greater number of species and more sensitive to sequencing errors. We finally show that the new method outperforms the state-of-the-art in its ability to classify reads from species of lineage absent from the reference database and confirm that compositional approaches achieve faster prediction times, with a gain of 2-17 times with respect to the BWA-MEM short read mapper, depending on the number of candidate species and the level of sequencing noise. AVAILABILITY AND IMPLEMENTATION: Data and codes are available at http://cbio.ensmp.fr/largescalemetagenomics CONTACT: pierre.mahe@biomerieux.com SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Aprendizado de Máquina , Metagenômica , Análise de Sequência de DNA , Algoritmos , Metagenoma , Software
5.
Nature ; 464(7289): 768-72, 2010 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-20220758

RESUMO

Understanding the genetic mechanisms underlying natural variation in gene expression is a central goal of both medical and evolutionary genetics, and studies of expression quantitative trait loci (eQTLs) have become an important tool for achieving this goal. Although all eQTL studies so far have assayed messenger RNA levels using expression microarrays, recent advances in RNA sequencing enable the analysis of transcript variation at unprecedented resolution. We sequenced RNA from 69 lymphoblastoid cell lines derived from unrelated Nigerian individuals that have been extensively genotyped by the International HapMap Project. By pooling data from all individuals, we generated a map of the transcriptional landscape of these cells, identifying extensive use of unannotated untranslated regions and more than 100 new putative protein-coding exons. Using the genotypes from the HapMap project, we identified more than a thousand genes at which genetic variation influences overall expression levels or splicing. We demonstrate that eQTLs near genes generally act by a mechanism involving allele-specific expression, and that variation that influences the inclusion of an exon is enriched within and near the consensus splice sites. Our results illustrate the power of high-throughput sequencing for the joint analysis of variation in transcription, splicing and allele-specific expression across individuals.


Assuntos
Perfilação da Expressão Gênica , Regulação da Expressão Gênica/genética , Variação Genética/genética , RNA Mensageiro/análise , RNA Mensageiro/genética , Transcrição Gênica/genética , Alelos , População Negra/genética , Sequência Consenso/genética , DNA Complementar/genética , Éxons/genética , Humanos , Nigéria , Polimorfismo de Nucleotídeo Único/genética , Locos de Características Quantitativas/genética , Sítios de Splice de RNA/genética , Análise de Sequência de RNA
6.
BMC Bioinformatics ; 16: 106, 2015 Mar 28.
Artigo em Inglês | MEDLINE | ID: mdl-25880752

RESUMO

BACKGROUND: Construction and validation of a prognostic model for survival data in the clinical domain is still an active field of research. Nevertheless there is no consensus on how to develop routine prognostic tests based on a combination of RT-qPCR biomarkers and clinical or demographic variables. In particular, the estimation of the model performance requires to properly account for the RT-qPCR experimental design. RESULTS: We present a strategy to build, select, and validate a prognostic model for survival data based on a combination of RT-qPCR biomarkers and clinical or demographic data and we provide an illustration on a real clinical dataset. First, we compare two cross-validation schemes: a classical outcome-stratified cross-validation scheme and an alternative one that accounts for the RT-qPCR plate design, especially when samples are processed by batches. The latter is intended to limit the performance discrepancies, also called the validation surprise, between the training and the test sets. Second, strategies for model building (covariate selection, functional relationship modeling, and statistical model) as well as performance indicators estimation are presented. Since in practice several prognostic models can exhibit similar performances, complementary criteria for model selection are discussed: the stability of the selected variables, the model optimism, and the impact of the omitted variables on the model performance. CONCLUSION: On the training dataset, appropriate resampling methods are expected to prevent from any upward biases due to unaccounted technical and biological variability that may arise from the experimental and intrinsic design of the RT-qPCR assay. Moreover, the stability of the selected variables, the model optimism, and the impact of the omitted variables on the model performances are pivotal indicators to select the optimal model to be validated on the test dataset.


Assuntos
Expressão Gênica , Modelos de Riscos Proporcionais , Reação em Cadeia da Polimerase em Tempo Real , Reação em Cadeia da Polimerase Via Transcriptase Reversa , Biomarcadores , Humanos , Prognóstico , Choque Séptico/mortalidade
7.
Bioinformatics ; 30(1): 40-9, 2014 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-24130309

RESUMO

MOTIVATION: Paired-end sequencing allows circumventing the shortness of the reads produced by second generation sequencers and is essential for de novo assembly of genomes. However, obtaining a finished genome from short reads is still an open challenge. We present an algorithm that exploits the pairing information issued from inserts of potentially any length. The method determines paths through an overlaps graph by using a constrained search tree. We also present a method that automatically determines suited overlaps cutoffs according to the contextual coverage, reducing thus the need for manual parameterization. Finally, we introduce an interactive mode that allows querying an assembly at targeted regions. RESULTS: We assess our methods by assembling two Staphylococcus aureus strains that were sequenced on the Illumina platform. Using 100 bp paired-end reads and minimal manual curation, we produce a finished genome sequence for the previously undescribed isolate SGH-10-168. AVAILABILITY AND IMPLEMENTATION: The presented algorithms are implemented in the standalone Edena software, freely available under the General Public License (GPLv3) at www.genomic.ch/edena.php.


Assuntos
Mapeamento Cromossômico/métodos , Genoma , Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Staphylococcus aureus/genética , Algoritmos , Sequência de Bases , Dados de Sequência Molecular , Análise de Sequência de DNA/métodos , Software
8.
Bioinformatics ; 30(9): 1280-6, 2014 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-24443381

RESUMO

MOTIVATION: Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry has been broadly adopted by routine clinical microbiology laboratories for bacterial species identification. An isolated colony of the targeted microorganism is the single prerequisite. Currently, MS-based microbial identification directly from clinical specimens can not be routinely performed, as it raises two main challenges: (i) the nature of the sample itself may increase the level of technical variability and bring heterogeneity with respect to the reference database and (ii) the possibility of encountering polymicrobial samples that will yield a 'mixed' MS fingerprint. In this article, we introduce a new method to infer the composition of polymicrobial samples on the basis of a single mass spectrum. Our approach relies on a penalized non-negative linear regression framework making use of species-specific prototypes, which can be derived directly from the routine reference database of pure spectra. RESULTS: A large spectral dataset obtained from in vitro mono- and bi-microbial samples allowed us to evaluate the performance of the method in a comprehensive way. Provided that the reference matrix-assisted laser desorption/ionization time-of-flight mass spectrometry fingerprints were sufficiently distinct for the individual species, the method automatically predicted which bacterial species were present in the sample. Only few samples (5.3%) were misidentified, and bi-microbial samples were correctly identified in up to 61.2% of the cases. This method could be used in routine clinical microbiology practice.


Assuntos
Bactérias Gram-Negativas/química , Bactérias Gram-Positivas/isolamento & purificação , Espectrometria de Massas por Ionização e Dessorção a Laser Assistida por Matriz/métodos , Automação , Bases de Dados Genéticas , Bactérias Gram-Negativas/isolamento & purificação , Modelos Lineares
9.
PLoS Genet ; 8(9): e1002958, 2012 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-23028365

RESUMO

Natural populations are known to differ not only in DNA but also in their chromatin-associated epigenetic marks. When such inter-individual epigenomic differences (or "epi-polymorphisms") are observed, their stability is usually not known: they may or may not be reprogrammed over time or upon environmental changes. In addition, their origin may be purely epigenetic, or they may result from regulatory variation encoded in the DNA. Studying epi-polymorphisms requires, therefore, an assessment of their nature and stability. Here we estimate the stability of yeast epi-polymorphisms of chromatin acetylation, and we provide a genome-by-epigenome map of their genetic control. A transient epi-drug treatment was able to reprogram acetylation variation at more than one thousand nucleosomes, whereas a similar amount of variation persisted, distinguishing "labile" from "persistent" epi-polymorphisms. Hundreds of genetic loci underlied acetylation variation at 2,418 nucleosomes either locally (in cis) or distantly (in trans), and this genetic control overlapped only partially with the genetic control of gene expression. Trans-acting regulators were not necessarily associated with genes coding for chromatin modifying enzymes. Strikingly, "labile" and "persistent" epi-polymorphisms were associated with poor and strong genetic control, respectively, showing that genetic modifiers contribute to persistence. These results estimate the amount of natural epigenomic variation that can be lost after transient environmental exposures, and they reveal the complex genetic architecture of the DNA-encoded determinism of chromatin epi-polymorphisms. Our observations provide a basis for the development of population epigenetics.


Assuntos
Cromatina/genética , Epigênese Genética/genética , Histona-Lisina N-Metiltransferase , Polimorfismo Genético , Saccharomyces cerevisiae , Acetilação , Regulação Fúngica da Expressão Gênica , Genética Populacional , Histona-Lisina N-Metiltransferase/genética , Histona-Lisina N-Metiltransferase/metabolismo , Histonas/genética , Histonas/metabolismo , Nucleossomos/metabolismo , Polimorfismo de Nucleotídeo Único , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo
10.
PLoS Genet ; 8(10): e1003000, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-23071454

RESUMO

Recent gene expression QTL (eQTL) mapping studies have provided considerable insight into the genetic basis for inter-individual regulatory variation. However, a limitation of all eQTL studies to date, which have used measurements of steady-state gene expression levels, is the inability to directly distinguish between variation in transcription and decay rates. To address this gap, we performed a genome-wide study of variation in gene-specific mRNA decay rates across individuals. Using a time-course study design, we estimated mRNA decay rates for over 16,000 genes in 70 Yoruban HapMap lymphoblastoid cell lines (LCLs), for which extensive genotyping data are available. Considering mRNA decay rates across genes, we found that: (i) as expected, highly expressed genes are generally associated with lower mRNA decay rates, (ii) genes with rapid mRNA decay rates are enriched with putative binding sites for miRNA and RNA binding proteins, and (iii) genes with similar functional roles tend to exhibit correlated rates of mRNA decay. Focusing on variation in mRNA decay across individuals, we estimate that steady-state expression levels are significantly correlated with variation in decay rates in 10% of genes. Somewhat counter-intuitively, for about half of these genes, higher expression is associated with faster decay rates, possibly due to a coupling of mRNA decay with transcriptional processes in genes involved in rapid cellular responses. Finally, we used these data to map genetic variation that is specifically associated with variation in mRNA decay rates across individuals. We found 195 such loci, which we named RNA decay quantitative trait loci ("rdQTLs"). All the observed rdQTLs are located near the regulated genes and therefore are assumed to act in cis. By analyzing our data within the context of known steady-state eQTLs, we estimate that a substantial fraction of eQTLs are associated with inter-individual variation in mRNA decay rates.


Assuntos
Expressão Gênica , Variação Genética , Locos de Características Quantitativas , Estabilidade de RNA , Linhagem Celular , Mapeamento Cromossômico , Perfilação da Expressão Gênica , Regulação da Expressão Gênica , Estudo de Associação Genômica Ampla , Humanos , MicroRNAs/genética , MicroRNAs/metabolismo , Interferência de RNA
11.
PLoS Genet ; 6(4): e1000913, 2010 Apr 22.
Artigo em Inglês | MEDLINE | ID: mdl-20421933

RESUMO

Epigenomes commonly refer to the sequence of presence/absence of specific epigenetic marks along eukaryotic chromatin. Complete histone-borne epigenomes have now been described at single-nucleosome resolution from various organisms, tissues, developmental stages, or diseases, yet their intra-species natural variation has never been investigated. We describe here that the epigenomic sequence of histone H3 acetylation at Lysine 14 (H3K14ac) differs greatly between two unrelated strains of the yeast Saccharomyces cerevisiae. Using single-nucleosome chromatin immunoprecipitation and mapping, we interrogated 58,694 nucleosomes and found that 5,442 of them differed in their level of H3K14 acetylation, at a false discovery rate (FDR) of 0.0001. These Single Nucleosome Epi-Polymorphisms (SNEPs) were enriched at regulatory sites and conserved non-coding DNA sequences. Surprisingly, higher acetylation in one strain did not imply higher expression of the relevant gene. However, SNEPs were enriched in genes of high transcriptional variability and one SNEP was associated with the strength of gene activation upon stimulation. Our observations suggest a high level of inter-individual epigenomic variation in natural populations, with essential questions on the origin of this diversity and its relevance to gene x environment interactions.


Assuntos
Epigênese Genética , Nucleossomos/metabolismo , Polimorfismo de Nucleotídeo Único , Saccharomyces cerevisiae/genética , Acetilação , Sequência Conservada , Genoma Fúngico , Saccharomyces cerevisiae/metabolismo
12.
BMC Plant Biol ; 11: 16, 2011 Jan 19.
Artigo em Inglês | MEDLINE | ID: mdl-21247437

RESUMO

BACKGROUND: Integrating QTL results from independent experiments performed on related species helps to survey the genetic diversity of loci/alleles underlying complex traits, and to highlight potential targets for breeding or QTL cloning. Potato (Solanum tuberosum L.) late blight resistance has been thoroughly studied, generating mapping data for many Rpi-genes (R-genes to Phytophthora infestans) and QTLs (quantitative trait loci). Moreover, late blight resistance was often associated with plant maturity. To get insight into the genomic organization of late blight resistance loci as compared to maturity QTLs, a QTL meta-analysis was performed for both traits. RESULTS: Nineteen QTL publications for late blight resistance were considered, seven of them reported maturity QTLs. Twenty-one QTL maps and eight reference maps were compiled to construct a 2,141-marker consensus map on which QTLs were projected and clustered into meta-QTLs. The whole-genome QTL meta-analysis reduced by six-fold late blight resistance QTLs (by clustering 144 QTLs into 24 meta-QTLs), by ca. five-fold maturity QTLs (by clustering 42 QTLs into eight meta-QTLs), and by ca. two-fold QTL confidence interval mean. Late blight resistance meta-QTLs were observed on every chromosome and maturity meta-QTLs on only six chromosomes. CONCLUSIONS: Meta-analysis helped to refine the genomic regions of interest frequently described, and provided the closest flanking markers. Meta-QTLs of late blight resistance and maturity juxtaposed along chromosomes IV, V and VIII, and overlapped on chromosomes VI and XI. The distribution of late blight resistance meta-QTLs is significantly independent from those of Rpi-genes, resistance gene analogs and defence-related loci. The anchorage of meta-QTLs to the potato genome sequence, recently publicly released, will especially improve the candidate gene selection to determine the genes underlying meta-QTLs. All mapping data are available from the Sol Genomics Network (SGN) database.


Assuntos
Imunidade Inata/genética , Phytophthora infestans/fisiologia , Doenças das Plantas/imunologia , Locos de Características Quantitativas/genética , Característica Quantitativa Herdável , Solanum tuberosum/genética , Solanum tuberosum/microbiologia , Mapeamento Cromossômico , Genes de Plantas/genética , Estudos de Associação Genética , Marcadores Genéticos , Doenças das Plantas/genética , Doenças das Plantas/microbiologia
13.
Theor Appl Genet ; 123(6): 907-26, 2011 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-21761163

RESUMO

Earliness is very important for the adaptation of wheat to environmental conditions and the achievement of high grain yield. A detailed knowledge of key genetic components of the life cycle would enable an easier control by the breeders. The objective of the study was to investigate the effect of candidate genes on flowering time. Using a collection of hexaploid wheat composed of 235 lines from diverse geographical origins, we conducted an association study for six candidate genes for flowering time and its components (vernalization sensitivity and earliness per se). The effect on the variation of earliness components of polymorphisms within the copies of each gene was tested in ANOVA models accounting for the underlying genetic structure. The collection was structured in five groups that minimized the residual covariance. Vernalization requirement and lateness tend to increase according to the mean latitude of each group. Heading date for an autumnal sowing was mainly determined by the earliness per se. Except for the Constans (CO) gene orthologous of the barley HvCO3, all gene polymorphisms had a significant impact on earliness components. The three traits used to quantify vernalization requirement were primarily associated with polymorphisms at Vrn-1 and then at Vrn-3 and Luminidependens (LD) genes. We found a good correspondence between spring/winter types and genotypes at the three homeologous copies of Vrn-1. Earliness per se was mainly explained by polymorphisms at Vrn-3 and to a lesser extent at Vrn-1, Hd-1 and Gigantea (GI) genes. Vernalization requirement and earliness as a function of geographical origin, as well as the possible role of the breeding practices in the geographical distribution of the alleles and the hypothetical adaptive value of the candidate genes, are discussed.


Assuntos
Flores/genética , Flores/fisiologia , Triticum/genética , Triticum/fisiologia , Alelos , Sequência de Bases , Mapeamento Cromossômico , Regulação da Expressão Gênica de Plantas , Genes de Plantas , Estudos de Associação Genética , Variação Genética , Genótipo , Haplótipos , Desequilíbrio de Ligação , Família Multigênica , Fenótipo , Proteínas de Plantas/genética , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , Alinhamento de Sequência , Análise de Sequência de DNA
14.
PLoS Genet ; 4(10): e1000214, 2008 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-18846210

RESUMO

Recent studies of the HapMap lymphoblastoid cell lines have identified large numbers of quantitative trait loci for gene expression (eQTLs). Reanalyzing these data using a novel Bayesian hierarchical model, we were able to create a surprisingly high-resolution map of the typical locations of sites that affect mRNA levels in cis. Strikingly, we found a strong enrichment of eQTLs in the 250 bp just upstream of the transcription end site (TES), in addition to an enrichment around the transcription start site (TSS). Most eQTLs lie either within genes or close to genes; for example, we estimate that only 5% of eQTLs lie more than 20 kb upstream of the TSS. After controlling for position effects, SNPs in exons are approximately 2-fold more likely than SNPs in introns to be eQTLs. Our results suggest an important role for mRNA stability in determining steady-state mRNA levels, and highlight the potential of eQTL mapping as a high-resolution tool for studying the determinants of gene regulation.


Assuntos
Mapeamento Cromossômico/métodos , Regulação da Expressão Gênica , Locos de Características Quantitativas , Teorema de Bayes , Mapeamento Cromossômico/estatística & dados numéricos , Bases de Dados Genéticas , Teste de Complementação Genética , Genoma Humano , Humanos , Íntrons , Modelos Genéticos , Polimorfismo de Nucleotídeo Único , Estabilidade de RNA , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Regiões Terminadoras Genéticas , Sítio de Iniciação de Transcrição
15.
Mol Biol Evol ; 26(3): 649-58, 2009 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-19091723

RESUMO

Changes in gene expression may represent an important mode of human adaptation. However, to date, there are relatively few known examples in which selection has been shown to act directly on levels or patterns of gene expression. In order to test whether single nucleotide polymorphisms (SNPs) that affect gene expression in cis are frequently targets of positive natural selection in humans, we analyzed genome-wide SNP and expression data from cell lines associated with the International HapMap Project. Using a haplotype-based test for selection that was designed to detect incomplete selective sweeps, we found that SNPs showing signals of selection are more likely than random SNPs to be associated with gene expression levels in cis. This signal is significant in the Yoruba (which is the population that shows the strongest signals of selection overall) and shows a trend in the same direction in the other HapMap populations. Our results argue that selection on gene expression levels is an important type of human adaptation. Finally, our work provides an analytical framework for tackling a more general problem that will become increasingly important: namely, testing whether selection signals overlap significantly with SNPs that are associated with phenotypes of interest.


Assuntos
Regulação da Expressão Gênica , Genoma Humano , Polimorfismo de Nucleotídeo Único , Seleção Genética , Linhagem Celular , Perfilação da Expressão Gênica , Haplótipos , Humanos
16.
Genetics ; 178(4): 2433-7, 2008 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-18430961

RESUMO

An association study conducted on 375 maize inbred lines indicates a strong relationship between Vgt1 polymorphisms and flowering time, extending former quantitative trait loci (QTL) mapping results. Analysis of allele frequencies in a landrace collection supports a key role of Vgt1 in maize altilatitudinal adaptation.


Assuntos
Adaptação Fisiológica , Mapeamento Cromossômico , Flores/genética , Flores/fisiologia , Proteínas de Plantas/genética , Locos de Características Quantitativas/genética , Zea mays/genética , Zea mays/fisiologia , Ecossistema , Frequência do Gene , Genes de Plantas/genética , Genótipo , Geografia , Desequilíbrio de Ligação , Polimorfismo Genético
17.
Front Microbiol ; 9: 511, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29616014

RESUMO

The French National Reference Center for Staphylococci currently uses DNA arrays and spa typing for the initial epidemiological characterization of Staphylococcus aureus strains. We here describe the use of whole-genome sequencing (WGS) to investigate retrospectively four distinct and virulent S. aureus lineages [clonal complexes (CCs): CC1, CC5, CC8, CC30] involved in hospital and community outbreaks or sporadic infections in France. We used a WGS bioinformatics pipeline based on de novo assembly (reference-free approach), single nucleotide polymorphism analysis, and on the inclusion of epidemiological markers. We examined the phylogeographic diversity of the French dominant hospital-acquired CC8-MRSA (methicillin-resistant S. aureus) Lyon clone through WGS analysis which did not demonstrate evidence of large-scale geographic clustering. We analyzed sporadic cases along with two outbreaks of a CC1-MSSA (methicillin-susceptible S. aureus) clone containing the Panton-Valentine leukocidin (PVL) and results showed that two sporadic cases were closely related. We investigated an outbreak of PVL-positive CC30-MSSA in a school environment and were able to reconstruct the transmission history between eight families. We explored different outbreaks among newborns due to the CC5-MRSA Geraldine clone and we found evidence of an unsuspected link between two otherwise distinct outbreaks. Here, WGS provides the resolving power to disprove transmission events indicated by conventional methods (same sequence type, spa type, toxin profile, and antibiotic resistance profile) and, most importantly, WGS can reveal unsuspected transmission events. Therefore, WGS allows to better describe and understand outbreaks and (inter-)national dissemination of S. aureus lineages. Our findings underscore the importance of adding WGS for (inter-)national surveillance of infections caused by virulent clones of S. aureus but also substantiate the fact that technological optimization at the bioinformatics level is still urgently needed for routine use. However, the greatest limitation of WGS analysis is the completeness and the correctness of the reference database being used and the conversion of floods of data into actionable results. The WGS bioinformatics pipeline (EpiSeqTM) we used here can easily generate a uniform database and associated metadata for epidemiological applications.

18.
BMC Bioinformatics ; 8: 49, 2007 Feb 08.
Artigo em Inglês | MEDLINE | ID: mdl-17288608

RESUMO

BACKGROUND: Integration of multiple results from Quantitative Trait Loci (QTL) studies is a key point to understand the genetic determinism of complex traits. Up to now many efforts have been made by public database developers to facilitate the storage, compilation and visualization of multiple QTL mapping experiment results. However, studying the congruency between these results still remains a complex task. Presently, the few computational and statistical frameworks to do so are mainly based on empirical methods (e.g. consensus genetic maps are generally built by iterative projection). RESULTS: In this article, we present a new computational and statistical package, called MetaQTL, for carrying out whole-genome meta-analysis of QTL mapping experiments. Contrary to existing methods, MetaQTL offers a complete statistical process to establish a consensus model for both the marker and the QTL positions on the whole genome. First, MetaQTL implements a new statistical approach to merge multiple distinct genetic maps into a single consensus map which is optimal in terms of weighted least squares and can be used to investigate recombination rate heterogeneity between studies. Secondly, assuming that QTL can be projected on the consensus map, MetaQTL offers a new clustering approach based on a Gaussian mixture model to decide how many QTL underly the distribution of the observed QTL. CONCLUSION: We demonstrate using simulations that the usual model choice criteria from mixture model literature perform relatively well in this context. As expected, simulations also show that this new clustering algorithm leads to a reduction in the length of the confidence interval of QTL location provided that across studies there are enough observed QTL for each underlying true QTL location. The usefulness of our approach is illustrated on published QTL detection results of flowering time in maize. Finally, MetaQTL is freely available at http://bioinformatics.org/mqtl.


Assuntos
Mapeamento Cromossômico/métodos , Metanálise como Assunto , Locos de Características Quantitativas/genética , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Software , Interface Usuário-Computador , Gráficos por Computador , Marcadores Genéticos/genética
19.
Genetics ; 172(4): 2449-63, 2006 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-16415370

RESUMO

To investigate the genetic basis of maize adaptation to temperate climate, collections of 375 inbred lines and 275 landraces, representative of American and European diversity, were evaluated for flowering time under short- and long-day conditions. The inbred line collection was genotyped for 55 genomewide simple sequence repeat (SSR) markers. Comparison of inbred line population structure with that of landraces, as determined with 24 SSR loci, underlined strong effects of both historical and modern selection on population structure and a clear relationship with geographical origins. The late tropical groups and the early "Northern Flint" group from the northern United States and northern Europe exhibited different flowering times. Both collections were genotyped for a 6-bp insertion/deletion in the Dwarf8 (D8idp) gene, previously reported to be potentially involved in flowering time variation in a 102 American inbred panel. Among-group D8idp differentiation was much higher than that for any SSR marker, suggesting diversifying selection. Correcting for population structure, D8idp was associated with flowering time under long-day conditions, the deletion allele showing an average earlier flowering of 29 degree days for inbreds and 145 degree days for landraces. Additionally, the deletion allele occurred at a high frequency (>80%) in Northern Flint while being almost absent (<5%) in tropical materials. Altogether, these results indicate that Dwarf8 could be involved in maize climatic adaptation through diversifying selection for flowering time.


Assuntos
Clima , Proteínas de Plantas/genética , Polimorfismo Genético , Zea mays/genética , Alelos , Deleção de Genes , Genes de Plantas , Variação Genética , Genética Populacional , Genoma de Planta , Genótipo , Geografia , Sequências Repetitivas de Ácido Nucleico , Fatores de Tempo
20.
Int J Antimicrob Agents ; 50(2): 210-218, 2017 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-28554735

RESUMO

Genetic determinants of antibiotic resistance (AR) have been extensively investigated. High-throughput sequencing allows for the assessment of the relationship between genotype and phenotype. A panel of 672 Pseudomonas aeruginosa strains was analysed, including representatives of globally disseminated multidrug-resistant and extensively drug-resistant clones; genomes and multiple antibiograms were available. This panel was annotated for AR gene presence and polymorphism, defining a resistome in which integrons were included. Integrons were present in >70 distinct cassettes, with In5 being the most prevalent. Some cassettes closely associated with clonal complexes, whereas others spread across the phylogenetic diversity, highlighting the importance of horizontal transfer. A resistome-wide association study (RWAS) was performed for clinically relevant antibiotics by correlating the variability in minimum inhibitory concentration (MIC) values with resistome data. Resistome annotation identified 147 loci associated with AR. These loci consisted mainly of acquired genomic elements and intrinsic genes. The RWAS allowed for correct identification of resistance mechanisms for meropenem, amikacin, levofloxacin and cefepime, and added 46 novel mutations. Among these, 29 were variants of the oprD gene associated with variation in meropenem MIC. Using genomic and MIC data, phenotypic AR was successfully correlated with molecular determinants at the whole-genome sequence level.


Assuntos
Antibacterianos/farmacologia , Farmacorresistência Bacteriana , Genes Bacterianos , Genótipo , Pseudomonas aeruginosa/efeitos dos fármacos , Pseudomonas aeruginosa/genética , Loci Gênicos , Humanos , Sequências Repetitivas Dispersas , Testes de Sensibilidade Microbiana , Infecções por Pseudomonas/microbiologia , Pseudomonas aeruginosa/isolamento & purificação
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA