RESUMO
Self-renewal and pluripotency of the embryonic stem cell (ESC) state are established and maintained by multiple regulatory networks that comprise transcription factors and epigenetic regulators. While much has been learned regarding transcription factors, the function of epigenetic regulators in these networks is less well defined. We conducted a CRISPR-Cas9-mediated loss-of-function genetic screen that identified two epigenetic regulators, TAF5L and TAF6L, components or co-activators of the GNAT-HAT complexes for the mouse ESC (mESC) state. Detailed molecular studies demonstrate that TAF5L/TAF6L transcriptionally activate c-Myc and Oct4 and their corresponding MYC and CORE regulatory networks. Besides, TAF5L/TAF6L predominantly regulate their target genes through H3K9ac deposition and c-MYC recruitment that eventually activate the MYC regulatory network for self-renewal of mESCs. Thus, our findings uncover a role of TAF5L/TAF6L in directing the MYC regulatory network that orchestrates gene expression programs to control self-renewal for the maintenance of mESC state.
Assuntos
Células-Tronco Embrionárias/metabolismo , Redes Reguladoras de Genes , Células-Tronco Pluripotentes Induzidas/metabolismo , Proteínas Proto-Oncogênicas c-myc/genética , Fatores Associados à Proteína de Ligação a TATA/genética , Animais , Sistemas CRISPR-Cas , Ciclo Celular/genética , Proliferação de Células , Reprogramação Celular , Embrião de Mamíferos , Células-Tronco Embrionárias/citologia , Epigênese Genética , Fibroblastos/citologia , Fibroblastos/metabolismo , Edição de Genes , Regulação da Expressão Gênica , Células HEK293 , Histonas/genética , Histonas/metabolismo , Humanos , Células-Tronco Pluripotentes Induzidas/citologia , Camundongos , Cultura Primária de Células , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , Proteínas Proto-Oncogênicas c-myc/metabolismo , Transdução de Sinais , Fatores Associados à Proteína de Ligação a TATA/metabolismoRESUMO
Keratolytic winter erythema (KWE) is a rare autosomal-dominant skin disorder characterized by recurrent episodes of palmoplantar erythema and epidermal peeling. KWE was previously mapped to 8p23.1-p22 (KWE critical region) in South African families. Using targeted resequencing of the KWE critical region in five South African families and SNP array and whole-genome sequencing in two Norwegian families, we identified two overlapping tandem duplications of 7.67 kb (South Africans) and 15.93 kb (Norwegians). The duplications segregated with the disease and were located upstream of CTSB, a gene encoding cathepsin B, a cysteine protease involved in keratinocyte homeostasis. Included in the 2.62 kb overlapping region of these duplications is an enhancer element that is active in epidermal keratinocytes. The activity of this enhancer correlated with CTSB expression in normal differentiating keratinocytes and other cell lines, but not with FDFT1 or NEIL2 expression. Gene expression (qPCR) analysis and immunohistochemistry of the palmar epidermis demonstrated significantly increased expression of CTSB, as well as stronger staining of cathepsin B in the stratum granulosum of affected individuals than in that of control individuals. Analysis of higher-order chromatin structure data and RNA polymerase II ChIA-PET data from MCF-7 cells did not suggest remote effects of the enhancer. In conclusion, KWE in South African and Norwegian families is caused by tandem duplications in a non-coding genomic region containing an active enhancer element for CTSB, resulting in upregulation of this gene in affected individuals.
Assuntos
Catepsina B/metabolismo , Elementos Facilitadores Genéticos , Eritema/genética , Duplicação Gênica , Regulação da Expressão Gênica , Ceratose/genética , Dermatopatias Genéticas/genética , Estudos de Casos e Controles , Catepsina B/genética , Mapeamento Cromossômico , Cromossomos Humanos Par 8/genética , Variações do Número de Cópias de DNA , DNA Glicosilases/genética , DNA Glicosilases/metabolismo , DNA Liase (Sítios Apurínicos ou Apirimidínicos)/genética , DNA Liase (Sítios Apurínicos ou Apirimidínicos)/metabolismo , Epiderme/metabolismo , Epigenômica , Eritema/epidemiologia , Feminino , Marcadores Genéticos , Humanos , Queratinócitos/metabolismo , Ceratose/epidemiologia , Células MCF-7 , Masculino , Noruega/epidemiologia , Linhagem , Dermatopatias Genéticas/epidemiologia , África do Sul/epidemiologiaRESUMO
The prefrontal cortex (PFC) is one of the latest brain regions to mature, which allows the acquisition of complex cognitive abilities through experience. To unravel the underlying gene expression changes during postnatal development, we performed RNA-sequencing (RNA-seq) in the rat medial PFC (mPFC) at five developmental time points from infancy to adulthood, and analyzed the differential expression of protein-coding genes, long intergenic noncoding RNAs (lincRNAs), and alternative exons. We showed that most expression changes occur in infancy, and that the number of differentially expressed genes reduces toward adulthood. We observed 137 differentially expressed lincRNAs and 796 genes showing alternative exon usage during postnatal development. Importantly, we detected a genetic switch from neuronal network establishment in infancy to maintenance of neural networks in adulthood based on gene expression dynamics, involving changes in protein-coding and lincRNA gene expression as well as alternative exon usage. Our gene expression datasets provide insights into the multifaceted transcriptional regulation of the developing PFC. They can be used to study the basic developmental processes of the mPFC and to understand the mechanisms of neurodevelopmental and neuropsychiatric disorders. Our study provides an important contribution to the ongoing efforts to complete the "brain map", and to the understanding of PFC development.
Assuntos
Regulação da Expressão Gênica no Desenvolvimento/fisiologia , Neurônios/fisiologia , Córtex Pré-Frontal/citologia , Córtex Pré-Frontal/crescimento & desenvolvimento , Fatores Etários , Animais , Animais Recém-Nascidos , Perfilação da Expressão Gênica , Ontologia Genética , Estudo de Associação Genômica Ampla , RNA Longo não Codificante/genética , RNA Longo não Codificante/metabolismo , Ratos , Ratos WistarRESUMO
The transcription factor p63 plays a pivotal role in keratinocyte proliferation and differentiation in the epidermis. However, how p63 regulates epidermal genes during differentiation is not yet clear. Using epigenome profiling of differentiating human primary epidermal keratinocytes, we characterized a catalog of dynamically regulated genes and p63-bound regulatory elements that are relevant for epithelial development and related diseases. p63-bound regulatory elements occur as single or clustered enhancers, and remarkably, only a subset is active as defined by the co-presence of the active enhancer mark histone modification H3K27ac in epidermal keratinocytes. We show that the dynamics of gene expression correlates with the activity of p63-bound enhancers rather than with p63 binding itself. The activity of p63-bound enhancers is likely determined by other transcription factors that cooperate with p63. Our data show that inactive p63-bound enhancers in epidermal keratinocytes may be active during the development of other epithelial-related structures such as limbs and suggest that p63 bookmarks genomic loci during the commitment of the epithelial lineage and regulates genes through temporal- and spatial-specific active enhancers.
Assuntos
Diferenciação Celular , Elementos Facilitadores Genéticos , Células Epidérmicas , Regulação da Expressão Gênica , Queratinócitos/citologia , Fatores de Transcrição/genética , Proteínas Supressoras de Tumor/genética , Linhagem da Célula , Loci Gênicos , Humanos , Fatores de Transcrição/metabolismo , Proteínas Supressoras de Tumor/metabolismoRESUMO
BACKGROUND: The CCTC-binding factor (CTCF) protein is involved in genome organization, including mediating three-dimensional chromatin interactions. Human patient lymphocytes with mutations in a single copy of the CTCF gene have reduced expression of enhancer-associated genes involved in response to stimuli. We hypothesize that CTCF interactions stabilize enhancer-promoter chromatin interaction domains, facilitating increased expression of genes in response to stimuli. Here we systematically investigate this model using computational analyses. RESULTS: We use CTCF ChIA-PET data from the ENCODE project to show that CTCF-associated chromatin loops have a tendency to enclose regions of enhancer-regulated stimulus responsive genes, insulating them from neighboring regions of constitutively expressed housekeeping genes. To facilitate cell type-specific CTCF loop identification, we develop an algorithm to predict CTCF loops from ChIP-seq data alone by exploiting the CTCF motif directionality in loop anchors. We apply this algorithm to a hundred ENCODE cell line datasets, confirming the universality of our observations as well as identifying a general distinction between primary and immortal cells in loop-enclosed gene content. Finally, we combine the existing evidence to propose a model for the formation of CTCF loops in which partner sites are brought together by chromatin template reeling through stationary RNA polymerases, consistent with the transcription factory hypothesis. CONCLUSIONS: We provide computational evidence that CTCF-mediated chromatin interactions enclose domains of stimulus responsive enhancer-regulated genes, insulating them from nearby housekeeping genes.
Assuntos
Cromatina/química , Elementos Facilitadores Genéticos , Regiões Promotoras Genéticas , Proteínas Repressoras/química , Fator de Ligação a CCCTC , Imunoprecipitação da Cromatina , Regulação da Expressão Gênica , Humanos , Células K562 , Células MCF-7 , Proteínas Repressoras/genéticaRESUMO
Orofacial clefts (OFCs) represent a large fraction of human birth defects and are one of the most common phenotypes affected by large copy number variants (CNVs). Due to the limited number of CNV patients in individual centers, CNV analyses of a large number of OFC patients are challenging. The present study analyzed 249 genomic deletions and 226 duplications from a cohort of 312 OFC patients reported in two publicly accessible databases of chromosome imbalance and phenotype in humans, DECIPHER and ECARUCA. Genomic regions deleted or duplicated in multiple patients were identified, and genes in these overlapping CNVs were prioritized based on the number of genes encompassed by the region and gene expression in embryonic mouse palate. Our analyses of these overlapping CNVs identified two genes known to be causative for human OFCs, SATB2 and MEIS2, and 12 genes (DGCR6, FGF2, FRZB, LETM1, MAPK3, SPRY1, THBS1, TSHZ1, TTC28, TULP4, WHSC1, WHSC2) that are associated with OFC or orofacial development. Additionally, we report 34 deleted and 24 duplicated genes that have not previously been associated with OFCs but are associated with the BMP, MAPK and RAC1 pathways. Statistical analyses show that the high number of overlapping CNVs is not due to random occurrence. The identified genes are not located in highly variable genomic regions in healthy populations and are significantly enriched for genes that are involved in orofacial development. In summary, we report a CNV analysis pipeline of a large cohort of OFC patients and identify novel candidate OFC genes.
Assuntos
Fenda Labial/genética , Fissura Palatina/genética , Variações do Número de Cópias de DNA , Face/anormalidades , Predisposição Genética para Doença , Humanos , FenótipoRESUMO
An increasing number of genes involved in chromatin structure and epigenetic regulation has been implicated in a variety of developmental disorders, often including intellectual disability. By trio exome sequencing and subsequent mutational screening we now identified two de novo frameshift mutations and one de novo missense mutation in CTCF in individuals with intellectual disability, microcephaly, and growth retardation. Furthermore, an individual with a larger deletion including CTCF was identified. CTCF (CCCTC-binding factor) is one of the most important chromatin organizers in vertebrates and is involved in various chromatin regulation processes such as higher order of chromatin organization, enhancer function, and maintenance of three-dimensional chromatin structure. Transcriptome analyses in all three individuals with point mutations revealed deregulation of genes involved in signal transduction and emphasized the role of CTCF in enhancer-driven expression of genes. Our findings indicate that haploinsufficiency of CTCF affects genomic interaction of enhancers and their regulated gene promoters that drive developmental processes and cognition.
Assuntos
Mutação da Fase de Leitura , Deficiência Intelectual/genética , Mutação de Sentido Incorreto , Proteínas Repressoras/genética , Adolescente , Fator de Ligação a CCCTC , Criança , Pré-Escolar , Cromatina/genética , Cromatina/metabolismo , Análise Mutacional de DNA , Elementos Facilitadores Genéticos , Exoma , Feminino , Perfilação da Expressão Gênica , Regulação da Expressão Gênica , Genoma Humano , Haploinsuficiência , Humanos , Masculino , Microcefalia/genética , Mutação Puntual , Polimorfismo de Nucleotídeo Único , Regiões Promotoras Genéticas , Transdução de SinaisRESUMO
Intellectual Disability (ID) disorders, defined by an IQ below 70, are genetically and phenotypically highly heterogeneous. Identification of common molecular pathways underlying these disorders is crucial for understanding the molecular basis of cognition and for the development of therapeutic intervention strategies. To systematically establish their functional connectivity, we used transgenic RNAi to target 270 ID gene orthologs in the Drosophila eye. Assessment of neuronal function in behavioral and electrophysiological assays and multiparametric morphological analysis identified phenotypes associated with knockdown of 180 ID gene orthologs. Most of these genotype-phenotype associations were novel. For example, we uncovered 16 genes that are required for basal neurotransmission and have not previously been implicated in this process in any system or organism. ID gene orthologs with morphological eye phenotypes, in contrast to genes without phenotypes, are relatively highly expressed in the human nervous system and are enriched for neuronal functions, suggesting that eye phenotyping can distinguish different classes of ID genes. Indeed, grouping genes by Drosophila phenotype uncovered 26 connected functional modules. Novel links between ID genes successfully predicted that MYCN, PIGV and UPF3B regulate synapse development. Drosophila phenotype groups show, in addition to ID, significant phenotypic similarity also in humans, indicating that functional modules are conserved. The combined data indicate that ID disorders, despite their extreme genetic diversity, are caused by disruption of a limited number of highly connected functional modules.
Assuntos
Olho/metabolismo , Deficiência Intelectual/genética , Redes e Vias Metabólicas/genética , Sinapses/genética , Animais , Animais Geneticamente Modificados , Drosophila/genética , Olho/crescimento & desenvolvimento , Técnicas de Silenciamento de Genes , Variação Genética , Humanos , Deficiência Intelectual/metabolismo , Deficiência Intelectual/patologia , Neurônios/metabolismo , Fenótipo , Interferência de RNA , Sinapses/metabolismoRESUMO
BACKGROUND: Candidate disease gene prediction is a rapidly developing area of bioinformatics research with the potential to deliver great benefits to human health. As experimental studies detecting associations between genetic intervals and disease proliferate, better bioinformatic techniques that can expand and exploit the data are required. DESCRIPTION: Gentrepid is a web resource which predicts and prioritizes candidate disease genes for both Mendelian and complex diseases. The system can take input from linkage analysis of single genetic intervals or multiple marker loci from genome-wide association studies. The underlying database of the Gentrepid tool sources data from numerous gene and protein resources, taking advantage of the wealth of biological information available. Using known disease gene information from OMIM, the system predicts and prioritizes disease gene candidates that participate in the same protein pathways or share similar protein domains. Alternatively, using an ab initio approach, the system can detect enrichment of these protein annotations without prior knowledge of the phenotype. CONCLUSIONS: The system aims to integrate the wealth of protein information currently available with known and novel phenotype/genotype information to acquire knowledge of biological mechanisms underpinning disease. We have updated the system to facilitate analysis of GWAS data and the study of complex diseases. Application of the system to GWAS data on hypertension using the ICBP data is provided as an example. An interesting prediction is a ZIP transporter additional to the one found by the ICBP analysis. The webserver URL is https://www.gentrepid.org/.
Assuntos
Biologia Computacional/métodos , Bases de Dados Genéticas , Predisposição Genética para Doença/genética , Estudo de Associação Genômica Ampla , Internet , Humanos , FenótipoRESUMO
Heterozygous mutations in p63 are associated with split hand/foot malformations (SHFM), orofacial clefting, and ectodermal abnormalities. Elucidation of the p63 gene network that includes target genes and regulatory elements may reveal new genes for other malformation disorders. We performed genome-wide DNA-binding profiling by chromatin immunoprecipitation (ChIP), followed by deep sequencing (ChIP-seq) in primary human keratinocytes, and identified potential target genes and regulatory elements controlled by p63. We show that p63 binds to an enhancer element in the SHFM1 locus on chromosome 7q and that this element controls expression of DLX6 and possibly DLX5, both of which are important for limb development. A unique micro-deletion including this enhancer element, but not the DLX5/DLX6 genes, was identified in a patient with SHFM. Our study strongly indicates disruption of a non-coding cis-regulatory element located more than 250 kb from the DLX5/DLX6 genes as a novel disease mechanism in SHFM1. These data provide a proof-of-concept that the catalogue of p63 binding sites identified in this study may be of relevance to the studies of SHFM and other congenital malformations that resemble the p63-associated phenotypes.
Assuntos
Cromossomos Humanos Par 7/genética , Elementos Facilitadores Genéticos , Regulação da Expressão Gênica no Desenvolvimento , Proteínas de Homeodomínio/genética , Deformidades Congênitas dos Membros/genética , Proteínas de Membrana/metabolismo , Complexo de Endopeptidases do Proteassoma/genética , Fatores de Transcrição/genética , Animais , Sequência de Bases , Sítios de Ligação , Células Cultivadas , Pré-Escolar , Imunoprecipitação da Cromatina , Cromossomos Humanos Par 7/metabolismo , Proteínas de Ligação a DNA/genética , Proteínas de Ligação a DNA/metabolismo , Feminino , Estudo de Associação Genômica Ampla , Proteínas de Homeodomínio/metabolismo , Humanos , Queratinócitos/metabolismo , Deformidades Congênitas dos Membros/metabolismo , Masculino , Proteínas de Membrana/genética , Camundongos , Dados de Sequência Molecular , Complexo de Endopeptidases do Proteassoma/metabolismo , Ligação Proteica , Fatores de Transcrição/metabolismo , Peixe-ZebraRESUMO
Enhancers are important cis-regulatory elements controlling cell-type specific expression patterns of genes. Furthermore, combinations of enhancers and minimal promoters are utilized to construct small, artificial promoters for gene delivery vectors. Large-scale functional screening methodology to construct genomic maps of enhancer activities has been successfully established in cultured cell lines, however, not yet applied to terminally differentiated cells and tissues in a living animal. Here, we transposed the Self-Transcribing Active Regulatory Region Sequencing (STARR-seq) technique to the mouse brain using adeno-associated-viruses (AAV) for the delivery of a highly complex screening library tiling entire genomic regions and covering in total 3 Mb of the mouse genome. We identified 483 sequences with enhancer activity, including sequences that were not predicted by DNA accessibility or histone marks. Characterizing the expression patterns of fluorescent reporters controlled by nine candidate sequences, we observed differential expression patterns also in sparse cell types. Together, our study provides an entry point for the unbiased study of enhancer activities in organisms during health and disease.
Assuntos
Elementos Facilitadores Genéticos , Genômica , Animais , Camundongos , Genômica/métodos , Mapeamento Cromossômico/métodos , Regiões Promotoras Genéticas , EncéfaloRESUMO
Disease networks are increasingly explored as a complement to networks centered around interactions between genes and proteins. The quality of disease networks is heavily dependent on the amount and quality of phenotype information in phenotype databases of human genetic diseases. We explored which aspects of phenotype database architecture and content best reflect the underlying biology of disease. We used the OMIM-based HPO, Orphanet, and POSSUM phenotype databases for this purpose and devised a biological coherence score based on the sharing of gene ontology annotation to investigate the degree to which phenotype similarity in these databases reflects related pathobiology. Our analyses support the notion that a fine-grained phenotype ontology enhances the accuracy of phenome representation. In addition, we find that the OMIM database that is most used by the human genetics community is heavily underannotated. We show that this problem can easily be overcome by simply adding data available in the POSSUM database to improve OMIM phenotype representations in the HPO. Also, we find that the use of feature frequency estimates--currently implemented only in the Orphanet database--significantly improves the quality of the phenome representation. Our data suggest that there is much to be gained by improving human phenome databases and that some of the measures needed to achieve this are relatively easy to implement. More generally, we propose that curation and more systematic annotation of human phenome databases can greatly improve the power of the phenotype for genetic disease analysis.
Assuntos
Mapeamento Cromossômico/métodos , Genótipo , Fenótipo , Algoritmos , Bases de Dados Genéticas , Predisposição Genética para Doença , Genoma Humano , Humanos , Modelos Genéticos , Modelos Estatísticos , Família MultigênicaRESUMO
Concurrent mutation of a RAS oncogene and the tumor suppressor p53 is common in tumorigenesis, and inflammation can promote RAS-driven tumorigenesis without the need to mutate p53. Here, we show, using a well-established mutant RAS and an inflammation-driven mouse skin tumor model, that loss of the p53 inhibitor iASPP facilitates tumorigenesis. Specifically, iASPP regulates expression of a subset of p63 and AP1 targets, including genes involved in skin differentiation and inflammation, suggesting that loss of iASPP in keratinocytes supports a tumor-promoting inflammatory microenvironment. Mechanistically, JNK-mediated phosphorylation regulates iASPP function and inhibits iASPP binding with AP1 components, such as JUND, via PXXP/SH3 domain-mediated interaction. Our results uncover a JNK-iASPP-AP1 regulatory axis that is crucial for tissue homeostasis. We show that iASPP is a tumor suppressor and an AP1 coregulator.
Assuntos
Proteínas Repressoras , Proteína Supressora de Tumor p53 , Animais , Camundongos , Transformação Celular Neoplásica/genética , Inflamação/genética , Peptídeos e Proteínas de Sinalização Intracelular/genética , Peptídeos e Proteínas de Sinalização Intracelular/metabolismo , Proteínas Repressoras/metabolismo , Microambiente Tumoral , Proteína Supressora de Tumor p53/genética , Proteína Supressora de Tumor p53/metabolismo , MAP Quinase Quinase 4/metabolismo , Fator de Transcrição AP-1/metabolismoRESUMO
Human phenomics is about to come of age with studies that systematically assess the overlap and relationships among all human genetic diseases. A recent study by Andrey Rzhetsky and colleagues illustrates the power of phenomics by revealing links between conditions that were thought to be distinct, suggesting that they share a genetic basis. Their results imply that the human phenome can be viewed as a landscape of interrelated diseases, reflecting overlapping molecular causation.
Assuntos
Variação Genética , Fenótipo , Doenças Genéticas Inatas/genética , Testes Genéticos , Genômica , Genótipo , HumanosRESUMO
The comparison of fully sequenced genomes enables the study of selective constraints that determine genome organisation. We show that, in fungi, adjacent divergently transcribed (<---->) genes are more conserved in orientation than convergent (--><--) or co-oriented (-->-->) gene pairs. Furthermore, the time divergent orientation of two genes is conserved correlates with the degree of their co-expression and with the likelihood of them being functionally related. The functional interactions of the proteins encoded by the conserved divergent gene pairs indicate a potential for protein function prediction in eukaryotes.
Assuntos
Fungos/genética , Transcrição Gênica/fisiologia , Animais , Sequência Conservada/genética , Evolução MolecularRESUMO
BACKGROUND: Genome-wide association studies (GWAS) aim to identify causal variants and genes for complex disease by independently testing a large number of SNP markers for disease association. Although genes have been implicated in these studies, few utilise the multiple-hit model of complex disease to identify causal candidates. A major benefit of multi-locus comparison is that it compensates for some shortcomings of current statistical analyses that test the frequency of each SNP in isolation for the phenotype population versus control. RESULTS: Here we developed and benchmarked several protocols for GWAS data analysis using different in-silico gene prediction and prioritisation methodologies. We adopted a high sensitivity approach to the data, using less conservative statistical SNP associations. Multiple gene search spaces, either of fixed-widths or proximity-based, were generated around each SNP marker. We used the candidate disease gene prediction system Gentrepid to identify candidates based on shared biomolecular pathways or domain-based protein homology. Predictions were made either with phenotype-specific known disease genes as input; or without a priori knowledge, by exhaustive comparison of genes in distinct loci. Because Gentrepid uses biomolecular data to find interactions and common features between genes in distinct loci of the search spaces, it takes advantage of the multi-locus aspect of the data. CONCLUSIONS: Results suggest testing multiple SNP-to-gene search spaces compensates for differences in phenotypes, populations and SNP platforms. Surprisingly, domain-based homology information was more informative when benchmarked against gene candidates reported by GWA studies compared to previously determined disease genes, possibly suggesting a larger contribution of gene homologs to complex diseases than Mendelian diseases.
Assuntos
Doença/genética , Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Bases de Dados Genéticas , Bases de Dados de Proteínas , Humanos , SoftwareRESUMO
BACKGROUND: Even in the post-genomic era, the identification of candidate genes within loci associated with human genetic diseases is a very demanding task, because the critical region may typically contain hundreds of positional candidates. Since genes implicated in similar phenotypes tend to share very similar expression profiles, high throughput gene expression data may represent a very important resource to identify the best candidates for sequencing. However, so far, gene coexpression has not been used very successfully to prioritize positional candidates. METHODOLOGY/PRINCIPAL FINDINGS: We show that it is possible to reliably identify disease-relevant relationships among genes from massive microarray datasets by concentrating only on genes sharing similar expression profiles in both human and mouse. Moreover, we show systematically that the integration of human-mouse conserved coexpression with a phenotype similarity map allows the efficient identification of disease genes in large genomic regions. Finally, using this approach on 850 OMIM loci characterized by an unknown molecular basis, we propose high-probability candidates for 81 genetic diseases. CONCLUSION: Our results demonstrate that conserved coexpression, even at the human-mouse phylogenetic distance, represents a very strong criterion to predict disease-relevant relationships among human genes.
Assuntos
Mapeamento Cromossômico/métodos , Diagnóstico por Computador/métodos , Perfilação da Expressão Gênica/métodos , Doenças Genéticas Inatas/diagnóstico , Doenças Genéticas Inatas/genética , Predisposição Genética para Doença/genética , Proteoma/genética , Algoritmos , Animais , Biomarcadores/análise , Sequência Conservada/genética , Humanos , Camundongos , Proteoma/análiseRESUMO
Background: The failure to translate preclinical results to the clinical setting is the rule, not the exception. One reason that is frequently overlooked is whether the animal model reproduces distinctive features of human disease. Another is the reproducibility of the method used to measure treatment effects in preclinical studies. Left ventricular (LV) function improvement is the most common endpoint in preclinical cardiovascular disease studies, while echocardiography is the most frequently used method to evaluate LV function. In this work, we conducted a robust echocardiographic evaluation of LV size and function in dogs chronically infected by Trypanosoma cruzi. Methods and Results: Echocardiography was performed blindly by two distinct observers in mongrel dogs before and between 6 and 9 months post infection. Parameters analyzed included end-systolic volume (ESV), end-diastolic volume (EDV), ejection fraction (EF), and fractional shortening (FS). We observed a significant LVEF and FS reduction in infected animals compared to controls, with no significant variation in volumes. However, the effect of chronic infection in systolic function was quite variable, with EF ranging from 17 to 66%. Using the cut-off value of EF ≤ 40%, established for dilated cardiomyopathy (DCM) in dogs, only 28% of the infected dogs were affected by the chronic infection. Conclusions: The canine model of CCC mimics human disease, reproducing the percentage of individuals that develop heart failure during the chronic infection. It is thus mandatory to establish inclusion criteria in the experimental design of canine preclinical studies to account for the variable effect that chronic infection has on systolic function.
Assuntos
Cardiomiopatia Chagásica/diagnóstico por imagem , Ecocardiografia/métodos , Ventrículos do Coração/diagnóstico por imagem , Animais , Modelos Animais de Doenças , Cães , Reprodutibilidade dos Testes , Função VentricularRESUMO
BACKGROUND: Genes that are co-expressed tend to be involved in the same biological process. However, co-expression is not a very reliable predictor of functional links between genes. The evolutionary conservation of co-expression between species can be used to predict protein function more reliably than co-expression in a single species. Here we examine whether co-expression across multiple species is also a better prioritizer of disease genes than is co-expression between human genes alone. RESULTS: We use co-expression data from yeast (S. cerevisiae), nematode worm (C. elegans), fruit fly (D. melanogaster), mouse and human and find that the use of evolutionary conservation can indeed improve the predictive value of co-expression. The effect that genes causing the same disease have higher co-expression than do other genes from their associated disease loci, is significantly enhanced when co-expression data are combined across evolutionarily distant species. We also find that performance can vary significantly depending on the co-expression datasets used, and just using more data does not necessarily lead to better prioritization. Instead, we find that dataset quality is more important than quantity, and using a consistent microarray platform per species leads to better performance than using more inclusive datasets pooled from various platforms. CONCLUSION: We find that evolutionarily conserved gene co-expression prioritizes disease candidate genes better than human gene co-expression alone, and provide the integrated data as a new resource for disease gene prioritization tools.
Assuntos
Sequência Conservada , Doença/etiologia , Perfilação da Expressão Gênica/métodos , Predisposição Genética para Doença , Penetrância , Animais , Sequência de Bases , Caenorhabditis elegans/genética , Bases de Dados Genéticas , Drosophila melanogaster/genética , Evolução Molecular , Dosagem de Genes , Expressão Gênica , Frequência do Gene , Humanos , Camundongos , Análise de Sequência com Séries de Oligonucleotídeos , Valor Preditivo dos Testes , Saccharomyces cerevisiae/genética , Tamanho da Amostra , Especificidade da EspécieRESUMO
Genome-wide experimental methods to identify disease genes, such as linkage analysis and association studies, generate increasingly large candidate gene sets for which comprehensive empirical analysis is impractical. Computational methods employ data from a variety of sources to identify the most likely candidate disease genes from these gene sets. Here, we review seven independent computational disease gene prioritization methods, and then apply them in concert to the analysis of 9556 positional candidate genes for type 2 diabetes (T2D) and the related trait obesity. We generate and analyse a list of nine primary candidate genes for T2D genes and five for obesity. Two genes, LPL and BCKDHA, are common to these two sets. We also present a set of secondary candidates for T2D (94 genes) and for obesity (116 genes) with 58 genes in common to both diseases.