RESUMEN
Translated small open reading frames (smORFs) can have important regulatory roles and encode microproteins, yet their genome-wide identification has been challenging. We determined the ribosome locations across six primary human cell types and five tissues and detected 7,767 smORFs with translational profiles matching those of known proteins. The human genome was found to contain highly cell-type- and tissue-specific smORFs and a subset that encodes highly conserved amino acid sequences. Changes in the translational efficiency of upstream-encoded smORFs (uORFs) and the corresponding main ORFs predominantly occur in the same direction. Integration with 456 mass-spectrometry datasets confirms the presence of 603 small peptides at the protein level in humans and provides insights into the subcellular localization of these small proteins. This study provides a comprehensive atlas of high-confidence translated smORFs derived from primary human cells and tissues in order to provide a more complete understanding of the translated human genome.
Asunto(s)
Regulación de la Expresión Génica , Ribosomas , Genoma Humano/genética , Humanos , Sistemas de Lectura Abierta/genética , Biosíntesis de Proteínas , Proteínas/metabolismo , ARN/metabolismo , Ribosomas/genética , Ribosomas/metabolismoRESUMEN
Cells undergo a major epigenome reconfiguration when reprogrammed to human induced pluripotent stem cells (hiPS cells). However, the epigenomes of hiPS cells and human embryonic stem (hES) cells differ significantly, which affects hiPS cell function1-8. These differences include epigenetic memory and aberrations that emerge during reprogramming, for which the mechanisms remain unknown. Here we characterized the persistence and emergence of these epigenetic differences by performing genome-wide DNA methylation profiling throughout primed and naive reprogramming of human somatic cells to hiPS cells. We found that reprogramming-induced epigenetic aberrations emerge midway through primed reprogramming, whereas DNA demethylation begins early in naive reprogramming. Using this knowledge, we developed a transient-naive-treatment (TNT) reprogramming strategy that emulates the embryonic epigenetic reset. We show that the epigenetic memory in hiPS cells is concentrated in cell of origin-dependent repressive chromatin marked by H3K9me3, lamin-B1 and aberrant CpH methylation. TNT reprogramming reconfigures these domains to a hES cell-like state and does not disrupt genomic imprinting. Using an isogenic system, we demonstrate that TNT reprogramming can correct the transposable element overexpression and differential gene expression seen in conventional hiPS cells, and that TNT-reprogrammed hiPS and hES cells show similar differentiation efficiencies. Moreover, TNT reprogramming enhances the differentiation of hiPS cells derived from multiple cell types. Thus, TNT reprogramming corrects epigenetic memory and aberrations, producing hiPS cells that are molecularly and functionally more similar to hES cells than conventional hiPS cells. We foresee TNT reprogramming becoming a new standard for biomedical and therapeutic applications and providing a novel system for studying epigenetic memory.
Asunto(s)
Reprogramación Celular , Epigénesis Genética , Células Madre Pluripotentes Inducidas , Humanos , Cromatina/genética , Cromatina/metabolismo , Desmetilación del ADN , Metilación de ADN , Elementos Transponibles de ADN , Células Madre Pluripotentes Inducidas/citología , Células Madre Pluripotentes Inducidas/metabolismo , Células Madre Embrionarias Humanas/citología , Células Madre Embrionarias Humanas/metabolismo , Lamina Tipo BRESUMEN
Human pluripotent and trophoblast stem cells have been essential alternatives to blastocysts for understanding early human development1-4. However, these simple culture systems lack the complexity to adequately model the spatiotemporal cellular and molecular dynamics that occur during early embryonic development. Here we describe the reprogramming of fibroblasts into in vitro three-dimensional models of the human blastocyst, termed iBlastoids. Characterization of iBlastoids shows that they model the overall architecture of blastocysts, presenting an inner cell mass-like structure, with epiblast- and primitive endoderm-like cells, a blastocoel-like cavity and a trophectoderm-like outer layer of cells. Single-cell transcriptomics further confirmed the presence of epiblast-, primitive endoderm-, and trophectoderm-like cells. Moreover, iBlastoids can give rise to pluripotent and trophoblast stem cells and are capable of modelling, in vitro, several aspects of the early stage of implantation. In summary, we have developed a scalable and tractable system to model human blastocyst biology; we envision that this will facilitate the study of early human development and the effects of gene mutations and toxins during early embryogenesis, as well as aiding in the development of new therapies associated with in vitro fertilization.
Asunto(s)
Blastocisto/citología , Blastocisto/metabolismo , Técnicas de Cultivo de Célula , Reprogramación Celular , Fibroblastos/citología , Modelos Biológicos , Transcriptoma , Femenino , Fibroblastos/metabolismo , Humanos , Técnicas In Vitro , Análisis de la Célula Individual , Células Madre/citología , Células Madre/metabolismo , Trofoblastos/citologíaRESUMEN
The reprogramming of human somatic cells to primed or naive induced pluripotent stem cells recapitulates the stages of early embryonic development1-6. The molecular mechanism that underpins these reprogramming processes remains largely unexplored, which impedes our understanding and limits rational improvements to reprogramming protocols. Here, to address these issues, we reconstruct molecular reprogramming trajectories of human dermal fibroblasts using single-cell transcriptomics. This revealed that reprogramming into primed and naive pluripotency follows diverging and distinct trajectories. Moreover, genome-wide analyses of accessible chromatin showed key changes in the regulatory elements of core pluripotency genes, and orchestrated global changes in chromatin accessibility over time. Integrated analysis of these datasets revealed a role for transcription factors associated with the trophectoderm lineage, and the existence of a subpopulation of cells that enter a trophectoderm-like state during reprogramming. Furthermore, this trophectoderm-like state could be captured, which enabled the derivation of induced trophoblast stem cells. Induced trophoblast stem cells are molecularly and functionally similar to trophoblast stem cells derived from human blastocysts or first-trimester placentas7. Our results provide a high-resolution roadmap for the transcription-factor-mediated reprogramming of human somatic cells, indicate a role for the trophectoderm-lineage-specific regulatory program during this process, and facilitate the direct reprogramming of somatic cells into induced trophoblast stem cells.
Asunto(s)
Reprogramación Celular/genética , Regulación de la Expresión Génica , Células Madre Pluripotentes Inducidas/citología , Células Madre Pluripotentes Inducidas/metabolismo , Trofoblastos/citología , Trofoblastos/metabolismo , Adulto , Cromatina/genética , Cromatina/metabolismo , Ectodermo/citología , Ectodermo/metabolismo , Femenino , Fibroblastos/citología , Fibroblastos/metabolismo , Humanos , Transcripción GenéticaRESUMEN
The three striatins (STRN, STRN3, STRN4) form the core of STRiatin-Interacting Phosphatase and Kinase (STRIPAK) complexes. These place protein phosphatase 2A (PP2A) in proximity to protein kinases thereby restraining kinase activity and regulating key cellular processes. Our aim was to establish if striatins play a significant role in cardiac remodelling associated with cardiac hypertrophy and heart failure. All striatins were expressed in control human hearts, with up-regulation of STRN and STRN3 in failing hearts. We used mice with global heterozygote gene deletion to assess the roles of STRN and STRN3 in cardiac remodelling induced by angiotensin II (AngII; 7 days). Using echocardiography, we detected no differences in baseline cardiac function or dimensions in STRN+/- or STRN3+/- male mice (8 weeks) compared with wild-type littermates. Heterozygous gene deletion did not affect cardiac function in mice treated with AngII, but the increase in left ventricle mass induced by AngII was inhibited in STRN+/- (but not STRN3+/-) mice. Histological staining indicated that cardiomyocyte hypertrophy was inhibited. To assess the role of STRN in cardiomyocytes, we converted the STRN knockout line for inducible cardiomyocyte-specific gene deletion. There was no effect of cardiomyocyte STRN knockout on cardiac function or dimensions, but the increase in left ventricle mass induced by AngII was inhibited. This resulted from inhibition of cardiomyocyte hypertrophy and cardiac fibrosis. The data indicate that cardiomyocyte striatin is required for early remodelling of the heart by AngII and identify the striatin-based STRIPAK system as a signalling paradigm in the development of pathological cardiac hypertrophy.
Asunto(s)
Angiotensina II , Cardiomegalia , Ratones Noqueados , Miocitos Cardíacos , Animales , Angiotensina II/farmacología , Miocitos Cardíacos/metabolismo , Miocitos Cardíacos/patología , Cardiomegalia/genética , Cardiomegalia/patología , Cardiomegalia/metabolismo , Cardiomegalia/fisiopatología , Masculino , Humanos , Proteínas Musculares/metabolismo , Proteínas Musculares/genética , Remodelación Ventricular , Proteínas de la Membrana/genética , Proteínas de la Membrana/metabolismo , Ratones , Ratones Endogámicos C57BL , Proteínas de Unión a Calmodulina , Proteínas del Tejido NerviosoRESUMEN
MOTIVATION: The creation and analysis of gene regulatory networks have been the focus of bioinformatics research and underpins much of what is known about gene regulation. However, as a result of a bias in the availability of data types that are collected, the vast majority of gene regulatory network resources and tools have focused on either transcriptional regulation or protein-protein interactions. This has left other areas of regulation, for instance, translational regulation, vastly underrepresented despite them having been shown to play a critical role in both health and disease. RESULTS: In order to address this, we have developed CLIPreg, a package that integrates RNA, Ribo and CLIP- sequencing data in order to construct translational regulatory networks coordinated by RNA-binding proteins and micro-RNAs. This is the first tool of its type to be created, allowing for detailed investigation into a previously unseen layer of regulation. AVAILABILITY AND IMPLEMENTATION: CLIPreg is available at https://github.com/SGDDNB/CLIPreg. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Asunto(s)
Redes Reguladoras de Genes , MicroARNs , RNA-Seq , Proteínas de Unión al ARN , Programas InformáticosRESUMEN
Long non-coding RNAs (lncRNAs) are largely heterogeneous and functionally uncharacterized. Here, using FANTOM5 cap analysis of gene expression (CAGE) data, we integrate multiple transcript collections to generate a comprehensive atlas of 27,919 human lncRNA genes with high-confidence 5' ends and expression profiles across 1,829 samples from the major human primary cell types and tissues. Genomic and epigenomic classification of these lncRNAs reveals that most intergenic lncRNAs originate from enhancers rather than from promoters. Incorporating genetic and expression data, we show that lncRNAs overlapping trait-associated single nucleotide polymorphisms are specifically expressed in cell types relevant to the traits, implicating these lncRNAs in multiple diseases. We further demonstrate that lncRNAs overlapping expression quantitative trait loci (eQTL)-associated single nucleotide polymorphisms of messenger RNAs are co-expressed with the corresponding messenger RNAs, suggesting their potential roles in transcriptional regulation. Combining these findings with conservation data, we identify 19,175 potentially functional lncRNAs in the human genome.
Asunto(s)
Bases de Datos Genéticas , ARN Largo no Codificante/química , ARN Largo no Codificante/genética , Transcriptoma/genética , Células Cultivadas , Secuencia Conservada/genética , Conjuntos de Datos como Asunto , Elementos de Facilitación Genéticos/genética , Epigénesis Genética , Perfilación de la Expresión Génica , Regulación de la Expresión Génica , Genoma Humano/genética , Estudio de Asociación del Genoma Completo , Genómica , Humanos , Internet , Anotación de Secuencia Molecular , Especificidad de Órganos/genética , Polimorfismo de Nucleótido Simple , Regiones Promotoras Genéticas/genética , Sitios de Carácter Cuantitativo/genética , Estabilidad del ARN , ARN Mensajero/genéticaRESUMEN
Fibrosis is a common pathology in cardiovascular disease. In the heart, fibrosis causes mechanical and electrical dysfunction and in the kidney, it predicts the onset of renal failure. Transforming growth factor ß1 (TGFß1) is the principal pro-fibrotic factor, but its inhibition is associated with side effects due to its pleiotropic roles. We hypothesized that downstream effectors of TGFß1 in fibroblasts could be attractive therapeutic targets and lack upstream toxicity. Here we show, using integrated imaging-genomics analyses of primary human fibroblasts, that upregulation of interleukin-11 (IL-11) is the dominant transcriptional response to TGFß1 exposure and required for its pro-fibrotic effect. IL-11 and its receptor (IL11RA) are expressed specifically in fibroblasts, in which they drive non-canonical, ERK-dependent autocrine signalling that is required for fibrogenic protein synthesis. In mice, fibroblast-specific Il11 transgene expression or Il-11 injection causes heart and kidney fibrosis and organ failure, whereas genetic deletion of Il11ra1 protects against disease. Therefore, inhibition of IL-11 prevents fibroblast activation across organs and species in response to a range of important pro-fibrotic stimuli. These results reveal a central role of IL-11 in fibrosis and we propose that inhibition of IL-11 is a potential therapeutic strategy to treat fibrotic diseases.
Asunto(s)
Sistema Cardiovascular/metabolismo , Sistema Cardiovascular/patología , Fibrosis/metabolismo , Fibrosis/patología , Interleucina-11/metabolismo , Animales , Comunicación Autocrina , Células Cultivadas , Femenino , Fibroblastos/efectos de los fármacos , Fibroblastos/metabolismo , Fibroblastos/patología , Fibrosis/inducido químicamente , Corazón , Humanos , Interleucina-11/antagonistas & inhibidores , Interleucina-11/genética , Subunidad alfa del Receptor de Interleucina-11/deficiencia , Subunidad alfa del Receptor de Interleucina-11/genética , Riñón/patología , Masculino , Ratones , Ratones Noqueados , Persona de Mediana Edad , Miocardio/metabolismo , Miocardio/patología , Puntuaciones en la Disfunción de Órganos , Biosíntesis de Proteínas , Factor de Crecimiento Transformador beta1/metabolismo , Factor de Crecimiento Transformador beta1/farmacología , Transgenes/genéticaRESUMEN
The protein kinase PKN2 is required for embryonic development and PKN2 knockout mice die as a result of failure in the expansion of mesoderm, cardiac development and neural tube closure. In the adult, cardiomyocyte PKN2 and PKN1 (in combination) are required for cardiac adaptation to pressure-overload. The specific role of PKN2 in contractile cardiomyocytes during development and its role in the adult heart remain to be fully established. We used mice with cardiomyocyte-directed knockout of PKN2 or global PKN2 haploinsufficiency to assess cardiac development and function using high resolution episcopic microscopy, MRI, micro-CT and echocardiography. Biochemical and histological changes were also assessed. Cardiomyocyte-directed PKN2 knockout embryos displayed striking abnormalities in the compact myocardium, with frequent myocardial clefts and diverticula, ventricular septal defects and abnormal heart shape. The sub-Mendelian homozygous knockout survivors developed cardiac failure. RNASeq data showed up-regulation of PKN2 in patients with dilated cardiomyopathy, suggesting an involvement in adult heart disease. Given the rarity of homozygous survivors with cardiomyocyte-specific deletion of PKN2, the requirement for PKN2 in adult mice was explored using the constitutive heterozygous PKN2 knockout. Cardiac hypertrophy resulting from hypertension induced by angiotensin II was reduced in these haploinsufficient PKN2 mice relative to wild-type littermates, with suppression of cardiomyocyte hypertrophy and cardiac fibrosis. It is concluded that cardiomyocyte PKN2 is essential for heart development and the formation of compact myocardium and is also required for cardiac hypertrophy in hypertension. Thus, PKN signalling may offer therapeutic options for managing congenital and adult heart diseases.
Asunto(s)
Cardiomiopatías , Hipertensión , Proteína Quinasa C/metabolismo , Angiotensina II/metabolismo , Angiotensina II/farmacología , Animales , Cardiomegalia/metabolismo , Cardiomiopatías/metabolismo , Cardiomiopatías/patología , Femenino , Hipertensión/metabolismo , Hipertensión/patología , Ratones , Ratones Noqueados , Miocardio/metabolismo , Miocitos Cardíacos/metabolismo , EmbarazoRESUMEN
The extracellular signal-regulated kinase 1/2 (ERK1/2) cascade promotes cardiomyocyte hypertrophy and is cardioprotective, with the three RAF kinases forming a node for signal integration. Our aims were to determine if BRAF is relevant for human heart failure, whether BRAF promotes cardiomyocyte hypertrophy, and if Type 1 RAF inhibitors developed for cancer (that paradoxically activate ERK1/2 at low concentrations: the 'RAF paradox') may have the same effect. BRAF was up-regulated in heart samples from patients with heart failure compared with normal controls. We assessed the effects of activated BRAF in the heart using mice with tamoxifen-activated Cre for cardiomyocyte-specific knock-in of the activating V600E mutation into the endogenous gene. We used echocardiography to measure cardiac dimensions/function. Cardiomyocyte BRAFV600E induced cardiac hypertrophy within 10â d, resulting in increased ejection fraction and fractional shortening over 6 weeks. This was associated with increased cardiomyocyte size without significant fibrosis, consistent with compensated hypertrophy. The experimental Type 1 RAF inhibitor, SB590885, and/or encorafenib (a RAF inhibitor used clinically) increased ERK1/2 phosphorylation in cardiomyocytes, and promoted hypertrophy, consistent with a 'RAF paradox' effect. Both promoted cardiac hypertrophy in mouse hearts in vivo, with increased cardiomyocyte size and no overt fibrosis. In conclusion, BRAF potentially plays an important role in human failing hearts, activation of BRAF is sufficient to induce hypertrophy, and Type 1 RAF inhibitors promote hypertrophy via the 'RAF paradox'. Cardiac hypertrophy resulting from these interventions was not associated with pathological features, suggesting that Type 1 RAF inhibitors may be useful to boost cardiomyocyte function.
Asunto(s)
Cardiomegalia/patología , Sistema de Señalización de MAP Quinasas/fisiología , Miocitos Cardíacos/patología , Proteínas Proto-Oncogénicas B-raf/fisiología , Animales , Carbamatos/farmacología , Carbamatos/toxicidad , Cardiomegalia/metabolismo , Tamaño de la Célula/efectos de los fármacos , Células Cultivadas , Dimerización , Técnicas de Sustitución del Gen , Insuficiencia Cardíaca/patología , Humanos , Sistema de Señalización de MAP Quinasas/efectos de los fármacos , Masculino , Ratones , Ratones Endogámicos C57BL , Mutación Missense , Miocitos Cardíacos/efectos de los fármacos , Miocitos Cardíacos/metabolismo , Mutación Puntual , Conformación Proteica/efectos de los fármacos , Mapeo de Interacción de Proteínas , Proteínas Proto-Oncogénicas B-raf/genética , Proteínas Proto-Oncogénicas c-raf/antagonistas & inhibidores , Proteínas Proto-Oncogénicas c-raf/biosíntesis , Ratas , Ratas Sprague-Dawley , Sulfonamidas/farmacología , Sulfonamidas/toxicidadRESUMEN
MOTIVATION: As the generation of complex single-cell RNA sequencing datasets becomes more commonplace it is the responsibility of researchers to provide access to these data in a way that can be easily explored and shared. Whilst it is often the case that data is deposited for future bioinformatic analysis many studies do not release their data in a way that is easy to explore by non-computational researchers. RESULTS: In order to help address this we have developed ShinyCell, an R package that converts single-cell RNA sequencing datasets into explorable and shareable interactive interfaces. These interfaces can be easily customized in order to maximize their usability and can be easily uploaded to online platforms to facilitate wider access to published data. AVAILABILITY AND IMPLEMENTATION: ShinyCell is available at https://github.com/SGDDNB/ShinyCell and https://figshare.com/projects/ShinyCell/100439. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
RESUMEN
SUMMARY: Emerging single-cell RNA-sequencing data technologies has made it possible to capture and assess the gene expression of individual cells. Based on the similarity of gene expression profiles, many tools have been developed to generate an in silico ordering of cells in the form of pseudo-time trajectories. However, these tools do not provide a means to find the ordering of critical gene expression changes over pseudo-time. We present GeneSwitches, a tool that takes any single-cell pseudo-time trajectory and determines the precise order of gene expression and functional-event changes over time. GeneSwitches uses a statistical framework based on logistic regression to identify the order in which genes are either switched on or off along pseudo-time. With this information, users can identify the order in which surface markers appear, investigate how functional ontologies are gained or lost over time and compare the ordering of switching genes from two related pseudo-temporal processes. AVAILABILITY: GeneSwitches is available at https://geneswitches.ddnetbio.com. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Asunto(s)
Análisis de la Célula Individual , Programas Informáticos , Perfilación de la Expresión Génica , ARN , Análisis de Secuencia de ARNRESUMEN
BACKGROUND: Several genetic susceptibility loci associated with diabetic nephropathy have been documented, but no causative variants implying novel pathogenetic mechanisms have been elucidated. METHODS: We carried out whole-genome sequencing of a discovery cohort of Finnish siblings with type 1 diabetes who were discordant for the presence (case) or absence (control) of diabetic nephropathy. Controls had diabetes without complications for 15-37 years. We analyzed and annotated variants at genome, gene, and single-nucleotide variant levels. We then replicated the associated variants, genes, and regions in a replication cohort from the Finnish Diabetic Nephropathy study that included 3531 unrelated Finns with type 1 diabetes. RESULTS: We observed protein-altering variants and an enrichment of variants in regions associated with the presence or absence of diabetic nephropathy. The replication cohort confirmed variants in both regulatory and protein-coding regions. We also observed that diabetic nephropathy-associated variants, when clustered at the gene level, are enriched in a core protein-interaction network representing proteins essential for podocyte function. These genes include protein kinases (protein kinase C isoforms ε and ι) and protein tyrosine kinase 2. CONCLUSIONS: Our comprehensive analysis of a diabetic nephropathy cohort of siblings with type 1 diabetes who were discordant for kidney disease points to variants and genes that are potentially causative or protective for diabetic nephropathy. This includes variants in two isoforms of the protein kinase C family not previously linked to diabetic nephropathy, adding support to previous hypotheses that the protein kinase C family members play a role in diabetic nephropathy and might be attractive therapeutic targets.
Asunto(s)
Diabetes Mellitus Tipo 1/complicaciones , Nefropatías Diabéticas/genética , Secuenciación Completa del Genoma/métodos , Adolescente , Adulto , Animales , Niño , Preescolar , Diabetes Mellitus Tipo 1/genética , Femenino , Células HEK293 , Humanos , Masculino , Polimorfismo de Nucleótido Simple , Proteína Quinasa C/fisiología , Hermanos , Adulto Joven , Pez CebraRESUMEN
BACKGROUND: Fibrosis is a common pathology in many cardiac disorders and is driven by the activation of resident fibroblasts. The global posttranscriptional mechanisms underlying fibroblast-to-myofibroblast conversion in the heart have not been explored. METHODS: Genome-wide changes of RNA transcription and translation during human cardiac fibroblast activation were monitored with RNA sequencing and ribosome profiling. We then used RNA-binding protein-based analyses to identify translational regulators of fibrogenic genes. The integration with cardiac ribosome occupancy levels of 30 dilated cardiomyopathy patients demonstrates that these posttranscriptional mechanisms are also active in the diseased fibrotic human heart. RESULTS: We generated nucleotide-resolution translatome data during the transforming growth factor ß1-driven cellular transition of human cardiac fibroblasts to myofibroblasts. This identified dynamic changes of RNA transcription and translation at several time points during the fibrotic response, revealing transient and early-responder genes. Remarkably, about one-third of all changes in gene expression in activated fibroblasts are subject to translational regulation, and dynamic variation in ribosome occupancy affects protein abundance independent of RNA levels. Targets of RNA-binding proteins were strongly enriched in posttranscriptionally regulated genes, suggesting genes such as MBNL2 can act as translational activators or repressors. Ribosome occupancy in the hearts of patients with dilated cardiomyopathy suggested the same posttranscriptional regulatory network was underlying cardiac fibrosis. Key network hubs include RNA-binding proteins such as Pumilio RNA binding family member 2 (PUM2) and Quaking (QKI) that work in concert to regulate the translation of target transcripts in human diseased hearts. Furthermore, silencing of both PUM2 and QKI inhibits the transition of fibroblasts toward profibrotic myofibroblasts in response to transforming growth factor ß1. CONCLUSIONS: We reveal widespread translational effects of transforming growth factor ß1 and define novel posttranscriptional regulatory networks that control the fibroblast-to-myofibroblast transition. These networks are active in human heart disease, and silencing of hub genes limits fibroblast activation. Our findings show the central importance of translational control in fibrosis and highlight novel pathogenic mechanisms in heart failure.
Asunto(s)
Cardiopatías/genética , Cardiopatías/metabolismo , Miocitos Cardíacos/metabolismo , Miocitos Cardíacos/patología , Biosíntesis de Proteínas/genética , Proteínas de Unión al ARN/genética , Células Cultivadas , Fibroblastos/metabolismo , Fibroblastos/patología , Fibrosis/genética , Fibrosis/metabolismo , Fibrosis/patología , Perfilación de la Expresión Génica/métodos , Cardiopatías/patología , Humanos , Análisis de Secuencia de ARN/métodos , Factor de Crecimiento Transformador beta1/genética , Factor de Crecimiento Transformador beta1/metabolismoRESUMEN
UNLABELLED: As the volume of patient-specific genome sequences increases the focus of biomedical research is switching from the detection of disease-mutations to their interpretation. To this end a number of techniques have been developed that use mutation data collected within a population to predict whether individual genes are likely to be disease-causing or not. As both sequence data and associated analysis tools proliferate, it becomes increasingly difficult for the community to make sense of these data and their implications. Moreover, no single analysis tool is likely to capture all relevant genomic features that contribute to the gene's pathogenicity. Here, we introduce Web-based Gene Pathogenicity Analysis (WGPA), a web-based tool to analyze genes impacted by mutations and rank them through the integration of existing prioritization tools, which assess different aspects of gene pathogenicity using population-level sequence data. Additionally, to explore the polygenic contribution of mutations to disease, WGPA implements gene set enrichment analysis to prioritize disease-causing genes and gene interaction networks, therefore providing a comprehensive annotation of personal genomes data in disease. AVAILABILITY AND IMPLEMENTATION: wgpa.systems-genetics.net.
Asunto(s)
Enfermedad/genética , Genoma Humano , Mutación , Programas Informáticos , Factores de Virulencia/genética , Genómica , Humanos , InternetRESUMEN
Methods to interpret personal genome sequences are increasingly required. Here, we report a novel framework (EvoTol) to identify disease-causing genes using patient sequence data from within protein coding-regions. EvoTol quantifies a gene's intolerance to mutation using evolutionary conservation of protein sequences and can incorporate tissue-specific gene expression data. We apply this framework to the analysis of whole-exome sequence data in epilepsy and congenital heart disease, and demonstrate EvoTol's ability to identify known disease-causing genes is unmatched by competing methods. Application of EvoTol to the human interactome revealed networks enriched for genes intolerant to protein sequence variation, informing novel polygenic contributions to human disease.
Asunto(s)
Biología Computacional/métodos , Evolución Molecular , Predisposición Genética a la Enfermedad/genética , Proteínas/genética , Secuencia de Aminoácidos/genética , Exoma/genética , Cardiopatías Congénitas/genética , Humanos , Mutación , Filogenia , Polimorfismo de Nucleótido Simple , Mapas de Interacción de Proteínas/genética , Proteínas/clasificación , Proteínas/metabolismo , Reproducibilidad de los Resultados , Análisis de Secuencia de ADN/métodosRESUMEN
We present updates to the SUPERFAMILY 1.75 (http://supfam.org) online resource and protein sequence collection. The hidden Markov model library that provides sequence homology to SCOP structural domains remains unchanged at version 1.75. In the last 4 years SUPERFAMILY has more than doubled its holding of curated complete proteomes over all cellular life, from 1400 proteomes reported previously in 2010 up to 3258 at present. Outside of the main sequence collection, SUPERFAMILY continues to provide domain annotation for sequences provided by other resources such as: UniProt, Ensembl, PDB, much of JGI Phytozome and selected subcollections of NCBI RefSeq. Despite this growth in data volume, SUPERFAMILY now provides users with an expanded and daily updated phylogenetic tree of life (sTOL). This tree is built with genomic-scale domain annotation data as before, but constantly updated when new species are introduced to the sequence library. Our Gene Ontology and other functional and phenotypic annotations previously reported have stood up to critical assessment by the function prediction community. We have now introduced these data in an integrated manner online at the level of an individual sequence, and--in the case of whole genomes--with enrichment analysis against a taxonomically defined background.
Asunto(s)
Bases de Datos de Proteínas , Estructura Terciaria de Proteína , Ontología de Genes , Anotación de Secuencia Molecular , Filogenia , Proteínas/clasificación , Proteínas/genética , Proteoma/química , Análisis de Secuencia de ProteínaRESUMEN
Genome3D (http://www.genome3d.eu) is a collaborative resource that provides predicted domain annotations and structural models for key sequences. Since introducing Genome3D in a previous NAR paper, we have substantially extended and improved the resource. We have annotated representatives from Pfam families to improve coverage of diverse sequences and added a fast sequence search to the website to allow users to find Genome3D-annotated sequences similar to their own. We have improved and extended the Genome3D data, enlarging the source data set from three model organisms to 10, and adding VIVACE, a resource new to Genome3D. We have analysed and updated Genome3D's SCOP/CATH mapping. Finally, we have improved the superposition tools, which now give users a more powerful interface for investigating similarities and differences between structural models.
Asunto(s)
Bases de Datos de Proteínas , Anotación de Secuencia Molecular , Estructura Terciaria de Proteína , Algoritmos , Genómica , Internet , Modelos Moleculares , Estructura Terciaria de Proteína/genética , Análisis de Secuencia de ProteínaRESUMEN
MOTIVATION: As the number of studies looking at differences between DNA methylation increases, there is a growing demand to develop and benchmark statistical methods to analyse these data. To date no objective approach for the comparison of these methods has been developed and as such it remains difficult to assess which analysis tool is most appropriate for a given experiment. As a result, there is an unmet need for a DNA methylation data simulator that can accurately reproduce a wide range of experimental setups, and can be routinely used to compare the performance of different statistical models. RESULTS: We have developed WGBSSuite, a flexible stochastic simulation tool that generates single-base resolution DNA methylation data genome-wide. Several simulator parameters can be derived directly from real datasets provided by the user in order to mimic real case scenarios. Thus, it is possible to choose the most appropriate statistical analysis tool for a given simulated design. To show the usefulness of our simulator, we also report a benchmark of commonly used methods for differential methylation analysis. AVAILABILITY AND IMPLEMENTATION: WGBS code and documentation are available under GNU licence at http://www.wgbssuite.org.uk/ CONTACT: : owen.rackham@imperial.ac.uk or l.bottolo@imperial.ac.uk SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Asunto(s)
Benchmarking , Simulación por Computador , Metilación de ADN , Modelos Estadísticos , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Sulfitos/química , Genoma Humano , Humanos , Procesos EstocásticosRESUMEN
Humans are composed of hundreds of cell types. As the genomic DNA of each somatic cell is identical, cell type is determined by what is expressed and when. Until recently, little has been reported about the determinants of human cell identity, particularly from the joint perspective of gene evolution and expression. Here, we chart the evolutionary past of all documented human cell types via the collective histories of proteins, the principal product of gene expression. FANTOM5 data provide cell-type-specific digital expression of human protein-coding genes and the SUPERFAMILY resource is used to provide protein domain annotation. The evolutionary epoch in which each protein was created is inferred by comparison with domain annotation of all other completely sequenced genomes. Studying the distribution across epochs of genes expressed in each cell type reveals insights into human cellular evolution in terms of protein innovation. For each cell type, its history of protein innovation is charted based on the genes it expresses. Combining the histories of all cell types enables us to create a timeline of cell evolution. This timeline identifies the possibility that our common ancestor Coelomata (cavity-forming animals) provided the innovation required for the innate immune system, whereas cells which now form the brain of human have followed a trajectory of continually accumulating novel proteins since Opisthokonta (boundary of animals and fungi). We conclude that exaptation of existing domain architectures into new contexts is the dominant source of cell-type-specific domain architectures.