Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 141
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Genome Res ; 34(10): 1674-1686, 2024 Oct 29.
Artículo en Inglés | MEDLINE | ID: mdl-39406500

RESUMEN

PWAS (proteome-wide association study) is an innovative genetic association approach that complements widely used methods like GWAS (genome-wide association study). The PWAS approach involves consecutive phases. Initially, machine learning modeling and probabilistic considerations quantify the impact of genetic variants on protein-coding genes' biochemical functions. Secondly, for each individual, aggregating the variants per gene determines a gene-damaging score. Finally, standard statistical tests are activated in the case-control setting to yield statistically significant genes per phenotype. The PWAS Hub offers a user-friendly interface for an in-depth exploration of gene-disease associations from the UK Biobank (UKB). Results from PWAS cover 99 common diseases and conditions, each with over 10,000 diagnosed individuals per phenotype. Users can explore genes associated with these diseases, with separate analyses conducted for males and females. For each phenotype, the analyses account for sex-based genetic effects, inheritance modes (dominant and recessive), and the pleiotropic nature of associated genes. The PWAS Hub showcases its usefulness for asthma by navigating through proteomic-genetic analyses. Inspecting PWAS asthma-listed genes (a total of 27) provide insights into the underlying cellular and molecular mechanisms. Comparison of PWAS-statistically significant genes for common diseases to the Open Targets benchmark shows partial but significant overlap in gene associations for most phenotypes. Graphical tools facilitate comparing genetic effects between PWAS and coding GWAS results, aiding in understanding the sex-specific genetic impact on common diseases. This adaptable platform is attractive to clinicians, researchers, and individuals interested in delving into gene-disease associations and sex-specific genetic effects.


Asunto(s)
Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Humanos , Estudio de Asociación del Genoma Completo/métodos , Femenino , Masculino , Asma/genética , Fenotipo , Programas Informáticos , Proteoma
2.
Brief Bioinform ; 25(4)2024 May 23.
Artículo en Inglés | MEDLINE | ID: mdl-39038933

RESUMEN

Breast cancer (BC) is the most common malignancy affecting Western women today. It is estimated that as many as 10% of BC cases can be attributed to germline variants. However, the genetic basis of the majority of familial BC cases has yet to be identified. Discovering predisposing genes contributing to familial BC is challenging due to their presumed rarity, low penetrance, and complex biological mechanisms. Here, we focused on an analysis of rare missense variants in a cohort of 12 families of Middle Eastern origins characterized by a high incidence of BC cases. We devised a novel, high-throughput, variant analysis pipeline adapted for family studies, which aims to analyze variants at the protein level by employing state-of-the-art machine learning models and three-dimensional protein structural analysis. Using our pipeline, we analyzed 1218 rare missense variants that are shared between affected family members and classified 80 genes as candidate pathogenic. Among these genes, we found significant functional enrichment in peroxisomal and mitochondrial biological pathways which segregated across seven families in the study and covered diverse ethnic groups. We present multiple evidence that peroxisomal and mitochondrial pathways play an important, yet underappreciated, role in both germline BC predisposition and BC survival.


Asunto(s)
Neoplasias de la Mama , Aprendizaje Profundo , Predisposición Genética a la Enfermedad , Humanos , Neoplasias de la Mama/genética , Femenino , Mutación Missense , Linaje , Mutación de Línea Germinal
3.
Int J Obes (Lond) ; 48(7): 954-963, 2024 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-38472354

RESUMEN

BACKGROUND/OBJECTIVES: The effects of early life exposures on offspring life-course health are well established. This study assessed whether adding early socio-demographic and perinatal variables to a model based on polygenic risk score (PRS) improves prediction of obesity risk. METHODS: We used the Jerusalem Perinatal study (JPS) with data at birth and body mass index (BMI) and waist circumference (WC) measured at age 32. The PRS was constructed using over 2.1M common SNPs identified in genome-wide association study (GWAS) for BMI. Linear and logistic models were applied in a stepwise approach. We first examined the associations between genetic variables and obesity-related phenotypes (e.g., BMI and WC). Secondly, socio-demographic variables were added and finally perinatal exposures, such as maternal pre-pregnancy BMI (mppBMI) and gestational weight gain (GWG) were added to the model. Improvement in prediction of each step was assessed using measures of model discrimination (area under the curve, AUC), net reclassification improvement (NRI) and integrated discrimination improvement (IDI). RESULTS: One standard deviation (SD) change in PRS was associated with a significant increase in BMI (ß = 1.40) and WC (ß = 2.45). These associations were slightly attenuated (13.7-14.2%) with the addition of early life exposures to the model. Also, higher mppBMI was associated with increased offspring BMI (ß = 0.39) and WC (ß = 0.79) (p < 0.001). For obesity (BMI ≥ 30) prediction, the addition of early socio-demographic and perinatal exposures to the PRS model significantly increased AUC from 0.69 to 0.73. At an obesity risk threshold of 15%, the addition of early socio-demographic and perinatal exposures to the PRS model provided a significant improvement in reclassification of obesity (NRI, 0.147; 95% CI 0.068-0.225). CONCLUSIONS: Inclusion of early life exposures, such as mppBMI and maternal smoking, to a model based on PRS improves obesity risk prediction in an Israeli population-sample.


Asunto(s)
Índice de Masa Corporal , Obesidad , Humanos , Femenino , Obesidad/epidemiología , Obesidad/genética , Israel/epidemiología , Adulto , Embarazo , Masculino , Factores de Riesgo , Estudio de Asociación del Genoma Completo , Herencia Multifactorial , Adulto Joven , Predisposición Genética a la Enfermedad
4.
J Biomed Inform ; 154: 104650, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38701887

RESUMEN

BACKGROUND: Distinguishing diseases into distinct subtypes is crucial for study and effective treatment strategies. The Open Targets Platform (OT) integrates biomedical, genetic, and biochemical datasets to empower disease ontologies, classifications, and potential gene targets. Nevertheless, many disease annotations are incomplete, requiring laborious expert medical input. This challenge is especially pronounced for rare and orphan diseases, where resources are scarce. METHODS: We present a machine learning approach to identifying diseases with potential subtypes, using the approximately 23,000 diseases documented in OT. We derive novel features for predicting diseases with subtypes using direct evidence. Machine learning models were applied to analyze feature importance and evaluate predictive performance for discovering both known and novel disease subtypes. RESULTS: Our model achieves a high (89.4%) ROC AUC (Area Under the Receiver Operating Characteristic Curve) in identifying known disease subtypes. We integrated pre-trained deep-learning language models and showed their benefits. Moreover, we identify 515 disease candidates predicted to possess previously unannotated subtypes. CONCLUSIONS: Our models can partition diseases into distinct subtypes. This methodology enables a robust, scalable approach for improving knowledge-based annotations and a comprehensive assessment of disease ontology tiers. Our candidates are attractive targets for further study and personalized medicine, potentially aiding in the unveiling of new therapeutic indications for sought-after targets.


Asunto(s)
Aprendizaje Automático , Humanos , Enfermedad/clasificación , Curva ROC , Biología Computacional/métodos , Algoritmos , Aprendizaje Profundo
5.
Proc Natl Acad Sci U S A ; 118(34)2021 08 24.
Artículo en Inglés | MEDLINE | ID: mdl-34373319

RESUMEN

Atomic structures of several proteins from the coronavirus family are still partial or unavailable. A possible reason for this gap is the instability of these proteins outside of the cellular context, thereby prompting the use of in-cell approaches. In situ cross-linking and mass spectrometry (in situ CLMS) can provide information on the structures of such proteins as they occur in the intact cell. Here, we applied targeted in situ CLMS to structurally probe Nsp1, Nsp2, and nucleocapsid (N) proteins from severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and obtained cross-link sets with an average density of one cross-link per 20 residues. We then employed integrative modeling that computationally combined the cross-linking data with domain structures to determine full-length atomic models. For the Nsp2, the cross-links report on a complex topology with long-range interactions. Integrative modeling with structural prediction of individual domains by the AlphaFold2 system allowed us to generate a single consistent all-atom model of the full-length Nsp2. The model reveals three putative metal binding sites and suggests a role for Nsp2 in zinc regulation within the replication-transcription complex. For the N protein, we identified multiple intra- and interdomain cross-links. Our integrative model of the N dimer demonstrates that it can accommodate three single RNA strands simultaneously, both stereochemically and electrostatically. For the Nsp1, cross-links with the 40S ribosome were highly consistent with recent cryogenic electron microscopy structures. These results highlight the importance of cellular context for the structural probing of recalcitrant proteins and demonstrate the effectiveness of targeted in situ CLMS and integrative modeling.


Asunto(s)
Modelos Moleculares , SARS-CoV-2/química , Proteínas Virales/química , Reactivos de Enlaces Cruzados/química , Células HEK293 , Humanos , Espectrometría de Masas , Dominios Proteicos
6.
Int J Mol Sci ; 25(14)2024 Jul 10.
Artículo en Inglés | MEDLINE | ID: mdl-39062793

RESUMEN

PARK7, also known as DJ-1, plays a critical role in protecting cells by functioning as a sensitive oxidation sensor and modulator of antioxidants. DJ-1 acts to maintain mitochondrial function and regulate transcription in response to different stressors. In this study, we showed that cell lines vary based on their antioxidation potential under basal conditions. The transcriptome of HEK293 cells was tested following knockdown (KD) of DJ-1 using siRNAs, which reduced the DJ-1 transcripts to only 12% of the original level. We compared the expression levels of 14k protein-coding transcripts and 4.2k non-coding RNAs relative to cells treated with non-specific siRNAs. Among the coding genes, approximately 200 upregulated differentially expressed genes (DEGs) signified a coordinated antiviral innate immune response. Most genes were associated with the regulation of type 1 interferons (IFN) and the induction of inflammatory cytokines. About a quarter of these genes were also induced in cells treated with non-specific siRNAs that were used as a negative control. Beyond the antiviral-like response, 114 genes were specific to the KD of DJ-1 with enrichment in RNA metabolism and mitochondrial functions. A smaller set of downregulated genes (58 genes) was associated with dysregulation in membrane structure, cell viability, and mitophagy. We propose that the KD DJ-1 perturbation diminishes the protective potency against oxidative stress. Thus, it renders the cells labile and responsive to the dsRNA signal by activating a large number of genes, many of which drive apoptosis, cell death, and inflammatory signatures. The KD of DJ-1 highlights its potency in regulating genes associated with antiviral responses, RNA metabolism, and mitochondrial functions, apparently through alteration in STAT activity and downstream signaling. Given that DJ-1 also acts as an oncogene in metastatic cancers, targeting DJ-1 could be a promising therapeutic strategy where manipulation of the DJ-1 level may reduce cancer cell viability and enhance the efficacy of cancer treatments.


Asunto(s)
Técnicas de Silenciamiento del Gen , Inmunidad Innata , Proteína Desglicasa DJ-1 , Humanos , Proteína Desglicasa DJ-1/genética , Proteína Desglicasa DJ-1/metabolismo , Inmunidad Innata/genética , Células HEK293 , Mitocondrias/metabolismo , Mitocondrias/genética , ARN Interferente Pequeño/genética , Transcriptoma , Regulación de la Expresión Génica , Perfilación de la Expresión Génica
7.
Hum Genet ; 142(7): 863-878, 2023 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-37133573

RESUMEN

Hypertension is a polygenic disease that affects over 1.2 billion adults aged 30-79 worldwide. It is a major risk factor for renal, cerebrovascular, and cardiovascular diseases. The heritability of hypertension is estimated to be high; nevertheless, our understanding of its underlying mechanisms remains scarce and incomplete. This study covered the entries from European ancestry from the UK-Biobank (UKB), with 74,090 cases diagnosed with essential (primary) hypertension and 200,734 controls. We compared the findings from large-scale genome-wide association studies (GWAS) to the gene-based method of proteome-wide association studies (PWAS). We focused on 70 statistically significant associated genes, most of which failed to reach significance in variant-based GWAS. A total of 30% of the PWAS-associated genes were validated against independent cohorts, including the Finnish Biobank. Furthermore, gene-based analyses that were performed on both sexes revealed sex-dependent genetics with a stronger genetic component associated with females. Analysis of systolic and diastolic blood pressure measurements confirms a strong genetic effect associated with females. We demonstrated that gene-based approaches provide insight into the underlying biology of hypertension. Specifically, the expression profiles of the identified genes exposed the enrichment of endothelial cells from multiple organs. Furthermore, females' top-ranked significant genes are involved in cellular immunity. We conclude that studying hypertension and blood pressure via gene-based association methods improves interpretability and exposes sex-dependent genetic effects, which enhances clinical utility.


Asunto(s)
Estudio de Asociación del Genoma Completo , Hipertensión , Masculino , Adulto , Humanos , Femenino , Predisposición Genética a la Enfermedad , Células Endoteliales , Hipertensión/genética , Proteoma/genética , Polimorfismo de Nucleótido Simple , Hipertensión Esencial
8.
Bioinformatics ; 38(8): 2102-2110, 2022 04 12.
Artículo en Inglés | MEDLINE | ID: mdl-35020807

RESUMEN

SUMMARY: Self-supervised deep language modeling has shown unprecedented success across natural language tasks, and has recently been repurposed to biological sequences. However, existing models and pretraining methods are designed and optimized for text analysis. We introduce ProteinBERT, a deep language model specifically designed for proteins. Our pretraining scheme combines language modeling with a novel task of Gene Ontology (GO) annotation prediction. We introduce novel architectural elements that make the model highly efficient and flexible to long sequences. The architecture of ProteinBERT consists of both local and global representations, allowing end-to-end processing of these types of inputs and outputs. ProteinBERT obtains near state-of-the-art performance, and sometimes exceeds it, on multiple benchmarks covering diverse protein properties (including protein structure, post-translational modifications and biophysical attributes), despite using a far smaller and faster model than competing deep-learning methods. Overall, ProteinBERT provides an efficient framework for rapidly training protein predictors, even with limited labeled data. AVAILABILITY AND IMPLEMENTATION: Code and pretrained model weights are available at https://github.com/nadavbra/protein_bert. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Aprendizaje Profundo , Secuencia de Aminoácidos , Proteínas/química , Lenguaje , Procesamiento de Lenguaje Natural
9.
Int J Mol Sci ; 24(13)2023 Jun 30.
Artículo en Inglés | MEDLINE | ID: mdl-37446105

RESUMEN

The primary role of microglia is to maintain homeostasis by effectively responding to various disturbances. Activation of transcriptional programs determines the microglia's response to external stimuli. In this study, we stimulated murine neonatal microglial cells with benzoyl ATP (bzATP) and lipopolysaccharide (LPS), and monitored their ability to release pro-inflammatory cytokines. When cells are exposed to bzATP, a purinergic receptor agonist, a short-lived wave of transcriptional changes, occurs. However, only combining bzATP and LPS led to a sustainable and robust response. The transcriptional profile is dominated by induced cytokines (e.g., IL-1α and IL-1ß), chemokines, and their membrane receptors. Several abundant long noncoding RNAs (lncRNAs) are induced by bzATP/LPS, including Ptgs2os2, Bc1, and Morrbid, that function in inflammation and cytokine production. Analyzing the observed changes through TNF (Tumor necrosis factor) and NF-κB (nuclear factor kappa light chain enhancer of activated B cells) pathways confirmed that neonatal glial cells exhibit a distinctive expression program in which inflammatory-related genes are upregulated by orders of magnitude. The observed capacity of the microglial culture to activate a robust inflammatory response is useful for studying neurons under stress, brain injury, and aging. We propose the use of a primary neonatal microglia culture as a responsive in vitro model for testing drugs that may interact with inflammatory signaling and the lncRNA regulatory network.


Asunto(s)
Lipopolisacáridos , Microglía , Ratones , Animales , Microglía/metabolismo , Lipopolisacáridos/farmacología , Lipopolisacáridos/metabolismo , FN-kappa B/metabolismo , Citocinas/metabolismo , Neuroglía/metabolismo , Inflamación/metabolismo , Células Cultivadas
10.
Adv Exp Med Biol ; 1385: 133-160, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36352213

RESUMEN

MicroRNAs (miRNAs) provide a fundamental layer of regulation in cells. miRNAs act posttranscriptionally through complementary base-pairing with the 3'-UTR of a target mRNA, leading to mRNA degradation and translation arrest. The likelihood of forming a valid miRNA-target duplex within cells was computationally predicted and experimentally monitored. In human cells, the miRNA profiles determine their identity and physiology. Therefore, alterations in the composition of miRNAs signify many cancer types and chronic diseases. In this chapter, we introduce online functional tools and resources to facilitate miRNA research. We start by introducing currently available miRNA catalogs and miRNA-gateway portals for navigating among different miRNA-centric online resources. We then sketch several realistic challenges that may occur while investigating miRNA regulation in living cells. As a showcase, we demonstrate the utility of miRNAs and mRNAs expression databases that cover diverse human cells and tissues, including resources that report on genetic alterations affecting miRNA expression levels and alteration in binding capacity. Introducing tools linking miRNAs with transcription factor (TF) networks reveals miRNA regulation complexity within living cells. Finally, we concentrate on online resources that analyze miRNAs in human diseases and specifically in cancer. Altogether, we introduce contemporary, selected resources and online tools for studying miRNA regulation in cells and tissues and their utility in health and disease.


Asunto(s)
MicroARNs , Humanos , Regulación de la Expresión Génica , MicroARNs/genética , MicroARNs/metabolismo , ARN Mensajero/genética , ARN Mensajero/metabolismo , Factores de Transcripción/metabolismo , Bases de Datos Factuales
11.
Int J Mol Sci ; 23(24)2022 Dec 18.
Artículo en Inglés | MEDLINE | ID: mdl-36555797

RESUMEN

Mature microRNAs (miRNAs) are single-stranded non-coding RNA (ncRNA) molecules that act in post-transcriptional regulation in animals and plants. A mature miRNA is the end product of consecutive, highly regulated processing steps of the primary miRNA transcript. Following base-paring of the mature miRNA with its mRNA target, translation is inhibited, and the targeted mRNA is degraded. There are hundreds of miRNAs in each cell that work together to regulate cellular key processes, including development, differentiation, cell cycle, apoptosis, inflammation, viral infection, and more. In this review, we present an overlooked layer of cellular regulation that addresses cell dynamics affecting miRNA accessibility. We discuss the regulation of miRNA local storage and translocation among cell compartments. The local amounts of the miRNAs and their targets dictate their actual availability, which determines the ability to fine-tune cell responses to abrupt or chronic changes. We emphasize that changes in miRNA storage and compactization occur under induced stress and changing conditions. Furthermore, we demonstrate shared principles on cell physiology, governed by miRNA under oxidative stress, tumorigenesis, viral infection, or synaptic plasticity. The evidence presented in this review article highlights the importance of spatial and temporal miRNA regulation for cell physiology. We argue that limiting the research to mature miRNAs within the cytosol undermines our understanding of the efficacy of miRNAs to regulate cell fate under stress conditions.


Asunto(s)
MicroARNs , Animales , MicroARNs/genética , MicroARNs/metabolismo , Regulación de la Expresión Génica , ARN Mensajero/genética , Diferenciación Celular , Homeostasis/genética
12.
Bioinformatics ; 36(Suppl_1): i251-i257, 2020 07 01.
Artículo en Inglés | MEDLINE | ID: mdl-32657402

RESUMEN

SUMMARY: Current technologies for single-cell transcriptomics allow thousands of cells to be analyzed in a single experiment. The increased scale of these methods raises the risk of cell doublets contamination. Available tools and algorithms for identifying doublets and estimating their occurrence in single-cell experimental data focus on doublets of different species, cell types or individuals. In this study, we analyze transcriptomic data from single cells having an identical genetic background. We claim that the ratio of monoallelic to biallelic expression provides a discriminating power toward doublets' identification. We present a pipeline called BIallelic Ratio for Doublets (BIRD) that relies on heterologous genetic variations, from single-cell RNA sequencing. For each dataset, doublets were artificially created from the actual data and used to train a predictive model. BIRD was applied on Smart-seq data from 163 primary fibroblast single cells. The model achieved 100% accuracy in annotating the randomly simulated doublets. Bonafide doublets were verified based on a biallelic expression signal amongst X-chromosome of female fibroblasts. Data from 10X Genomics microfluidics of human peripheral blood cells achieved in average 83% (±3.7%) accuracy, and an area under the curve of 0.88 (±0.04) for a collection of ∼13 300 single cells. BIRD addresses instances of doublets, which were formed from cell mixtures of identical genetic background and cell identity. Maximal performance is achieved for high-coverage data from Smart-seq. Success in identifying doublets is data specific which varies according to the experimental methodology, genomic diversity between haplotypes, sequence coverage and depth. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Análisis de la Célula Individual , Programas Informáticos , Algoritmos , Biología Computacional , Humanos , Análisis de Secuencia de ARN
13.
Nucleic Acids Res ; 47(13): 6642-6655, 2019 07 26.
Artículo en Inglés | MEDLINE | ID: mdl-31334812

RESUMEN

Compiling the catalogue of genes actively involved in cancer is an ongoing endeavor, with profound implications to the understanding and treatment of the disease. An abundance of computational methods have been developed to screening the genome for candidate driver genes based on genomic data of somatic mutations in tumors. Existing methods make many implicit and explicit assumptions about the distribution of random mutations. We present FABRIC, a new framework for quantifying the selection of genes in cancer by assessing the effects of de-novo somatic mutations on protein-coding genes. Using a machine-learning model, we quantified the functional effects of ∼3M somatic mutations extracted from over 10 000 human cancerous samples, and compared them against the effects of all possible single-nucleotide mutations in the coding human genome. We detected 593 protein-coding genes showing statistically significant bias towards harmful mutations. These genes, discovered without any prior knowledge, show an overwhelming overlap with known cancer genes, but also include many overlooked genes. FABRIC is designed to avoid false discoveries by comparing each gene to its own background model using rigorous statistics, making minimal assumptions about the distribution of random somatic mutations. The framework is an open-source project with a simple command-line interface.


Asunto(s)
Biología Computacional/métodos , Genes Relacionados con las Neoplasias , Mutación , Proteínas de Neoplasias/genética , Neoplasias/genética , Conjuntos de Datos como Asunto , Humanos , Modelos Genéticos , Mutación Missense , Proteínas de Neoplasias/química , Proteínas de Neoplasias/fisiología
14.
PLoS Comput Biol ; 15(12): e1007204, 2019 12.
Artículo en Inglés | MEDLINE | ID: mdl-31790387

RESUMEN

Mature microRNAs (miRNAs) regulate most human genes through direct base-pairing with mRNAs. We investigate the underlying principles of miRNA regulation in living cells. To this end, we overexpressed miRNAs in different cell types and measured the mRNA decay rate under a paradigm of a transcriptional arrest. Based on an exhaustive matrix of mRNA-miRNA binding probabilities, and parameters extracted from our experiments, we developed a computational framework that captures the cooperative action of miRNAs in living cells. The framework, called COMICS, simulates the stochastic binding events between miRNAs and mRNAs in cells. The input of COMICS is cell-specific profiles of mRNAs and miRNAs, and the outcome is the retention level of each mRNA at the end of 100,000 iterations. The results of COMICS from thousands of miRNA manipulations reveal gene sets that exhibit coordinated behavior with respect to all miRNAs (total of 248 families). We identified a small set of genes that are highly responsive to changes in the expression of almost any of the miRNAs. In contrast, about 20% of the tested genes remain insensitive to a broad range of miRNA manipulations. The set of insensitive genes is strongly enriched with genes that belong to the translation machinery. These trends are shared by different cell types. We conclude that the stochastic nature of miRNAs reveals unexpected robustness of gene expression in living cells. By applying a systematic probabilistic approach some key design principles of cell states are revealed, emphasizing in particular, the immunity of the translational machinery vis-a-vis miRNA manipulations across cell types. We propose COMICS as a valuable platform for assessing the outcome of miRNA regulation of cells in health and disease.


Asunto(s)
MicroARNs/genética , MicroARNs/metabolismo , Modelos Genéticos , ARN Mensajero/genética , ARN Mensajero/metabolismo , Biología Computacional , Simulación por Computador , Perfilación de la Expresión Génica , Regulación de la Expresión Génica , Células HEK293 , Células HeLa , Humanos , Células MCF-7 , Estabilidad del ARN/genética , Procesos Estocásticos
15.
Nucleic Acids Res ; 46(20): 11014-11029, 2018 11 16.
Artículo en Inglés | MEDLINE | ID: mdl-30203035

RESUMEN

MicroRNAs (miRNAs) are short non-coding RNAs that negatively regulate the expression and translation of genes in healthy and diseased tissues. Herein, we characterize short RNAs from human HeLa cells found in the supraspliceosome, a nuclear dynamic machine in which pre-mRNA processing occurs. We sequenced small RNAs (<200 nt) extracted from the supraspliceosome, and identified sequences that are derived from 200 miRNAs genes. About three quarters of them are mature miRNAs, whereas the rest account for various defined regions of the pre-miRNA, and its hairpin-loop precursor. Out of these aligned sequences, 53 were undetected in cellular extract, and the abundance of additional 48 strongly differed from that in cellular extract. Notably, we describe seven abundant miRNA-derived sequences that overlap non-coding exons of their host gene. The rich collection of sequences identical to pre-miRNAs at the supraspliceosome suggests overlooked nuclear functions. Specifically, the abundant hsa-mir-99b may affect splicing of LINC01129 primary transcript through base-pairing with its exon-intron junction. Using suppression and overexpression experiments, we show that hsa-mir-7704 negatively regulates the level of the lncRNA HAGLR. We claim that in cases of extended base-pairing complementarity, such supraspliceosomal pre-miRNA sequences might have a role in transcription attenuation, maturation and processing.


Asunto(s)
MicroARNs/genética , Precursores del ARN/genética , Empalmosomas/genética , Secuencia de Bases , Línea Celular , Regulación de la Expresión Génica , Células HeLa , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , MicroARNs/metabolismo , Procesamiento Postranscripcional del ARN , Empalme del ARN , Empalmosomas/metabolismo
16.
Int J Mol Sci ; 21(21)2020 Oct 30.
Artículo en Inglés | MEDLINE | ID: mdl-33143250

RESUMEN

MicroRNAs (miRNAs) act as negative regulators of gene expression in the cytoplasm. Previous studies have identified the presence of miRNAs in the nucleus. Here we study human breast cancer-derived cell-lines (MCF-7 and MDA-MB-231) and a non-tumorigenic cell-line (MCF-10A) and compare their miRNA sequences at the spliceosome fraction (SF). We report that the levels of miRNAs found in the spliceosome, their identity, and pre-miRNA segmental composition are cell-line specific. One such miRNA is miR-7704 whose genomic position overlaps HAGLR, a cancer-related lncRNA. We detected an inverse expression of miR-7704 and HAGLR in the tested cell lines. Specifically, inhibition of miR-7704 caused an increase in HAGLR expression. Furthermore, elevated levels of miR-7704 slightly altered the cell-cycle in MDA-MB-231. Altogether, we show that SF-miR-7704 acts as a tumor-suppressor gene with HAGLR being its nuclear target. The relative levels of miRNAs found in the spliceosome fractions (e.g., miR-100, miR-30a, and let-7 family) in non-tumorigenic relative to cancer-derived cell-lines was monitored. We found that the expression trend of the abundant miRNAs in SF was different from that reported in the literature and from the observation of large cohorts of breast cancer patients, suggesting that many SF-miRNAs act on targets that are different from the cytoplasmic ones. Altogether, we report on the potential of SF-miRNAs as an unexplored route for cancerous cell state.


Asunto(s)
Biomarcadores de Tumor/genética , Neoplasias de la Mama/patología , Regulación Neoplásica de la Expresión Génica , MicroARNs/genética , ARN Largo no Codificante/genética , Empalmosomas/genética , Apoptosis , Neoplasias de la Mama/genética , Neoplasias de la Mama/metabolismo , Proliferación Celular , Femenino , Perfilación de la Expresión Génica , Humanos , Células Tumorales Cultivadas
17.
Am J Med Genet B Neuropsychiatr Genet ; 183(7): 412-422, 2020 10.
Artículo en Inglés | MEDLINE | ID: mdl-32815282

RESUMEN

STXBP1, also known as Munc-18, is a master regulator of neurotransmitter release and synaptic function in the human brain through its direct interaction with syntaxin 1A. STXBP1 binds syntaxin 1A is an inactive conformational state. STXBP1 decreases its binding affinity to syntaxin upon phosphorylation, enabling syntaxin 1A to engage in the SNARE complex, leading to neurotransmitter release. STXBP1-related disorders are well characterized by encephalopathy with epilepsy, and a diverse range of neurological and neurodevelopmental conditions. Through exome sequencing of a child with developmental delay, hypotonia, and spasticity, we found a novel de novo insertion mutation of three nucleotides in the STXBP1 coding region, resulting in an additional arginine after position 39 (R39dup). Inconclusive results from state-of-the-art variant prediction tools mandated a structure-based approach using molecular dynamics (MD) simulations of the STXBP1-syntaxin 1A complex. Comparison of the interaction interfaces of the wild-type and the R39dup complexes revealed a reduced interaction surface area in the mutant, leading to destabilization of the protein complex. Moreover, the decrease in affinity toward syntaxin 1A is similar for the phosphorylated STXBP1 and the R39dup. We applied the same MD methodology to seven additional previously reported STXBP1 mutations and reveal that the stability of the STXBP1-syntaxin 1A interface correlates with the reported clinical phenotypes. This study provides a direct link between the outcome of a novel variant in STXBP1 and protein structure and dynamics. The structural change upon mutation drives an alteration in synaptic function.


Asunto(s)
Discapacidades del Desarrollo/genética , Proteínas Munc18/genética , Sintaxina 1/metabolismo , Encéfalo/metabolismo , Encefalopatías/genética , Preescolar , Discapacidades del Desarrollo/fisiopatología , Electroencefalografía/métodos , Epilepsia/genética , Femenino , Humanos , Proteínas Munc18/metabolismo , Mutagénesis Insercional/genética , Sintaxina 1/genética , Secuenciación del Exoma/métodos
18.
BMC Genomics ; 20(1): 201, 2019 Mar 12.
Artículo en Inglés | MEDLINE | ID: mdl-30871455

RESUMEN

BACKGROUND: In mammals, sex chromosomes pose an inherent imbalance of gene expression between sexes. In each female somatic cell, random inactivation of one of the X-chromosomes restores this balance. While most genes from the inactivated X-chromosome are silenced, 15-25% are known to escape X-inactivation (termed escapees). The expression levels of these genes are attributed to sex-dependent phenotypic variability. RESULTS: We used single-cell RNA-Seq to detect escapees in somatic cells. As only one X-chromosome is inactivated in each cell, the origin of expression from the active or inactive chromosome can be determined from the variation of sequenced RNAs. We analyzed primary, healthy fibroblasts (n = 104), and clonal lymphoblasts with sequenced parental genomes (n = 25) by measuring the degree of allelic-specific expression (ASE) from heterozygous sites. We identified 24 and 49 candidate escapees, at varying degree of confidence, from the fibroblast and lymphoblast transcriptomes, respectively. We critically test the validity of escapee annotations by comparing our findings with a large collection of independent studies. We find that most genes (66%) from the unified set were previously reported as escapees. Furthermore, out of the overlooked escapees, 11 are long noncoding RNA (lncRNAs). CONCLUSIONS: X-chromosome inactivation and escaping from it are robust, permanent phenomena that are best studies at a single-cell resolution. The cumulative information from individual cells increases the potential of identifying escapees. Moreover, despite the use of a limited number of cells, clonal cells (i.e., same X- chromosomes are coordinately inhibited) with genomic phasing are valuable for detecting escapees at high confidence. Generalizing the method to uncharacterized genomic loci resulted in lncRNAs escapees which account for 20% of the listed candidates. By confirming genes as escapees and propose others as candidates from two different cell types, we contribute to the cumulative knowledge and reliability of human escapees.


Asunto(s)
Cromosomas Humanos X , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Análisis de la Célula Individual/métodos , Transcriptoma , Inactivación del Cromosoma X , Alelos , Mapeo Cromosómico , Femenino , Fibroblastos/citología , Fibroblastos/metabolismo , Humanos , Recién Nacido , Linfocitos/citología , Linfocitos/metabolismo
19.
BMC Cancer ; 19(1): 783, 2019 Aug 07.
Artículo en Inglés | MEDLINE | ID: mdl-31391007

RESUMEN

BACKGROUND: In recent years, research on cancer predisposition germline variants has emerged as a prominent field. The identity of somatic mutations is based on a reliable mapping of the patient germline variants. In addition, the statistics of germline variants frequencies in healthy individuals and cancer patients is the basis for seeking candidates for cancer predisposition genes. The Cancer Genome Atlas (TCGA) is one of the main sources of such data, providing a diverse collection of molecular data including deep sequencing for more than 30 types of cancer from > 10,000 patients. METHODS: Our hypothesis in this study is that whole exome sequences from blood samples of cancer patients are not expected to show systematic differences among cancer types. To test this hypothesis, we analyzed common and rare germline variants across six cancer types, covering 2241 samples from TCGA. In our analysis we accounted for inherent variables in the data including the different variant calling protocols, sequencing platforms, and ethnicity. RESULTS: We report on substantial batch effects in germline variants associated with cancer types. We attribute the effect to the specific sequencing centers that produced the data. Specifically, we measured 30% variability in the number of reported germline variants per sample across sequencing centers. The batch effect is further expressed in nucleotide composition and variant frequencies. Importantly, the batch effect causes substantial differences in germline variant distribution patterns across numerous genes, including prominent cancer predisposition genes such as BRCA1, RET, MAX, and KRAS. For most of known cancer predisposition genes, we found a distinct batch-dependent difference in germline variants. CONCLUSION: TCGA germline data is exposed to strong batch effects with substantial variabilities among TCGA sequencing centers. We claim that those batch effects are consequential for numerous TCGA pan-cancer studies. In particular, these effects may compromise the reliability and the potency to detect new cancer predisposition genes. Furthermore, interpretation of pan-cancer analyses should be revisited in view of the source of the genomic data after accounting for the reported batch effects.


Asunto(s)
Exoma , Genoma Humano , Genómica , Mutación de Línea Germinal , Neoplasias/genética , Estudios de Asociación Genética , Predisposición Genética a la Enfermedad , Humanos , Estimación de Kaplan-Meier , Neoplasias/diagnóstico , Neoplasias/mortalidad , Neoplasias/terapia , Medicina de Precisión/métodos
20.
Am J Hematol ; 94(1): 62-73, 2019 01.
Artículo en Inglés | MEDLINE | ID: mdl-30295334

RESUMEN

Myeloproliferative neoplasms (MPNs) driver mutations are usually found in JAK2, MPL, and CALR genes; however, 10%-15% of cases are triple negative (TN). A previous study showed lower rate of JAK2 V617F in primary myelofibrosis patients exposed to low doses of ionizing radiation (IR) from Chernobyl accident. To examine distinct driver mutations, we enrolled 281 Ukrainian IR-exposed and unexposed MPN patients. Genomic DNA was obtained from peripheral blood leukocytes. JAK2 V617F, MPL W515, types 1- and 2-like CALR mutations were identified by Sanger Sequencing and real time polymerase chain reaction. Chromosomal alterations were assessed by oligo-SNP microarray platform. Additional genetic variants were identified by whole exome and targeted sequencing. Statistical significance was evaluated by Fisher's exact test and Wilcoxon's rank sum test (R, version 3.4.2). IR-exposed MPN patients exhibited a different genetic profile vs unexposed: lower rate of JAK2 V617F (58.4% vs 75.4%, P = .0077), higher rate of type 1-like CALR mutation (12.2% vs 3.1%, P = .0056), higher rate of TN cases (27.8% vs 16.2%, P = .0366), higher rate of potentially pathogenic sequence variants (mean numbers: 4.8 vs 3.1, P = .0242). Furthermore, we identified several potential drivers specific to IR-exposed TN MPN patients: ATM p.S1691R with copy-neutral loss of heterozygosity at 11q; EZH2 p.D659G at 7q and SUZ12 p.V71 M at 17q with copy number loss. Thus, IR-exposed MPN patients represent a group with distinct genomic characteristics worthy of further study.


Asunto(s)
Accidente Nuclear de Chernóbil , Trastornos Mieloproliferativos/etiología , Neoplasias Inducidas por Radiación/etiología , Contaminantes Radiactivos/efectos adversos , Adulto , Anciano , Calreticulina/genética , Aberraciones Cromosómicas , ADN/genética , Femenino , Dosificación de Gen , Humanos , Janus Quinasa 2/genética , Pérdida de Heterocigocidad , Masculino , Persona de Mediana Edad , Mutación Missense , Trastornos Mieloproliferativos/epidemiología , Trastornos Mieloproliferativos/genética , Neoplasias Inducidas por Radiación/epidemiología , Neoplasias Inducidas por Radiación/genética , Receptores de Trombopoyetina/genética , Ucrania/epidemiología , Secuenciación del Exoma , Adulto Joven
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA