Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 39
Filtrar
Mais filtros

Tipo de documento
Intervalo de ano de publicação
1.
Cell ; 186(22): 4834-4850.e23, 2023 10 26.
Artigo em Inglês | MEDLINE | ID: mdl-37794589

RESUMO

Regulation of viral RNA biogenesis is fundamental to productive SARS-CoV-2 infection. To characterize host RNA-binding proteins (RBPs) involved in this process, we biochemically identified proteins bound to genomic and subgenomic SARS-CoV-2 RNAs. We find that the host protein SND1 binds the 5' end of negative-sense viral RNA and is required for SARS-CoV-2 RNA synthesis. SND1-depleted cells form smaller replication organelles and display diminished virus growth kinetics. We discover that NSP9, a viral RBP and direct SND1 interaction partner, is covalently linked to the 5' ends of positive- and negative-sense RNAs produced during infection. These linkages occur at replication-transcription initiation sites, consistent with NSP9 priming viral RNA synthesis. Mechanistically, SND1 remodels NSP9 occupancy and alters the covalent linkage of NSP9 to initiating nucleotides in viral RNA. Our findings implicate NSP9 in the initiation of SARS-CoV-2 RNA synthesis and unravel an unsuspected role of a cellular protein in orchestrating viral RNA production.


Assuntos
COVID-19 , RNA Viral , Humanos , COVID-19/metabolismo , Endonucleases/metabolismo , RNA Viral/metabolismo , SARS-CoV-2/genética , Replicação Viral
2.
Nat Immunol ; 23(8): 1208-1221, 2022 08.
Artigo em Inglês | MEDLINE | ID: mdl-35879451

RESUMO

T cell antigen-receptor (TCR) signaling controls the development, activation and survival of T cells by involving several layers and numerous mechanisms of gene regulation. N6-methyladenosine (m6A) is the most prevalent messenger RNA modification affecting splicing, translation and stability of transcripts. In the present study, we describe the Wtap protein as essential for m6A methyltransferase complex function and reveal its crucial role in TCR signaling in mouse T cells. Wtap and m6A methyltransferase functions were required for the differentiation of thymocytes, control of activation-induced death of peripheral T cells and prevention of colitis by enabling gut RORγt+ regulatory T cell function. Transcriptome and epitranscriptomic analyses reveal that m6A modification destabilizes Orai1 and Ripk1 mRNAs. Lack of post-transcriptional repression of the encoded proteins correlated with increased store-operated calcium entry activity and diminished survival of T cells with conditional genetic inactivation of Wtap. These findings uncover how m6A modification impacts on TCR signal transduction and determines activation and survival of T cells.


Assuntos
Proteínas de Ciclo Celular , Metiltransferases , Adenosina/análogos & derivados , Animais , Proteínas de Ciclo Celular/metabolismo , Metilação , Metiltransferases/genética , Camundongos , Fatores de Processamento de RNA/metabolismo , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Transdução de Sinais
3.
Mol Cell ; 82(1): 190-208.e17, 2022 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-34932975

RESUMO

Developmental genes such as Xist, which initiates X chromosome inactivation, are controlled by complex cis-regulatory landscapes, which decode multiple signals to establish specific spatiotemporal expression patterns. Xist integrates information on X chromosome dosage and developmental stage to trigger X inactivation in the epiblast specifically in female embryos. Through a pooled CRISPR screen in differentiating mouse embryonic stem cells, we identify functional enhancer elements of Xist at the onset of random X inactivation. Chromatin profiling reveals that X-dosage controls the promoter-proximal region, while differentiation cues activate several distal enhancers. The strongest distal element lies in an enhancer cluster associated with a previously unannotated Xist-enhancing regulatory transcript, which we named Xert. Developmental cues and X-dosage are thus decoded by distinct regulatory regions, which cooperate to ensure female-specific Xist upregulation at the correct developmental time. With this study, we start to disentangle how multiple, functionally distinct regulatory elements interact to generate complex expression patterns in mammals.


Assuntos
Elementos Facilitadores Genéticos , Loci Gênicos , Células-Tronco Embrionárias Murinas/metabolismo , Regiões Promotoras Genéticas , RNA Longo não Codificante/genética , Inativação do Cromossomo X , Cromossomo X , Animais , Diferenciação Celular , Linhagem Celular , Feminino , Regulação da Expressão Gênica no Desenvolvimento , Camundongos , Camundongos Endogâmicos C57BL , Camundongos Transgênicos , Regulação para Cima
4.
Genome Res ; 34(4): 572-589, 2024 May 15.
Artigo em Inglês | MEDLINE | ID: mdl-38719471

RESUMO

Dormancy is a key feature of stem cell function in adult tissues as well as in embryonic cells in the context of diapause. The establishment of dormancy is an active process that involves extensive transcriptional, epigenetic, and metabolic rewiring. How these processes are coordinated to successfully transition cells to the resting dormant state remains unclear. Here we show that microRNA activity, which is otherwise dispensable for preimplantation development, is essential for the adaptation of early mouse embryos to the dormant state of diapause. In particular, the pluripotent epiblast depends on miRNA activity, the absence of which results in the loss of pluripotent cells. Through the integration of high-sensitivity small RNA expression profiling of individual embryos and protein expression of miRNA targets with public data of protein-protein interactions, we constructed the miRNA-mediated regulatory network of mouse early embryos specific to diapause. We find that individual miRNAs contribute to the combinatorial regulation by the network, and the perturbation of the network compromises embryo survival in diapause. We further identified the nutrient-sensitive transcription factor TFE3 as an upstream regulator of diapause-specific miRNAs, linking cytoplasmic MTOR activity to nuclear miRNA biogenesis. Our results place miRNAs as a critical regulatory layer for the molecular rewiring of early embryos to establish dormancy.


Assuntos
Proliferação de Células , MicroRNAs , Células-Tronco Pluripotentes , Animais , MicroRNAs/genética , MicroRNAs/metabolismo , Camundongos , Células-Tronco Pluripotentes/metabolismo , Células-Tronco Pluripotentes/citologia , Regulação da Expressão Gênica no Desenvolvimento , Redes Reguladoras de Genes , Desenvolvimento Embrionário/genética , Camadas Germinativas/metabolismo , Camadas Germinativas/citologia , Blastocisto/metabolismo , Blastocisto/citologia , Feminino
5.
Brief Bioinform ; 24(5)2023 09 20.
Artigo em Inglês | MEDLINE | ID: mdl-37635383

RESUMO

RNA-binding proteins (RBPs) are central actors of RNA post-transcriptional regulation. Experiments to profile-binding sites of RBPs in vivo are limited to transcripts expressed in the experimental cell type, creating the need for computational methods to infer missing binding information. While numerous machine-learning based methods have been developed for this task, their use of heterogeneous training and evaluation datasets across different sets of RBPs and CLIP-seq protocols makes a direct comparison of their performance difficult. Here, we compile a set of 37 machine learning (primarily deep learning) methods for in vivo RBP-RNA interaction prediction and systematically benchmark a subset of 11 representative methods across hundreds of CLIP-seq datasets and RBPs. Using homogenized sample pre-processing and two negative-class sample generation strategies, we evaluate methods in terms of predictive performance and assess the impact of neural network architectures and input modalities on model performance. We believe that this study will not only enable researchers to choose the optimal prediction method for their tasks at hand, but also aid method developers in developing novel, high-performing methods by introducing a standardized framework for their evaluation.


Assuntos
Benchmarking , Sequenciamento de Cromatina por Imunoprecipitação , Sítios de Ligação , Aprendizado de Máquina , RNA/genética
6.
Proc Natl Acad Sci U S A ; 119(36): e2120680119, 2022 09 06.
Artigo em Inglês | MEDLINE | ID: mdl-35998224

RESUMO

The systemic immune response to viral infection is shaped by master transcription factors, such as NF-κB, STAT1, or PU.1. Although long noncoding RNAs (lncRNAs) have been suggested as important regulators of transcription factor activity, their contributions to the systemic immunopathologies observed during SARS-CoV-2 infection have remained unknown. Here, we employed a targeted single-cell RNA sequencing approach to reveal lncRNAs differentially expressed in blood leukocytes during severe COVID-19. Our results uncover the lncRNA PIRAT (PU.1-induced regulator of alarmin transcription) as a major PU.1 feedback-regulator in monocytes, governing the production of the alarmins S100A8/A9, key drivers of COVID-19 pathogenesis. Knockout and transgene expression, combined with chromatin-occupancy profiling, characterized PIRAT as a nuclear decoy RNA, keeping PU.1 from binding to alarmin promoters and promoting its binding to pseudogenes in naïve monocytes. NF-κB-dependent PIRAT down-regulation during COVID-19 consequently releases a transcriptional brake, fueling alarmin production. Alarmin expression is additionally enhanced by the up-regulation of the lncRNA LUCAT1, which promotes NF-κB-dependent gene expression at the expense of targets of the JAK-STAT pathway. Our results suggest a major role of nuclear noncoding RNA networks in systemic antiviral responses to SARS-CoV-2 in humans.


Assuntos
COVID-19 , Regulação da Expressão Gênica , Monócitos , RNA Longo não Codificante , SARS-CoV-2 , Alarminas/genética , COVID-19/genética , COVID-19/imunologia , Humanos , Janus Quinases/genética , Monócitos/imunologia , NF-kappa B/genética , RNA Longo não Codificante/genética , RNA Longo não Codificante/metabolismo , RNA-Seq , SARS-CoV-2/imunologia , Fatores de Transcrição STAT/genética , Transdução de Sinais/genética , Análise de Célula Única
7.
Int J Mol Sci ; 24(6)2023 Mar 07.
Artigo em Inglês | MEDLINE | ID: mdl-36982191

RESUMO

The nuclear factor NF-kB is the master transcription factor in the inflammatory process by modulating the expression of pro-inflammatory genes. However, an additional level of complexity is the ability to promote the transcriptional activation of post-transcriptional modulators of gene expression as non-coding RNA (i.e., miRNAs). While NF-kB's role in inflammation-associated gene expression has been extensively investigated, the interplay between NF-kB and genes coding for miRNAs still deserves investigation. To identify miRNAs with potential NF-kB binding sites in their transcription start site, we predicted miRNA promoters by an in silico analysis using the PROmiRNA software, which allowed us to score the genomic region's propensity to be miRNA cis-regulatory elements. A list of 722 human miRNAs was generated, of which 399 were expressed in at least one tissue involved in the inflammatory processes. The selection of "high-confidence" hairpins in miRbase identified 68 mature miRNAs, most of them previously identified as inflammamiRs. The identification of targeted pathways/diseases highlighted their involvement in the most common age-related diseases. Overall, our results reinforce the hypothesis that persistent activation of NF-kB could unbalance the transcription of specific inflammamiRNAs. The identification of such miRNAs could be of diagnostic/prognostic/therapeutic relevance for the most common inflammatory-related and age-related diseases.


Assuntos
MicroRNAs , NF-kappa B , Humanos , NF-kappa B/genética , NF-kappa B/metabolismo , MicroRNAs/genética , MicroRNAs/metabolismo , Fatores de Transcrição/metabolismo , Mineração de Dados , Envelhecimento/genética
8.
Genome Res ; 29(7): 1087-1099, 2019 07.
Artigo em Inglês | MEDLINE | ID: mdl-31175153

RESUMO

To initiate X-Chromosome inactivation (XCI), the long noncoding RNA Xist mediates chromosome-wide gene silencing of one X Chromosome in female mammals to equalize gene dosage between the sexes. The efficiency of gene silencing is highly variable across genes, with some genes even escaping XCI in somatic cells. A gene's susceptibility to Xist-mediated silencing appears to be determined by a complex interplay of epigenetic and genomic features; however, the underlying rules remain poorly understood. We have quantified chromosome-wide gene silencing kinetics at the level of the nascent transcriptome using allele-specific Precision nuclear Run-On sequencing (PRO-seq). We have developed a Random Forest machine-learning model that can predict the measured silencing dynamics based on a large set of epigenetic and genomic features and tested its predictive power experimentally. The genomic distance to the Xist locus, followed by gene density and distance to LINE elements, are the prime determinants of the speed of gene silencing. Moreover, we find two distinct gene classes associated with different silencing pathways: a class that requires Xist-repeat A for silencing, which is known to activate the SPEN pathway, and a second class in which genes are premarked by Polycomb complexes and tend to rely on the B repeat in Xist for silencing, known to recruit Polycomb complexes during XCI. Moreover, a series of features associated with active transcriptional elongation and chromatin 3D structure are enriched at rapidly silenced genes. Our machine-learning approach can thus uncover the complex combinatorial rules underlying gene silencing during X inactivation.


Assuntos
Epigênese Genética , Inativação Gênica , Aprendizado de Máquina , RNA Longo não Codificante/fisiologia , Inativação do Cromossomo X/genética , Animais , Linhagem Celular , Células-Tronco Embrionárias , Feminino , Genes Ligados ao Cromossomo X , Genoma , Cinética , Camundongos , Modelos Genéticos
9.
Pediatr Allergy Immunol ; 33(6): e13802, 2022 06.
Artigo em Inglês | MEDLINE | ID: mdl-35754128

RESUMO

BACKGROUND: Asthma exacerbations are a serious public health concern due to high healthcare resource utilization, work/school productivity loss, impact on quality of life, and risk of mortality. The genetic basis of asthma exacerbations has been studied in several populations, but no prior study has performed a multi-ancestry meta-analysis of genome-wide association studies (meta-GWAS) for this trait. We aimed to identify common genetic loci associated with asthma exacerbations across diverse populations and to assess their functional role in regulating DNA methylation and gene expression. METHODS: A meta-GWAS of asthma exacerbations in 4989 Europeans, 2181 Hispanics/Latinos, 1250 Singaporean Chinese, and 972 African Americans analyzed 9.6 million genetic variants. Suggestively associated variants (p ≤ 5 × 10-5 ) were assessed for replication in 36,477 European and 1078 non-European asthma patients. Functional effects on DNA methylation were assessed in 595 Hispanic/Latino and African American asthma patients and in publicly available databases. The effect on gene expression was evaluated in silico. RESULTS: One hundred and twenty-six independent variants were suggestively associated with asthma exacerbations in the discovery phase. Two variants independently replicated: rs12091010 located at vascular cell adhesion molecule-1/exostosin like glycosyltransferase-2 (VCAM1/EXTL2) (discovery: odds ratio (ORT allele ) = 0.82, p = 9.05 × 10-6 and replication: ORT allele  = 0.89, p = 5.35 × 10-3 ) and rs943126 from pantothenate kinase 1 (PANK1) (discovery: ORC allele  = 0.85, p = 3.10 × 10-5 and replication: ORC allele  = 0.89, p = 1.30 × 10-2 ). Both variants regulate gene expression of genes where they locate and DNA methylation levels of nearby genes in whole blood. CONCLUSIONS: This multi-ancestry study revealed novel suggestive regulatory loci for asthma exacerbations located in genomic regions participating in inflammation and host defense.


Assuntos
Asma , Estudo de Associação Genômica Ampla , Asma/genética , Predisposição Genética para Doença , Hispânico ou Latino/genética , Humanos , Polimorfismo de Nucleotídeo Único , Qualidade de Vida
10.
Nucleic Acids Res ; 47(9): 4406-4417, 2019 05 21.
Artigo em Inglês | MEDLINE | ID: mdl-30923827

RESUMO

In recent years, hundreds of novel RNA-binding proteins (RBPs) have been identified, leading to the discovery of novel RNA-binding domains. Furthermore, unstructured or disordered low-complexity regions of RBPs have been identified to play an important role in interactions with nucleic acids. However, these advances in understanding RBPs are limited mainly to eukaryotic species and we only have limited tools to faithfully predict RNA-binders in bacteria. Here, we describe a support vector machine-based method, called TriPepSVM, for the prediction of RNA-binding proteins. TriPepSVM applies string kernels to directly handle protein sequences using tri-peptide frequencies. Testing the method in human and bacteria, we find that several RBP-enriched tri-peptides occur more often in structurally disordered regions of RBPs. TriPepSVM outperforms existing applications, which consider classical structural features of RNA-binding or homology, in the task of RBP prediction in both human and bacteria. Finally, we predict 66 novel RBPs in Salmonella Typhimurium and validate the bacterial proteins ClpX, DnaJ and UbiG to associate with RNA in vivo.


Assuntos
Motivos de Aminoácidos/genética , Biologia Computacional , Motivos de Ligação ao RNA/genética , Proteínas de Ligação a RNA/química , Algoritmos , Sequência de Aminoácidos/genética , Sítios de Ligação/genética , Humanos , Conformação de Ácido Nucleico , Ligação Proteica , Proteínas de Ligação a RNA/genética
11.
BMC Bioinformatics ; 20(1): 292, 2019 May 29.
Artigo em Inglês | MEDLINE | ID: mdl-31142264

RESUMO

BACKGROUND: Although several studies have provided insights into the role of long non-coding RNAs (lncRNAs), the majority of them have unknown function. Recent evidence has shown the importance of both lncRNAs and chromatin interactions in transcriptional regulation. Although network-based methods, mainly exploiting gene-lncRNA co-expression, have been applied to characterize lncRNA of unknown function by means of 'guilt-by-association', no strategy exists so far which identifies mRNA-lncRNA functional modules based on the 3D chromatin interaction graph. RESULTS: To better understand the function of chromatin interactions in the context of lncRNA-mediated gene regulation, we have developed a multi-step graph analysis approach to examine the RNA polymerase II ChIA-PET chromatin interaction network in the K562 human cell line. We have annotated the network with gene and lncRNA coordinates, and chromatin states from the ENCODE project. We used centrality measures, as well as an adaptation of our previously developed Markov State Models (MSM) clustering method, to gain a better understanding of lncRNAs in transcriptional regulation. The novelty of our approach resides in the detection of fuzzy regulatory modules based on network properties and their optimization based on co-expression analysis between genes and gene-lncRNA pairs. This results in our method returning more bona fide regulatory modules than other state-of-the art approaches for clustering on graphs. CONCLUSIONS: Interestingly, we find that lncRNA network hubs tend to be significantly enriched in evolutionary conserved lncRNAs and enhancer-like functions. We validated regulatory functions for well known lncRNAs, such as MALAT1 and the enhancer-like lncRNA FALEC. In addition, by investigating the modular structure of bigger components we mine putative regulatory functions for uncharacterized lncRNAs.


Assuntos
Cromatina/metabolismo , Redes Reguladoras de Genes , RNA Longo não Codificante/genética , Análise de Sequência de RNA/métodos , Algoritmos , Regulação da Expressão Gênica , Humanos , Células K562 , RNA Mensageiro/genética
12.
Bioinformatics ; 34(17): 3035-3037, 2018 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-29659719

RESUMO

Summary: Convolutional neural networks (CNNs) have been shown to perform exceptionally well in a variety of tasks, including biological sequence classification. Available implementations, however, are usually optimized for a particular task and difficult to reuse. To enable researchers to utilize these networks more easily, we implemented pysster, a Python package for training CNNs on biological sequence data. Sequences are classified by learning sequence and structure motifs and the package offers an automated hyper-parameter optimization procedure and options to visualize learned motifs along with information about their positional and class enrichment. The package runs seamlessly on CPU and GPU and provides a simple interface to train and evaluate a network with a handful lines of code. Using an RNA A-to-I editing dataset and cross-linking immunoprecipitation (CLIP)-seq binding site sequences, we demonstrate that pysster classifies sequences with higher accuracy than previous methods, such as GraphProt or ssHMM, and is able to recover known sequence and structure motifs. Availability and implementation: pysster is freely available at https://github.com/budach/pysster. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Redes Neurais de Computação , Sítios de Ligação , Aprendizado de Máquina , Análise de Sequência , Software
13.
Nucleic Acids Res ; 45(19): 11004-11018, 2017 Nov 02.
Artigo em Inglês | MEDLINE | ID: mdl-28977546

RESUMO

RNA-binding proteins (RBPs) play an important role in RNA post-transcriptional regulation and recognize target RNAs via sequence-structure motifs. The extent to which RNA structure influences protein binding in the presence or absence of a sequence motif is still poorly understood. Existing RNA motif finders either take the structure of the RNA only partially into account, or employ models which are not directly interpretable as sequence-structure motifs. We developed ssHMM, an RNA motif finder based on a hidden Markov model (HMM) and Gibbs sampling which fully captures the relationship between RNA sequence and secondary structure preference of a given RBP. Compared to previous methods which output separate logos for sequence and structure, it directly produces a combined sequence-structure motif when trained on a large set of sequences. ssHMM's model is visualized intuitively as a graph and facilitates biological interpretation. ssHMM can be used to find novel bona fide sequence-structure motifs of uncharacterized RBPs, such as the one presented here for the YY1 protein. ssHMM reaches a high motif recovery rate on synthetic data, it recovers known RBP motifs from CLIP-Seq data, and scales linearly on the input size, being considerably faster than MEMERIS and RNAcontext on large datasets while being on par with GraphProt. It is freely available on Github and as a Docker image.


Assuntos
Algoritmos , Biologia Computacional/métodos , Proteínas de Ligação a RNA/metabolismo , Análise de Sequência de RNA/métodos , Sequência de Bases , Modelos Moleculares , Conformação de Ácido Nucleico , Motivos de Nucleotídeos/genética , Ligação Proteica , Domínios Proteicos , RNA/química , RNA/genética , RNA/metabolismo , Proteínas de Ligação a RNA/química , Reprodutibilidade dos Testes
14.
J Infect Dis ; 214(3): 454-63, 2016 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-27130431

RESUMO

BACKGROUND: Legionella pneumophila is a causative agent of severe pneumonia. Infection leads to a broad host cell response, as evident, for example, on the transcriptional level. Chromatin modifications, which control gene expression, play a central role in the transcriptional response to L. pneumophila METHODS: We infected human-blood-derived macrophages (BDMs) with L. pneumophila and used chromatin immunoprecipitation followed by sequencing to screen for gene promoters with the activating histone 4 acetylation mark. RESULTS: We found the promoter of tumor necrosis factor α-induced protein 2 (TNFAIP2) to be acetylated at histone H4. This factor has not been characterized in the pathology of L. pneumophila TNFAIP2 messenger RNA and protein were upregulated in response to L. pneumophila infection of human-BDMs and human alveolar epithelial (A549) cells. We showed that L. pneumophila-induced TNFAIP2 expression is dependent on the NF-κB transcription factor. Importantly, knock down of TNFAIP2 led to reduced intracellular replication of L. pneumophila Corby in A549 cells. CONCLUSIONS: Taken together, genome-wide chromatin analysis of L. pneumophila-infected macrophages demonstrated induction of TNFAIP2, a NF-κB-dependent factor relevant for bacterial replication.


Assuntos
Citocinas/análise , Interações Hospedeiro-Patógeno , Legionella pneumophila/patogenicidade , Macrófagos/química , Macrófagos/microbiologia , Acetilação , Linhagem Celular , Cromatina/química , Imunoprecipitação da Cromatina , Citocinas/genética , Células Epiteliais/química , Células Epiteliais/microbiologia , Histonas/análise , Humanos
15.
Cell Syst ; 14(10): 906-922.e6, 2023 10 18.
Artigo em Inglês | MEDLINE | ID: mdl-37857083

RESUMO

Long non-coding RNAs (lncRNAs) are involved in gene expression regulation in cis. Although enriched in the cell chromatin fraction, to what degree this defines their regulatory potential remains unclear. Furthermore, the factors underlying lncRNA chromatin tethering, as well as the molecular basis of efficient lncRNA chromatin dissociation and its impact on enhancer activity and target gene expression, remain to be resolved. Here, we developed chrTT-seq, which combines the pulse-chase metabolic labeling of nascent RNA with chromatin fractionation and transient transcriptome sequencing to follow nascent RNA transcripts from their transcription on chromatin to release and allows the quantification of dissociation dynamics. By incorporating genomic, transcriptomic, and epigenetic metrics, as well as RNA-binding protein propensities, in machine learning models, we identify features that define transcript groups of different chromatin dissociation dynamics. Notably, lncRNAs transcribed from enhancers display reduced chromatin retention, suggesting that, in addition to splicing, their chromatin dissociation may shape enhancer activity.


Assuntos
Cromatina , RNA Longo não Codificante , Cromatina/genética , RNA Longo não Codificante/genética , RNA Longo não Codificante/metabolismo , Regulação da Expressão Gênica/genética , Sequências Reguladoras de Ácido Nucleico , Transcriptoma
16.
NAR Genom Bioinform ; 5(2): lqad026, 2023 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-37007588

RESUMO

Dysfunction of regulatory elements through genetic variants is a central mechanism in the pathogenesis of disease. To better understand disease etiology, there is consequently a need to understand how DNA encodes regulatory activity. Deep learning methods show great promise for modeling of biomolecular data from DNA sequence but are limited to large input data for training. Here, we develop ChromTransfer, a transfer learning method that uses a pre-trained, cell-type agnostic model of open chromatin regions as a basis for fine-tuning on regulatory sequences. We demonstrate superior performances with ChromTransfer for learning cell-type specific chromatin accessibility from sequence compared to models not informed by a pre-trained model. Importantly, ChromTransfer enables fine-tuning on small input data with minimal decrease in accuracy. We show that ChromTransfer uses sequence features matching binding site sequences of key transcription factors for prediction. Together, these results demonstrate ChromTransfer as a promising tool for learning the regulatory code.

17.
Genome Biol ; 24(1): 180, 2023 08 04.
Artigo em Inglês | MEDLINE | ID: mdl-37542318

RESUMO

We present RBPNet, a novel deep learning method, which predicts CLIP-seq crosslink count distribution from RNA sequence at single-nucleotide resolution. By training on up to a million regions, RBPNet achieves high generalization on eCLIP, iCLIP and miCLIP assays, outperforming state-of-the-art classifiers. RBPNet performs bias correction by modeling the raw signal as a mixture of the protein-specific and background signal. Through model interrogation via Integrated Gradients, RBPNet identifies predictive sub-sequences that correspond to known and novel binding motifs and enables variant-impact scoring via in silico mutagenesis. Together, RBPNet improves imputation of protein-RNA interactions, as well as mechanistic interpretation of predictions.


Assuntos
Sequência de Bases , Simulação por Computador , Aprendizado Profundo , Proteínas de Ligação a RNA , RNA , Humanos , Alelos , Viés , Sítios de Ligação , Sequência Consenso , Conjuntos de Dados como Assunto , Internet , Mutação , Motivos de Nucleotídeos , Nucleotídeos/metabolismo , RNA/química , RNA/genética , RNA/metabolismo , Sítios de Splice de RNA , RNA Mensageiro/química , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , RNA Viral/química , RNA Viral/genética , RNA Viral/metabolismo , Proteínas de Ligação a RNA/química , Proteínas de Ligação a RNA/metabolismo
18.
NAR Genom Bioinform ; 5(1): lqad010, 2023 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-36814457

RESUMO

RNA-binding proteins (RBPs) are critical host factors for viral infection, however, large scale experimental investigation of the binding landscape of human RBPs to viral RNAs is costly and further complicated due to sequence variation between viral strains. To fill this gap, we investigated the role of RBPs in the context of SARS-CoV-2 by constructing the first in silico map of human RBP-viral RNA interactions at nucleotide-resolution using two deep learning methods (pysster and DeepRiPe) trained on data from CLIP-seq experiments on more than 100 human RBPs. We evaluated conservation of RBP binding between six other human pathogenic coronaviruses and identified sites of conserved and differential binding in the UTRs of SARS-CoV-1, SARS-CoV-2 and MERS. We scored the impact of mutations from 11 variants of concern on protein-RNA interaction, identifying a set of gain- and loss-of-binding events, as well as predicted the regulatory impact of putative future mutations. Lastly, we linked RBPs to functional, OMICs and COVID-19 patient data from other studies, and identified MBNL1, FTO and FXR2 RBPs as potential clinical biomarkers. Our results contribute towards a deeper understanding of how viruses hijack host cellular pathways and open new avenues for therapeutic intervention.

19.
Nucleic Acids Res ; 38(Database issue): D181-9, 2010 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-19910368

RESUMO

Membrane proteins are important for many processes in the cell and used as main drug targets. The increasing number of high-resolution structures available makes for the first time a characterization of local structural and functional motifs in alpha-helical transmembrane proteins possible. MeMotif (http://projects.biotec.tu-dresden.de/memotif) is a database and wiki which collects more than 2000 known and novel computationally predicted linear motifs in alpha-helical transmembrane proteins. Motifs are fully described in terms of several structural and functional features and editable. Motifs contained in MeMotif can be used in different biological applications, from the identification of biochemically important functional residues which are candidates for mutagenesis experiments to the improvement of tools for transmembrane protein modeling.


Assuntos
Motivos de Aminoácidos , Biologia Computacional/métodos , Bases de Dados Genéticas , Bases de Dados de Ácidos Nucleicos , Proteínas de Membrana/química , Proteínas de Bactérias/química , Biologia Computacional/tendências , Bases de Dados de Proteínas , Armazenamento e Recuperação da Informação/métodos , Internet , Estrutura Secundária de Proteína , Estrutura Terciária de Proteína , Software
20.
Front Genet ; 13: 909714, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35903362

RESUMO

COVID-19 is a heterogeneous disease caused by SARS-CoV-2. Aside from infections of the lungs, the disease can spread throughout the body and damage many other tissues, leading to multiorgan failure in severe cases. The highly variable symptom severity is influenced by genetic predispositions and preexisting diseases which have not been investigated in a large-scale multimodal manner. We present a holistic analysis framework, setting previously reported COVID-19 genes in context with prepandemic data, such as gene expression patterns across multiple tissues, polygenetic predispositions, and patient diseases, which are putative comorbidities of COVID-19. First, we generate a multimodal network using the prior-based network inference method KiMONo. We then embed the network to generate a meaningful lower-dimensional representation of the data. The input data are obtained via the Genotype-Tissue Expression project (GTEx), containing expression data from a range of tissues with genomic and phenotypic information of over 900 patients and 50 tissues. The generated network consists of nodes, that is, genes and polygenic risk scores (PRS) for several diseases/phenotypes, as well as for COVID-19 severity and hospitalization, and links between them if they are statistically associated in a regularized linear model by feature selection. Applying network embedding on the generated multimodal network allows us to perform efficient network analysis by identifying nodes close by in a lower-dimensional space that correspond to entities which are statistically linked. By determining the similarity between COVID-19 genes and other nodes through embedding, we identify disease associations to tissues, like the brain and gut. We also find strong associations between COVID-19 genes and various diseases such as ischemic heart disease, cerebrovascular disease, and hypertension. Moreover, we find evidence linking PTPN6 to a range of comorbidities along with the genetic predisposition of COVID-19, suggesting that this kinase is a central player in severe cases of COVID-19. In conclusion, our holistic network inference coupled with network embedding of multimodal data enables the contextualization of COVID-19-associated genes with respect to tissues, disease states, and genetic risk factors. Such contextualization can be exploited to further elucidate the biological importance of known and novel genes for severity of the disease in patients.

SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa