Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Brief Bioinform ; 24(4)2023 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-37280185

RESUMO

The three-dimensional structure of RNA molecules plays a critical role in a wide range of cellular processes encompassing functions from riboswitches to epigenetic regulation. These RNA structures are incredibly dynamic and can indeed be described aptly as an ensemble of structures that shifts in distribution depending on different cellular conditions. Thus, the computational prediction of RNA structure poses a unique challenge, even as computational protein folding has seen great advances. In this review, we focus on a variety of machine learning-based methods that have been developed to predict RNA molecules' secondary structure, as well as more complex tertiary structures. We survey commonly used modeling strategies, and how many are inspired by or incorporate thermodynamic principles. We discuss the shortcomings that various design decisions entail and propose future directions that could build off these methods to yield more robust, accurate RNA structure predictions.


Assuntos
Epigênese Genética , RNA , RNA/metabolismo , Aprendizado de Máquina , Estrutura Secundária de Proteína , Biologia Computacional/métodos
2.
Proc Natl Acad Sci U S A ; 118(15)2021 04 13.
Artigo em Inglês | MEDLINE | ID: mdl-33827925

RESUMO

Simultaneous profiling of multiomic modalities within a single cell is a grand challenge for single-cell biology. While there have been impressive technical innovations demonstrating feasibility-for example, generating paired measurements of single-cell transcriptome (single-cell RNA sequencing [scRNA-seq]) and chromatin accessibility (single-cell assay for transposase-accessible chromatin using sequencing [scATAC-seq])-widespread application of joint profiling is challenging due to its experimental complexity, noise, and cost. Here, we introduce BABEL, a deep learning method that translates between the transcriptome and chromatin profiles of a single cell. Leveraging an interoperable neural network model, BABEL can predict single-cell expression directly from a cell's scATAC-seq and vice versa after training on relevant data. This makes it possible to computationally synthesize paired multiomic measurements when only one modality is experimentally available. Across several paired single-cell ATAC and gene expression datasets in human and mouse, we validate that BABEL accurately translates between these modalities for individual cells. BABEL also generalizes well to cell types within new biological contexts not seen during training. Starting from scATAC-seq of patient-derived basal cell carcinoma (BCC), BABEL generated single-cell expression that enabled fine-grained classification of complex cell states, despite having never seen BCC data. These predictions are comparable to analyses of experimental BCC scRNA-seq data for diverse cell types related to BABEL's training data. We further show that BABEL can incorporate additional single-cell data modalities, such as protein epitope profiling, thus enabling translation across chromatin, RNA, and protein. BABEL offers a powerful approach for data exploration and hypothesis generation.


Assuntos
Carcinoma/genética , Genômica/métodos , Análise de Célula Única/métodos , Software , Animais , Carcinoma/metabolismo , Aprendizado Profundo , Humanos , Camundongos , Proteoma/genética , Proteoma/metabolismo , Transcriptoma
3.
RNA ; 26(7): 851-865, 2020 07.
Artigo em Inglês | MEDLINE | ID: mdl-32220894

RESUMO

Subcellular localization is essential to RNA biogenesis, processing, and function across the gene expression life cycle. However, the specific nucleotide sequence motifs that direct RNA localization are incompletely understood. Fortunately, new sequencing technologies have provided transcriptome-wide atlases of RNA localization, creating an opportunity to leverage computational modeling. Here we present RNA-GPS, a new machine learning model that uses nucleotide-level features to predict RNA localization across eight different subcellular locations-the first to provide such a wide range of predictions. RNA-GPS's design enables high-throughput sequence ablation and feature importance analyses to probe the sequence motifs that drive localization prediction. We find localization informative motifs to be concentrated on 3'-UTRs and scattered along the coding sequence, and motifs related to splicing to be important drivers of predicted localization, even for cytotopic distinctions for membraneless bodies within the nucleus or for organelles within the cytoplasm. Overall, our results suggest transcript splicing is one of many elements influencing RNA subcellular localization.


Assuntos
Processamento Alternativo/genética , RNA/genética , Regiões 3' não Traduzidas/genética , Linhagem Celular Tumoral , Núcleo Celular/genética , Biologia Computacional/métodos , Citoplasma/genética , Células HeLa , Humanos , Células K562 , Análise de Sequência de RNA/métodos , Transcriptoma/genética
4.
Nat Commun ; 15(1): 1059, 2024 Feb 05.
Artigo em Inglês | MEDLINE | ID: mdl-38316764

RESUMO

The ability to computationally generate novel yet physically foldable protein structures could lead to new biological discoveries and new treatments targeting yet incurable diseases. Despite recent advances in protein structure prediction, directly generating diverse, novel protein structures from neural networks remains difficult. In this work, we present a diffusion-based generative model that generates protein backbone structures via a procedure inspired by the natural folding process. We describe a protein backbone structure as a sequence of angles capturing the relative orientation of the constituent backbone atoms, and generate structures by denoising from a random, unfolded state towards a stable folded structure. Not only does this mirror how proteins natively twist into energetically favorable conformations, the inherent shift and rotational invariance of this representation crucially alleviates the need for more complex equivariant networks. We train a denoising diffusion probabilistic model with a simple transformer backbone and demonstrate that our resulting model unconditionally generates highly realistic protein structures with complexity and structural patterns akin to those of naturally-occurring proteins. As a useful resource, we release an open-source codebase and trained models for protein structure diffusion.


Assuntos
Dobramento de Proteína , Proteínas , Proteínas/metabolismo , Redes Neurais de Computação , Conformação Proteica
5.
Cell Syst ; 11(1): 102-108.e3, 2020 07 22.
Artigo em Inglês | MEDLINE | ID: mdl-32673562

RESUMO

SARS-CoV-2 genomic and subgenomic RNA (sgRNA) transcripts hijack the host cell's machinery. Subcellular localization of its viral RNA could, thus, play important roles in viral replication and host antiviral immune response. We perform computational modeling of SARS-CoV-2 viral RNA subcellular residency across eight subcellular neighborhoods. We compare hundreds of SARS-CoV-2 genomes with the human transcriptome and other coronaviruses. We predict the SARS-CoV-2 RNA genome and sgRNAs to be enriched toward the host mitochondrial matrix and nucleolus, and that the 5' and 3' viral untranslated regions contain the strongest, most distinct localization signals. We interpret the mitochondrial residency signal as an indicator of intracellular RNA trafficking with respect to double-membrane vesicles, a critical stage in the coronavirus life cycle. Our computational analysis serves as a hypothesis generation tool to suggest models for SARS-CoV-2 biology and inform experimental efforts to combat the virus. A record of this paper's Transparent Peer Review process is included in the Supplemental Information.


Assuntos
Betacoronavirus/genética , Nucléolo Celular/virologia , Infecções por Coronavirus/virologia , Mitocôndrias/virologia , Pneumonia Viral/virologia , RNA Viral/metabolismo , Betacoronavirus/metabolismo , COVID-19 , Nucléolo Celular/metabolismo , Bases de Dados Genéticas , Genoma Viral , Humanos , Aprendizado de Máquina , Mitocôndrias/metabolismo , Modelos Genéticos , Pandemias , RNA Viral/genética , SARS-CoV-2
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA