Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
PLoS Comput Biol ; 19(3): e1010948, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-36897885

RESUMO

G-quadruplexes are non-B-DNA structures that form in the genome facilitated by Hoogsteen bonds between guanines in single or multiple strands of DNA. The functions of G-quadruplexes are linked to various molecular and disease phenotypes, and thus researchers are interested in measuring G-quadruplex formation genome-wide. Experimentally measuring G-quadruplexes is a long and laborious process. Computational prediction of G-quadruplex propensity from a given DNA sequence is thus a long-standing challenge. Unfortunately, despite the availability of high-throughput datasets measuring G-quadruplex propensity in the form of mismatch scores, extant methods to predict G-quadruplex formation either rely on small datasets or are based on domain-knowledge rules. We developed G4mismatch, a novel algorithm to accurately and efficiently predict G-quadruplex propensity for any genomic sequence. G4mismatch is based on a convolutional neural network trained on almost 400 millions human genomic loci measured in a single G4-seq experiment. When tested on sequences from a held-out chromosome, G4mismatch, the first method to predict mismatch scores genome-wide, achieved a Pearson correlation of over 0.8. When benchmarked on independent datasets derived from various animal species, G4mismatch trained on human data predicted G-quadruplex propensity genome-wide with high accuracy (Pearson correlations greater than 0.7). Moreover, when tested in detecting G-quadruplexes genome-wide using the predicted mismatch scores, G4mismatch achieved superior performance compared to extant methods. Last, we demonstrate the ability to deduce the mechanism behind G-quadruplex formation by unique visualization of the principles learned by the model.


Assuntos
Quadruplex G , Animais , Humanos , DNA/genética , DNA/química , Genoma Humano , Genômica , Redes Neurais de Computação
2.
Nucleic Acids Res ; 50(20): 11426-11441, 2022 11 11.
Artigo em Inglês | MEDLINE | ID: mdl-36350614

RESUMO

RNA G-quadruplexes (rG4s) are RNA secondary structures, which are formed by guanine-rich sequences and have important cellular functions. Existing computational tools for rG4 prediction rely on specific sequence features and/or were trained on small datasets, without considering rG4 stability information, and are therefore sub-optimal. Here, we developed rG4detector, a convolutional neural network to identify potential rG4s in transcriptomics data. rG4detector outperforms existing methods in both predicting rG4 stability and in detecting rG4-forming sequences. To demonstrate the biological-relevance of rG4detector, we employed it to study RNAs that are bound by the RNA-binding protein G3BP1. G3BP1 is central to the induction of stress granules (SGs), which are cytoplasmic biomolecular condensates that form in response to a variety of cellular stresses. Unexpectedly, rG4detector revealed a dynamic enrichment of rG4s bound by G3BP1 in response to cellular stress. In addition, we experimentally characterized G3BP1 cross-talk with rG4s, demonstrating that G3BP1 is a bona fide rG4-binding protein and that endogenous rG4s are enriched within SGs. Furthermore, we found that reduced rG4 availability impairs SG formation. Hence, we conclude that rG4s play a direct role in SG biology via their interactions with RNA-binding proteins and that rG4detector is a novel useful tool for rG4 transcriptomics data analyses.


Assuntos
Quadruplex G , Proteínas de Ligação a RNA , Grânulos de Estresse , DNA Helicases/genética , DNA Helicases/metabolismo , Proteínas de Ligação a Poli-ADP-Ribose/genética , Proteínas de Ligação a Poli-ADP-Ribose/metabolismo , RNA/química , RNA Helicases/genética , RNA Helicases/metabolismo , Proteínas com Motivo de Reconhecimento de RNA/genética , Proteínas com Motivo de Reconhecimento de RNA/metabolismo , Proteínas de Ligação a RNA/metabolismo
3.
IEEE/ACM Trans Comput Biol Bioinform ; 19(4): 1946-1955, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-33872156

RESUMO

G-quadruplexes (G4s) are nucleic acid secondary structures that form within guanine-rich DNA or RNA sequences. G4 formation can affect chromatin architecture and gene regulation, and has been associated with genomic instability, genetic diseases, and cancer progression. The experimental data produced by the G4-seq experiment provides unprecedented details on G4 formation in the genome. Still, running the experimental protocol on a whole genome is an expensive and time-consuming process. Thus, it is highly desirable to have a computational method to predict G4 formation in new DNA sequences or whole genomes. Here, we present G4detector, a new method based on a convolutional neural network to predict G4s from DNA sequences. On top of the sequence information, we improved prediction accuracy by the addition of RNA secondary structure information. To train and test G4detector, we compiled novel high-throughput benchmarks over multiple species genomes measured by the G4-seq protocol. We show that G4detector outperforms extant methods for the same task on all benchmark datasets, can detect G4s genome-wide with high accuracy, and is able to extrapolate human-trained measurements to various non-human species. The code and benchmarks are publicly available on github.com/OrensteinLab/G4detector.


Assuntos
Quadruplex G , DNA/química , DNA/genética , Genoma , Redes Neurais de Computação , RNA/química
4.
ACS Chem Biol ; 15(4): 925-935, 2020 04 17.
Artigo em Inglês | MEDLINE | ID: mdl-32216326

RESUMO

Single-stranded DNA (ssDNA) containing four guanine repeats can form G-quadruplex (G4) structures. While cellular proteins and small molecules can bind G4s, it has been difficult to broadly assess their DNA-binding specificity. Here, we use custom DNA microarrays to examine the binding specificities of proteins, small molecules, and antibodies across ∼15,000 potential G4 structures. Molecules used include fluorescently labeled pyridostatin (Cy5-PDS, a small molecule), BG4 (Cy5-BG4, a G4-specific antibody), and eight proteins (GST-tagged nucleolin, IGF2, CNBP, FANCJ, PIF1, BLM, DHX36, and WRN). Cy5-PDS and Cy5-BG4 selectively bind sequences known to form G4s, confirming their formation on the microarrays. Cy5-PDS binding decreased when G4 formation was inhibited using lithium or when ssDNA features on the microarray were made double-stranded. Similar conditions inhibited the binding of all other molecules except for CNBP and PIF1. We report that proteins have different G4-binding preferences suggesting unique cellular functions. Finally, competition experiments are used to assess the binding specificity of an unlabeled small molecule, revealing the structural features in the G4 required to achieve selectivity. These data demonstrate that the microarray platform can be used to assess the binding preferences of molecules to G4s on a broad scale, helping to understand the properties that govern molecular recognition.


Assuntos
DNA de Cadeia Simples/metabolismo , Proteínas de Ligação a DNA/metabolismo , Quadruplex G , DNA de Cadeia Simples/genética , Humanos , Análise de Sequência com Séries de Oligonucleotídeos , Polimorfismo de Nucleotídeo Único , Ligação Proteica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...