Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
bioRxiv ; 2024 Feb 17.
Artículo en Inglés | MEDLINE | ID: mdl-38405704

RESUMEN

Neural networks have emerged as immensely powerful tools in predicting functional genomic regions, notably evidenced by recent successes in deciphering gene regulatory logic. However, a systematic evaluation of how model architectures and training strategies impact genomics model performance is lacking. To address this gap, we held a DREAM Challenge where competitors trained models on a dataset of millions of random promoter DNA sequences and corresponding expression levels, experimentally determined in yeast, to best capture the relationship between regulatory DNA and gene expression. For a robust evaluation of the models, we designed a comprehensive suite of benchmarks encompassing various sequence types. While some benchmarks produced similar results across the top-performing models, others differed substantially. All top-performing models used neural networks, but diverged in architectures and novel training strategies, tailored to genomics sequence data. To dissect how architectural and training choices impact performance, we developed the Prix Fixe framework to divide any given model into logically equivalent building blocks. We tested all possible combinations for the top three models and observed performance improvements for each. The DREAM Challenge models not only achieved state-of-the-art results on our comprehensive yeast dataset but also consistently surpassed existing benchmarks on Drosophila and human genomic datasets. Overall, we demonstrate that high-quality gold-standard genomics datasets can drive significant progress in model development.

2.
Bioinformatics ; 39(8)2023 08 01.
Artículo en Inglés | MEDLINE | ID: mdl-37490428

RESUMEN

MOTIVATION: The increasing volume of data from high-throughput experiments including parallel reporter assays facilitates the development of complex deep-learning approaches for modeling DNA regulatory grammar. RESULTS: Here, we introduce LegNet, an EfficientNetV2-inspired convolutional network for modeling short gene regulatory regions. By approaching the sequence-to-expression regression problem as a soft classification task, LegNet secured first place for the autosome.org team in the DREAM 2022 challenge of predicting gene expression from gigantic parallel reporter assays. Using published data, here, we demonstrate that LegNet outperforms existing models and accurately predicts gene expression per se as well as the effects of single-nucleotide variants. Furthermore, we show how LegNet can be used in a diffusion network manner for the rational design of promoter sequences yielding the desired expression level. AVAILABILITY AND IMPLEMENTATION: https://github.com/autosome-ru/LegNet. The GitHub repository includes Jupyter Notebook tutorials and Python scripts under the MIT license to reproduce the results presented in the study.


Asunto(s)
Aprendizaje Profundo , Secuencias Reguladoras de Ácidos Nucleicos , ADN , Regiones Promotoras Genéticas , Programas Informáticos
3.
Data Brief ; 23: 103701, 2019 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-30815525

RESUMEN

TMA20 (MCT-1), TMA22 (DENR) and TMA64 (eIF2D) are eukaryotic translation factors involved in ribosome recycling and re-initiation. They operate with P-site bound tRNA in post-termination or (re-)initiation translation complexes, thus participating in the removal of 40S ribosomal subunit from mRNA stop codons after termination and controlling translation re-initiation on mRNAs with upstream open reading frames (uORFs), as well as de novo initiation on some specific mRNAs. Here we report ribosomal profiling data of S.cerevisiae strains with individual deletions of TMA20, TMA64 or both TMA20 and TMA64 genes. We provide RNA-Seq and Ribo-Seq data from yeast strains grown in the rich YPD or minimal SD medium. We illustrate our data by plotting differential distribution of ribosomal-bound mRNA fragments throughout uORFs in 5'-untranslated region (5' UTR) of GCN4 mRNA and on mRNA transcripts encoded in MAT locus in the mutant and wild-type strains, thus providing a basis for investigation of the role of these factors in the stress response, mating and sporulation. We also document a shift of transcription start site of the APC4 gene which occurs when the neighboring TMA64 gene is replaced by the standard G418-resistance cassette used for the creation of the Yeast Deletion Library. This shift results in dramatic deregulation of the APC4 gene expression, as revealed by our Ribo-Seq data, which can be probably used to explain strong genetic interactions of TMA64 with genes involved in the cell cycle and mitotic checkpoints. Raw RNA-Seq and Ribo-Seq data as well as all gene counts are available in NCBI Gene Expression Omnibus (GEO) repository under GEO accession GSE122039 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE122039).

4.
BMC Evol Biol ; 17(Suppl 2): 258, 2017 12 28.
Artículo en Inglés | MEDLINE | ID: mdl-29297306

RESUMEN

BACKGROUND: Gray whale, Eschrichtius robustus (E. robustus), is a single member of the family Eschrichtiidae, which is considered to be the most primitive in the class Cetacea. Gray whale is often described as a "living fossil". It is adapted to extreme marine conditions and has a high life expectancy (77 years). The assembly of a gray whale genome and transcriptome will allow to carry out further studies of whale evolution, longevity, and resistance to extreme environment. RESULTS: In this work, we report the first de novo assembly and primary analysis of the E. robustus genome and transcriptome based on kidney and liver samples. The presented draft genome assembly is complete by 55% in terms of a total genome length, but only by 24% in terms of the BUSCO complete gene groups, although 10,895 genes were identified. Transcriptome annotation and comparison with other whale species revealed robust expression of DNA repair and hypoxia-response genes, which is expected for whales. CONCLUSIONS: This preliminary study of the gray whale genome and transcriptome provides new data to better understand the whale evolution and the mechanisms of their adaptation to the hypoxic conditions.


Asunto(s)
Genoma , Transcriptoma/genética , Ballenas/genética , Animales , Regulación de la Expresión Génica , Biblioteca de Genes , Anotación de Secuencia Molecular , Filogenia
5.
Nucleic Acids Res ; 44(15): 7228-41, 2016 09 06.
Artículo en Inglés | MEDLINE | ID: mdl-27137890

RESUMEN

According to recent models, as yet poorly studied architectural proteins appear to be required for local regulation of enhancer-promoter interactions, as well as for global chromosome organization. Transcription factors ZIPIC, Pita and Zw5 belong to the class of chromatin insulator proteins and preferentially bind to promoters near the TSS and extensively colocalize with cohesin and condensin complexes. ZIPIC, Pita and Zw5 are structurally similar in containing the N-terminal zinc finger-associated domain (ZAD) and different numbers of C2H2-type zinc fingers at the C-terminus. Here we have shown that the ZAD domains of ZIPIC, Pita and Zw5 form homodimers. In Drosophila transgenic lines, these proteins are able to support long-distance interaction between GAL4 activator and the reporter gene promoter. However, no functional interaction between binding sites for different proteins has been revealed, suggesting that such interactions are highly specific. ZIPIC facilitates long-distance stimulation of the reporter gene by GAL4 activator in yeast model system. Many of the genomic binding sites of ZIPIC, Pita and Zw5 are located at the boundaries of topologically associated domains (TADs). Thus, ZAD-containing zinc-finger proteins can be attributed to the class of architectural proteins.


Asunto(s)
Proteínas de Unión al ADN/química , Proteínas de Unión al ADN/metabolismo , Proteínas de Drosophila/química , Proteínas de Drosophila/metabolismo , Drosophila melanogaster/metabolismo , Multimerización de Proteína , Factores de Transcripción/química , Factores de Transcripción/metabolismo , Animales , Animales Modificados Genéticamente , Sitios de Unión , Línea Celular , Proteínas de Drosophila/genética , Drosophila melanogaster/química , Drosophila melanogaster/citología , Drosophila melanogaster/embriología , Femenino , Genes Reporteros/genética , Masculino , Regiones Promotoras Genéticas , Unión Proteica , Dominios Proteicos , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/metabolismo , Especificidad por Sustrato , Factores de Transcripción/genética , Transgenes/genética , Dedos de Zinc
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...