GANSamples-ac4C: Enhancing ac4C site prediction via generative adversarial networks and transfer learning.
Anal Biochem
; 689: 115495, 2024 Jun.
Article
en En
| MEDLINE
| ID: mdl-38431142
ABSTRACT
RNA modification, N4-acetylcytidine (ac4C), is enzymatically catalyzed by N-acetyltransferase 10 (NAT10) and plays an essential role across tRNA, rRNA, and mRNA. It influences various cellular functions, including mRNA stability and rRNA biosynthesis. Wet-lab detection of ac4C modification sites is highly resource-intensive and costly. Therefore, various machine learning and deep learning techniques have been employed for computational detection of ac4C modification sites. The known ac4C modification sites are limited for training an accurate and stable prediction model. This study introduces GANSamples-ac4C, a novel framework that synergizes transfer learning and generative adversarial network (GAN) to generate synthetic RNA sequences to train a better ac4C modification site prediction model. Comparative analysis reveals that GANSamples-ac4C outperforms existing state-of-the-art methods in identifying ac4C sites. Moreover, our result underscores the potential of synthetic data in mitigating the issue of data scarcity for biological sequence prediction tasks. Another major advantage of GANSamples-ac4C is its interpretable decision logic. Multi-faceted interpretability analyses detect key regions in the ac4C sequences influencing the discriminating decision between positive and negative samples, a pronounced enrichment of G in this region, and ac4C-associated motifs. These findings may offer novel insights for ac4C research. The GANSamples-ac4C framework and its source code are publicly accessible at http//www.healthinformaticslab.org/supp/.
Palabras clave
Texto completo:
1
Banco de datos:
MEDLINE
Asunto principal:
Citidina
/
Aprendizaje Automático
Idioma:
En
Revista:
Anal Biochem
Año:
2024
Tipo del documento:
Article
País de afiliación:
China