Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 3 de 3
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Brief Bioinform ; 25(3)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38622357

RESUMO

Pseudouridine is an RNA modification that is widely distributed in both prokaryotes and eukaryotes, and plays a critical role in numerous biological activities. Despite its importance, the precise identification of pseudouridine sites through experimental approaches poses significant challenges, requiring substantial time and resources.Therefore, there is a growing need for computational techniques that can reliably and quickly identify pseudouridine sites from vast amounts of RNA sequencing data. In this study, we propose fuzzy kernel evidence Random Forest (FKeERF) to identify pseudouridine sites. This method is called PseU-FKeERF, which demonstrates high accuracy in identifying pseudouridine sites from RNA sequencing data. The PseU-FKeERF model selected four RNA feature coding schemes with relatively good performance for feature combination, and then input them into the newly proposed FKeERF method for category prediction. FKeERF not only uses fuzzy logic to expand the original feature space, but also combines kernel methods that are easy to interpret in general for category prediction. Both cross-validation tests and independent tests on benchmark datasets have shown that PseU-FKeERF has better predictive performance than several state-of-the-art methods. This new method not only improves the accuracy of pseudouridine site identification, but also provides a certain reference for disease control and related drug development in the future.


Assuntos
Pseudouridina , Algoritmo Florestas Aleatórias , Pseudouridina/genética , RNA/genética , Sequência de Bases
2.
Front Biosci (Landmark Ed) ; 27(6): 177, 2022 06 02.
Artigo em Inglês | MEDLINE | ID: mdl-35748253

RESUMO

BACKGROUND: Channel proteins are proteins that can transport molecules past the plasma membrane through free diffusion movement. Due to the cost of labor and experimental methods, developing a tool to identify channel proteins is necessary for biological research on channel proteins. METHODS: 17 feature coding methods and four machine learning classifiers to generate 68-dimensional data probability features. Then, the two-step feature selection strategy was used to optimize the features, and the final prediction Model M16-LGBM (light gradient boosting machine) was obtained on the 16-dimensional optimal feature vector. RESULTS: A new predictor, CAPs-LGBM, was proposed to identify the channel proteins effectively. CONCLUSIONS: CAPs-LGBM is the first channel protein machine learning predictor was used to construct the final prediction model based on protein primary sequences. The classifier performed well in the training and test sets.


Assuntos
Biologia Computacional , Proteínas , Algoritmos , Sequência de Aminoácidos , Biologia Computacional/métodos , Aprendizado de Máquina , Máquina de Vetores de Suporte
3.
Front Cell Dev Biol ; 8: 624766, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-34026747

RESUMO

Periodontal diseases are among the most frequent inflammatory diseases affecting children and adolescents, which affect the supporting structures of the teeth and lead to tooth loss and contribute to systemic inflammation. Gingivitis is the most common periodontal infection. Gingivitis, which is mainly caused by a substance produced by microbial plaque, systemic disorders, and genetic abnormalities in the host. Identifying gingivitis-related genes across human tissues is not only significant for understanding disease mechanisms but also disease development and clinical diagnosis. The Genome-wide association study (GWAS) a commonly used method to mine disease-related genetic variants. However, due to some factors such as linkage disequilibrium, it is difficult for GWAS to identify genes directly related to the disease. Hence, we constructed a data integration method that uses the Summary Mendelian randomization (SMR) to combine the GWAS with expression quantitative trait locus (eQTL) data to identify gingivitis-related genes. Five eQTL studies from different human tissues and one GWAS studies were referenced in this paper. This study identified several candidates SNPs and genes relate to gingivitis in tissue-specific or cross-tissue. Further, we also analyzed and explained the functions of these genes. The R program for the SMR method has been uploaded to GitHub(https://github.com/hxdde/SMR).

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA