Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
1.
Genomics ; 116(1): 110749, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38008265

RESUMO

MOTIVATION: N4-acetylcytidine (ac4C) is a highly conserved RNA modification that plays a crucial role in various biological processes. Accurately identifying ac4C sites is of paramount importance for gaining a deeper understanding of their regulatory mechanisms. Nevertheless, the existing experimental techniques for ac4C site identification are characterized by limitations in terms of cost-effectiveness, while the performance of current computational methods in accurately identifying ac4C sites requires further enhancement. RESULTS: In this paper, we present MetaAc4C, an advanced deep learning model that leverages pre-trained bidirectional encoder representations from transformers (BERT). The model is based on a bi-directional long short-term memory network (BLSTM) architecture, incorporating attention mechanism and residual connection. To address the issue of data imbalance, we adapt generative adversarial networks to generate synthetic feature samples. On the independent test set, MetaAc4C surpasses the current state-of-the-art ac4C prediction model, exhibiting improvements in terms of ACC, MCC, and AUROC by 2.36%, 4.76%, and 3.11%, respectively, on the unbalanced dataset. When evaluated on the balanced dataset, MetaAc4C achieves improvements in ACC, MCC, and AUROC by 2.6%, 5.11%, and 1.01%, respectively. Notably, our approach of utilizing WGAN-GP augmented training RNA samples demonstrates even superior performance compared to the SMOTE oversampling method.


Assuntos
Aprendizado Profundo , Citidina , RNA
2.
Brief Bioinform ; 23(2)2022 03 10.
Artigo em Inglês | MEDLINE | ID: mdl-35189635

RESUMO

Protein lysine crotonylation (Kcr) is an important type of posttranslational modification that is associated with a wide range of biological processes. The identification of Kcr sites is critical to better understanding their functional mechanisms. However, the existing experimental techniques for detecting Kcr sites are cost-ineffective, to a great need for new computational methods to address this problem. We here describe Adapt-Kcr, an advanced deep learning model that utilizes adaptive embedding and is based on a convolutional neural network together with a bidirectional long short-term memory network and attention architecture. On the independent testing set, Adapt-Kcr outperformed the current state-of-the-art Kcr prediction model, with an improvement of 3.2% in accuracy and 1.9% in the area under the receiver operating characteristic curve. Compared to other Kcr models, Adapt-Kcr additionally had a more robust ability to distinguish between crotonylation and other lysine modifications. Another model (Adapt-ST) was trained to predict phosphorylation sites in SARS-CoV-2, and outperformed the equivalent state-of-the-art phosphorylation site prediction model. These results indicate that self-adaptive embedding features perform better than handcrafted features in capturing discriminative information; when used in attention architecture, this could be an effective way of identifying protein Kcr sites. Together, our Adapt framework (including learning embedding features and attention architecture) has a strong potential for prediction of other protein posttranslational modification sites.


Assuntos
Biologia Computacional , Aprendizado Profundo , Lisina/metabolismo , Processamento de Proteína Pós-Traducional , Software , Algoritmos , Benchmarking , Biologia Computacional/métodos , Biologia Computacional/normas , Bases de Dados Factuais , Redes Neurais de Computação , Fosforilação , Curva ROC , Reprodutibilidade dos Testes , Interface Usuário-Computador
3.
Bioinformatics ; 39(1)2023 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-36525367

RESUMO

SUMMARY: Non-coding RNAs play important roles in transcriptional processes and participate in the regulation of various biological functions, in particular miRNAs and lncRNAs. Despite their importance for several biological functions, the existing signaling pathway databases do not include information on miRNA and lncRNA. Here, we redesigned a novel pathway database named NcPath by integrating and visualizing a total of 178 308 human experimentally validated miRNA-target interactions (MTIs), 32 282 experimentally verified lncRNA-target interactions (LTIs) and 4837 experimentally validated human ceRNA networks across 222 KEGG pathways (including 27 sub-categories). To expand the application potential of the redesigned NcPath database, we identified 556 798 reliable lncRNA-protein-coding genes (PCG) interaction pairs by integrating co-expression relations, ceRNA relations, co-TF-binding interactions, co-histone-modification interactions, cis-regulation relations and lncPro Tool predictions between lncRNAs and PCG. In addition, to determine the pathways in which miRNA/lncRNA targets are involved, we performed a KEGG enrichment analysis using a hypergeometric test. The NcPath database also provides information on MTIs/LTIs/ceRNA networks, PubMed IDs, gene annotations and the experimental verification method used. In summary, the NcPath database will serve as an important and continually updated platform that provides annotation and visualization of the pathways on which non-coding RNAs (miRNA and lncRNA) are involved, and provide support to multimodal non-coding RNAs enrichment analysis. The NcPath database is freely accessible at http://ncpath.pianlab.cn/. AVAILABILITY AND IMPLEMENTATION: NcPath database is freely available at http://ncpath.pianlab.cn/. The code and manual to use NcPath can be found at https://github.com/Marscolono/NcPath/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
MicroRNAs , RNA Longo não Codificante , Humanos , RNA Longo não Codificante/genética , RNA Longo não Codificante/metabolismo , RNA Mensageiro/metabolismo , Redes Reguladoras de Genes , MicroRNAs/genética , MicroRNAs/metabolismo , Transdução de Sinais
4.
BMC Bioinformatics ; 22(1): 27, 2021 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-33482718

RESUMO

BACKGROUND: Currently, large-scale gene expression profiling has been successfully applied to the discovery of functional connections among diseases, genetic perturbation, and drug action. To address the cost of an ever-expanding gene expression profile, a new, low-cost, high-throughput reduced representation expression profiling method called L1000 was proposed, with which one million profiles were produced. Although a set of ~ 1000 carefully chosen landmark genes that can capture ~ 80% of information from the whole genome has been identified for use in L1000, the robustness of using these landmark genes to infer target genes is not satisfactory. Therefore, more efficient computational methods are still needed to deep mine the influential genes in the genome. RESULTS: Here, we propose a computational framework based on deep learning to mine a subset of genes that can cover more genomic information. Specifically, an AutoEncoder framework is first constructed to learn the non-linear relationship between genes, and then DeepLIFT is applied to calculate gene importance scores. Using this data-driven approach, we have re-obtained a landmark gene set. The result shows that our landmark genes can predict target genes more accurately and robustly than that of L1000 based on two metrics [mean absolute error (MAE) and Pearson correlation coefficient (PCC)]. This reveals that the landmark genes detected by our method contain more genomic information. CONCLUSIONS: We believe that our proposed framework is very suitable for the analysis of biological big data to reveal the mysteries of life. Furthermore, the landmark genes inferred from this study can be used for the explosive amplification of gene expression profiles to facilitate research into functional connections.


Assuntos
Aprendizado Profundo , Perfilação da Expressão Gênica , Genômica , Genoma , Transcriptoma
5.
J Cancer Res Clin Oncol ; 149(11): 9151-9165, 2023 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-37178426

RESUMO

PURPOSE: HGSOC is a kind of gynecological cancer with high mortality and strong heterogeneity. The study used multi-omics and multiple algorithms to identify novel molecular subtypes, which can help patients obtain more personalized treatments. METHODS: Firstly, the consensus clustering result was obtained using a consensus ensemble of ten classical clustering algorithms, based on mRNA, lncRNA, DNA methylation, and mutation data. The difference in signaling pathways was evaluated using the single-sample gene set enrichment analysis (ssGSEA). Meanwhile, the relationship between genetic alteration, response to immunotherapy, drug sensitivity, prognosis, and subtypes was further analyzed. Finally, the reliability of the new subtype was verified in three external datasets. RESULTS: Three molecular subtypes were identified. Immune desert subtype (CS1) had little enrichment in the immune microenvironment and metabolic pathways. Immune/non-stromal subtype (CS2) was enriched in the immune microenvironment and metabolism of polyamines. Immune/stromal subtype (CS3) not only enriched anti-tumor immune microenvironment characteristics but also enriched pro-tumor stroma characteristics, glycosaminoglycan metabolism, and sphingolipid metabolism. The CS2 had the best overall survival and the highest response rate to immunotherapy. The CS3 had the worst prognosis and the lowest response rate to immunotherapy but was more sensitive to PARP and VEGFR molecular-targeted therapy. The similar differences among three subtypes were successfully validated in three external cohorts. CONCLUSION: We used ten clustering algorithms to comprehensively analyze four types of omics data, identified three biologically significant subtypes of HGSOC patients, and provided personalized treatment recommendations for each subtype. Our findings provided novel views into the HGSOC subtypes and could provide potential clinical treatment strategies.


Assuntos
Neoplasias Ovarianas , Humanos , Feminino , Neoplasias Ovarianas/terapia , Neoplasias Ovarianas/tratamento farmacológico , Multiômica , Medicina de Precisão , Reprodutibilidade dos Testes , Prognóstico , Análise de Dados , Microambiente Tumoral
6.
J Cancer Res Clin Oncol ; 149(15): 13823-13839, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37535162

RESUMO

PURPOSE: Cancer stem cells are associated with unfavorable prognosis in hepatocellular carcinoma (HCC). However, existing stemness-related biomarkers and prognostic models are limited. METHODS: The stemness-related signatures were derived from taking the union of the results obtained by performing WGCNA and CytoTRACE analysis at the bulk RNA-seq and scRNA-seq levels, respectively. Univariate Cox regression and the LASSO were applied for filtering prognosis-related signatures and selecting variables. Finally, ten gene signatures were identified to construct the prognostic model. We evaluated the differences in survival, genomic alternation, biological processes, and degree of immune cell infiltration in the high- and low-risk groups. pRRophetic and Tumor Immune Dysfunction and Exclusion (TIDE) algorithms were utilized to predict chemosensitivity and immunotherapy response. Human Protein Atlas (HPA) database was used to evaluate the protein expressions. RESULTS: A stemness-related prognostic model was constructed with ten genes including YBX1, CYB5R3, CDC20, RAMP3, LDHA, MTHFS, PTRH2, SRPRB, GNA14, and CLEC3B. Kaplan-Meier and ROC curve analyses showed that the high-risk group had a worse prognosis and the AUC of the model in four datasets was greater than 0.64. Multivariate Cox regression analyses verified that the model was an independent prognostic indicator in predicting overall survival, and a nomogram was then built for clinical utility in predicting the prognosis of HCC. Additionally, chemotherapy drug sensitivity and immunotherapy response analyses revealed that the high-risk group exhibited a higher likelihood of benefiting from these treatments. CONCLUSION: The novel stemness-related prognostic model is a promising biomarker for estimating overall survival in HCC.

7.
Genes (Basel) ; 13(7)2022 06 23.
Artigo em Inglês | MEDLINE | ID: mdl-35885905

RESUMO

Small molecular networks within complex pathways are defined as subpathways. The identification of patient-specific subpathways can reveal the etiology of cancer and guide the development of personalized therapeutic strategies. The dysfunction of subpathways has been associated with the occurrence and development of cancer. Here, we propose a strategy to identify aberrant subpathways at the individual level by calculating the edge score and using the Gene Set Enrichment Analysis (GSEA) method. This provides a novel approach to subpathway analysis. We applied this method to the expression data of a lung adenocarcinoma (LUAD) dataset from The Cancer Genome Atlas (TCGA) database. We validated the effectiveness of this method in identifying LUAD-relevant subpathways and demonstrated its reliability using an independent Gene Expression Omnibus dataset (GEO). Additionally, survival analysis was applied to illustrate the clinical application value of the genes and edges in subpathways that were associated with the prognosis of patients and cancer immunity, which could be potential biomarkers. With these analyses, we show that our method could help uncover subpathways underlying lung adenocarcinoma.


Assuntos
Adenocarcinoma de Pulmão , Neoplasias Pulmonares , Adenocarcinoma de Pulmão/genética , Humanos , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/patologia , Prognóstico , Reprodutibilidade dos Testes
8.
PeerJ ; 9: e11426, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34055486

RESUMO

Long non-coding RNA (lncRNA)-microRNA (miRNA) interactions are quickly emerging as important mechanisms underlying the functions of non-coding RNAs. Accordingly, predicting lncRNA-miRNA interactions provides an important basis for understanding the mechanisms of action of ncRNAs. However, the accuracy of the established prediction methods is still limited. In this study, we used structural consistency to measure the predictability of interactive links based on a bilayer network by integrating information for known lncRNA-miRNA interactions, an lncRNA similarity network, and an miRNA similarity network. In particular, by using the structural perturbation method, we proposed a framework called SPMLMI to predict potential lncRNA-miRNA interactions based on the bilayer network. We found that the structural consistency of the bilayer network was higher than that of any single network, supporting the utility of bilayer network construction for the prediction of lncRNA-miRNA interactions. Applying SPMLMI to three real datasets, we obtained areas under the curves of 0.9512 ± 0.0034, 0.8767 ± 0.0033, and 0.8653 ± 0.0021 based on 5-fold cross-validation, suggesting good model performance. In addition, the generalizability of SPMLMI was better than that of the previously established methods. Case studies of two lncRNAs (i.e., SNHG14 and MALAT1) further demonstrated the feasibility and effectiveness of the method. Therefore, SPMLMI is a feasible approach to identify novel lncRNA-miRNA interactions underlying complex biological processes.

9.
Genes (Basel) ; 12(2)2021 01 28.
Artigo em Inglês | MEDLINE | ID: mdl-33525573

RESUMO

In genome-wide association studies, detecting high-order epistasis is important for analyzing the occurrence of complex human diseases and explaining missing heritability. However, there are various challenges in the actual high-order epistasis detection process due to the large amount of data, "small sample size problem", diversity of disease models, etc. This paper proposes a multi-objective genetic algorithm (EpiMOGA) for single nucleotide polymorphism (SNP) epistasis detection. The K2 score based on the Bayesian network criterion and the Gini index of the diversity of the binary classification problem were used to guide the search process of the genetic algorithm. Experiments were performed on 26 simulated datasets of different models and a real Alzheimer's disease dataset. The results indicated that EpiMOGA was obviously superior to other related and competitive methods in both detection efficiency and accuracy, especially for small-sample-size datasets, and the performance of EpiMOGA remained stable across datasets of different disease models. At the same time, a number of SNP loci and 2-order epistasis associated with Alzheimer's disease were identified by the EpiMOGA method, indicating that this method is capable of identifying high-order epistasis from genome-wide data and can be applied in the study of complex diseases.


Assuntos
Epistasia Genética/genética , Estudo de Associação Genômica Ampla/estatística & dados numéricos , Genoma/genética , Algoritmos , Teorema de Bayes , Humanos , Modelos Genéticos , Polimorfismo de Nucleotídeo Único/genética
10.
Genes (Basel) ; 11(11)2020 10 29.
Artigo em Inglês | MEDLINE | ID: mdl-33138076

RESUMO

Identifying perturbed pathways at an individual level is important to discover the causes of cancer and develop individualized custom therapeutic strategies. Though prognostic gene lists have had success in prognosis prediction, using single genes that are related to the relevant system or specific network cannot fully reveal the process of tumorigenesis. We hypothesize that in individual samples, the disruption of transcription homeostasis can influence the occurrence, development, and metastasis of tumors and has implications for patient survival outcomes. Here, we introduced the individual-level pathway score, which can measure the correlation perturbation of the pathways in a single sample well. We applied this method to the expression data of 16 different cancer types from The Cancer Genome Atlas (TCGA) database. Our results indicate that different cancer types as well as their tumor-adjacent tissues can be clearly distinguished by the individual-level pathway score. Additionally, we found that there was strong heterogeneity among different cancer types and the percentage of perturbed pathways as well as the perturbation proportions of tumor samples in each pathway were significantly different. Finally, the prognosis-related pathways of different cancer types were obtained by survival analysis. We demonstrated that the individual-level pathway score (iPS) is capable of classifying cancer types and identifying some key prognosis-related pathways.


Assuntos
Neoplasias/genética , Estudos de Casos e Controles , Bases de Dados de Ácidos Nucleicos , Feminino , Perfilação da Expressão Gênica , Regulação Neoplásica da Expressão Gênica , Humanos , Estimativa de Kaplan-Meier , Masculino , Neoplasias/classificação , Neoplasias/mortalidade , Prognóstico , RNA-Seq
11.
J Hazard Mater ; 351: 240-249, 2018 06 05.
Artigo em Inglês | MEDLINE | ID: mdl-29550558

RESUMO

Although bioaugmentation of pollutant-contaminated sites is a great concern, there are few reports on the relationships among indigenous microbial consortia, exogenous inocula, and pollutants in a bioaugmentation process. In this study, bioaugmentation with Pseudochrobactrum sp. BSQ1 and Massilia sp. BLM18, which can hydrolytically and reductively dehalogenate chlorothalonil (TPN), respectively, was studied for its ability to remove TPN from soil; the alteration of the soil microbial community during the bioaugmentation process was investigated. The results showed that TPN (50 mg/kg) was completely removed in both bioaugmentation treatments within 35 days with half-lives of 6.8 and 9.8 days for strains BSQ1 and BLM18 respectively. In high concentration of TPN-treated soils (100 mg/kg), the bioaugmentation with strains BSQ1 and BLM18 respectively reduced 76.7% and 62.0% of TPN within 35 days. The TPN treatment significantly decreased bacterial richness and diversity and improved the growth of bacteria related to the elimination of chlorinated organic pollutants. However, little influence on soil microbial community was observed for each inoculation treatment (without TPN treatment), showing that TPN treatment is the main force for the shift in indigenous consortia. This study provides insights into the effects of halogenated fungicide application and bioaugmentation on indigenous soil microbiomes.


Assuntos
Brucellaceae/metabolismo , Fungicidas Industriais/metabolismo , Nitrilas/metabolismo , Oxalobacteraceae/metabolismo , Microbiologia do Solo , Poluentes do Solo/metabolismo , Biodegradação Ambiental , Hidrólise , Oxirredução
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA