Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 13 de 13
Filtrar
Mais filtros











Base de dados
Intervalo de ano de publicação
1.
Bioinformatics ; 40(5)2024 May 02.
Artigo em Inglês | MEDLINE | ID: mdl-38696758

RESUMO

MOTIVATION: Peptides are promising agents for the treatment of a variety of diseases due to their specificity and efficacy. However, the development of peptide-based drugs is often hindered by the potential toxicity of peptides, which poses a significant barrier to their clinical application. Traditional experimental methods for evaluating peptide toxicity are time-consuming and costly, making the development process inefficient. Therefore, there is an urgent need for computational tools specifically designed to predict peptide toxicity accurately and rapidly, facilitating the identification of safe peptide candidates for drug development. RESULTS: We provide here a novel computational approach, CAPTP, which leverages the power of convolutional and self-attention to enhance the prediction of peptide toxicity from amino acid sequences. CAPTP demonstrates outstanding performance, achieving a Matthews correlation coefficient of approximately 0.82 in both cross-validation settings and on independent test datasets. This performance surpasses that of existing state-of-the-art peptide toxicity predictors. Importantly, CAPTP maintains its robustness and generalizability even when dealing with data imbalances. Further analysis by CAPTP reveals that certain sequential patterns, particularly in the head and central regions of peptides, are crucial in determining their toxicity. This insight can significantly inform and guide the design of safer peptide drugs. AVAILABILITY AND IMPLEMENTATION: The source code for CAPTP is freely available at https://github.com/jiaoshihu/CAPTP.


Assuntos
Biologia Computacional , Peptídeos , Peptídeos/química , Biologia Computacional/métodos , Humanos , Sequência de Aminoácidos , Algoritmos , Software
2.
Adv Sci (Weinh) ; 11(22): e2400009, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38602457

RESUMO

Recent studies have revealed that numerous lncRNAs can translate proteins under specific conditions, performing diverse biological functions, thus termed coding lncRNAs. Their comprehensive landscape, however, remains elusive due to this field's preliminary and dispersed nature. This study introduces codLncScape, a framework for coding lncRNA exploration consisting of codLncDB, codLncFlow, codLncWeb, and codLncNLP. Specifically, it contains a manually compiled knowledge base, codLncDB, encompassing 353 coding lncRNA entries validated by experiments. Building upon codLncDB, codLncFlow investigates the expression characteristics of these lncRNAs and their diagnostic potential in the pan-cancer context, alongside their association with spermatogenesis. Furthermore, codLncWeb emerges as a platform for storing, browsing, and accessing knowledge concerning coding lncRNAs within various programming environments. Finally, codLncNLP serves as a knowledge-mining tool to enhance the timely content inclusion and updates within codLncDB. In summary, this study offers a well-functioning, content-rich ecosystem for coding lncRNA research, aiming to accelerate systematic studies in this field.


Assuntos
RNA Longo não Codificante , RNA Longo não Codificante/genética , RNA Longo não Codificante/metabolismo , Humanos , Biologia Computacional/métodos , Software , Neoplasias/genética
3.
PLoS Comput Biol ; 19(12): e1011450, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-38096269

RESUMO

Cancer is known as a heterogeneous disease. Cancer driver genes (CDGs) need to be inferred for understanding tumor heterogeneity in cancer. However, the existing computational methods have identified many common CDGs. A key challenge exploring cancer progression is to infer cancer subtype-specific driver genes (CSDGs), which provides guidane for the diagnosis, treatment and prognosis of cancer. The significant advancements in single-cell RNA-sequencing (scRNA-seq) technologies have opened up new possibilities for studying human cancers at the individual cell level. In this study, we develop a novel unsupervised method, CSDGI (Cancer Subtype-specific Driver Gene Inference), which applies Encoder-Decoder-Framework consisting of low-rank residual neural networks to inferring driver genes corresponding to potential cancer subtypes at the single-cell level. To infer CSDGs, we apply CSDGI to the tumor single-cell transcriptomics data. To filter the redundant genes before driver gene inference, we perform the differential expression genes (DEGs). The experimental results demonstrate CSDGI is effective to infer driver genes that are cancer subtype-specific. Functional and disease enrichment analysis shows these inferred CSDGs indicate the key biological processes and disease pathways. CSDGI is the first method to explore cancer driver genes at the cancer subtype level. We believe that it can be a useful method to understand the mechanisms of cell transformation driving tumours.


Assuntos
Neoplasias , Oncogenes , Humanos , Perfilação da Expressão Gênica , Neoplasias/genética , Neoplasias/patologia , Transformação Celular Neoplásica/genética , Análise de Célula Única/métodos
4.
Comput Biol Med ; 164: 107223, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-37490833

RESUMO

The increased availability of high-throughput technologies has enabled biomedical researchers to learn about disease etiology across multiple omics layers, which shows promise for improving cancer subtype identification. Many computational methods have been developed to perform clustering on multi-omics data, however, only a few of them are applicable for partial multi-omics in which some samples lack data in some types of omics. In this study, we propose a novel multi-omics clustering method based on latent sub-space learning (MCLS), which can deal with the missing multi-omics for clustering. We utilize the data with complete omics to construct a latent subspace using PCA-based feature extraction and singular value decomposition (SVD). The data with incomplete multi-omics are then projected to the latent subspace, and spectral clustering is performed to find the clusters. The proposed MCLS method is evaluated on seven different cancer datasets on three levels of omics in both full and partial cases compared to several state-of-the-art methods. The experimental results show that the proposed MCLS method is more efficient and effective than the compared methods for cancer subtype identification in multi-omics data analysis, which provides important references to a comprehensive understanding of cancer and biological mechanisms. AVAILABILITY: The proposed method can be freely accessible at https://github.com/ShangCS/MCLS.


Assuntos
Algoritmos , Neoplasias , Humanos , Multiômica , Análise por Conglomerados , Neoplasias/genética , Análise de Dados
5.
Methods ; 211: 61-67, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-36804215

RESUMO

Recent advances in multi-omics databases offer the opportunity to explore complex systems of cancers across hierarchical biological levels. Some methods have been proposed to identify the genes that play a vital role in disease development by integrating multi-omics. However, the existing methods identify the related genes separately, neglecting the gene interactions that are related to the multigenic disease. In this study, we develop a learning framework to identify the interactive genes based on multi-omics data including gene expression. Firstly, we integrate different omics based on their similarities and apply spectral clustering for cancer subtype identification. Then, a gene co-expression network is construct for each cancer subtype. Finally, we detect the interactive genes in the co-expression network by learning the dense subgraphs based on the L1 prosperities of eigenvectors in the modularity matrix. We apply the proposed learning framework on a multi-omics cancer dataset to identify the interactive genes for each cancer subtype. The detected genes are examined by DAVID and KEGG tools for systematic gene ontology enrichment analysis. The analysis results show that the detected genes have relationships to cancer development and the genes in different cancer subtypes are related to different biological processes and pathways, which are expected to yield important references for understanding tumor heterogeneity and improving patient survival.


Assuntos
Multiômica , Neoplasias , Humanos , Neoplasias/genética , Análise por Conglomerados , Bases de Dados Factuais
6.
Sensors (Basel) ; 22(20)2022 Oct 18.
Artigo em Inglês | MEDLINE | ID: mdl-36298289

RESUMO

The Tactile Internet enables physical touch to be transmitted over the Internet. In the context of electronic medicine, an authenticated key agreement for the Tactile Internet allows surgeons to perform operations via robotic systems and receive tactile feedback from remote patients. The fifth generation of networks has completely changed the network space and has increased the efficiency of the Tactile Internet with its ultra-low latency, high data rates, and reliable connectivity. However, inappropriate and insecure authentication key agreements for the Tactile Internet may cause misjudgment and improper operation by medical staff, endangering the life of patients. In 2021, Kamil et al. developed a novel and lightweight authenticated key agreement scheme that is suitable for remote surgery applications in the Tactile Internet environment. However, their scheme directly encrypts communication messages with constant secret keys and directly stores secret keys in the verifier table, making the scheme vulnerable to possible attacks. Therefore, in this investigation, we discuss the limitations of the scheme proposed by Kamil scheme and present an enhanced scheme. The enhanced scheme is developed using a one-time key to protect communication messages, whereas the verifier table is protected with a secret gateway key to mitigate the mentioned limitations. The enhanced scheme is proven secure against possible attacks, providing more security functionalities than similar schemes and retaining a lightweight computational cost.


Assuntos
Segurança Computacional , Telemedicina , Humanos , Confidencialidade , Tato , Internet
7.
Sci Rep ; 12(1): 13550, 2022 08 08.
Artigo em Inglês | MEDLINE | ID: mdl-35941273

RESUMO

Triple negative breast cancer (TNBC) is associated with worse outcomes and results in high mortality; therefore, great efforts are required to find effective treatment. In the present study, we suggested a novel strategy to treat TNBC using mesenchymal stem cell (MSC)-derived extracellular vesicles (EV) to transform the behaviors and cellular communication of TNBC cells (BCC) with other non-cancer cells related to tumorigenesis and metastasis. Our data showed that, BCC after being internalized with EV derived from Wharton's Jelly MSC (WJ-EV) showed the impaired proliferation, stemness properties, tumorigenesis and metastasis under hypoxic conditions. Moreover, these inhibitory effects may be involved in the transfer of miRNA-125b from WJ-EV to BCC, which downregulated the expression of HIF1α and target genes related to proliferation, epithelial-mesenchymal transition, and angiogenesis. Of note, WJ-EV-internalized BCC (wBCC) showed transformed behaviors that attenuated the in vivo development and metastatic ability of TNBC, the angiogenic abilities of endothelial cells and endothelial progenitor cells and the generation of cancer-associated fibroblasts from MSC. Furthermore, wBCC generated a new EV with modified functions that contributed to the inhibitory effects on tumorigenesis and metastasis of TNBC. Taken together, our findings suggested that WJ-EV treatment is a promising therapy that results in the generation of wBCC to interrupt the cellular crosstalk in the tumor environment and inhibit the tumor progression in TNBC.


Assuntos
Vesículas Extracelulares , Células-Tronco Mesenquimais , MicroRNAs , Neoplasias de Mama Triplo Negativas , Geleia de Wharton , Carcinogênese/genética , Carcinogênese/metabolismo , Diferenciação Celular , Proliferação de Células , Células Cultivadas , Células Endoteliais , Humanos , Células-Tronco Mesenquimais/metabolismo , MicroRNAs/metabolismo , Transdução de Sinais , Neoplasias de Mama Triplo Negativas/genética , Neoplasias de Mama Triplo Negativas/metabolismo , Neoplasias de Mama Triplo Negativas/terapia , Geleia de Wharton/metabolismo
8.
Front Genet ; 13: 952649, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35910201

RESUMO

Single-cell RNA-sequencing (scRNA-seq) technologies enable the measurements of gene expressions in individual cells, which is helpful for exploring cancer heterogeneity and precision medicine. However, various technical noises lead to false zero values (missing gene expression values) in scRNA-seq data, termed as dropout events. These zero values complicate the analysis of cell patterns, which affects the high-precision analysis of intra-tumor heterogeneity. Recovering missing gene expression values is still a major obstacle in the scRNA-seq data analysis. In this study, taking the cell heterogeneity into consideration, we develop a novel method, called single cell Gauss-Newton Gene expression Imputation (scGNGI), to impute the scRNA-seq expression matrices by using a low-rank matrix completion. The obtained experimental results on the simulated datasets and real scRNA-seq datasets show that scGNGI can more effectively impute the missing values for scRNA-seq gene expression and improve the down-stream analysis compared to other state-of-the-art methods. Moreover, we show that the proposed method can better preserve gene expression variability among cells. Overall, this study helps explore the complex biological system and precision medicine in scRNA-seq data.

9.
J Biomed Inform ; 128: 104049, 2022 04.
Artigo em Inglês | MEDLINE | ID: mdl-35283266

RESUMO

Renal cell carcinoma (RCC) is one of the deadliest cancers and mainly consists of three subtypes: kidney clear cell carcinoma (KIRC), kidney papillary cell carcinoma (KIRP), and kidney chromophobe (KICH). Gene signature identification plays an important role in the precise classification of RCC subtypes and personalized treatment. However, most of the existing gene selection methods focus on statically selecting the same informative genes for each subtype, and fail to consider the heterogeneity of patients which causes pattern differences in each subtype. In this work, to explore different informative gene subsets for each subtype, we propose a novel gene selection method, named sequential reinforcement active feature learning (SRAFL), which dynamically acquire the different genes in each sample to identify the different gene signatures for each subtype. The proposed SRAFL method combines the cancer subtype classifier with the reinforcement learning (RL) agent, which sequentially select the active genes in each sample from three mixed RCC subtypes in a cost-sensitive manner. Moreover, the module-based gene filtering is run before gene selection to filter the redundant genes. We mainly evaluate the proposed SRAFL method based on mRNA and long non-coding RNA (lncRNA) expression profiles of RCC datasets from The Cancer Genome Atlas (TCGA). The experimental results demonstrate that the proposed method can automatically identify different gene signatures for different subtypes to accurately classify RCC subtypes. More importantly, we here for the first time show the proposed SRAFL method can consider the heterogeneity of samples to select different gene signatures for different RCC subtypes, which shows more potential for the precision-based RCC care in the future.


Assuntos
Carcinoma de Células Renais , Neoplasias Renais , Carcinoma de Células Renais/diagnóstico , Carcinoma de Células Renais/genética , Carcinoma de Células Renais/metabolismo , Genoma , Humanos , Neoplasias Renais/diagnóstico , Neoplasias Renais/genética , Neoplasias Renais/metabolismo , RNA Mensageiro
10.
Bioinformatics ; 38(6): 1514-1524, 2022 03 04.
Artigo em Inglês | MEDLINE | ID: mdl-34999757

RESUMO

MOTIVATION: Recently, peptides have emerged as a promising class of pharmaceuticals for various diseases treatment poised between traditional small molecule drugs and therapeutic proteins. However, one of the key bottlenecks preventing them from therapeutic peptides is their toxicity toward human cells, and few available algorithms for predicting toxicity are specially designed for short-length peptides. RESULTS: We present ToxIBTL, a novel deep learning framework by utilizing the information bottleneck principle and transfer learning to predict the toxicity of peptides as well as proteins. Specifically, we use evolutionary information and physicochemical properties of peptide sequences and integrate the information bottleneck principle into a feature representation learning scheme, by which relevant information is retained and the redundant information is minimized in the obtained features. Moreover, transfer learning is introduced to transfer the common knowledge contained in proteins to peptides, which aims to improve the feature representation capability. Extensive experimental results demonstrate that ToxIBTL not only achieves a higher prediction performance than state-of-the-art methods on the peptide dataset, but also has a competitive performance on the protein dataset. Furthermore, a user-friendly online web server is established as the implementation of the proposed ToxIBTL. AVAILABILITY AND IMPLEMENTATION: The proposed ToxIBTL and data can be freely accessible at http://server.wei-group.net/ToxIBTL. Our source code is available at https://github.com/WLYLab/ToxIBTL. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Aprendizado de Máquina , Peptídeos , Humanos , Proteínas , Software , Algoritmos
11.
Brief Bioinform ; 22(5)2021 09 02.
Artigo em Inglês | MEDLINE | ID: mdl-33822870

RESUMO

MOTIVATION: Peptides have recently emerged as promising therapeutic agents against various diseases. For both research and safety regulation purposes, it is of high importance to develop computational methods to accurately predict the potential toxicity of peptides within the vast number of candidate peptides. RESULTS: In this study, we proposed ATSE, a peptide toxicity predictor by exploiting structural and evolutionary information based on graph neural networks and attention mechanism. More specifically, it consists of four modules: (i) a sequence processing module for converting peptide sequences to molecular graphs and evolutionary profiles, (ii) a feature extraction module designed to learn discriminative features from graph structural information and evolutionary information, (iii) an attention module employed to optimize the features and (iv) an output module determining a peptide as toxic or non-toxic, using optimized features from the attention module. CONCLUSION: Comparative studies demonstrate that the proposed ATSE significantly outperforms all other competing methods. We found that structural information is complementary to the evolutionary information, effectively improving the predictive performance. Importantly, the data-driven features learned by ATSE can be interpreted and visualized, providing additional information for further analysis. Moreover, we present a user-friendly online computational platform that implements the proposed ATSE, which is now available at http://server.malab.cn/ATSE. We expect that it can be a powerful and useful tool for researchers of interest.


Assuntos
Biologia Computacional/métodos , Aprendizado de Máquina , Redes Neurais de Computação , Peptídeos/toxicidade , Software , Bases de Dados de Proteínas , Conjuntos de Dados como Assunto , Evolução Molecular , Humanos , Peptídeos/química
12.
Cells ; 9(9)2020 08 21.
Artigo em Inglês | MEDLINE | ID: mdl-32825786

RESUMO

High-throughput sequencing technologies have enabled the generation of single-cell RNA-seq (scRNA-seq) data, which explore both genetic heterogeneity and phenotypic variation between cells. Some methods have been proposed to detect the related genes causing cell-to-cell variability for understanding tumor heterogeneity. However, most existing methods detect the related genes separately, without considering gene interactions. In this paper, we proposed a novel learning framework to detect the interactive gene groups for scRNA-seq data based on co-expression network analysis and subgraph learning. We first utilized spectral clustering to identify the subpopulations of cells. For each cell subpopulation, the differentially expressed genes were then selected to construct a gene co-expression network. Finally, the interactive gene groups were detected by learning the dense subgraphs embedded in the gene co-expression networks. We applied the proposed learning framework on a real cancer scRNA-seq dataset to detect interactive gene groups of different cancer subtypes. Systematic gene ontology enrichment analysis was performed to examine the detected genes groups by summarizing the key biological processes and pathways. Our analysis shows that different subtypes exhibit distinct gene co-expression networks and interactive gene groups with different functional enrichment. The interactive genes are expected to yield important references for understanding tumor heterogeneity.


Assuntos
Perfilação da Expressão Gênica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Aprendizado de Máquina/normas , RNA-Seq/métodos , Análise de Célula Única/métodos , Humanos
13.
Int J Med Sci ; 16(7): 949-959, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31341408

RESUMO

Background: In recent years, the development and diagnosis of secondary cancer have become the primary concern of cancer survivors. A number of studies have been developing strategies to extract knowledge from the clinical data, aiming to identify important risk factors that can be used to prevent the recurrence of diseases. However, these studies do not focus on secondary cancer. Secondary cancer is lack of the strategies for clinical treatment as well as risk factor identification to prevent the occurrence. Methods: We propose an effective ensemble feature learning method to identify the risk factors for predicting secondary cancer by considering class imbalance and patient heterogeneity. We first divide the patients into some heterogeneous groups based on spectral clustering. In each group, we apply the oversampling method to balance the number of samples in each class and use them as training data for ensemble feature learning. The purpose of ensemble feature learning is to identify the risk factors and construct a diagnosis model for each group. The importance of risk factors is measured based on the properties of patients in each group separately. We predict secondary cancer by assigning the patient to a corresponding group and based on the diagnosis model in this corresponding group. Results: Analysis of the results shows that the decision tree obtains the best results for predicting secondary cancer in the three classifiers. The best results of the decision tree are 0.72 in terms of AUC when dividing the patients into 15 groups, 0.38 in terms of F1 score when dividing the patients into 20 groups. In terms of AUC, decision tree achieves 67.4% improvement compared to using all 20 predictor variables and 28.6% improvement compared to no group division. In terms of F1 score, decision tree achieves 216.7% improvement compared to using all 20 predictor variables and 80.9% improvement compared to no group division. Different groups provide different ranking results for the predictor variables. Conclusion: The accuracies of predicting secondary cancer using k-nearest neighbor, decision tree, support vector machine indeed increased after using the selected important risk factors as predictors. Group division on patients to predict secondary cancer on the separated models can further improve the prediction accuracies. The information discovered in the experiments can provide important references to the personality and clinical symptom representations on all phases of guide interventions, with the complexities of multiple symptoms associated with secondary cancer in all phases of the recurrent trajectory.


Assuntos
Sobreviventes de Câncer/estatística & dados numéricos , Análise de Dados , Modelos Biológicos , Segunda Neoplasia Primária/diagnóstico , Conjuntos de Dados como Assunto , Árvores de Decisões , Estudos de Viabilidade , Humanos , Segunda Neoplasia Primária/epidemiologia , Prognóstico , Medição de Risco/métodos , Fatores de Risco , Máquina de Vetores de Suporte
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA