Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 37
Filtrar
1.
Brief Bioinform ; 25(3)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38725156

RESUMO

Protein acetylation is one of the extensively studied post-translational modifications (PTMs) due to its significant roles across a myriad of biological processes. Although many computational tools for acetylation site identification have been developed, there is a lack of benchmark dataset and bespoke predictors for non-histone acetylation site prediction. To address these problems, we have contributed to both dataset creation and predictor benchmark in this study. First, we construct a non-histone acetylation site benchmark dataset, namely NHAC, which includes 11 subsets according to the sequence length ranging from 11 to 61 amino acids. There are totally 886 positive samples and 4707 negative samples for each sequence length. Secondly, we propose TransPTM, a transformer-based neural network model for non-histone acetylation site predication. During the data representation phase, per-residue contextualized embeddings are extracted using ProtT5 (an existing pre-trained protein language model). This is followed by the implementation of a graph neural network framework, which consists of three TransformerConv layers for feature extraction and a multilayer perceptron module for classification. The benchmark results reflect that TransPTM has the competitive performance for non-histone acetylation site prediction over three state-of-the-art tools. It improves our comprehension on the PTM mechanism and provides a theoretical basis for developing drug targets for diseases. Moreover, the created PTM datasets fills the gap in non-histone acetylation site datasets and is beneficial to the related communities. The related source code and data utilized by TransPTM are accessible at https://www.github.com/TransPTM/TransPTM.


Assuntos
Redes Neurais de Computação , Processamento de Proteína Pós-Traducional , Acetilação , Biologia Computacional/métodos , Bases de Dados de Proteínas , Software , Algoritmos , Humanos , Proteínas/química , Proteínas/metabolismo
2.
Artigo em Inglês | MEDLINE | ID: mdl-38739518

RESUMO

The employment of surface electromyographic (sEMG) signals in the estimation of hand kinematics represents a promising non-invasive methodology for the advancement of human-machine interfaces. However, the limitations of existing subject-specific methods are obvious as they confine the application to individual models that are custom-tailored for specific subjects, thereby reducing the potential for broader applicability. In addition, current cross-subject methods are challenged in their ability to simultaneously cater to the needs of both new and existing users effectively. To overcome these challenges, we propose the Cross-Subject Lifelong Network (CSLN). CSLN incorporates a novel lifelong learning approach, maintaining the patterns of sEMG signals across a varied user population and across different temporal scales. Our method enhances the generalization of acquired patterns, making it applicable to various individuals and temporal contexts. Our experimental investigations, encompassing both joint and sequential training approaches, demonstrate that the CSLN model not only attains enhanced performance in cross-subject scenarios but also effectively addresses the issue of catastrophic forgetting, thereby augmenting training efficacy.


Assuntos
Algoritmos , Eletromiografia , Mãos , Humanos , Eletromiografia/métodos , Mãos/fisiologia , Fenômenos Biomecânicos , Masculino , Adulto , Aprendizagem/fisiologia , Feminino , Sistemas Homem-Máquina , Aprendizado de Máquina , Adulto Jovem , Redes Neurais de Computação , Músculo Esquelético/fisiologia
3.
Comput Biol Med ; 175: 108487, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38653064

RESUMO

Drug repurposing is promising in multiple scenarios, such as emerging viral outbreak controls and cost reductions of drug discovery. Traditional graph-based drug repurposing methods are limited to fast, large-scale virtual screens, as they constrain the counts for drugs and targets and fail to predict novel viruses or drugs. Moreover, though deep learning has been proposed for drug repurposing, only a few methods have been used, including a group of pre-trained deep learning models for embedding generation and transfer learning. Hence, we propose DeepSeq2Drug to tackle the shortcomings of previous methods. We leverage multi-modal embeddings and an ensemble strategy to complement the numbers of drugs and viruses and to guarantee the novel prediction. This framework (including the expanded version) involves four modal types: six NLP models, four CV models, four graph models, and two sequence models. In detail, we first make a pipeline and calculate the predictive performance of each pair of viral and drug embeddings. Then, we select the best embedding pairs and apply an ensemble strategy to conduct anti-viral drug repurposing. To validate the effect of the proposed ensemble model, a monkeypox virus (MPV) case study is conducted to reflect the potential predictive capability. This framework could be a benchmark method for further pre-trained deep learning optimization and anti-viral drug repurposing tasks. We also build software further to make the proposed model easier to reuse. The code and software are freely available at http://deepseq2drug.cs.cityu.edu.hk.


Assuntos
Antivirais , Aprendizado Profundo , Reposicionamento de Medicamentos , Reposicionamento de Medicamentos/métodos , Antivirais/farmacologia , Antivirais/uso terapêutico , Humanos , Software , Benchmarking
4.
Nat Commun ; 15(1): 2657, 2024 Mar 26.
Artigo em Inglês | MEDLINE | ID: mdl-38531837

RESUMO

Structure-based generative chemistry is essential in computer-aided drug discovery by exploring a vast chemical space to design ligands with high binding affinity for targets. However, traditional in silico methods are limited by computational inefficiency, while machine learning approaches face bottlenecks due to auto-regressive sampling. To address these concerns, we have developed a conditional deep generative model, PMDM, for 3D molecule generation fitting specified targets. PMDM consists of a conditional equivariant diffusion model with both local and global molecular dynamics, enabling PMDM to consider the conditioned protein information to generate molecules efficiently. The comprehensive experiments indicate that PMDM outperforms baseline models across multiple evaluation metrics. To evaluate the applications of PMDM under real drug design scenarios, we conduct lead compound optimization for SARS-CoV-2 main protease (Mpro) and Cyclin-dependent Kinase 2 (CDK2), respectively. The selected lead optimization molecules are synthesized and evaluated for their in-vitro activities against CDK2, displaying improved CDK2 activity.


Assuntos
Fármacos Anti-HIV , Metacrilatos , Benchmarking , Benzoatos , Físico-Química , Desenho de Fármacos
5.
iScience ; 27(4): 109352, 2024 Apr 19.
Artigo em Inglês | MEDLINE | ID: mdl-38510148

RESUMO

Gene regulatory networks (GRNs) involve complex and multi-layer regulatory interactions between regulators and their target genes. Precise knowledge of GRNs is important in understanding cellular processes and molecular functions. Recent breakthroughs in single-cell sequencing technology made it possible to infer GRNs at single-cell level. Existing methods, however, are limited by expensive computations, and sometimes simplistic assumptions. To overcome these obstacles, we propose scGREAT, a framework to infer GRN using gene embeddings and transformer from single-cell transcriptomics. scGREAT starts by constructing gene expression and gene biotext dictionaries from scRNA-seq data and gene text information. The representation of TF gene pairs is learned through optimizing embedding space by transformer-based engine. Results illustrated scGREAT outperformed other contemporary methods on benchmarks. Besides, gene representations from scGREAT provide valuable gene regulation insights, and external validation on spatial transcriptomics illuminated the mechanism behind scGREAT annotation. Moreover, scGREAT identified several TF target regulations corroborated in studies.

6.
Langmuir ; 40(11): 5701-5714, 2024 Mar 19.
Artigo em Inglês | MEDLINE | ID: mdl-38501266

RESUMO

A series of WS42- intercalated NiZnAl ternary-layered double-hydroxides (LDHs) with various Ni/Zn ratios were synthesized by an ion-exchange method and used as adsorbents to remove Cu2+ from water. The introduction of Zn produced ZnS on the surface of LDHs. The LDH with the Ni/Zn/Al molar ratio of 0.1/1.9/1 showed the best adsorption ability. Cu2+ ions are removed via three routes: forming [Cu-WS4]n- complexes via soft acid-soft base interaction between WS42- and Cu2+, isomorphic substitution of Zn2+ in sheets by Cu2+, and cation exchange of Cu2+, with ZnS on the surface of LDHs. With the increased Cu2+ concentration, the complexes dominated the adsorption because polynuclear [Cu-WS4]n- complexes with high Cu/W ratios (2-6) may be formed. Cu+ is present in such complexes, which is produced by the internal redox. Even at Cu2+ concentration up to 600 mg·L-1, neither amorphous CuWS4 nor decreased interlayer distance was observed. Contrarily, the interlayer distance was slightly enlarged due to forming bigger [Cu-WS4]n- complexes. The adsorption followed the pseudo-second-order kinetics and Langmuir isotherm model. The experimental maximum adsorption capacity reached 555.4 mg·g-1.

7.
Comput Biol Med ; 168: 107753, 2024 01.
Artigo em Inglês | MEDLINE | ID: mdl-38039889

RESUMO

BACKGROUND: Trans-acting factors are of special importance in transcription regulation, which is a group of proteins that can directly or indirectly recognize or bind to the 8-12 bp core sequence of cis-acting elements and regulate the transcription efficiency of target genes. The progressive development in high-throughput chromatin capture technology (e.g., Hi-C) enables the identification of chromatin-interacting sequence groups where trans-acting DNA motif groups can be discovered. The problem difficulty lies in the combinatorial nature of DNA sequence pattern matching and its underlying sequence pattern search space. METHOD: Here, we propose to develop MotifHub for trans-acting DNA motif group discovery on grouped sequences. Specifically, the main approach is to develop probabilistic modeling for accommodating the stochastic nature of DNA motif patterns. RESULTS: Based on the modeling, we develop global sampling techniques based on EM and Gibbs sampling to address the global optimization challenge for model fitting with latent variables. The results reflect that our proposed approaches demonstrate promising performance with linear time complexities. CONCLUSION: MotifHub is a novel algorithm considering the identification of both DNA co-binding motif groups and trans-acting TFs. Our study paves the way for identifying hub TFs of stem cell development (OCT4 and SOX2) and determining potential therapeutic targets of prostate cancer (FOXA1 and MYC). To ensure scientific reproducibility and long-term impact, its matrix-algebra-optimized source code is released at http://bioinfo.cs.cityu.edu.hk/MotifHub.


Assuntos
Algoritmos , Software , Motivos de Nucleotídeos/genética , Reprodutibilidade dos Testes , Cromatina/genética
8.
J Am Chem Soc ; 145(50): 27788-27799, 2023 Dec 20.
Artigo em Inglês | MEDLINE | ID: mdl-37987648

RESUMO

Poly(disulfide)s are an emerging class of sulfur-containing polymers with applications in medicine, energy, and functional materials. However, the constituent dynamic covalent S-S bond is highly reactive in the presence of the sulfide (RS-) anion, imposing a persistent challenge to control the polymerization. Here, we report an anion-binding approach to arrest the high reactivity of the RS- chain end to control the synthesis of linear poly(disulfide)s, realizing a rapid, living ring-opening polymerization of 1,2-dithiolanes with narrow dispersity and high regioselectivity (Mw/Mn ∼ 1.1, Ps ∼ 0.85). Mechanistic studies support the formation of a thiourea-base-sulfide ternary complex as the catalytically active species during the chain propagation. Theoretical analyses reveal a synergistic catalytic model where the catalyst preorganizes the protonated base and anionic chain end to establish spatial confinement over the bound monomer, effecting the observed regioselectivity. The catalytic system is amenable to monomers with various functional groups, and semicrystalline polymers are also obtained from lipoic acid derivatives by enhancing the regioselectivity.

9.
iScience ; 26(11): 108197, 2023 Nov 17.
Artigo em Inglês | MEDLINE | ID: mdl-37965148

RESUMO

By soaking microRNAs (miRNAs), long non-coding RNAs (lncRNAs) have the potential to regulate gene expression. Few methods have been created based on this mechanism to anticipate the lncRNA-gene relationship prediction. Hence, we present lncRNA-Top to forecast potential lncRNA-gene regulation relationships. Specifically, we constructed controlled deep-learning methods using 12417 lncRNAs and 16127 genes. We have provided retrospective and innovative views among negative sampling, random seeds, cross-validation, metrics, and independent datasets. The AUC, AUPR, and our defined precision@k were leveraged to evaluate performance. In-depth case studies demonstrate that 47 out of 100 projected top unknown pairings were recorded in publications, supporting the predictive power. Our additional software can annotate the scores with target candidates. The lncRNA-Top will be a helpful tool to uncover prospective lncRNA targets and better comprehend the regulatory processes of lncRNAs.

10.
Adv Sci (Weinh) ; 10(33): e2303502, 2023 11.
Artigo em Inglês | MEDLINE | ID: mdl-37816141

RESUMO

Single-cell Hi-C (scHi-C) has made it possible to analyze chromatin organization at the single-cell level. However, scHi-C experiments generate inherently sparse data, which poses a challenge for loop calling methods. The existing approach performs significance tests across the imputed dense contact maps, leading to substantial computational overhead and loss of information at the single-cell level. To overcome this limitation, a lightweight framework called scGSLoop is proposed, which sets a new paradigm for scHi-C loop calling by adapting the training and inferencing strategies of graph-based deep learning to leverage the sequence features and 1D positional information of genomic loci. With this framework, sparsity is no longer a challenge, but rather an advantage that the model leverages to achieve unprecedented computational efficiency. Compared to existing methods, scGSLoop makes more accurate predictions and is able to identify more loops that have the potential to play regulatory roles in genome functioning. Moreover, scGSLoop preserves single-cell information by identifying a distinct group of loops for each individual cell, which not only enables an understanding of the variability of chromatin looping states between cells, but also allows scGSLoop to be extended for the investigation of multi-connected hubs and their underlying mechanisms.


Assuntos
Cromatina , Genômica , Cromatina/genética , Genoma
11.
Artigo em Chinês | MEDLINE | ID: mdl-37905488

RESUMO

Extranodal NK/T cell lymphoma, nasal type(ENKTL) is a highly aggressive malignant tumor derived from NK cells. This article reports a case of ENKTL invading the larynx and digestive tract. The clinical clinical manifestations include hoarseness and intranasal masses.


Assuntos
Laringe , Linfoma Extranodal de Células T-NK , Neoplasias Nasais , Humanos , Linfoma Extranodal de Células T-NK/patologia , Nariz/patologia , Neoplasias Nasais/patologia , Laringe/patologia , Trato Gastrointestinal/patologia
12.
Brief Bioinform ; 24(4)2023 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-37249547

RESUMO

Pathogen detection from biological and environmental samples is important for global disease control. Despite advances in pathogen detection using deep learning, current algorithms have limitations in processing long genomic sequences. Through the deep cross-fusion of cross, residual and deep neural networks, we developed DCiPatho for accurate pathogen detection based on the integrated frequency features of 3-to-7 k-mers. Compared with the existing state-of-the-art algorithms, DCiPatho can be used to accurately identify distinct pathogenic bacteria infecting humans, animals and plants. We evaluated DCiPatho on both learned and unlearned pathogen species using both genomics and metagenomics datasets. DCiPatho is an effective tool for the genomic-scale identification of pathogens by integrating the frequency of k-mers into deep cross-fusion networks. The source code is publicly available at https://github.com/LorMeBioAI/DCiPatho.


Assuntos
Algoritmos , Software , Humanos , Redes Neurais de Computação , Genoma , Genômica
13.
Adv Sci (Weinh) ; 10(11): e2204113, 2023 04.
Artigo em Inglês | MEDLINE | ID: mdl-36762572

RESUMO

The single-cell RNA sequencing (scRNA-seq) quantifies the gene expression of individual cells, while the bulk RNA sequencing (bulk RNA-seq) characterizes the mixed transcriptome of cells. The inference of drug sensitivities for individual cells can provide new insights to understand the mechanism of anti-cancer response heterogeneity and drug resistance at the cellular resolution. However, pharmacogenomic information related to their corresponding scRNA-Seq is often limited. Therefore, a transfer learning model is proposed to infer the drug sensitivities at single-cell level. This framework learns bulk transcriptome profiles and pharmacogenomics information from population cell lines in a large public dataset and transfers the knowledge to infer drug efficacy of individual cells. The results suggest that it is suitable to learn knowledge from pre-clinical cell lines to infer pre-existing cell subpopulations with different drug sensitivities prior to drug exposure. In addition, the model offers a new perspective on drug combinations. It is observed that drug-resistant subpopulation can be sensitive to other drugs (e.g., a subset of JHU006 is Vorinostat-resistant while Gefitinib-sensitive); such finding corroborates the previously reported drug combination (Gefitinib + Vorinostat) strategy in several cancer types. The identified drug sensitivity biomarkers reveal insights into the tumor heterogeneity and treatment at cellular resolution.


Assuntos
Transcriptoma , RNA-Seq/métodos , Gefitinibe , Vorinostat , Transcriptoma/genética , Análise de Sequência de RNA/métodos
14.
Artigo em Inglês | MEDLINE | ID: mdl-36269909

RESUMO

Estimation of hand kinematics from surface electromyographic (sEMG) signals provides a non-invasive human-machine interface. This approach is usually subject-specific, so that the training on one individual does not generalise to different subjects. In this paper, we propose a method based on Bidirectional Encoder Representation from Transformers (BERT) structure to predict the movement of hands from the root mean square (RMS) feature of the sEMG signal following µ -law normalization. The method was tested for within-subject and cross-subject conditions. We trained the model with two hard sample mining methods, Gradient Harmonizing Mechanism (GHM) and Online Hard Sample Mining (OHEM). The proposed method was compared with classic approaches, including long short-term memory (LSTM) and Temporal Convolutional Network (TCN) as well as a recent method called Long Exposure Convolutional Memory Network (LE-ConvMN). Correlation coefficient (CC), normalized root mean square error (NRMSE) and time costs were used as performance metrics. Our method (sBERT-OHEM) achieved state-of-the-art performance in cross-subject evaluation as well as high performance in subject-specific tests on the Ninapro dataset. The above tests are based on the same randomly selected 10 subjects. Generally, in the cross-subject situation, with the increasing of the subjects' number, it unavoidably leads to the decline of the performance. While the performance of our method on 38 subjects was significantly higher than the other methods on 10 subjects in cross-subject conditions, which further verified the advantage of our methods.


Assuntos
Algoritmos , Mãos , Humanos , Fenômenos Biomecânicos , Eletromiografia/métodos , Movimento
15.
Front Cell Infect Microbiol ; 13: 1325103, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-38173793

RESUMO

Polymethyl methacrylate (PMMA) frequently features in dental restorative materials due to its favorable properties. However, its surface exhibits a propensity for bacterial colonization, and the material can fracture under masticatory pressure. This study incorporated commercially available RHA-1F-II nano-silver loaded zirconium phosphate (Ag-ZrP) into room-temperature cured PMMA at varying mass fractions. Various methods were employed to characterize Ag-ZrP. Subsequently, an examination of the effects of Ag-ZrP on the antimicrobial properties, biosafety, and mechanical properties of PMMA materials was conducted. The results indicated that the antibacterial rate against Streptococcus mutans was enhanced at Ag-ZrP additions of 0%wt, 0.5%wt, 1.0%wt, 1.5%wt, 2.0%wt, 2.5%wt, and 3.0%wt, achieving respective rates of 53.53%, 67.08%, 83.23%, 93.38%, 95.85%, and 98.00%. Similarly, the antibacterial rate against Escherichia coli registered at 31.62%, 50.14%, 64.00%, 75.09%, 86.30%, 92.98%. When Ag-ZrP was introduced at amounts ranging from 1.0% to 1.5%, PMMA materials exhibited peak mechanical properties. However, mechanical strength diminished beyond additions of 2.5%wt to 3.0%wt, relative to the 0%wt group, while PMMA demonstrated no notable cytotoxicity below a 3.0%wt dosage. Thus, it is inferred that optimal antimicrobial and mechanical properties of PMMA materials are achieved with nano-Ag-ZrP (RHA-1F-II) additions of 1.5%wt to 2.0%wt, without eliciting cytotoxicity.


Assuntos
Anti-Infecciosos , Polimetil Metacrilato , Polimetil Metacrilato/farmacologia , Contenção de Riscos Biológicos , Temperatura , Antibacterianos/farmacologia
16.
Brief Bioinform ; 23(6)2022 11 19.
Artigo em Inglês | MEDLINE | ID: mdl-36274236

RESUMO

MOTIVATION: The identification of drug-target interactions (DTIs) plays a vital role for in silico drug discovery, in which the drug is the chemical molecule, and the target is the protein residues in the binding pocket. Manual DTI annotation approaches remain reliable; however, it is notoriously laborious and time-consuming to test each drug-target pair exhaustively. Recently, the rapid growth of labelled DTI data has catalysed interests in high-throughput DTI prediction. Unfortunately, those methods highly rely on the manual features denoted by human, leading to errors. RESULTS: Here, we developed an end-to-end deep learning framework called CoaDTI to significantly improve the efficiency and interpretability of drug target annotation. CoaDTI incorporates the Co-attention mechanism to model the interaction information from the drug modality and protein modality. In particular, CoaDTI incorporates transformer to learn the protein representations from raw amino acid sequences, and GraphSage to extract the molecule graph features from SMILES. Furthermore, we proposed to employ the transfer learning strategy to encode protein features by pre-trained transformer to address the issue of scarce labelled data. The experimental results demonstrate that CoaDTI achieves competitive performance on three public datasets compared with state-of-the-art models. In addition, the transfer learning strategy further boosts the performance to an unprecedented level. The extended study reveals that CoaDTI can identify novel DTIs such as reactions between candidate drugs and severe acute respiratory syndrome coronavirus 2-associated proteins. The visualization of co-attention scores can illustrate the interpretability of our model for mechanistic insights. AVAILABILITY: Source code are publicly available at https://github.com/Layne-Huang/CoaDTI.


Assuntos
COVID-19 , Humanos , Simulação por Computador , Proteínas/química , Sequência de Aminoácidos , Descoberta de Drogas/métodos
17.
Artigo em Chinês | MEDLINE | ID: mdl-35822387

RESUMO

This paper reported a case of superficial angiomyxoma in the region of the nasal vestibule. The clinical manifestation was swelling of the left nasal vestibular skin, while paranasal sinus CT showed swell soft tissue in the anterior and superior region to the left maxilla. Under general anesthesia, the left nasal vestibular mass was resected under nasal endoscopy. The postoperative pathological diagnosis was superficial angiomyxoma. The patient underwent a CT scan of the paranasal sinuses 4 months after the operation, and there was no recurrence of the tumor.


Assuntos
Mixoma , Seios Paranasais , Endoscopia , Humanos , Cavidade Nasal/patologia , Seios Paranasais/patologia , Tomografia Computadorizada por Raios X
18.
Comput Struct Biotechnol J ; 20: 3522-3532, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35860402

RESUMO

Post-translational modifications (PTMs) are closely linked to numerous diseases, playing a significant role in regulating protein structures, activities, and functions. Therefore, the identification of PTMs is crucial for understanding the mechanisms of cell biology and diseases therapy. Compared to traditional machine learning methods, the deep learning approaches for PTM prediction provide accurate and rapid screening, guiding the downstream wet experiments to leverage the screened information for focused studies. In this paper, we reviewed the recent works in deep learning to identify phosphorylation, acetylation, ubiquitination, and other PTM types. In addition, we summarized PTM databases and discussed future directions with critical insights.

19.
IEEE J Biomed Health Inform ; 26(8): 4303-4313, 2022 08.
Artigo em Inglês | MEDLINE | ID: mdl-35439152

RESUMO

Exploring the prognostic classification and biomarkers in Head and Neck Squamous Carcinoma (HNSC) is of great clinical significance. We hybridized three prominent strategies to comprehensively characterize the molecular features of HNSC. We constructed a 15-gene signature to predict patients' death risk with an average AUC of 0.744 for 1-, 3-, and 5-year on TCGA-HNSC training set, and average AUCs of 0.636, 0.584, 0.755 in GSE65858, GSE-112026, CPTAC-HNSCC datasets, respectively. By combined with NMF clustering and consensus clustering of fraction of tumor immune cell infiltration (ICI) in the tumor microenvironment (TME), we captured a more refined biological characteristics of HNSC, and observed a prognosis heterogeneity in high tumor immunity patients. By matching tumor subset-specific expression signatures to drug-induced cell line expression profiles from large-scale pharmacogenomic databases in the OCTAD workspace, we identified a group of HNSC patients featured with poor prognosis and demonstrated that the individuals in this group are likely to receive increased drug sensitivity to reverse differentially expressed disease signature genes. This trend is especially highlighted among those with higher death risk and tumour immunity.


Assuntos
Perfilação da Expressão Gênica , Neoplasias de Cabeça e Pescoço , Biomarcadores Tumorais/genética , Neoplasias de Cabeça e Pescoço/tratamento farmacológico , Neoplasias de Cabeça e Pescoço/genética , Humanos , Prognóstico , Carcinoma de Células Escamosas de Cabeça e Pescoço/genética , Transcriptoma , Resultado do Tratamento , Microambiente Tumoral/genética
20.
iScience ; 25(4): 104081, 2022 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-35372808

RESUMO

Human disease prediction from microbiome data has broad implications in metagenomics. It is rare for the existing methods to consider abundance profiles from both known and unknown microbial organisms, or capture the taxonomic relationships among microbial taxa, leading to significant information loss. On the other hand, deep learning has shown unprecedented advantages in classification tasks for its feature-learning ability. However, it encounters the opposite situation in metagenome-based disease prediction since high-dimensional low-sample-size metagenomic datasets can lead to severe overfitting; and black-box model fails in providing biological explanations. To circumvent the related problems, we developed MetaDR, a comprehensive machine learning-based framework that integrates various information and deep learning to predict human diseases. Experimental results indicate that MetaDR achieves competitive prediction performance with a reduction in running time, and effectively discovers the informative features with biological insights.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA