Pesquisa | Portal Regional da BVS

Automated evaluation of masseter muscle volume: deep learning prognostic approach in oral cancer.

Sakamoto, Katsuya; Hiraoka, Shin-Ichiro; Kawamura, Kohei; Ruan, Peiying; Uchida, Shuji; Akiyama, Ryo; Lee, Chonho; Ide, Kazuki; Tanaka, Susumu.

BMC Cancer ; 24(1): 128, 2024 Jan 24.

Artigo em Inglês | MEDLINE | ID: mdl-38267924

RESUMO

BACKGROUND: Sarcopenia has been identified as a potential negative prognostic factor in cancer patients. In this study, our objective was to investigate the relationship between the assessment method for sarcopenia using the masseter muscle volume measured on computed tomography (CT) images and the life expectancy of patients with oral cancer. We also developed a learning model using deep learning to automatically extract the masseter muscle volume and investigated its association with the life expectancy of oral cancer patients. METHODS: To develop the learning model for masseter muscle volume, we used manually extracted data from CT images of 277 patients. We established the association between manually extracted masseter muscle volume and the life expectancy of oral cancer patients. Additionally, we compared the correlation between the groups of manual and automatic extraction in the masseter muscle volume learning model. RESULTS: Our findings revealed a significant association between manually extracted masseter muscle volume on CT images and the life expectancy of patients with oral cancer. Notably, the manual and automatic extraction groups in the masseter muscle volume learning model showed a high correlation. Furthermore, the masseter muscle volume automatically extracted using the developed learning model exhibited a strong association with life expectancy. CONCLUSIONS: The sarcopenia assessment method is useful for predicting the life expectancy of patients with oral cancer. In the future, it is crucial to validate and analyze various factors within the oral surgery field, extending beyond cancer patients.

Assuntos

Aprendizado Profundo , Neoplasias Bucais , Sarcopenia , Humanos , Prognóstico , Músculo Masseter/diagnóstico por imagem , Sarcopenia/diagnóstico por imagem , Neoplasias Bucais/diagnóstico por imagem

Federated learning for predicting clinical outcomes in patients with COVID-19.

Dayan, Ittai; Roth, Holger R; Zhong, Aoxiao; Harouni, Ahmed; Gentili, Amilcare; Abidin, Anas Z; Liu, Andrew; Costa, Anthony Beardsworth; Wood, Bradford J; Tsai, Chien-Sung; Wang, Chih-Hung; Hsu, Chun-Nan; Lee, C K; Ruan, Peiying; Xu, Daguang; Wu, Dufan; Huang, Eddie; Kitamura, Felipe Campos; Lacey, Griffin; de Antônio Corradi, Gustavo César; Nino, Gustavo; Shin, Hao-Hsin; Obinata, Hirofumi; Ren, Hui; Crane, Jason C; Tetreault, Jesse; Guan, Jiahui; Garrett, John W; Kaggie, Joshua D; Park, Jung Gil; Dreyer, Keith; Juluru, Krishna; Kersten, Kristopher; Rockenbach, Marcio Aloisio Bezerra Cavalcanti; Linguraru, Marius George; Haider, Masoom A; AbdelMaseeh, Meena; Rieke, Nicola; Damasceno, Pablo F; E Silva, Pedro Mario Cruz; Wang, Pochuan; Xu, Sheng; Kawano, Shuichi; Sriswasdi, Sira; Park, Soo Young; Grist, Thomas M; Buch, Varun; Jantarabenjakul, Watsamon; Wang, Weichung; Tak, Won Young.

Nat Med ; 27(10): 1735-1743, 2021 10.

Artigo em Inglês | MEDLINE | ID: mdl-34526699

RESUMO

Federated learning (FL) is a method used for training artificial intelligence models with data from multiple sources while maintaining data anonymity, thus removing many barriers to data sharing. Here we used data from 20 institutes across the globe to train a FL model, called EXAM (electronic medical record (EMR) chest X-ray AI model), that predicts the future oxygen requirements of symptomatic patients with COVID-19 using inputs of vital signs, laboratory data and chest X-rays. EXAM achieved an average area under the curve (AUC) >0.92 for predicting outcomes at 24 and 72 h from the time of initial presentation to the emergency room, and it provided 16% improvement in average AUC measured across all participating sites and an average increase in generalizability of 38% when compared with models trained at a single site using that site's data. For prediction of mechanical ventilation treatment or death at 24 h at the largest independent test site, EXAM achieved a sensitivity of 0.950 and specificity of 0.882. In this study, FL facilitated rapid data science collaboration without data exchange and generated a model that generalized across heterogeneous, unharmonized datasets for prediction of clinical outcomes in patients with COVID-19, setting the stage for the broader use of FL in healthcare.

Assuntos

COVID-19/fisiopatologia , Aprendizado de Máquina , Avaliação de Resultados em Cuidados de Saúde , COVID-19/terapia , COVID-19/virologia , Registros Eletrônicos de Saúde , Humanos , Prognóstico , SARS-CoV-2/isolamento & purificação

Deep learning with evolutionary and genomic profiles for identifying cancer subtypes.

Lin, Chun-Yu; Ruan, Peiying; Li, Ruiming; Yang, Jinn-Moon; See, Simon; Song, Jiangning; Akutsu, Tatsuya.

J Bioinform Comput Biol ; 17(3): 1940005, 2019 06.

Artigo em Inglês | MEDLINE | ID: mdl-31288637

RESUMO

Cancer subtype identification is an unmet need in precision diagnosis. Recently, evolutionary conservation has been indicated to contain informative signatures for functional significance in cancers. However, the importance of evolutionary conservation in distinguishing cancer subtypes remains largely unclear. Here, we identified the evolutionarily conserved genes (i.e. core genes) and observed that they are primarily involved in cellular pathways relevant to cell growth and metabolisms. By using these core genes, we developed two novel strategies, namely a feature-based strategy (FES) and an image-based strategy (IMS) by integrating their evolutionary and genomic profiles with the deep learning algorithm. In comparison with the FES using the random set and the strategy using the PAM50 classifier, the core gene set-based FES achieved a higher accuracy for identifying breast cancer subtypes. The IMS and FES using the core gene set yielded better performances than the other strategies, in terms of classifying both breast cancer subtypes and multiple cancer types. Moreover, the IMS is reproducible even using different gene expression data (i.e. RNA-seq and microarray). Comprehensive analysis of eight cancer types demonstrates that our evolutionary conservation-based models represent a valid and helpful approach for identifying cancer subtypes and the core gene set offers distinguishable clues of cancer subtypes.

Assuntos

Biologia Computacional/métodos , Neoplasias/genética , Neoplasias/patologia , Sequência de Bases , Evolução Biológica , Neoplasias da Mama/genética , Neoplasias da Mama/patologia , Sequência Conservada , Bases de Dados Factuais , Aprendizado Profundo , Feminino , Perfilação da Expressão Gênica , Regulação Neoplásica da Expressão Gênica , Humanos , Redes Neurais de Computação , Mapas de Interação de Proteínas

On the number of driver nodes for controlling a Boolean network when the targets are restricted to attractors.

Hou, Wenpin; Ruan, Peiying; Ching, Wai-Ki; Akutsu, Tatsuya.

J Theor Biol ; 463: 1-11, 2019 02 21.

Artigo em Inglês | MEDLINE | ID: mdl-30543810

RESUMO

It is known that many driver nodes are required to control complex biological networks. Previous studies imply that O(N) driver nodes are required in both linear complex network and Boolean network models with N nodes if an arbitrary state is specified as the target. In order to cope with this intrinsic difficulty, we consider a special case of the control problem in which the targets are restricted to attractors. For this special case, we mathematically prove under the uniform distribution of states in basins that the expected number of driver nodes is only O(log2N+log2M) for controlling Boolean networks, where M is the number of attractors. Since it is expected that M is not very large in many practical networks, the new model requires a much smaller number of driver nodes. This result is based on discovery of novel relationships between control problems on Boolean networks and the coupon collector's problem, a well-known concept in combinatorics. We also provide lower bounds of the number of driver nodes as well as simulation results using artificial and realistic network data, which support our theoretical findings.

Assuntos

Modelos Biológicos , Modelos Teóricos , Algoritmos , Biologia de Sistemas/métodos

Improving prediction of heterodimeric protein complexes using combination with pairwise kernel.

Ruan, Peiying; Hayashida, Morihiro; Akutsu, Tatsuya; Vert, Jean-Philippe.

BMC Bioinformatics ; 19(Suppl 1): 39, 2018 02 19.

Artigo em Inglês | MEDLINE | ID: mdl-29504897

RESUMO

BACKGROUND: Since many proteins become functional only after they interact with their partner proteins and form protein complexes, it is essential to identify the sets of proteins that form complexes. Therefore, several computational methods have been proposed to predict complexes from the topology and structure of experimental protein-protein interaction (PPI) network. These methods work well to predict complexes involving at least three proteins, but generally fail at identifying complexes involving only two different proteins, called heterodimeric complexes or heterodimers. There is however an urgent need for efficient methods to predict heterodimers, since the majority of known protein complexes are precisely heterodimers. RESULTS: In this paper, we use three promising kernel functions, Min kernel and two pairwise kernels, which are Metric Learning Pairwise Kernel (MLPK) and Tensor Product Pairwise Kernel (TPPK). We also consider the normalization forms of Min kernel. Then, we combine Min kernel or its normalization form and one of the pairwise kernels by plugging. We applied kernels based on PPI, domain, phylogenetic profile, and subcellular localization properties to predicting heterodimers. Then, we evaluate our method by employing C-Support Vector Classification (C-SVC), carrying out 10-fold cross-validation, and calculating the average F-measures. The results suggest that the combination of normalized-Min-kernel and MLPK leads to the best F-measure and improved the performance of our previous work, which had been the best existing method so far. CONCLUSIONS: We propose new methods to predict heterodimers, using a machine learning-based approach. We train a support vector machine (SVM) to discriminate interacting vs non-interacting protein pairs, based on informations extracted from PPI, domain, phylogenetic profiles and subcellular localization. We evaluate in detail new kernel functions to encode these data, and report prediction performance that outperforms the state-of-the-art.

Assuntos

Algoritmos , Complexos Multiproteicos/química , Dimerização , Complexos Multiproteicos/classificação , Filogenia , Domínios Proteicos , Mapas de Interação de Proteínas , Multimerização Proteica , Máquina de Vetores de Suporte

Feature weight estimation for gene selection: a local hyperlinear learning approach.

Cai, Hongmin; Ruan, Peiying; Ng, Michael; Akutsu, Tatsuya.

BMC Bioinformatics ; 15: 70, 2014 Mar 14.

Artigo em Inglês | MEDLINE | ID: mdl-24625071

RESUMO

BACKGROUND: Modeling high-dimensional data involving thousands of variables is particularly important for gene expression profiling experiments, nevertheless,it remains a challenging task. One of the challenges is to implement an effective method for selecting a small set of relevant genes, buried in high-dimensional irrelevant noises. RELIEF is a popular and widely used approach for feature selection owing to its low computational cost and high accuracy. However, RELIEF based methods suffer from instability, especially in the presence of noisy and/or high-dimensional outliers. RESULTS: We propose an innovative feature weighting algorithm, called LHR, to select informative genes from highly noisy data. LHR is based on RELIEF for feature weighting using classical margin maximization. The key idea of LHR is to estimate the feature weights through local approximation rather than global measurement, which is typically used in existing methods. The weights obtained by our method are very robust in terms of degradation of noisy features, even those with vast dimensions. To demonstrate the performance of our method, extensive experiments involving classification tests have been carried out on both synthetic and real microarray benchmark datasets by combining the proposed technique with standard classifiers, including the support vector machine (SVM), k-nearest neighbor (KNN), hyperplane k-nearest neighbor (HKNN), linear discriminant analysis (LDA) and naive Bayes (NB). CONCLUSION: Experiments on both synthetic and real-world datasets demonstrate the superior performance of the proposed feature selection method combined with supervised learning in three aspects: 1) high classification accuracy, 2) excellent robustness to noise and 3) good stability using to various classification algorithms.

Assuntos

Biologia Computacional/métodos , Perfilação da Expressão Gênica/métodos , Máquina de Vetores de Suporte , Algoritmos , Teorema de Bayes , Análise por Conglomerados , Bases de Dados Genéticas , Análise Discriminante , Humanos , Neoplasias/genética , Neoplasias/metabolismo , Análise de Sequência com Séries de Oligonucleotídeos

Prediction of heterotrimeric protein complexes by two-phase learning using neighboring kernels.

Ruan, Peiying; Hayashida, Morihiro; Maruyama, Osamu; Akutsu, Tatsuya.

BMC Bioinformatics ; 15 Suppl 2: S6, 2014.

Artigo em Inglês | MEDLINE | ID: mdl-24564744

RESUMO

BACKGROUND: Protein complexes play important roles in biological systems such as gene regulatory networks and metabolic pathways. Most methods for predicting protein complexes try to find protein complexes with size more than three. It, however, is known that protein complexes with smaller sizes occupy a large part of whole complexes for several species. In our previous work, we developed a method with several feature space mappings and the domain composition kernel for prediction of heterodimeric protein complexes, which outperforms existing methods. RESULTS: We propose methods for prediction of heterotrimeric protein complexes by extending techniques in the previous work on the basis of the idea that most heterotrimeric protein complexes are not likely to share the same protein with each other. We make use of the discriminant function in support vector machines (SVMs), and design novel feature space mappings for the second phase. As the second classifier, we examine SVMs and relevance vector machines (RVMs). We perform 10-fold cross-validation computational experiments. The results suggest that our proposed two-phase methods and SVM with the extended features outperform the existing method NWE, which was reported to outperform other existing methods such as MCL, MCODE, DPClus, CMC, COACH, RRW, and PPSampler for prediction of heterotrimeric protein complexes. CONCLUSIONS: We propose two-phase prediction methods with the extended features, the domain composition kernel, SVMs and RVMs. The two-phase method with the extended features and the domain composition kernel using SVM as the second classifier is particularly useful for prediction of heterotrimeric protein complexes.

Assuntos

Complexos Multiproteicos/análise , Máquina de Vetores de Suporte , Análise Discriminante , Multimerização Proteica

Proteome compression via protein domain compositions.

Hayashida, Morihiro; Ruan, Peiying; Akutsu, Tatsuya.

Methods ; 67(3): 380-5, 2014 Jun 01.

Artigo em Inglês | MEDLINE | ID: mdl-24486717

RESUMO

In this paper, we study domain compositions of proteins via compression of whole proteins in an organism for the sake of obtaining the entropy that the individual contains. We suppose that a protein is a multiset of domains. Since gene duplication and fusion have occurred through evolutionary processes, the same domains and the same compositions of domains appear in multiple proteins, which enables us to compress a proteome by using references to proteins for duplicated and fused proteins. Such a network with references to at most two proteins is modeled as a directed hypergraph. We propose a heuristic approach by combining the Edmonds algorithm and an integer linear programming, and apply our procedure to 14 proteomes of Dictyostelium discoideum, Escherichia coli, Saccharomyces cerevisiae, Schizosaccharomyces pombe, Caenorhabditis elegans, Drosophila melanogaster, Arabidopsis thaliana, Oryza sativa, Danio rerio, Xenopus laevis, Gallus gallus, Mus musculus, Pan troglodytes, and Homo sapiens. The compressed size using both of duplication and fusion was smaller than that using only duplication, which suggests the importance of fusion events in evolution of a proteome.

Assuntos

Estrutura Terciária de Proteína , Proteoma , Proteômica/métodos , Algoritmos , Animais , Bases de Dados de Proteínas , Humanos , Análise de Sequência de Proteína

Prediction of heterodimeric protein complexes from weighted protein-protein interaction networks using novel features and kernel functions.

Ruan, Peiying; Hayashida, Morihiro; Maruyama, Osamu; Akutsu, Tatsuya.

PLoS One ; 8(6): e65265, 2013.

Artigo em Inglês | MEDLINE | ID: mdl-23776458

RESUMO

Since many proteins express their functional activity by interacting with other proteins and forming protein complexes, it is very useful to identify sets of proteins that form complexes. For that purpose, many prediction methods for protein complexes from protein-protein interactions have been developed such as MCL, MCODE, RNSC, PCP, RRW, and NWE. These methods have dealt with only complexes with size of more than three because the methods often are based on some density of subgraphs. However, heterodimeric protein complexes that consist of two distinct proteins occupy a large part according to several comprehensive databases of known complexes. In this paper, we propose several feature space mappings from protein-protein interaction data, in which each interaction is weighted based on reliability. Furthermore, we make use of prior knowledge on protein domains to develop feature space mappings, domain composition kernel and its combination kernel with our proposed features. We perform ten-fold cross-validation computational experiments. These results suggest that our proposed kernel considerably outperforms the naive Bayes-based method, which is the best existing method for predicting heterodimeric protein complexes.

Assuntos

Biologia Computacional/métodos , Mapeamento de Interação de Proteínas , Proteínas/química , Algoritmos , Software

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA