Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
BMC Cancer ; 24(1): 128, 2024 Jan 24.
Artículo en Inglés | MEDLINE | ID: mdl-38267924

RESUMEN

BACKGROUND: Sarcopenia has been identified as a potential negative prognostic factor in cancer patients. In this study, our objective was to investigate the relationship between the assessment method for sarcopenia using the masseter muscle volume measured on computed tomography (CT) images and the life expectancy of patients with oral cancer. We also developed a learning model using deep learning to automatically extract the masseter muscle volume and investigated its association with the life expectancy of oral cancer patients. METHODS: To develop the learning model for masseter muscle volume, we used manually extracted data from CT images of 277 patients. We established the association between manually extracted masseter muscle volume and the life expectancy of oral cancer patients. Additionally, we compared the correlation between the groups of manual and automatic extraction in the masseter muscle volume learning model. RESULTS: Our findings revealed a significant association between manually extracted masseter muscle volume on CT images and the life expectancy of patients with oral cancer. Notably, the manual and automatic extraction groups in the masseter muscle volume learning model showed a high correlation. Furthermore, the masseter muscle volume automatically extracted using the developed learning model exhibited a strong association with life expectancy. CONCLUSIONS: The sarcopenia assessment method is useful for predicting the life expectancy of patients with oral cancer. In the future, it is crucial to validate and analyze various factors within the oral surgery field, extending beyond cancer patients.


Asunto(s)
Aprendizaje Profundo , Neoplasias de la Boca , Sarcopenia , Humanos , Pronóstico , Músculo Masetero/diagnóstico por imagen , Sarcopenia/diagnóstico por imagen , Neoplasias de la Boca/diagnóstico por imagen
2.
Nat Med ; 27(10): 1735-1743, 2021 10.
Artículo en Inglés | MEDLINE | ID: mdl-34526699

RESUMEN

Federated learning (FL) is a method used for training artificial intelligence models with data from multiple sources while maintaining data anonymity, thus removing many barriers to data sharing. Here we used data from 20 institutes across the globe to train a FL model, called EXAM (electronic medical record (EMR) chest X-ray AI model), that predicts the future oxygen requirements of symptomatic patients with COVID-19 using inputs of vital signs, laboratory data and chest X-rays. EXAM achieved an average area under the curve (AUC) >0.92 for predicting outcomes at 24 and 72 h from the time of initial presentation to the emergency room, and it provided 16% improvement in average AUC measured across all participating sites and an average increase in generalizability of 38% when compared with models trained at a single site using that site's data. For prediction of mechanical ventilation treatment or death at 24 h at the largest independent test site, EXAM achieved a sensitivity of 0.950 and specificity of 0.882. In this study, FL facilitated rapid data science collaboration without data exchange and generated a model that generalized across heterogeneous, unharmonized datasets for prediction of clinical outcomes in patients with COVID-19, setting the stage for the broader use of FL in healthcare.


Asunto(s)
COVID-19/fisiopatología , Aprendizaje Automático , Evaluación de Resultado en la Atención de Salud , COVID-19/terapia , COVID-19/virología , Registros Electrónicos de Salud , Humanos , Pronóstico , SARS-CoV-2/aislamiento & purificación
3.
J Bioinform Comput Biol ; 17(3): 1940005, 2019 06.
Artículo en Inglés | MEDLINE | ID: mdl-31288637

RESUMEN

Cancer subtype identification is an unmet need in precision diagnosis. Recently, evolutionary conservation has been indicated to contain informative signatures for functional significance in cancers. However, the importance of evolutionary conservation in distinguishing cancer subtypes remains largely unclear. Here, we identified the evolutionarily conserved genes (i.e. core genes) and observed that they are primarily involved in cellular pathways relevant to cell growth and metabolisms. By using these core genes, we developed two novel strategies, namely a feature-based strategy (FES) and an image-based strategy (IMS) by integrating their evolutionary and genomic profiles with the deep learning algorithm. In comparison with the FES using the random set and the strategy using the PAM50 classifier, the core gene set-based FES achieved a higher accuracy for identifying breast cancer subtypes. The IMS and FES using the core gene set yielded better performances than the other strategies, in terms of classifying both breast cancer subtypes and multiple cancer types. Moreover, the IMS is reproducible even using different gene expression data (i.e. RNA-seq and microarray). Comprehensive analysis of eight cancer types demonstrates that our evolutionary conservation-based models represent a valid and helpful approach for identifying cancer subtypes and the core gene set offers distinguishable clues of cancer subtypes.


Asunto(s)
Biología Computacional/métodos , Neoplasias/genética , Neoplasias/patología , Secuencia de Bases , Evolución Biológica , Neoplasias de la Mama/genética , Neoplasias de la Mama/patología , Secuencia Conservada , Bases de Datos Factuales , Aprendizaje Profundo , Femenino , Perfilación de la Expresión Génica , Regulación Neoplásica de la Expresión Génica , Humanos , Redes Neurales de la Computación , Mapas de Interacción de Proteínas
4.
J Theor Biol ; 463: 1-11, 2019 02 21.
Artículo en Inglés | MEDLINE | ID: mdl-30543810

RESUMEN

It is known that many driver nodes are required to control complex biological networks. Previous studies imply that O(N) driver nodes are required in both linear complex network and Boolean network models with N nodes if an arbitrary state is specified as the target. In order to cope with this intrinsic difficulty, we consider a special case of the control problem in which the targets are restricted to attractors. For this special case, we mathematically prove under the uniform distribution of states in basins that the expected number of driver nodes is only O(log2N+log2M) for controlling Boolean networks, where M is the number of attractors. Since it is expected that M is not very large in many practical networks, the new model requires a much smaller number of driver nodes. This result is based on discovery of novel relationships between control problems on Boolean networks and the coupon collector's problem, a well-known concept in combinatorics. We also provide lower bounds of the number of driver nodes as well as simulation results using artificial and realistic network data, which support our theoretical findings.


Asunto(s)
Modelos Biológicos , Modelos Teóricos , Algoritmos , Biología de Sistemas/métodos
5.
BMC Bioinformatics ; 19(Suppl 1): 39, 2018 02 19.
Artículo en Inglés | MEDLINE | ID: mdl-29504897

RESUMEN

BACKGROUND: Since many proteins become functional only after they interact with their partner proteins and form protein complexes, it is essential to identify the sets of proteins that form complexes. Therefore, several computational methods have been proposed to predict complexes from the topology and structure of experimental protein-protein interaction (PPI) network. These methods work well to predict complexes involving at least three proteins, but generally fail at identifying complexes involving only two different proteins, called heterodimeric complexes or heterodimers. There is however an urgent need for efficient methods to predict heterodimers, since the majority of known protein complexes are precisely heterodimers. RESULTS: In this paper, we use three promising kernel functions, Min kernel and two pairwise kernels, which are Metric Learning Pairwise Kernel (MLPK) and Tensor Product Pairwise Kernel (TPPK). We also consider the normalization forms of Min kernel. Then, we combine Min kernel or its normalization form and one of the pairwise kernels by plugging. We applied kernels based on PPI, domain, phylogenetic profile, and subcellular localization properties to predicting heterodimers. Then, we evaluate our method by employing C-Support Vector Classification (C-SVC), carrying out 10-fold cross-validation, and calculating the average F-measures. The results suggest that the combination of normalized-Min-kernel and MLPK leads to the best F-measure and improved the performance of our previous work, which had been the best existing method so far. CONCLUSIONS: We propose new methods to predict heterodimers, using a machine learning-based approach. We train a support vector machine (SVM) to discriminate interacting vs non-interacting protein pairs, based on informations extracted from PPI, domain, phylogenetic profiles and subcellular localization. We evaluate in detail new kernel functions to encode these data, and report prediction performance that outperforms the state-of-the-art.


Asunto(s)
Algoritmos , Complejos Multiproteicos/química , Dimerización , Complejos Multiproteicos/clasificación , Filogenia , Dominios Proteicos , Mapas de Interacción de Proteínas , Multimerización de Proteína , Máquina de Vectores de Soporte
6.
BMC Bioinformatics ; 15: 70, 2014 Mar 14.
Artículo en Inglés | MEDLINE | ID: mdl-24625071

RESUMEN

BACKGROUND: Modeling high-dimensional data involving thousands of variables is particularly important for gene expression profiling experiments, nevertheless,it remains a challenging task. One of the challenges is to implement an effective method for selecting a small set of relevant genes, buried in high-dimensional irrelevant noises. RELIEF is a popular and widely used approach for feature selection owing to its low computational cost and high accuracy. However, RELIEF based methods suffer from instability, especially in the presence of noisy and/or high-dimensional outliers. RESULTS: We propose an innovative feature weighting algorithm, called LHR, to select informative genes from highly noisy data. LHR is based on RELIEF for feature weighting using classical margin maximization. The key idea of LHR is to estimate the feature weights through local approximation rather than global measurement, which is typically used in existing methods. The weights obtained by our method are very robust in terms of degradation of noisy features, even those with vast dimensions. To demonstrate the performance of our method, extensive experiments involving classification tests have been carried out on both synthetic and real microarray benchmark datasets by combining the proposed technique with standard classifiers, including the support vector machine (SVM), k-nearest neighbor (KNN), hyperplane k-nearest neighbor (HKNN), linear discriminant analysis (LDA) and naive Bayes (NB). CONCLUSION: Experiments on both synthetic and real-world datasets demonstrate the superior performance of the proposed feature selection method combined with supervised learning in three aspects: 1) high classification accuracy, 2) excellent robustness to noise and 3) good stability using to various classification algorithms.


Asunto(s)
Biología Computacional/métodos , Perfilación de la Expresión Génica/métodos , Máquina de Vectores de Soporte , Algoritmos , Teorema de Bayes , Análisis por Conglomerados , Bases de Datos Genéticas , Análisis Discriminante , Humanos , Neoplasias/genética , Neoplasias/metabolismo , Análisis de Secuencia por Matrices de Oligonucleótidos
7.
BMC Bioinformatics ; 15 Suppl 2: S6, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-24564744

RESUMEN

BACKGROUND: Protein complexes play important roles in biological systems such as gene regulatory networks and metabolic pathways. Most methods for predicting protein complexes try to find protein complexes with size more than three. It, however, is known that protein complexes with smaller sizes occupy a large part of whole complexes for several species. In our previous work, we developed a method with several feature space mappings and the domain composition kernel for prediction of heterodimeric protein complexes, which outperforms existing methods. RESULTS: We propose methods for prediction of heterotrimeric protein complexes by extending techniques in the previous work on the basis of the idea that most heterotrimeric protein complexes are not likely to share the same protein with each other. We make use of the discriminant function in support vector machines (SVMs), and design novel feature space mappings for the second phase. As the second classifier, we examine SVMs and relevance vector machines (RVMs). We perform 10-fold cross-validation computational experiments. The results suggest that our proposed two-phase methods and SVM with the extended features outperform the existing method NWE, which was reported to outperform other existing methods such as MCL, MCODE, DPClus, CMC, COACH, RRW, and PPSampler for prediction of heterotrimeric protein complexes. CONCLUSIONS: We propose two-phase prediction methods with the extended features, the domain composition kernel, SVMs and RVMs. The two-phase method with the extended features and the domain composition kernel using SVM as the second classifier is particularly useful for prediction of heterotrimeric protein complexes.


Asunto(s)
Complejos Multiproteicos/análisis , Máquina de Vectores de Soporte , Análisis Discriminante , Multimerización de Proteína
8.
Methods ; 67(3): 380-5, 2014 Jun 01.
Artículo en Inglés | MEDLINE | ID: mdl-24486717

RESUMEN

In this paper, we study domain compositions of proteins via compression of whole proteins in an organism for the sake of obtaining the entropy that the individual contains. We suppose that a protein is a multiset of domains. Since gene duplication and fusion have occurred through evolutionary processes, the same domains and the same compositions of domains appear in multiple proteins, which enables us to compress a proteome by using references to proteins for duplicated and fused proteins. Such a network with references to at most two proteins is modeled as a directed hypergraph. We propose a heuristic approach by combining the Edmonds algorithm and an integer linear programming, and apply our procedure to 14 proteomes of Dictyostelium discoideum, Escherichia coli, Saccharomyces cerevisiae, Schizosaccharomyces pombe, Caenorhabditis elegans, Drosophila melanogaster, Arabidopsis thaliana, Oryza sativa, Danio rerio, Xenopus laevis, Gallus gallus, Mus musculus, Pan troglodytes, and Homo sapiens. The compressed size using both of duplication and fusion was smaller than that using only duplication, which suggests the importance of fusion events in evolution of a proteome.


Asunto(s)
Estructura Terciaria de Proteína , Proteoma , Proteómica/métodos , Algoritmos , Animales , Bases de Datos de Proteínas , Humanos , Análisis de Secuencia de Proteína
9.
PLoS One ; 8(6): e65265, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-23776458

RESUMEN

Since many proteins express their functional activity by interacting with other proteins and forming protein complexes, it is very useful to identify sets of proteins that form complexes. For that purpose, many prediction methods for protein complexes from protein-protein interactions have been developed such as MCL, MCODE, RNSC, PCP, RRW, and NWE. These methods have dealt with only complexes with size of more than three because the methods often are based on some density of subgraphs. However, heterodimeric protein complexes that consist of two distinct proteins occupy a large part according to several comprehensive databases of known complexes. In this paper, we propose several feature space mappings from protein-protein interaction data, in which each interaction is weighted based on reliability. Furthermore, we make use of prior knowledge on protein domains to develop feature space mappings, domain composition kernel and its combination kernel with our proposed features. We perform ten-fold cross-validation computational experiments. These results suggest that our proposed kernel considerably outperforms the naive Bayes-based method, which is the best existing method for predicting heterodimeric protein complexes.


Asunto(s)
Biología Computacional/métodos , Mapeo de Interacción de Proteínas , Proteínas/química , Algoritmos , Programas Informáticos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...