Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
1.
J Biomed Inform ; 58: 49-59, 2015 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-26423562

RESUMEN

Liver cancer is the sixth most frequently diagnosed cancer and, particularly, Hepatocellular Carcinoma (HCC) represents more than 90% of primary liver cancers. Clinicians assess each patient's treatment on the basis of evidence-based medicine, which may not always apply to a specific patient, given the biological variability among individuals. Over the years, and for the particular case of Hepatocellular Carcinoma, some research studies have been developing strategies for assisting clinicians in decision making, using computational methods (e.g. machine learning techniques) to extract knowledge from the clinical data. However, these studies have some limitations that have not yet been addressed: some do not focus entirely on Hepatocellular Carcinoma patients, others have strict application boundaries, and none considers the heterogeneity between patients nor the presence of missing data, a common drawback in healthcare contexts. In this work, a real complex Hepatocellular Carcinoma database composed of heterogeneous clinical features is studied. We propose a new cluster-based oversampling approach robust to small and imbalanced datasets, which accounts for the heterogeneity of patients with Hepatocellular Carcinoma. The preprocessing procedures of this work are based on data imputation considering appropriate distance metrics for both heterogeneous and missing data (HEOM) and clustering studies to assess the underlying patient groups in the studied dataset (K-means). The final approach is applied in order to diminish the impact of underlying patient profiles with reduced sizes on survival prediction. It is based on K-means clustering and the SMOTE algorithm to build a representative dataset and use it as training example for different machine learning procedures (logistic regression and neural networks). The results are evaluated in terms of survival prediction and compared across baseline approaches that do not consider clustering and/or oversampling using the Friedman rank test. Our proposed methodology coupled with neural networks outperformed all others, suggesting an improvement over the classical approaches currently used in Hepatocellular Carcinoma prediction models.


Asunto(s)
Carcinoma Hepatocelular/fisiopatología , Neoplasias Hepáticas/fisiopatología , Análisis por Conglomerados , Humanos
2.
Comput Biol Med ; 59: 125-133, 2015 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-25725446

RESUMEN

Breast cancer is the most frequently diagnosed cancer in women. Using historical patient information stored in clinical datasets, data mining and machine learning approaches can be applied to predict the survival of breast cancer patients. A common drawback is the absence of information, i.e., missing data, in certain clinical trials. However, most standard prediction methods are not able to handle incomplete samples and, then, missing data imputation is a widely applied approach for solving this inconvenience. Therefore, and taking into account the characteristics of each breast cancer dataset, it is required to perform a detailed analysis to determine the most appropriate imputation and prediction methods in each clinical environment. This research work analyzes a real breast cancer dataset from Institute Portuguese of Oncology of Porto with a high percentage of unknown categorical information (most clinical data of the patients are incomplete), which is a challenge in terms of complexity. Four scenarios are evaluated: (I) 5-year survival prediction without imputation and 5-year survival prediction from cleaned dataset with (II) Mode imputation, (III) Expectation-Maximization imputation and (IV) K-Nearest Neighbors imputation. Prediction models for breast cancer survivability are constructed using four different methods: K-Nearest Neighbors, Classification Trees, Logistic Regression and Support Vector Machines. Experiments are performed in a nested ten-fold cross-validation procedure and, according to the obtained results, the best results are provided by the K-Nearest Neighbors algorithm: more than 81% of accuracy and more than 0.78 of area under the Receiver Operator Characteristic curve, which constitutes very good results in this complex scenario.


Asunto(s)
Neoplasias de la Mama/mortalidad , Modelos Estadísticos , Análisis de Supervivencia , Adulto , Anciano , Anciano de 80 o más Años , Algoritmos , Femenino , Humanos , Persona de Mediana Edad , Adulto Joven
3.
Int J Neural Syst ; 23(4): 1350015, 2013 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-23746288

RESUMEN

Discriminative features have to be properly extracted and selected from the electroencephalographic (EEG) signals of each specific subject in order to achieve an adaptive brain-computer interface (BCI) system. This work presents an efficient wrapper-based methodology for feature selection and least squares discrimination of high-dimensional EEG data with low computational complexity. Features are computed in different time segments using three widely used methods for motor imagery tasks and, then, they are concatenated or averaged in order to take into account the time course variability of the EEG signals. Once EEG features have been extracted, proposed framework comprises two stages. The first stage entails feature ranking and, in this work, two different procedures have been considered, the least angle regression (LARS) and the Wilcoxon rank sum test, to compare the performance of each one. The second stage selects the most relevant features using an efficient leave-one-out (LOO) estimation based on the Allen's PRESS statistic. Experimental comparisons with the state-of-the-art BCI methods shows that this approach gives better results than current state-of-the-art approaches in terms of recognition rates and computational requirements and, also with respect to the first ranking stage, it is confirmed that the LARS algorithm provides better results than the Wilcoxon rank sum test for these experiments.


Asunto(s)
Interfaces Cerebro-Computador , Encéfalo/fisiología , Electroencefalografía , Imaginación/fisiología , Interfaz Usuario-Computador , Algoritmos , Automatización , Humanos , Procesamiento de Imagen Asistido por Computador , Análisis de los Mínimos Cuadrados , Análisis de Regresión , Factores de Tiempo
4.
Neural Netw ; 48: 19-24, 2013 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-23892908

RESUMEN

Selection of the optimal neural architecture to solve a pattern classification problem entails to choose the relevant input units, the number of hidden neurons and its corresponding interconnection weights. This problem has been widely studied in many research works but their solutions usually involve excessive computational cost in most of the problems and they do not provide a unique solution. This paper proposes a new technique to efficiently design the MultiLayer Perceptron (MLP) architecture for classification using the Extreme Learning Machine (ELM) algorithm. The proposed method provides a high generalization capability and a unique solution for the architecture design. Moreover, the selected final network only retains those input connections that are relevant for the classification task. Experimental results show these advantages.


Asunto(s)
Inteligencia Artificial , Sistemas de Computación , Redes Neurales de la Computación , Algoritmos , Interpretación Estadística de Datos , Neuronas/fisiología , Reproducibilidad de los Resultados
5.
J Med Syst ; 36 Suppl 1: S51-63, 2012 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-23117792

RESUMEN

Extracting knowledge from electroencephalographic (EEG) signals has become an increasingly important research area in biomedical engineering. In addition to its clinical diagnostic purposes, in recent years there have been many efforts to develop brain computer interface (BCI) systems, which allow users to control external devices only by using their brain activity. Once the EEG signals have been acquired, it is necessary to use appropriate feature extraction and classification methods adapted to the user in order to improve the performance of the BCI system and, also, to make its design stage easier. This work introduces a novel fast adaptive BCI system for automatic feature extraction and classification of EEG signals. The proposed system efficiently combines several well-known feature extraction procedures and automatically chooses the most useful features for performing the classification task. Three different feature extraction techniques are applied: power spectral density, Hjorth parameters and autoregressive modelling. The most relevant features for linear discrimination are selected using a fast and robust wrapper methodology. The proposed method is evaluated using EEG signals from nine subjects during motor imagery tasks. Obtained experimental results show its advantages over the state-of-the-art methods, especially in terms of classification accuracy and computational cost.


Asunto(s)
Interfaces Cerebro-Computador , Electroencefalografía/clasificación , Electroencefalografía/instrumentación , Algoritmos , Ingeniería Biomédica , Humanos , Diseño de Software
6.
Artif Intell Med ; 50(2): 105-15, 2010 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-20638252

RESUMEN

OBJECTIVES: Missing data imputation is an important task in cases where it is crucial to use all available data and not discard records with missing values. This work evaluates the performance of several statistical and machine learning imputation methods that were used to predict recurrence in patients in an extensive real breast cancer data set. MATERIALS AND METHODS: Imputation methods based on statistical techniques, e.g., mean, hot-deck and multiple imputation, and machine learning techniques, e.g., multi-layer perceptron (MLP), self-organisation maps (SOM) and k-nearest neighbour (KNN), were applied to data collected through the "El Álamo-I" project, and the results were then compared to those obtained from the listwise deletion (LD) imputation method. The database includes demographic, therapeutic and recurrence-survival information from 3679 women with operable invasive breast cancer diagnosed in 32 different hospitals belonging to the Spanish Breast Cancer Research Group (GEICAM). The accuracies of predictions on early cancer relapse were measured using artificial neural networks (ANNs), in which different ANNs were estimated using the data sets with imputed missing values. RESULTS: The imputation methods based on machine learning algorithms outperformed imputation statistical methods in the prediction of patient outcome. Friedman's test revealed a significant difference (p=0.0091) in the observed area under the ROC curve (AUC) values, and the pairwise comparison test showed that the AUCs for MLP, KNN and SOM were significantly higher (p=0.0053, p=0.0048 and p=0.0071, respectively) than the AUC from the LD-based prognosis model. CONCLUSION: The methods based on machine learning techniques were the most suited for the imputation of missing values and led to a significant enhancement of prognosis accuracy compared to imputation methods based on statistical procedures.


Asunto(s)
Inteligencia Artificial , Neoplasias de la Mama/diagnóstico , Modelos Estadísticos , Adulto , Anciano , Anciano de 80 o más Años , Algoritmos , Bases de Datos Factuales/estadística & datos numéricos , Demografía , Femenino , Humanos , Persona de Mediana Edad , Recurrencia Local de Neoplasia , Pronóstico , Curva ROC , Análisis de Supervivencia
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA