Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 18 de 18
Filtrar
1.
Appl Intell (Dordr) ; 52(6): 6413-6431, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-34764619

RESUMO

In this study, we analyze the capability of several state of the art machine learning methods to predict whether patients diagnosed with CoVid-19 (CoronaVirus disease 2019) will need different levels of hospital care assistance (regular hospital admission or intensive care unit admission), during the course of their illness, using only demographic and clinical data. For this research, a data set of 10,454 patients from 14 hospitals in Galicia (Spain) was used. Each patient is characterized by 833 variables, two of which are age and gender and the other are records of diseases or conditions in their medical history. In addition, for each patient, his/her history of hospital or intensive care unit (ICU) admissions due to CoVid-19 is available. This clinical history will serve to label each patient and thus being able to assess the predictions of the model. Our aim is to identify which model delivers the best accuracies for both hospital and ICU admissions only using demographic variables and some structured clinical data, as well as identifying which of those are more relevant in both cases. The results obtained in the experimental study show that the best models are those based on oversampling as a preprocessing phase to balance the distribution of classes. Using these models and all the available features, we achieved an area under the curve (AUC) of 76.1% and 80.4% for predicting the need of hospital and ICU admissions, respectively. Furthermore, feature selection and oversampling techniques were applied and it has been experimentally verified that the relevant variables for the classification are age and gender, since only using these two features the performance of the models is not degraded for the two mentioned prediction problems.

2.
Adv Exp Med Biol ; 1065: 607-626, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-30051410

RESUMO

Medicine will experience many changes in the coming years because the so-called "medicine of the future" will be increasingly proactive, featuring four basic elements: predictive, personalized, preventive, and participatory. Drivers for these changes include the digitization of data in medicine and the availability of computational tools that deal with massive volumes of data. Thus, the need to apply machine-learning methods to medicine has increased dramatically in recent years while facing challenges related to an unprecedented large number of clinically relevant features and highly specific diagnostic tests. Advances regarding data-storage technology and the progress concerning genome studies have enabled collecting vast amounts of patient clinical details, thus permitting the extraction of valuable information. In consequence, big-data analytics is becoming a mandatory technology to be used in the clinical domain.Machine learning and big-data analytics can be used in the field of cardiology, for example, for the prediction of individual risk factors for cardiovascular disease, for clinical decision support, and for practicing precision medicine using genomic information. Several projects employ machine-learning techniques to address the problem of classification and prediction of heart failure (HF) subtypes and unbiased clustering analysis using dense phenomapping to identify phenotypically distinct HF categories. In this chapter, these ideas are further presented, and a computerized model allowing the distinction between two major HF phenotypes on the basis of ventricular-volume data analysis is discussed in detail.


Assuntos
Big Data , Cardiologia/métodos , Mineração de Dados/métodos , Bases de Dados Factuais , Insuficiência Cardíaca , Aprendizado de Máquina , Análise por Conglomerados , Insuficiência Cardíaca/classificação , Insuficiência Cardíaca/diagnóstico , Insuficiência Cardíaca/fisiopatologia , Insuficiência Cardíaca/terapia , Humanos , Prognóstico , Terminologia como Assunto
3.
Comput Biol Med ; 180: 108999, 2024 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-39137672

RESUMO

Dietary Restriction (DR) is one of the most popular anti-ageing interventions; recently, Machine Learning (ML) has been explored to identify potential DR-related genes among ageing-related genes, aiming to minimize costly wet lab experiments needed to expand our knowledge on DR. However, to train a model from positive (DR-related) and negative (non-DR-related) examples, the existing ML approach naively labels genes without known DR relation as negative examples, assuming that lack of DR-related annotation for a gene represents evidence of absence of DR-relatedness, rather than absence of evidence. This hinders the reliability of the negative examples (non-DR-related genes) and the method's ability to identify novel DR-related genes. This work introduces a novel gene prioritization method based on the two-step Positive-Unlabelled (PU) Learning paradigm: using a similarity-based, KNN-inspired approach, our method first selects reliable negative examples among the genes without known DR associations. Then, these reliable negatives and all known positives are used to train a classifier that effectively differentiates DR-related and non-DR-related genes, which is finally employed to generate a more reliable ranking of promising genes for novel DR-relatedness. Our method significantly outperforms (p<0.05) the existing state-of-the-art approach in three predictive accuracy metrics with up to ∼40% lower computational cost in the best case, and we identify 4 new promising DR-related genes (PRKAB1, PRKAB2, IRS2, PRKAG1), all with evidence from the existing literature supporting their potential DR-related role.


Assuntos
Envelhecimento , Aprendizado de Máquina , Humanos , Envelhecimento/genética , Envelhecimento/fisiologia , Restrição Calórica , Biologia Computacional/métodos
4.
IEEE Trans Pattern Anal Mach Intell ; 45(7): 8311-8323, 2023 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-37015369

RESUMO

Classic embedded feature selection algorithms are often divided in two large groups: tree-based algorithms and LASSO variants. Both approaches are focused in different aspects: while the tree-based algorithms provide a clear explanation about which variables are being used to trigger a certain output, LASSO-like approaches sacrifice a detailed explanation in favor of increasing its accuracy. In this paper, we present a novel embedded feature selection algorithm, called End-to-End Feature Selection (E2E-FS), that aims to provide both accuracy and explainability in a clever way. Despite having non-convex regularization terms, our algorithm, similar to the LASSO approach, is solved with gradient descent techniques, introducing some restrictions that force the model to specifically select a maximum number of features that are going to be used subsequently by the classifier. Although these are hard restrictions, the experimental results obtained show that this algorithm can be used with any learning model that is trained using a gradient descent algorithm.

5.
Med Biol Eng Comput ; 60(5): 1333-1345, 2022 May.
Artigo em Inglês | MEDLINE | ID: mdl-35316469

RESUMO

The number of interconnected devices, such as personal wearables, cars, and smart-homes, surrounding us every day has recently increased. The Internet of Things devices monitor many processes, and have the capacity of using machine learning models for pattern recognition, and even making decisions, with the added advantage of diminishing network congestion by allowing computations near to the data sources. The main restriction is the low computation capacity of these devices. Thus, machine learning algorithms capable of maintaining accuracy while using mechanisms that exploit certain characteristics, such as low-precision versions, are needed. In this paper, low-precision mutual information-based feature selection algorithms are employed over DNA microarray datasets, showing that 16-bit and some times even 8-bit representations of these algorithms can be used without significant variations in the final classification results achieved. Graphical Abstract Graphical abstract.


Assuntos
Algoritmos , Aprendizado de Máquina , Armazenamento e Recuperação da Informação , Análise em Microsséries
6.
IEEE Trans Pattern Anal Mach Intell ; 43(11): 4177-4188, 2021 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-32396072

RESUMO

This paper presents a unified propagation method for dealing with both the classic Eikonal equation, where the motion direction does not affect the propagation, and the more general static Hamilton-Jacobi equations, where it does. While classic Fast Marching Method (FMM) techniques achieve the solution to the Eikonal equation with a O(M log M) (or O(M) assuming some modifications), solving the more general static Hamilton-Jacobi equation requires a higher complexity. The proposed framework maintains the O(M log M) complexity for both problems, while achieving higher accuracy than available state-of-the-art. The key idea behind the proposed method is the creation of 'mini wave-fronts', where the solution is interpolated to minimize the discretization error. Experimental results show how our algorithm can outperform the state-of-the-art both in precision and computational cost.

7.
Eur Heart J Cardiovasc Imaging ; 22(10): 1208-1217, 2021 09 20.
Artigo em Inglês | MEDLINE | ID: mdl-32588036

RESUMO

AIMS: Both left ventricular (LV) diastolic dysfunction (LVDD) and hypertrophy (LVH) as assessed by echocardiography are independent prognostic markers of future cardiovascular events in the community. However, selective screening strategies to identify individuals at risk who would benefit most from cardiac phenotyping are lacking. We, therefore, assessed the utility of several machine learning (ML) classifiers built on routinely measured clinical, biochemical, and electrocardiographic features for detecting subclinical LV abnormalities. METHODS AND RESULTS: We included 1407 participants (mean age, 51 years, 51% women) randomly recruited from the general population. We used echocardiographic parameters reflecting LV diastolic function and structure to define LV abnormalities (LVDD, n = 252; LVH, n = 272). Next, four supervised ML algorithms (XGBoost, AdaBoost, Random Forest (RF), Support Vector Machines, and Logistic regression) were used to build classifiers based on clinical data (67 features) to categorize LVDD and LVH. We applied a nested 10-fold cross-validation set-up. XGBoost and RF classifiers exhibited a high area under the receiver operating characteristic curve with values between 86.2% and 88.1% for predicting LVDD and between 77.7% and 78.5% for predicting LVH. Age, body mass index, different components of blood pressure, history of hypertension, antihypertensive treatment, and various electrocardiographic variables were the top selected features for predicting LVDD and LVH. CONCLUSION: XGBoost and RF classifiers combining routinely measured clinical, laboratory, and electrocardiographic data predicted LVDD and LVH with high accuracy. These ML classifiers might be useful to pre-select individuals in whom further echocardiographic examination, monitoring, and preventive measures are warranted.


Assuntos
Hipertensão , Disfunção Ventricular Esquerda , Feminino , Humanos , Hipertrofia Ventricular Esquerda , Aprendizado de Máquina , Masculino , Pessoa de Meia-Idade , Fatores de Risco , Disfunção Ventricular Esquerda/diagnóstico por imagem , Remodelação Ventricular
8.
Methods Mol Biol ; 1986: 65-85, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31115885

RESUMO

The advent of DNA microarray datasets has stimulated a new line of research both in bioinformatics and in machine learning. This type of data is used to collect information from tissue and cell samples regarding gene expression differences that could be useful for disease diagnosis or for distinguishing specific types of tumor. Microarray data classification is a difficult challenge for machine learning researchers due to its high number of features and the small sample sizes. This chapter is devoted to reviewing the microarray databases most frequently used in the literature. We also make the interested reader aware of the problematic of data characteristics in this domain, such as the imbalance of the data, their complexity, and the so-called dataset shift.


Assuntos
Bases de Dados Genéticas , Análise de Sequência com Séries de Oligonucleotídeos , Humanos , Neoplasias/genética , Tamanho da Amostra
9.
Methods Mol Biol ; 1986: 123-152, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31115887

RESUMO

A typical characteristic of microarray data is that it has a very high number of features (in the order of thousands) while the number of examples is usually less than 100. In the context of microarray classification, this poses a challenge for machine learning methods, which can suffer overfitting and thus degradation in their performance. A common solution is to apply a dimensionality reduction technique before classification, to reduce the number of features. This chapter will be focused on one of the most famous dimensionality reduction techniques: feature selection. We will see how feature selection can help improve the classification accuracy in several microarray data scenarios.


Assuntos
Algoritmos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Teorema de Bayes , Bases de Dados Genéticas , Máquina de Vetores de Suporte
10.
Methods Mol Biol ; 1986: 283-293, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31115895

RESUMO

The current situation in microarray data analysis and prospects for the future are briefly discussed in this chapter, in which the competition between microarray technologies and high-throughput technologies is considered under a data analysis view. The up-to-date limitations of DNA microarrays are important to forecast challenges and future trends in microarray data analysis; these include data analysis techniques associated with an increasing sample sizes, new feature selection methods, deep learning techniques, covariate significance testing as well as false discovery rate methods, among other procedures for a better interpretability of the results.


Assuntos
Análise em Microsséries/métodos , Análise em Microsséries/tendências , Algoritmos , Aprendizado Profundo , Humanos
11.
Artif Intell Med ; 34(1): 65-76, 2005 May.
Artigo em Inglês | MEDLINE | ID: mdl-15885567

RESUMO

OBJECTIVES: This paper presents a novel approach for sleep apnea classification. The goal is to classify each apnea in one of three basic types: obstructive, central and mixed. MATERIALS AND METHODS: Three different supervised learning methods using a neural network were tested. The inputs of the neural network are the first level-5-detail coefficients obtained from a discrete wavelet transformation of the samples (previously detected as apnea) in the thoracic effort signal. In order to train and test the systems, 120 events from six different patients were used. The true error rate was estimated using a 10-fold cross validation. The results presented in this work were averaged over 100 different simulations and a multiple comparison procedure was used for model selection. RESULTS: The method finally selected is based on a feedforward neural network trained using the Bayesian framework and a cross-entropy error function. The mean classification accuracy, obtained over the test set was 83.78+/-1.90%. CONCLUSION: The proposed classifier surpasses, up to the author's knowledge, other previous results. Finally, a scheme to maintain and improve this system during its clinical use is also proposed.


Assuntos
Redes Neurais de Computação , Síndromes da Apneia do Sono/classificação , Algoritmos , Teorema de Bayes , Humanos , Polissonografia
12.
Clin Med Insights Cardiol ; 9(Suppl 1): 57-71, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26052231

RESUMO

BACKGROUND: Heart failure (HF) manifests as at least two subtypes. The current paradigm distinguishes the two by using both the metric ejection fraction (EF) and a constraint for end-diastolic volume. About half of all HF patients exhibit preserved EF. In contrast, the classical type of HF shows a reduced EF. Common practice sets the cut-off point often at or near EF = 50%, thus defining a linear divider. However, a rationale for this safe choice is lacking, while the assumption regarding applicability of strict linearity has not been justified. Additionally, some studies opt for eliminating patients from consideration for HF if 40 < EF < 50% (gray zone). Thus, there is a need for documented classification guidelines, solving gray zone ambiguity and formulating crisp delineation of transitions between phenotypes. METHODS: Machine learning (ML) models are applied to classify HF subtypes within the ventricular volume domain, rather than by the single use of EF. Various ML models, both unsupervised and supervised, are employed to establish a foundation for classification. Data regarding 48 HF patients are employed as training set for subsequent classification of Monte Carlo-generated surrogate HF patients (n = 403). Next, we map consequences when EF cut-off differs from 50% (as proposed for women) and analyze HF candidates not covered by current rules. RESULTS: The training set yields best results for the Support Vector Machine method (test error 4.06%), covers the gray zone, and other clinically relevant HF candidates. End-systolic volume (ESV) emerges as a logical discriminator rather than EF as in the prevailing paradigm. CONCLUSIONS: Selected ML models offer promise for classifying HF patients (including the gray zone), when driven by ventricular volume data. ML analysis indicates that ESV has a role in the development of guidelines to parse HF subtypes. The documented curvilinear relationship between EF and ESV suggests that the assumption concerning a linear EF divider may not be of general utility over the complete clinically relevant range.

13.
Artif Intell Med ; 24(1): 71-96, 2002 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-11779686

RESUMO

The validation of a software product is a fundamental part of its development, and focuses on an analysis of whether the software correctly resolves the problems it was designed to tackle. Traditional approaches to validation are based on a comparison of results with what is called a gold standard. Nevertheless, in certain domains, it is not always easy or even possible to establish such a standard. This is the case of intelligent systems that endeavour to simulate or emulate a model of expert behaviour. This article describes the validation of the intelligent system computer-aided foetal evaluator (CAFE), developed for intelligent monitoring of the antenatal condition based on data from the non-stress test (NST), and how this validation was accomplished through a methodology designed to resolve the problem of the validation of intelligent systems. System performance was compared to that of three obstetricians using 3450 min of cardiotocographic (CTG) records corresponding to 53 different patients. From these records different parameters were extracted and interpreted, and thus, the validation was carried out on a parameter-by-parameter basis using measurement techniques such as percentage agreement, the Kappa statistic or cluster analysis. Results showed that the system's agreement with the experts is, in general, similar to agreement between the experts themselves which, in turn, permits our system to be considered at least as skillful as our experts. Throughout our article, the results obtained are commented on with a view to demonstrating how the utilisation of different measures of the level of agreement existing between system and experts can assist not only in assessing the aptness of a system, but also in highlighting its weaknesses. This kind of assessment means that the system can be fine-tuned repeatedly to the point where the expected results are obtained.


Assuntos
Inteligência Artificial , Diagnóstico por Computador/métodos , Diagnóstico Pré-Natal/métodos , Software , Interpretação Estatística de Dados , Feminino , Frequência Cardíaca , Humanos , Gravidez
14.
IEEE J Biomed Health Inform ; 18(4): 1485-93, 2014 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-25014945

RESUMO

Dry eye is a symptomatic disease which affects a wide range of population and has a negative impact on their daily activities. Its diagnosis can be achieved by analyzing the interference patterns of the tear film lipid layer and by classifying them into one of the Guillon categories. The manual process done by experts is not only affected by subjective factors but is also very time consuming. In this paper we propose a general methodology to the automatic classification of tear film lipid layer, using color and texture information to characterize the image and feature selection methods to reduce the processing time. The adequacy of the proposed methodology was demonstrated since it achieves classification rates over 97% while maintaining robustness and provides unbiased results. Also, it can be applied in real time, and so allows important time savings for the experts.


Assuntos
Processamento de Imagem Assistida por Computador/métodos , Lipídeos/química , Lágrimas/química , Adulto , Humanos , Microscopia de Vídeo , Máquina de Vetores de Suporte , Adulto Jovem
15.
IEEE Trans Pattern Anal Mach Intell ; 35(12): 2997-3009, 2013 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-24136436

RESUMO

A technique for adjusting a minimum volume set of covering ellipsoids technique is elaborated. Solutions to this problem have potential application in one-class classification and clustering problems. Its main original features are: 1) It avoids the direct evaluation of determinants by using diagonalization properties of the involved matrices, 2) it identifies and removes outliers from the estimation process, 3) it avoids binary variables resulting from the combinatorial character of the assignment problem that are replaced by continuous variables in the range [0,1], 4) the problem can be solved by a bilevel algorithm that in its first level determines the ellipsoids and in its second level reassigns the data points to ellipsoids and identifies outliers based on an algorithm that forces the Karush-Kuhn-Tucker conditions to be satisfied. Two theorems provide rigorous bases for the proposed methods. Finally, a set of examples of application in different fields is given to illustrate the power of the method and its practical performance.

16.
Neural Netw ; 24(8): 888-96, 2011 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-21703822

RESUMO

Gene-expression microarray is a novel technology that allows the examination of tens of thousands of genes at a time. For this reason, manual observation is not feasible and machine learning methods are progressing to face these new data. Specifically, since the number of genes is very high, feature selection methods have proven valuable to deal with these unbalanced-high dimensionality and low cardinality-data sets. In this work, the FVQIT (Frontier Vector Quantization using Information Theory) classifier is employed to classify twelve DNA gene-expression microarray data sets of different kinds of cancer. A comparative study with other well-known classifiers is performed. The proposed approach shows competitive results outperforming all other classifiers.


Assuntos
Bases de Dados Genéticas , Teoria da Informação , Análise em Microsséries/classificação , Algoritmos , Inteligência Artificial , DNA de Neoplasias/genética , Entropia , Lógica Fuzzy , Humanos , Análise em Microsséries/métodos , Modelos Genéticos , Modelos Estatísticos , Reprodutibilidade dos Testes , Software
17.
Neural Comput ; 19(1): 231-57, 2007 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-17134324

RESUMO

A new methodology for learning the topology of a functional network from data, based on the ANOVA decomposition technique, is presented. The method determines sensitivity (importance) indices that allow a decision to be made as to which set of interactions among variables is relevant and which is irrelevant to the problem under study. This immediately suggests the network topology to be used in a given problem. Moreover, local sensitivities to small changes in the data can be easily calculated. In this way, the dual optimization problem gives the local sensitivities. The methods are illustrated by their application to artificial and real examples.


Assuntos
Inteligência Artificial , Redes Neurais de Computação , Análise de Variância , Tomada de Decisões Assistida por Computador , Humanos
18.
Neural Comput ; 14(6): 1429-49, 2002 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-12020453

RESUMO

The article presents a method for learning the weights in one-layer feedforward neural networks minimizing either the sum of squared errors or the maximum absolute error, measured in the input scale. This leads to the existence of a global optimum that can be easily obtained solving linear systems of equations or linear programming problems, using much less computational power than the one associated with the standard methods. Another version of the method allows computing a large set of estimates for the weights, providing robust, mean or median, estimates for them, and the associated standard errors, which give a good measure for the quality of the fit. Later, the standard one-layer neural network algorithms are improved by learning the neural functions instead of assuming them known. A set of examples of applications is used to illustrate the methods. Finally, a comparison with other high-performance learning algorithms shows that the proposed methods are at least 10 times faster than the fastest standard algorithm used in the comparison.

SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa