Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 19 de 19
Filtrar
Mais filtros

País/Região como assunto
Intervalo de ano de publicação
1.
Appl Soft Comput ; 123: 108983, 2022 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-35573166

RESUMO

In the context of the global coronavirus pandemic, different deep learning solutions for infected subject detection using chest X-ray images have been proposed. However, deep learning models usually need large labelled datasets to be effective. Semi-supervised deep learning is an attractive alternative, where unlabelled data is leveraged to improve the overall model's accuracy. However, in real-world usage settings, an unlabelled dataset might present a different distribution than the labelled dataset (i.e. the labelled dataset was sampled from a target clinic and the unlabelled dataset from a source clinic). This results in a distribution mismatch between the unlabelled and labelled datasets. In this work, we assess the impact of the distribution mismatch between the labelled and the unlabelled datasets, for a semi-supervised model trained with chest X-ray images, for COVID-19 detection. Under strong distribution mismatch conditions, we found an accuracy hit of almost 30%, suggesting that the unlabelled dataset distribution has a strong influence in the behaviour of the model. Therefore, we propose a straightforward approach to diminish the impact of such distribution mismatch. Our proposed method uses a density approximation of the feature space. It is built upon the target dataset to filter out the observations in the source unlabelled dataset that might harm the accuracy of the semi-supervised model. It assumes that a small labelled source dataset is available together with a larger source unlabelled dataset. Our proposed method does not require any model training, it is simple and computationally cheap. We compare our proposed method against two popular state of the art out-of-distribution data detectors, which are also cheap and simple to implement. In our tests, our method yielded accuracy gains of up to 32%, when compared to the previous state of the art methods. The good results yielded by our method leads us to argue in favour for a more data-centric approach to improve model's accuracy. Furthermore, the developed method can be used to measure data effectiveness for semi-supervised deep learning model training.

2.
Appl Soft Comput ; 111: 107692, 2021 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-34276263

RESUMO

A key factor in the fight against viral diseases such as the coronavirus (COVID-19) is the identification of virus carriers as early and quickly as possible, in a cheap and efficient manner. The application of deep learning for image classification of chest X-ray images of COVID-19 patients could become a useful pre-diagnostic detection methodology. However, deep learning architectures require large labelled datasets. This is often a limitation when the subject of research is relatively new as in the case of the virus outbreak, where dealing with small labelled datasets is a challenge. Moreover, in such context, the datasets are also highly imbalanced, with few observations from positive cases of the new disease. In this work we evaluate the performance of the semi-supervised deep learning architecture known as MixMatch with a very limited number of labelled observations and highly imbalanced labelled datasets. We demonstrate the critical impact of data imbalance to the model's accuracy. Therefore, we propose a simple approach for correcting data imbalance, by re-weighting each observation in the loss function, giving a higher weight to the observations corresponding to the under-represented class. For unlabelled observations, we use the pseudo and augmented labels calculated by MixMatch to choose the appropriate weight. The proposed method improved classification accuracy by up to 18%, with respect to the non balanced MixMatch algorithm. We tested our proposed approach with several available datasets using 10, 15 and 20 labelled observations, for binary classification (COVID-19 positive and normal cases). For multi-class classification (COVID-19 positive, pneumonia and normal cases), we tested 30, 50, 70 and 90 labelled observations. Additionally, a new dataset is included among the tested datasets, composed of chest X-ray images of Costa Rican adult patients.

3.
Metabolites ; 13(10)2023 Oct 13.
Artigo em Inglês | MEDLINE | ID: mdl-37887404

RESUMO

In this investigation, we outline the applications of a data mining technique known as Subgroup Discovery (SD) to the analysis of a sample size-limited metabolomics-based dataset. The SD technique utilized a supervised learning strategy, which lies midway between classificational and descriptive criteria, in which given the descriptive property of a dataset (i.e., the response target variable of interest), the primary objective was to discover subgroups with behaviours that are distinguishable from those of the complete set (albeit with a differential statistical distribution). These approaches have, for the first time, been successfully employed for the analysis of aromatic metabolite patterns within an NMR-based urinary dataset collected from a small cohort of patients with the lysosomal storage disorder Niemann-Pick class 1 (NPC1) disease (n = 12) and utilized to distinguish these from a larger number of heterozygous (parental) control participants. These subgroup discovery strategies discovered two different NPC1 disease-specific metabolically sequential rules which permitted the reliable identification of NPC1 patients; the first of these involved 'normal' (intermediate) urinary concentrations of xanthurenate, 4-aminobenzoate, hippurate and quinaldate, and disease-downregulated levels of nicotinate and trigonelline, whereas the second comprised 'normal' 4-aminobenzoate, indoxyl sulphate, hippurate, 3-methylhistidine and quinaldate concentrations, and again downregulated nicotinate and trigonelline levels. Correspondingly, a series of five subgroup rules were generated for the heterozygous carrier control group, and 'biomarkers' featured in these included low histidine, 1-methylnicotinamide and 4-aminobenzoate concentrations, together with 'normal' levels of hippurate, hypoxanthine, quinolinate and hypoxanthine. These significant disease group-specific rules were consistent with imbalances in the combined tryptophan-nicotinamide, tryptophan, kynurenine and tyrosine metabolic pathways, along with dysregulations in those featuring histidine, 3-methylhistidine and 4-hydroxybenzoate. In principle, the novel subgroup discovery approach employed here should also be readily applicable to solving metabolomics-type problems of this nature which feature rare disease classification groupings with only limited patient participant and sample sizes available.

4.
Comput Biol Med ; 148: 105916, 2022 09.
Artigo em Inglês | MEDLINE | ID: mdl-35961091

RESUMO

Niemann-Pick Class 1 (NPC1) disease is a rare and debilitating neurodegenerative lysosomal storage disease (LSD). Metabolomics datasets of NPC1 patients available to perform this type of analysis are often limited in the number of samples and severely unbalanced. In order to improve the predictive capability and identify new biomarkers in an NPC1 disease urinary dataset, data augmentation (DA) techniques based on computational intelligence have been employed to create synthetic samples, i.e. the addition of noise, oversampling techniques and conditional generative adversarial networks. These techniques have been used to evaluate their predictive capacities on a set of urine samples donated by 13 untreated NPC1 disease and 47 heterozygous (parental) carrier control participants. Results on the prediction have also been obtained using different machine learning classification models and the partial least squares techniques. These results provide strong evidence for the ability of DA techniques to generate good quality synthetic data. Results acquired show increases in sensitivity of 20%-50%, an F1 score of 6%-30%, and a predictive capacity of 0.3 (out of 1). Additionally, more conventional forms of multivariate data analysis have been employed. These have allowed the detection of unusual urinary metabolite profiles, and the identification of biomarkers through the use of synthetically augmented datasets. Results indicate that urinary branched-chain amino acids such as valine, 3-aminoisobutyrate and quinolinate, may be employable as valuable biomarkers for the diagnosis and prognostic monitoring of NPC1 disease.


Assuntos
Doença de Niemann-Pick Tipo C , Biomarcadores , Humanos , Metabolômica
5.
Med Biol Eng Comput ; 60(4): 1159-1175, 2022 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-35239108

RESUMO

The implementation of deep learning-based computer-aided diagnosis systems for the classification of mammogram images can help in improving the accuracy, reliability, and cost of diagnosing patients. However, training a deep learning model requires a considerable amount of labelled images, which can be expensive to obtain as time and effort from clinical practitioners are required. To address this, a number of publicly available datasets have been built with data from different hospitals and clinics, which can be used to pre-train the model. However, using models trained on these datasets for later transfer learning and model fine-tuning with images sampled from a different hospital or clinic might result in lower performance. This is due to the distribution mismatch of the datasets, which include different patient populations and image acquisition protocols. In this work, a real-world scenario is evaluated where a novel target dataset sampled from a private Costa Rican clinic is used, with few labels and heavily imbalanced data. The use of two popular and publicly available datasets (INbreast and CBIS-DDSM) as source data, to train and test the models on the novel target dataset, is evaluated. A common approach to further improve the model's performance under such small labelled target dataset setting is data augmentation. However, often cheaper unlabelled data is available from the target clinic. Therefore, semi-supervised deep learning, which leverages both labelled and unlabelled data, can be used in such conditions. In this work, we evaluate the semi-supervised deep learning approach known as MixMatch, to take advantage of unlabelled data from the target dataset, for whole mammogram image classification. We compare the usage of semi-supervised learning on its own, and combined with transfer learning (from a source mammogram dataset) with data augmentation, as also against regular supervised learning with transfer learning and data augmentation from source datasets. It is shown that the use of a semi-supervised deep learning combined with transfer learning and data augmentation can provide a meaningful advantage when using scarce labelled observations. Also, we found a strong influence of the source dataset, which suggests a more data-centric approach needed to tackle the challenge of scarcely labelled data. We used several different metrics to assess the performance gain of using semi-supervised learning, when dealing with very imbalanced test datasets (such as the G-mean and the F2-score), as mammogram datasets are often very imbalanced. Graphical Abstract Description of the test-bed implemented in this work. Two different source data distributions were used to fine-tune the different models tested in this work. The target dataset is the in-house CR-Chavarria-2020 dataset.


Assuntos
Diagnóstico por Computador , Aprendizado de Máquina Supervisionado , Costa Rica , Diagnóstico por Computador/métodos , Humanos , Mamografia , Reprodutibilidade dos Testes
6.
IEEE Access ; 9: 85442-85454, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34812397

RESUMO

In this work we implement a COVID-19 infection detection system based on chest X-ray images with uncertainty estimation. Uncertainty estimation is vital for safe usage of computer aided diagnosis tools in medical applications. Model estimations with high uncertainty should be carefully analyzed by a trained radiologist. We aim to improve uncertainty estimations using unlabelled data through the MixMatch semi-supervised framework. We test popular uncertainty estimation approaches, comprising Softmax scores, Monte-Carlo dropout and deterministic uncertainty quantification. To compare the reliability of the uncertainty estimates, we propose the usage of the Jensen-Shannon distance between the uncertainty distributions of correct and incorrect estimations. This metric is statistically relevant, unlike most previously used metrics, which often ignore the distribution of the uncertainty estimations. Our test results show a significant improvement in uncertainty estimates when using unlabelled data. The best results are obtained with the use of the Monte Carlo dropout method.

7.
Neural Netw ; 20(10): 1095-108, 2007 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-17904333

RESUMO

The Recursive Deterministic Perceptron (RDP) feed-forward multilayer neural network is a generalisation of the single layer perceptron topology. This model is capable of solving any two-class classification problem as opposed to the single layer perceptron which can only solve classification problems dealing with linearly separable sets. For all classification problems, the construction of an RDP is done automatically and convergence is always guaranteed. Three methods for constructing RDP neural networks exist: Batch, Incremental, and Modular. The Batch method has been extensively tested and it has been shown to produce results comparable with those obtained with other neural network methods such as Back Propagation, Cascade Correlation, Rulex, and Ruleneg. However, no testing has been done before on the Incremental and Modular methods. Contrary to the Batch method, the complexity of these two methods is not NP-Complete. For the first time, a study on the three methods is presented. This study will allow the highlighting of the main advantages and disadvantages of each of these methods by comparing the results obtained while building RDP neural networks with the three methods in terms of the convergence time, the level of generalisation, and the topology size. The networks were trained and tested using the following standard benchmark classification datasets: IRIS, SOYBEAN, and Wisconsin Breast Cancer. The results obtained show the effectiveness of the Incremental and the Modular methods which are as good as that of the NP-Complete Batch method but with a much lower complexity level. The results obtained with the RDP are comparable to those obtained with the backpropagation and the Cascade Correlation algorithms.


Assuntos
Simulação por Computador , Redes Neurais de Computação , Projetos de Pesquisa , Humanos , Aprendizagem , Reprodutibilidade dos Testes
8.
Acta Ophthalmol ; 95(2): e138-e143, 2017 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-27775228

RESUMO

PURPOSE: To describe the dynamic changes of the intraocular pressure (IOP) and intracranial pressure (ICP) with normal or pathological values (intracranial hypertension) in nonglaucomatous neurological patients during lumbar punction (LP). METHODS: Case-control study, prospective measurement of tonometry in both groups referred for LP. Intraocular pressure, ICP and translaminar pressure difference (TPD) were compared pre- and post-LP. RESULTS: Thirty-six patients (72 eyes) with mean age of 38.5 (16-64) years and BMI of 26.81 kg/m2 were analysed. The initial mean ICP was 12.81 (± 6.6) mmHg. The mean TPD before and after the LP was 1.48 mmHg and 0.65 mmHg, respectively. The mean IOP of both eyes decreased to 0.8 mmHg post-LP in patients with pathological ICP (p = 0.0193) and normal ICP (p = 0.006). CONCLUSIONS: We found a statistically significant decrease of the IOP post-LP compared to the pre-LP in both groups, being higher in patients with pathological ICP. There were no significant differences of the IOP in patients with normal versus pathological ICP pre-LP/post-LP; neither was found a correlation between ICP and IOP.


Assuntos
Hipertensão Intracraniana/fisiopatologia , Pressão Intracraniana/fisiologia , Pressão Intraocular/fisiologia , Adolescente , Adulto , Estudos de Casos e Controles , Feminino , Seguimentos , Humanos , Masculino , Pessoa de Meia-Idade , Estudos Prospectivos , Fatores de Tempo , Adulto Jovem
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA