Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 15 de 15
Filtrar
1.
Sci Rep ; 13(1): 19598, 2023 11 10.
Artigo em Inglês | MEDLINE | ID: mdl-37950041

RESUMO

Thyroid cancer is a life-threatening condition that arises from the cells of the thyroid gland located in the neck's frontal region just below the adam's apple. While it is not as prevalent as other types of cancer, it ranks prominently among the commonly observed cancers affecting the endocrine system. Machine learning has emerged as a valuable medical diagnostics tool specifically for detecting thyroid abnormalities. Feature selection is of vital importance in the field of machine learning as it serves to decrease the data dimensionality and concentrate on the most pertinent features. This process improves model performance, reduces training time, and enhances interpretability. This study examined binary variants of FOX-optimization algorithms for feature selection. The study employed eight transfer functions (S and V shape) to convert the FOX-optimization algorithms into their binary versions. The vision transformer-based pre-trained models (DeiT and Swin Transformer) are used for feature extraction. The extracted features are transformed using locally linear embedding, and binary FOX-optimization algorithms are applied for feature selection in conjunction with the Naïve Bayes classifier. The study utilized two datasets (ultrasound and histopathological) related to thyroid cancer images. The benchmarking is performed using the half-quadratic theory-based ensemble ranking technique. Two TOPSIS-based methods (H-TOPSIS and A-TOPSIS) are employed for initial model ranking, followed by an ensemble technique for final ranking. The problem is treated as multi-objective optimization task with accuracy, F2-score, AUC-ROC and feature space size as optimization goals. The binary FOX-optimization algorithm based on the [Formula: see text] transfer function achieved superior performance compared to other variants using both datasets as well as feature extraction techniques. The proposed framework comprised a Swin transformer to extract features, a Fox optimization algorithm with a V1 transfer function for feature selection, and a Naïve Bayes classifier and obtained the best performance for both datasets. The best model achieved an accuracy of 94.75%, an AUC-ROC value of 0.9848, an F2-Score of 0.9365, an inference time of 0.0353 seconds, and selected 5 features for the ultrasound dataset. For the histopathological dataset, the diagnosis model achieved an overall accuracy of 89.71%, an AUC-ROC score of 0.9329, an F2-Score of 0.8760, an inference time of 0.05141 seconds, and selected 12 features. The proposed model achieved results comparable to existing research with small features space.


Assuntos
Algoritmos , Neoplasias da Glândula Tireoide , Humanos , Teorema de Bayes , Neoplasias da Glândula Tireoide/diagnóstico por imagem , Aprendizado de Máquina
2.
J Imaging ; 9(9)2023 Aug 27.
Artigo em Inglês | MEDLINE | ID: mdl-37754937

RESUMO

Computer-assisted diagnostic systems have been developed to aid doctors in diagnosing thyroid-related abnormalities. The aim of this research is to improve the diagnosis accuracy of thyroid abnormality detection models that can be utilized to alleviate undue pressure on healthcare professionals. In this research, we proposed deep learning, metaheuristics, and a MCDM algorithms-based framework to detect thyroid-related abnormalities from ultrasound and histopathological images. The proposed method uses three recently developed deep learning techniques (DeiT, Swin Transformer, and Mixer-MLP) to extract features from the thyroid image datasets. The feature extraction techniques are based on the Image Transformer and MLP models. There is a large number of redundant features that can overfit the classifiers and reduce the generalization capabilities of the classifiers. In order to avoid the overfitting problem, six feature transformation techniques (PCA, TSVD, FastICA, ISOMAP, LLE, and UMP) are analyzed to reduce the dimensionality of the data. There are five different classifiers (LR, NB, SVC, KNN, and RF) evaluated using the 5-fold stratified cross-validation technique on the transformed dataset. Both datasets exhibit large class imbalances and hence, the stratified cross-validation technique is used to evaluate the performance. The MEREC-TOPSIS MCDM technique is used for ranking the evaluated models at different analysis stages. In the first stage, the best feature extraction and classification techniques are chosen, whereas, in the second stage, the best dimensionality reduction method is evaluated in wrapper feature selection mode. Two best-ranked models are further selected for the weighted average ensemble learning and features selection using the recently proposed meta-heuristics FOX-optimization algorithm. The PCA+FOX optimization-based feature selection + random forest model achieved the highest TOPSIS score and performed exceptionally well with an accuracy of 99.13%, F2-score of 98.82%, and AUC-ROC score of 99.13% on the ultrasound dataset. Similarly, the model achieved an accuracy score of 90.65%, an F2-score of 92.01%, and an AUC-ROC score of 95.48% on the histopathological dataset. This study exploits the combination novelty of different algorithms in order to improve the thyroid cancer diagnosis capabilities. This proposed framework outperforms the current state-of-the-art diagnostic methods for thyroid-related abnormalities in ultrasound and histopathological datasets and can significantly aid medical professionals by reducing the excessive burden on the medical fraternity.

3.
Curr Med Imaging ; 2023 Apr 05.
Artigo em Inglês | MEDLINE | ID: mdl-37038671

RESUMO

BACKGROUND: Thyroid disorders are prevalent worldwide and impact many people. The abnormal growth of cells in the thyroid gland region is very common and even found in healthy people. These abnormal cells can be cancerous or non-cancerous, so early detection of this disease is the only solution for minimizing the death rate or maximizing a patient's survival rate. Traditional techniques to detect cancerous nodules are complex and time-consuming; hence, several imaging algorithms are used to detect the malignant status of thyroid nodules timely. AIM: This research aims to develop computer-aided diagnosis tools for malignant thyroid nodule detection using ultrasound images. This tool will be helpful for doctors and radiologists in the rapid detection of thyroid cancer at its early stages. The individual machine learning models are inferior to medical datasets because the size of medical image datasets is tiny, and there is a vast class imbalance problem. These problems lead to overfitting; hence, accuracy is very poor on the test dataset. OBJECTIVE: This research proposes ensemble learning models that achieve higher accuracy than individual models. The objective is to design different ensemble models and then utilize benchmarking techniques to select the best model among all trained models. METHODS: This research investigates four recently developed image transformer and mixer models for thyroid detection. The weighted average ensemble models are introduced, and model weights are optimized using the hunger games search (HGS) optimization algorithm. The recently developed distance correlation CRITIC (D-CRITIC) based TOPSIS method is utilized to rank the models. RESULTS: Based on the TOPSIS score, the best model for an 80:20 split is the gMLP+ViT model, which achieved an accuracy of 89.70%, whereas using a 70:30 data split, the gMLP+FNet+Mixer-MLP has achieved the highest accuracy of 82.18% on the publicly available thyroid dataset. CONCLUSION: This study shows that the proposed ensemble models have better thyroid detection capabilities than individual base models for the imbalanced thyroid ultrasound dataset.

4.
Front Big Data ; 5: 1021518, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36299660

RESUMO

The machine learning (ML)-based classification models are widely utilized for the automated detection of heart diseases (HDs) using various physiological signals such as electrocardiogram (ECG), magnetocardiography (MCG), heart sound (HS), and impedance cardiography (ICG) signals. However, ECG-based HD identification is the most common one used by clinicians. In the current investigation, the ECG records or subjects have been sampled and are used as inputs to the classification model to distinguish between normal and abnormal patients. The study has employed an imbalanced number of ECG samples for training the various classification models. Few ML methods such as support vector machine (SVM), logistic regression (LR), and adaptive boosting (AdaBoost) which have been rarely used for HD detection have been selected. The performance of the developed model has been evaluated in terms of accuracy, F1-score, and area under curve (AUC) values using ECG signals of subjects given in publicly available (PTB-ECG, MIT-BIH) datasets. Ranking of the models has been assigned based on these performance metrics and it is found that the AdaBoost and LR classifiers stand in first and second positions. These two models have been ensembled based on the majority voting principle and the performance measure of this ensemble model has also been determined. It is, in general, observed that the proposed ensemble model demonstrates the best HD detection performance of 0.946, 0.949, and 0.951 for the PTB-ECG dataset and 0.921, 0.926, and 0.950 for the MIT-BIH dataset in terms of accuracy, F1-score, and AUC, respectively. The proposed methodology can also be employed for the classification of HD using ICG, MCG, and HS signals as inputs. Further, the proposed methodology can also be applied to the detection of other diseases.

5.
IEEE J Biomed Health Inform ; 26(11): 5364-5371, 2022 11.
Artigo em Inglês | MEDLINE | ID: mdl-35947565

RESUMO

In recent times, speech-based automatic disease detection systems have shown several promising results in biomedical and life science applications, especially in the case of respiratory diseases. It provides a quick, cost-effective, reliable, and non-invasive potential alternative detection option for COVID-19 in the ongoing pandemic scenario since the subject's voice can be remotely recorded and sent for further analysis. The existing COVID-19 detection methods including RT-PCR, and chest X-ray tests are not only costlier but also require the involvement of a trained technician. The present paper proposes a novel speech-based respiratory disease detection scheme for COVID-19 and Asthma using the Gradient Boosting Machine-based classifier. From the recorded speech samples, the spectral, cepstral, and periodicity features, as well as spectral descriptors, are computed and then homogeneously fused to obtain relevant statistical features. These features are subsequently used as inputs to the Gradient Boosting Machine. The various performance matrices of the proposed model have been obtained using thirteen sound categories' speech data collected from more than 50 countries using five standard datasets for accurate diagnosis of respiratory diseases including COVID-19. The overall average accuracy achieved by the proposed model using the stratified k-fold cross-validation test is above 97%. The analysis of various performance matrices demonstrates that under the current pandemic scenario, the proposed COVID-19 detection scheme can be gainfully employed by physicians.


Assuntos
COVID-19 , Humanos , Fala , Pandemias
6.
Open Med (Wars) ; 17(1): 1100-1113, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35799599

RESUMO

Cardiovascular disease (CVD) makes our heart and blood vessels dysfunctional and often leads to death or physical paralysis. Therefore, early and automatic detection of CVD can save many human lives. Multiple investigations have been carried out to achieve this objective, but there is still room for improvement in performance and reliability. This study is yet another step in this direction. In this study, two reliable machine learning techniques, multi-layer perceptron (MLP), and K-nearest neighbour (K-NN) have been employed for CVD detection using publicly available University of California Irvine repository data. The performances of the models are optimally increased by removing outliers and attributes having null values. Experimental-based results demonstrate that a higher accuracy in detection of 82.47% and an area-under-the-curve value of 86.41% are obtained using the MLP model, unlike the K-NN model. Therefore, the proposed MLP model was recommended for automatic CVD detection. The proposed methodology can also be employed in detecting other diseases. In addition, the performance of the proposed model can be assessed via other standard data sets.

7.
Artigo em Inglês | MEDLINE | ID: mdl-33621179

RESUMO

Understanding the behavioral process of life and disease-causing mechanism, knowledge regarding protein-protein interactions (PPI) is essential. In this paper, a novel hybrid approach combining deep neural network (DNN) and extreme gradient boosting classifier (XGB) is employed for predicting PPI. The hybrid classifier (DNN-XGB) uses a fusion of three sequence-based features, amino acid composition (AAC), conjoint triad composition (CT), and local descriptor (LD) as inputs. The DNN extracts the hidden information through a layer-wise abstraction from the raw features that are passed through the XGB classifier. The 5-fold cross-validation accuracy for intraspecies interactions dataset of Saccharomyces cerevisiae (core subset), Helicobacter pylori, Saccharomyces cerevisiae, and Human are 98.35, 96.19, 97.37, and 99.74 percent respectively. Similarly, accuracies of 98.50 and 97.25 percent are achieved for interspecies interaction dataset of Human- Bacillus Anthracis and Human- Yersinia pestis datasets, respectively. The improved prediction accuracies obtained on the independent test sets and network datasets indicate that the DNN-XGB can be used to predict cross-species interactions. It can also provide new insights into signaling pathway analysis, predicting drug targets, and understanding disease pathogenesis. Improved performance of the proposed method suggests that the hybrid classifier can be used as a useful tool for PPI prediction. The datasets and source codes are available at: https://github.com/SatyajitECE/DNN-XGB-for-PPI-Prediction.


Assuntos
Redes Neurais de Computação , Mapeamento de Interação de Proteínas , Aminoácidos , Humanos , Saccharomyces cerevisiae/genética , Software
8.
Front Artif Intell ; 5: 1035805, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36686850

RESUMO

COVID-19 is a deadly viral infection that mainly affects the nasopharyngeal and oropharyngeal cavities before the lung in the human body. Early detection followed by immediate treatment can potentially reduce lung invasion and decrease fatality. Recently, several COVID-19 detections methods have been proposed using cough and breath sounds. However, very little study has been done on the use of phoneme analysis and the smearing of the audio signal in COVID-19 detection. In this paper, this problem has been addressed and the classification of speech samples has been carried out in COVID-19-positive and healthy audio samples. Additionally, the grouping of the phonemes based on reference classification accuracies have been proposed for effectiveness and faster detection of the disease at a primary stage. The Mel and Gammatone Cepstral coefficients and their derivatives are used as the features for five standard machine learning-based classifiers. It is observed that the generalized additive model provides the highest accuracy of 97.22% for the phoneme grouping "/t//r//n//g//l/." This smearing-based phoneme classification technique can also be used in the future to classify other speech-related disease detections.

9.
Pattern Recognit ; 117: 107999, 2021 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-33967346

RESUMO

The early detection of COVID-19 is a challenging task due to its deadly spreading nature and existing fear in minds of people. Speech-based detection can be one of the safest tools for this purpose as the voice of the suspected can be easily recorded. The Mel Frequency Cepstral Coefficient (MFCC) analysis of speech signal is one of the oldest but potential analysis tools. The performance of this analysis mainly depends on the use of conversion between normal frequency scale to perceptual frequency scale and the frequency range of the filters used. Traditionally, in speech recognition, these values are fixed. But the characteristics of speech signals vary from disease to disease. In the case of detection of COVID-19, mainly the coughing sounds are used whose bandwidth and properties are quite different from the complete speech signal. By exploiting these properties the efficiency of the COVID-19 detection can be improved. To achieve this objective the frequency range and the conversion scale of frequencies have been suitably optimized. Further to enhance the accuracy of detection performance, speech enhancement has been carried out before extraction of features. By implementing these two concepts a new feature called COVID-19 Coefficient (C-19CC) is developed in this paper. Finally, the performance of these features has been compared.

10.
Healthc Technol Lett ; 5(1): 31-37, 2018 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-29515814

RESUMO

Accurate optic disc (OD) segmentation is an important step in obtaining cup-to-disc ratio-based glaucoma screening using fundus imaging. It is a challenging task because of the subtle OD boundary, blood vessel occlusion and intensity inhomogeneity. In this Letter, the authors propose an improved version of the random walk algorithm for OD segmentation to tackle such challenges. The algorithm incorporates the mean curvature and Gabor texture energy features to define the new composite weight function to compute the edge weights. Unlike the deformable model-based OD segmentation techniques, the proposed algorithm remains unaffected by curve initialisation and local energy minima problem. The effectiveness of the proposed method is verified with DRIVE, DIARETDB1, DRISHTI-GS and MESSIDOR database images using the performance measures such as mean absolute distance, overlapping ratio, dice coefficient, sensitivity, specificity and precision. The obtained OD segmentation results and quantitative performance measures show robustness and superiority of the proposed algorithm in handling the complex challenges in OD segmentation.

11.
Comput Med Imaging Graph ; 66: 56-65, 2018 06.
Artigo em Inglês | MEDLINE | ID: mdl-29544118

RESUMO

Retinal nerve fiber layer defect (RNFLD) provides an early objective evidence of structural changes in glaucoma. RNFLD detection is currently carried out using imaging modalities like OCT and GDx which are expensive for routine practice. In this regard, we propose a novel automatic method for RNFLD detection and angular width quantification using cost effective redfree fundus images to be practically useful for computer-assisted glaucoma risk assessment. After blood vessel inpainting and CLAHE based contrast enhancement, the initial boundary pixels are identified by local minima analysis of the 1-D intensity profiles on concentric circles. The true boundary pixels are classified using random forest trained by newly proposed cumulative zero count local binary pattern (CZC-LBP) and directional differential energy (DDE) along with Shannon, Tsallis entropy and intensity features. Finally, the RNFLD angular width is obtained by random sample consensus (RANSAC) line fitting on the detected set of boundary pixels. The proposed method is found to achieve high RNFLD detection performance on a newly created dataset with sensitivity (SN) of 0.7821 at 0.2727 false positives per image (FPI) and the area under curve (AUC) value is obtained as 0.8733.


Assuntos
Fundo de Olho , Glaucoma/diagnóstico por imagem , Glaucoma/fisiopatologia , Processamento de Imagem Assistida por Computador/métodos , Fibras Nervosas/fisiologia , Retina/diagnóstico por imagem , Adulto , Algoritmos , Feminino , Humanos , Masculino , Tomografia de Coerência Óptica/métodos
12.
J Med Imaging (Bellingham) ; 5(4): 044003, 2018 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-30840736

RESUMO

Glaucoma is a progressive optic neuropathy characterized by peripheral visual field loss, which is caused by degeneration of retinal nerve fibers. The peripheral vision loss due to glaucoma is asymptomatic. If not detected and treated at an early stage, it leads to complete blindness, which is irreversible in nature. The retinal nerve fiber layer defect (RNFLD) provides an earliest objective evidence of glaucoma. In this regard, we explore cost-effective redfree fundus imaging for RNFLD detection to be practically useful for computer-assisted early glaucoma risk assessment. RNFLD appears as a wedge shaped arcuate structure radiating from the optic disc. The very low contrast between RNFLD and background makes its visual detection quite challenging even by medical experts. In our study, we formulate a deep convolutional neural network (CNN) based patch classification strategy for RNFLD boundary localization. A large number of RNFLD and background image patches train the deep CNN model, which extracts sufficient discriminative information from the patches and results in accurate RNFLD boundary pixel classification. The proposed approach is found to achieve enhanced RNFLD detection performance with sensitivity of 0.8205 and false positive per image of 0.2000 on a newly created early glaucomatic fundus image database.

13.
Artigo em Inglês | MEDLINE | ID: mdl-21778522

RESUMO

Protein-protein interactions govern almost all biological processes and the underlying functions of proteins. The interaction sites of protein depend on the 3D structure which in turn depends on the amino acid sequence. Hence, prediction of protein function from its primary sequence is an important and challenging task in bioinformatics. Identification of the amino acids (hot spots) that leads to the characteristic frequency signifying a particular biological function is really a tedious job in proteomic signal processing. In this paper, we have proposed a new promising technique for identification of hot spots in proteins using an efficient time-frequency filtering approach known as the S-transform filtering. The S-transform is a powerful linear time-frequency representation and is especially useful for the filtering in the time-frequency domain. The potential of the new technique is analyzed in identifying hot spots in proteins and the result obtained is compared with the existing methods. The results demonstrate that the proposed method is superior to its counterparts and is consistent with results based on biological methods for identification of the hot spots. The proposed method also reveals some new hot spots which need further investigation and validation by the biological community.


Assuntos
Biologia Computacional/métodos , Mapeamento de Interação de Proteínas/métodos , Proteínas/química , Análise de Sequência de Proteína/métodos , Processamento de Sinais Assistido por Computador , Aminoácidos/química , Animais , Proteínas de Bactérias/química , Proteínas de Bactérias/metabolismo , Bovinos , Fatores de Crescimento de Fibroblastos/química , Fatores de Crescimento de Fibroblastos/metabolismo , Hormônio do Crescimento Humano/química , Hormônio do Crescimento Humano/metabolismo , Humanos , Proteínas de Membrana/química , Proteínas de Membrana/metabolismo , Modelos Moleculares , Proteínas/metabolismo
14.
Genomics Proteomics Bioinformatics ; 9(1-2): 45-55, 2011 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-21641562

RESUMO

Accurate identification of protein-coding regions (exons) in DNA sequences has been a challenging task in bioinformatics. Particularly the coding regions have a 3-base periodicity, which forms the basis of all exon identification methods. Many signal processing tools and techniques have been applied successfully for the identification task but still improvement in this direction is needed. In this paper, we have introduced a new promising model-independent time-frequency filtering technique based on S-transform for accurate identification of the coding regions. The S-transform is a powerful linear time-frequency representation useful for filtering in time-frequency domain. The potential of the proposed technique has been assessed through simulation study and the results obtained have been compared with the existing methods using standard datasets. The comparative study demonstrates that the proposed method outperforms its counterparts in identifying the coding regions.


Assuntos
Éxons , Análise de Sequência de DNA/métodos , Animais , Humanos , Software
15.
Comput Biol Chem ; 34(5-6): 320-7, 2010 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-21106461

RESUMO

During last few decades accurate determination of protein structural class using a fast and suitable computational method has been a challenging problem in protein science. In this context a meaningful representation of a protein sample plays a key role in achieving higher prediction accuracy. In this paper based on the concept of Chou's pseudo amino acid composition (Chou, K.C., 2001. Proteins 43, 246-255), a new feature representation method is introduced which is composed of the amino acid composition information, the amphiphilic correlation factors and the spectral characteristics of the protein. Thus the sample of a protein is represented by a set of discrete components which incorporate both the sequence order and the length effect. On the basis of such a statistical framework a simple radial basis function network based classifier is introduced to predict protein structural class. A set of exhaustive simulation studies demonstrates high success rate of classification using the self-consistency and jackknife test on the benchmark datasets.


Assuntos
Aminoácidos/química , Proteínas/química , Análise de Sequência de Proteína/métodos , Algoritmos , Aminoácidos/classificação , Aminoácidos/genética , Bases de Dados de Proteínas , Proteínas/classificação , Proteínas/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...