RESUMO
In Italian universities, bioinformatics courses are increasingly being incorporated into different study paths. However, the content of bioinformatics courses is usually selected by the professor teaching the course, in the absence of national guidelines that identify the minimum indispensable knowledge in bioinformatics that undergraduate students from different scientific fields should achieve. The Training&Teaching group of the Bioinformatics Italian Society (BITS) proposed to university professors a survey aimed at portraying the current situation of bioinformatics courses within undergraduate curricula in Italy (i.e., bioinformatics courses activated within both bachelor's and master's degrees). Furthermore, the Training&Teaching group took a cue from the survey outcomes to develop recommendations for the design and the inclusion of bioinformatics courses in academic curricula. Here, we present the outcomes of the survey, as well as the BITS recommendations, with the hope that they may support BITS members in identifying learning outcomes and selecting content for their bioinformatics courses. As we share our effort with the broader international community involved in teaching bioinformatics at academic level, we seek feedback and thoughts on our proposal and hope to start a fruitful debate on the topic, including how to better fulfill the real bioinformatics knowledge needs of the research and the labor market at both the national and international level.
Assuntos
Currículo , Estudantes , Humanos , Itália , Inquéritos e Questionários , AprendizagemRESUMO
Frameworks for human activity recognition (HAR) can be applied in the clinical environment for monitoring patients' motor and functional abilities either remotely or within a rehabilitation program. Deep Learning (DL) models can be exploited to perform HAR by means of raw data, thus avoiding time-demanding feature engineering operations. Most works targeting HAR with DL-based architectures have tested the workflow performance on data related to a separate execution of the tasks. Hence, a paucity in the literature has been found with regard to frameworks aimed at recognizing continuously executed motor actions. In this article, the authors present the design, development, and testing of a DL-based workflow targeting continuous human activity recognition (CHAR). The model was trained on the data recorded from ten healthy subjects and tested on eight different subjects. Despite the limited sample size, the authors claim the capability of the proposed framework to accurately classify motor actions within a feasible time, thus making it potentially useful in a clinical scenario.
Assuntos
Aprendizado Profundo , Humanos , Atividades Humanas , Atividades Cotidianas , Engenharia , Voluntários SaudáveisRESUMO
In the field of neuroscience, brain-computer interfaces (BCIs) are used to connect the human brain with external devices, providing insights into the neural mechanisms underlying cognitive processes, including aesthetic perception. Non-invasive BCIs, such as EEG and fNIRS, are critical for studying central nervous system activity and understanding how individuals with cognitive deficits process and respond to aesthetic stimuli. This study assessed twenty participants who were divided into control and impaired aging (AI) groups based on MMSE scores. EEG and fNIRS were used to measure their neurophysiological responses to aesthetic stimuli that varied in pleasantness and dynamism. Significant differences were identified between the groups in P300 amplitude and late positive potential (LPP), with controls showing greater reactivity. AI subjects showed an increase in oxyhemoglobin in response to pleasurable stimuli, suggesting hemodynamic compensation. This study highlights the effectiveness of multimodal BCIs in identifying the neural basis of aesthetic appreciation and impaired aging. Despite its limitations, such as sample size and the subjective nature of aesthetic appreciation, this research lays the groundwork for cognitive rehabilitation tailored to aesthetic perception, improving the comprehension of cognitive disorders through integrated BCI methodologies.
Assuntos
Interfaces Cérebro-Computador , Humanos , Envelhecimento , Encéfalo , Estética , PercepçãoRESUMO
The study of visuomotor adaptation (VMA) capabilities has been encompassed in various experimental protocols aimed at investigating human motor control strategies and/or cognitive functions. VMA-oriented frameworks can have clinical applications, primarily in the investigation and assessment of neuromotor impairments caused by conditions such as Parkinson's disease or post-stroke, which affect the lives of tens of thousands of people worldwide. Therefore, they can enhance the understanding of the specific mechanisms of such neuromotor disorders, thus being a potential biomarker for recovery, with the aim of being integrated with conventional rehabilitative programs. Virtual Reality (VR) can be entailed in a framework targeting VMA since it allows the development of visual perturbations in a more customizable and realistic way. Moreover, as has been demonstrated in previous works, a serious game (SG) can further increase engagement thanks to the use of full-body embodied avatars. Most studies implementing VMA frameworks have focused on upper limb tasks and have utilized a cursor as visual feedback for the user. Hence, there is a paucity in the literature about VMA-oriented frameworks targeting locomotion tasks. In this article, the authors present the design, development, and testing of an SG-based framework that addresses VMA in a locomotion activity by controlling a full-body moving avatar in a custom VR environment. This workflow includes a set of metrics to quantitatively assess the participants' performance. Thirteen healthy children were recruited to evaluate the framework. Several quantitative comparisons and analyses were run to validate the different types of introduced visuomotor perturbations and to evaluate the ability of the proposed metrics to describe the difficulty caused by such perturbations. During the experimental sessions, it emerged that the system is safe, easy to use, and practical in a clinical setting. Despite the limited sample size, which represents the main limitation of the study and can be compensated for with future recruitment, the authors claim the potential of this framework as a useful instrument for quantitatively assessing either motor or cognitive impairments. The proposed feature-based approach gives several objective parameters as additional biomarkers that can integrate the conventional clinical scores. Future studies might investigate the relation between the proposed biomarkers and the clinical scores for specific disorders such as Parkinson's disease and cerebral palsy.
Assuntos
Doença de Parkinson , Acidente Vascular Cerebral , Realidade Virtual , Criança , Humanos , Doença de Parkinson/diagnóstico , Interface Usuário-Computador , LocomoçãoRESUMO
BACKGROUND: Computer-aided diagnosis (CAD) systems based on medical images could support physicians in the decision-making process. During the last decades, researchers have proposed CAD systems in several medical domains achieving promising results. CAD systems play an important role in digital pathology supporting pathologists in analyzing biopsy slides by means of standardized and objective workflows. In the proposed work, we designed and tested a novel CAD system module based on image processing techniques and machine learning, whose objective was to classify the condition affecting renal corpuscles (glomeruli) between sclerotic and non-sclerotic. Such discrimination is useful for the biopsy slides evaluation performed by pathologists. RESULTS: We collected 26 digital slides taken from the kidneys of 19 donors with Periodic Acid-Schiff staining. Expert pathologists have conducted the slides preparation, digital acquisition and glomeruli annotations. Before setting the classifiers, we evaluated several feature extraction techniques from the annotated regions. Then, a feature reduction procedure followed by a shallow artificial neural network allowed discriminating between the glomeruli classes. We evaluated the workflow considering an independent dataset (i.e., processing images not used in the training procedure). Ten independent runs of the training algorithm, and evaluation, allowed achieving MCC and Accuracy of 0.95 (± 0.01) and 0.99 (standard deviation < 0.00), respectively. We also obtained good precision (0.9844 ± 0.0111) and recall (0.9310 ± 0.0153). CONCLUSIONS: Results on the test set confirm that the proposed workflow is consistent and reliable for the investigated domain, and it can support the clinical practice of discriminating the two classes of glomeruli. Analyses on misclassifications show that the involved images are usually affected by staining artefacts or present partial sections due to slice preparation and staining processes. In clinical practice, however, pathologists discard images showing such artefacts.
Assuntos
Diagnóstico por Computador , Redes Neurais de Computação , Algoritmos , Biópsia , Humanos , Rim/diagnóstico por imagemRESUMO
The coronavirus disease 2019 (COVID-19) pandemic has affected hundreds of millions of individuals and caused millions of deaths worldwide. Predicting the clinical course of the disease is of pivotal importance to manage patients. Several studies have found hematochemical alterations in COVID-19 patients, such as inflammatory markers. We retrospectively analyzed the anamnestic data and laboratory parameters of 303 patients diagnosed with COVID-19 who were admitted to the Polyclinic Hospital of Bari during the first phase of the COVID-19 global pandemic. After the pre-processing phase, we performed a survival analysis with Kaplan-Meier curves and Cox Regression, with the aim to discover the most unfavorable predictors. The target outcomes were mortality or admission to the intensive care unit (ICU). Different machine learning models were also compared to realize a robust classifier relying on a low number of strongly significant factors to estimate the risk of death or admission to ICU. From the survival analysis, it emerged that the most significant laboratory parameters for both outcomes was C-reactive protein min; HR=17.963 (95% CI 6.548-49.277, p < 0.001) for death, HR=1.789 (95% CI 1.000-3.200, p = 0.050) for admission to ICU. The second most important parameter was Erythrocytes max; HR=1.765 (95% CI 1.141-2.729, p < 0.05) for death, HR=1.481 (95% CI 0.895-2.452, p = 0.127) for admission to ICU. The best model for predicting the risk of death was the decision tree, which resulted in ROC-AUC of 89.66%, whereas the best model for predicting the admission to ICU was support vector machine, which had ROC-AUC of 95.07%. The hematochemical predictors identified in this study can be utilized as a strong prognostic signature to characterize the severity of the disease in COVID-19 patients.
Assuntos
COVID-19 , Mortalidade Hospitalar , Humanos , Aprendizado de Máquina , Prognóstico , Estudos Retrospectivos , SARS-CoV-2 , Análise de SobrevidaRESUMO
BACKGROUND: The automatic segmentation of kidneys in medical images is not a trivial task when the subjects undergoing the medical examination are affected by Autosomal Dominant Polycystic Kidney Disease (ADPKD). Several works dealing with the segmentation of Computed Tomography images from pathological subjects were proposed, showing high invasiveness of the examination or requiring interaction by the user for performing the segmentation of the images. In this work, we propose a fully-automated approach for the segmentation of Magnetic Resonance images, both reducing the invasiveness of the acquisition device and not requiring any interaction by the users for the segmentation of the images. METHODS: Two different approaches are proposed based on Deep Learning architectures using Convolutional Neural Networks (CNN) for the semantic segmentation of images, without needing to extract any hand-crafted features. In details, the first approach performs the automatic segmentation of images without any procedure for pre-processing the input. Conversely, the second approach performs a two-steps classification strategy: a first CNN automatically detects Regions Of Interest (ROIs); a subsequent classifier performs the semantic segmentation on the ROIs previously extracted. RESULTS: Results show that even though the detection of ROIs shows an overall high number of false positives, the subsequent semantic segmentation on the extracted ROIs allows achieving high performance in terms of mean Accuracy. However, the segmentation of the entire images input to the network remains the most accurate and reliable approach showing better performance than the previous approach. CONCLUSION: The obtained results show that both the investigated approaches are reliable for the semantic segmentation of polycystic kidneys since both the strategies reach an Accuracy higher than 85%. Also, both the investigated methodologies show performances comparable and consistent with other approaches found in literature working on images from different sources, reducing both the invasiveness of the analyses and the interaction needed by the users for performing the segmentation task.
Assuntos
Aprendizado Profundo , Imageamento por Ressonância Magnética/métodos , Rim Policístico Autossômico Dominante , Semântica , Humanos , Processamento de Imagem Assistida por Computador/métodos , Espectroscopia de Ressonância Magnética , Redes Neurais de Computação , Tomografia Computadorizada por Raios XRESUMO
BACKGROUND: Assessment and rating of Parkinson's Disease (PD) are commonly based on the medical observation of several clinical manifestations, including the analysis of motor activities. In particular, medical specialists refer to the MDS-UPDRS (Movement Disorder Society - sponsored revision of Unified Parkinson's Disease Rating Scale) that is the most widely used clinical scale for PD rating. However, clinical scales rely on the observation of some subtle motor phenomena that are either difficult to capture with human eyes or could be misclassified. This limitation motivated several researchers to develop intelligent systems based on machine learning algorithms able to automatically recognize the PD. Nevertheless, most of the previous studies investigated the classification between healthy subjects and PD patients without considering the automatic rating of different levels of severity. METHODS: In this context, we implemented a simple and low-cost clinical tool that can extract postural and kinematic features with the Microsoft Kinect v2 sensor in order to classify and rate PD. Thirty participants were enrolled for the purpose of the present study: sixteen PD patients rated according to MDS-UPDRS and fourteen healthy paired subjects. In order to investigate the motor abilities of the upper and lower body, we acquired and analyzed three main motor tasks: (1) gait, (2) finger tapping, and (3) foot tapping. After preliminary feature selection, different classifiers based on Support Vector Machine (SVM) and Artificial Neural Networks (ANN) were trained and evaluated for the best solution. RESULTS: Concerning the gait analysis, results showed that the ANN classifier performed the best by reaching 89.4% of accuracy with only nine features in diagnosis PD and 95.0% of accuracy with only six features in rating PD severity. Regarding the finger and foot tapping analysis, results showed that an SVM using the extracted features was able to classify healthy subjects versus PD patients with great performances by reaching 87.1% of accuracy. The results of the classification between mild and moderate PD patients indicated that the foot tapping features were the most representative ones to discriminate (81.0% of accuracy). CONCLUSIONS: The results of this study have shown how a low-cost vision-based system can automatically detect subtle phenomena featuring the PD. Our findings suggest that the proposed tool can support medical specialists in the assessment and rating of PD patients in a real clinical scenario.
Assuntos
Análise Custo-Benefício , Atividade Motora/fisiologia , Doença de Parkinson/fisiopatologia , Índice de Gravidade de Doença , Idoso , Idoso de 80 Anos ou mais , Algoritmos , Feminino , Análise da Marcha , Humanos , Aprendizado de Máquina , Masculino , Pessoa de Meia-Idade , Máquina de Vetores de SuporteRESUMO
BACKGROUND: Handwriting represents one of the major symptom in Parkinson's Disease (PD) patients. The computer-aided analysis of the handwriting allows for the identification of promising patterns that might be useful in PD detection and rating. In this study, we propose an innovative set of features extracted by geometrical, dynamical and muscle activation signals acquired during handwriting tasks, and evaluate the contribution of such features in detecting and rating PD by means of artificial neural networks. METHODS: Eleven healthy subjects and twenty-one PD patients were enrolled in this study. Each involved subject was asked to write three different patterns on a graphic tablet while wearing the Myo Armband used to collect the muscle activation signals of the main forearm muscles. We have then extracted several features related to the written pattern, the movement of the pen and the pressure exerted with the pen and the muscle activations. The computed features have been used to classify healthy subjects versus PD patients and to discriminate mild PD patients from moderate PD patients by using an artificial neural network (ANN). RESULTS: After the training and evaluation of different ANN topologies, the obtained results showed that the proposed features have high relevance in PD detection and rating. In particular, we found that our approach both detect and rate (mild and moderate PD) with a classification accuracy higher than 90%. CONCLUSIONS: In this paper we have investigated the representativeness of a set of proposed features related to handwriting tasks in PD detection and rating. In particular, we used an ANN to classify healthy subjects and PD patients (PD detection), and to classify mild and moderate PD patients (PD rating). The implemented and tested methods showed promising results proven by the high level of accuracy, sensitivity and specificity. Such results suggest the usability of the proposed setup in clinical settings to support the medical decision about Parkinson's Disease.
Assuntos
Biometria , Escrita Manual , Doença de Parkinson/diagnóstico , Doença de Parkinson/patologia , Idoso , Idoso de 80 Anos ou mais , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Redes Neurais de ComputaçãoRESUMO
As of common routine in tumor resections, surgeons rely on local examinations of the removed tissues and on the swiftly made microscopy findings of the pathologist, which are based on intraoperatively taken tissue probes. This approach may imply an extended duration of the operation, increased effort for the medical staff, and longer occupancy of the operating room (OR). Mixed reality technologies, and particularly augmented reality, have already been applied in surgical scenarios with positive initial outcomes. Nonetheless, these methods have used manual or marker-based registration. In this work, we design an application for a marker-less registration of PET-CT information for a patient. The algorithm combines facial landmarks extracted from an RGB video stream, and the so-called Spatial-Mapping API provided by the HMD Microsoft HoloLens. The accuracy of the system is compared with a marker-based approach, and the opinions of field specialists have been collected during a demonstration. A survey based on the standard ISO-9241/110 has been designed for this purpose. The measurements show an average positioning error along the three axes of (x, y, z) = (3.3 ± 2.3, - 4.5 ± 2.9, - 9.3 ± 6.1) mm. Compared with the marker-based approach, this shows an increment of the positioning error of approx. 3 mm along two dimensions (x, y), which might be due to the absence of explicit markers. The application has been positively evaluated by the specialists; they have shown interest in continued further work and contributed to the development process with constructive criticism.
Assuntos
Realidade Aumentada , Imageamento Tridimensional/métodos , Tomografia por Emissão de Pósitrons combinada à Tomografia Computadorizada/métodos , Cirurgia Assistida por Computador/métodos , Cirurgia Bucal/métodos , Algoritmos , Humanos , Projetos Piloto , Reprodutibilidade dos TestesRESUMO
An interlaboratory comparison (ILC) was organized with the aim to set up quality control indicators suitable for multicomponent quantitative analysis by nuclear magnetic resonance (NMR) spectroscopy. A total of 36 NMR data sets (corresponding to 1260 NMR spectra) were produced by 30 participants using 34 NMR spectrometers. The calibration line method was chosen for the quantification of a five-component model mixture. Results show that quantitative NMR is a robust quantification tool and that 26 out of 36 data sets resulted in statistically equivalent calibration lines for all considered NMR signals. The performance of each laboratory was assessed by means of a new performance index (named Qp-score) which is related to the difference between the experimental and the consensus values of the slope of the calibration lines. Laboratories endowed with a Qp-score falling within the suitable acceptability range are qualified to produce NMR spectra that can be considered statistically equivalent in terms of relative intensities of the signals. In addition, the specific response of nuclei to the experimental excitation/relaxation conditions was addressed by means of the parameter named NR. NR is related to the difference between the theoretical and the consensus slopes of the calibration lines and is specific for each signal produced by a well-defined set of acquisition parameters.
RESUMO
BACKGROUND: Expressed sequences (e.g. ESTs) are a strong source of evidence to improve gene structures and predict reliable alternative splicing events. When a genome assembly is available, ESTs are suitable to generate gene-oriented clusters through the well-established EasyCluster software. Nowadays, EST-like sequences can be massively produced using Next Generation Sequencing (NGS) technologies. In order to handle genome-scale transcriptome data, we present here EasyCluster2, a reimplementation of EasyCluster able to speed up the creation of gene-oriented clusters and facilitate downstream analyses as the assembly of full-length transcripts and the detection of splicing isoforms. RESULTS: EasyCluster2 has been developed to facilitate the genome-based clustering of EST-like sequences generated through the NGS 454 technology. Reads mapped onto the reference genome can be uploaded using the standard GFF3 file format. Alignment parsing is initially performed to produce a first collection of pseudo-clusters by grouping reads according to the overlap of their genomic coordinates on the same strand. EasyCluster2 then refines read grouping by including in each cluster only reads sharing at least one splice site and optionally performs a Smith-Waterman alignment in the region surrounding splice sites in order to correct for potential alignment errors. In addition, EasyCluster2 can include unspliced reads, which generally account for >50% of 454 datasets, and collapses overlapping clusters. Finally, EasyCluster2 can assemble full-length transcripts using a Directed-Acyclic-Graph-based strategy, simplifying the identification of alternative splicing isoforms, thanks also to the implementation of the widespread AStalavista methodology. Accuracy and performances have been tested on real as well as simulated datasets. CONCLUSIONS: EasyCluster2 represents a unique tool to cluster and assemble transcriptome reads produced with 454 technology, as well as ESTs and full-length transcripts. The clustering procedure is enhanced with the employment of genome annotations and unspliced reads. Overall, EasyCluster2 is able to perform an effective detection of splicing isoforms, since it can refine exon-exon junctions and explore alternative splicing without known reference transcripts. Results in GFF3 format can be browsed in the UCSC Genome Browser. Therefore, EasyCluster2 is a powerful tool to generate reliable clusters for gene expression studies, facilitating the analysis also to researchers not skilled in bioinformatics.
Assuntos
Processamento Alternativo , Etiquetas de Sequências Expressas , Perfilação da Expressão Gênica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Software , Algoritmos , Análise por Conglomerados , Éxons , Genômica/métodos , HumanosRESUMO
BACKGROUND AND OBJECTIVE: In Pancreatic Ductal Adenocarcinoma (PDA), multi-omic models are emerging to answer unmet clinical needs to derive novel quantitative prognostic factors. We realized a pipeline that relies on survival machine-learning (SML) classifiers and explainability based on patients' follow-up (FU) to stratify prognosis from the public-available multi-omic datasets of the CPTAC-PDA project. MATERIALS AND METHODS: Analyzed datasets included tumor-annotated radiologic images, clinical, and mutational data. A feature selection was based on univariate (UV) and multivariate (MV) survival analyses according to Overall Survival (OS) and recurrence (REC). In this study, we considered seven multi-omic datasets and compared four SML classifiers: Cox, survival random forest, generalized boosted, and support vector machines (SVM). For each classifier, we assessed the concordance (C) index on the validation set. The best classifiers for the validation set on both OS and REC underwent explainability analyses using SurvSHAP(t), which extends SHapley Additive exPlanations (SHAP). RESULTS: According to OS, after UV and MV analyses we selected 18/37 and 10/37 multi-omic features, respectively. According to REC, based on UV and MV analyses we selected 10/35 and 5/35 determinants, respectively. Generally, SML classifiers including radiomics outperformed those modelled on clinical or mutational predictors. For OS, the Cox model encompassing radiomic, clinical, and mutational features reached 75 % of C index, outperforming other classifiers. On the other hand, for REC, the SVM model including only radiomics emerged as the best-performing, with 68 % of C index. For OS, SurvSHAP(t) identified the first order Median Gray Level (GL) intensities, the gender, the tumor grade, the Joint Energy GL Co-occurrence Matrix (GLCM), and the GLCM Informational Measures of Correlations of type 1 as the most important features. For REC, the first order Median GL intensities, the GL size zone matrix Small Area Low GL Emphasis, and first order variance of GL intensities emerged as the most discriminative. CONCLUSIONS: In this work, radiomics showed the potential for improving patients' risk stratification in PDA. Furthermore, a deeper understanding of how radiomics can contribute to prognosis in PDA was achieved with a time-dependent explainability of the top multi-omic predictors.
RESUMO
BACKGROUND: In Diffuse Large B-Cell Lymphoma (DLBCL), several methodologies are emerging to derive novel biomarkers to be incorporated in the risk assessment. We realized a pipeline that relies on autoencoders (AE) and Explainable Artificial Intelligence (XAI) to stratify prognosis and derive a gene-based signature. METHODS: AE was exploited to learn an unsupervised representation of the gene expression (GE) from three publicly available datasets, each with its own technology. Multi-layer perceptron (MLP) was used to classify prognosis from latent representation. GE data were preprocessed as normalized, scaled, and standardized. Four different AE architectures (Large, Medium, Small and Extra Small) were compared to find the most suitable for GE data. The joint AE-MLP classified patients on six different outcomes: overall survival at 12, 36, 60 months and progression-free survival (PFS) at 12, 36, 60 months. XAI techniques were used to derive a gene-based signature aimed at refining the Revised International Prognostic Index (R-IPI) risk, which was validated in a fourth independent publicly available dataset. We named our tool SurvIAE: Survival prediction with Interpretable AE. RESULTS: From the latent space of AEs, we observed that scaled and standardized data reduced the batch effect. SurvIAE models outperformed R-IPI with Matthews Correlation Coefficient up to 0.42 vs. 0.18 for the validation-set (PFS36) and to 0.30 vs. 0.19 for the test-set (PFS60). We selected the SurvIAE-Small-PFS36 as the best model and, from its gene signature, we stratified patients in three risk groups: R-IPI Poor patients with High levels of GAB1, R-IPI Poor patients with Low levels of GAB1 or R-IPI Good/Very Good patients with Low levels of GPR132, and R-IPI Good/Very Good patients with High levels of GPR132. CONCLUSIONS: SurvIAE showed the potential to derive a gene signature with translational purpose in DLBCL. The pipeline was made publicly available and can be reused for other pathologies.
Assuntos
Inteligência Artificial , Linfoma Difuso de Grandes Células B , Humanos , Protocolos de Quimioterapia Combinada Antineoplásica , Linfoma Difuso de Grandes Células B/genética , Linfoma Difuso de Grandes Células B/tratamento farmacológico , Prognóstico , Expressão Gênica , Estudos RetrospectivosRESUMO
The identification of EEG biomarkers to discriminate Subjective Cognitive Decline (SCD) from Mild Cognitive Impairment (MCI) conditions is a complex task which requires great clinical effort and expertise. We exploit the self-attention component of the Transformer architecture to obtain physiological explanations of the model's decisions in the discrimination of 56 SCD and 45 MCI patients using resting-state EEG. Specifically, an interpretability workflow leveraging attention scores and time-frequency analysis of EEG epochs through Continuous Wavelet Transform is proposed. In the classification framework, models are trained and validated with 5-fold cross-validation and evaluated on a test set obtained by selecting 20% of the total subjects. Ablation studies and hyperparameter tuning tests are conducted to identify the optimal model configuration. Results show that the best performing model, which achieves acceptable results both on epochs' and patients' classification, is capable of finding specific EEG patterns that highlight changes in the brain activity between the two conditions. We demonstrate the potential of attention weights as tools to guide experts in understanding which disease-relevant EEG features could be discriminative of SCD and MCI.
Assuntos
Disfunção Cognitiva , Eletroencefalografia , Humanos , Eletroencefalografia/métodos , Disfunção Cognitiva/fisiopatologia , Disfunção Cognitiva/diagnóstico , Masculino , Feminino , Idoso , Processamento de Sinais Assistido por Computador , Pessoa de Meia-Idade , Encéfalo/fisiopatologia , Encéfalo/fisiologia , Análise de Ondaletas , Atenção/fisiologia , AlgoritmosRESUMO
BACKGROUND: Discovering the molecular targets of compounds or the cause of physiological conditions, among the multitude of known genes, is one of the major challenges of bioinformatics. One of the most common approaches to this problem is finding sets of differentially expressed, and more recently differentially co-expressed, genes. Other approaches require libraries of genetic mutants or require to perform a large number of assays. Another elegant approach is the filtering of mRNA expression profiles using reverse-engineered gene network models of the target cell. This approach has the advantage of not needing control samples, libraries or numerous assays. Nevertheless, the impementations of this strategy proposed so far are computationally demanding. Moreover the user has to arbitrarily choose a threshold on the number of potentially relevant genes from the algorithm output. RESULTS: Our solution, while performing comparably to state of the art algorithms in terms of discovered targets, is more efficient in terms of memory and time consumption. The proposed algorithm computes the likelihood associated to each gene and outputs to the user only the list of likely perturbed genes. CONCLUSIONS: The proposed algorithm is a valid alternative to existing algorithms and is particularly suited to contemporary gene expression microarrays, given the number of probe sets in each chip, also when executed on common desktop computers.
Assuntos
Algoritmos , Perfilação da Expressão Gênica/métodos , Software , Expressão Gênica/efeitos dos fármacos , Terapia de Alvo Molecular , Análise de Sequência com Séries de Oligonucleotídeos , Saccharomyces cerevisiae/genéticaRESUMO
Background: Artificial neural networks (ANNs) and logistic regression (LR) are the models of chosen in many medical data classification tasks. Several published articles were based on summarizing the differences and similarities of these models from a technical point of view and critically assessing the quality of the models. The aim of this study was to compare ANN and LR the statistical techniques to predict gastrointestinal cancer in an elderly cohort in Southern Italy (ONCONUT study). Method: In 1992, ONCONUT was started with the aim of evaluating the relationship between diet and cancer development in a Southern Italian elderly population. Patients with gastrointestinal cancer (ICD-10 from 150.0 to 159.9) were included in the study (n = 3,545). Results: This cohort was used to train and test the ANN and LR. LR was evaluated separately for macro- and micronutrients, and the accuracy was evaluated based on true positives and true negatives versus the total (97.15%). Then, ANN was trained and the accuracy was evaluated (96.61% for macronutrients and 97.06% for micronutrients). To further investigate the classification capabilities of ANN, k-fold cross-validation and genetic algorithm (GA) were used after balancing the dataset among classes. Conclusions: Both LR and ANN had high accuracy and similar performance. Both models had the potential to be used as decision clinical support integrated into clinical practice, because in many circumstances, the use of a simple LR model was likely to be adequate for real-world needs, but in others in which there were large amounts of data, the application of advanced analytic tools such as ANNs could be indicated, and the GA optimizer needed to optimize the accuracy of ANN.
RESUMO
Objective. This study aims to design and implement the first deep learning (DL) model to classify subjects in the prodromic states of Alzheimer's disease (AD) based on resting-state electroencephalographic (EEG) signals.Approach. EEG recordings of 17 healthy controls (HCs), 56 subjective cognitive decline (SCD) and 45 mild cognitive impairment (MCI) subjects were acquired at resting state. After preprocessing, we selected sections corresponding to eyes-closed condition. Five different datasets were created by extracting delta, theta, alpha, beta and delta-to-theta frequency bands using bandpass filters. To classify SCDvsMCI and HCvsSCDvsMCI, we propose a framework based on the transformer architecture, which uses multi-head attention to focus on the most relevant parts of the input signals. We trained and validated the model on each dataset with a leave-one-subject-out cross-validation approach, splitting the signals into 10 s epochs. Subjects were assigned to the same class as the majority of their epochs. Classification performances of the transformer were assessed for both epochs and subjects and compared with other DL models.Main results. Results showed that the delta dataset allowed our model to achieve the best performances for the discrimination of SCD and MCI, reaching an Area Under the ROC Curve (AUC) of 0.807, while the highest results for the HCvsSCDvsMCI classification were obtained on alpha and theta with a micro-AUC higher than 0.74.Significance. We demonstrated that DL approaches can support the adoption of non-invasive and economic techniques as EEG to stratify patients in the clinical population at risk for AD. This result was achieved since the attention mechanism was able to learn temporal dependencies of the signal, focusing on the most discriminative patterns, achieving state-of-the-art results by using a deep model of reduced complexity. Our results were consistent with clinical evidence that changes in brain activity are progressive when considering early stages of AD.
Assuntos
Doença de Alzheimer , Disfunção Cognitiva , Aprendizado Profundo , Humanos , Eletroencefalografia/métodos , Doença de Alzheimer/diagnóstico , Disfunção Cognitiva/diagnósticoRESUMO
The segmentation and classification of cell nuclei are pivotal steps in the pipelines for the analysis of bioimages. Deep learning (DL) approaches are leading the digital pathology field in the context of nuclei detection and classification. Nevertheless, the features that are exploited by DL models to make their predictions are difficult to interpret, hindering the deployment of such methods in clinical practice. On the other hand, pathomic features can be linked to an easier description of the characteristics exploited by the classifiers for making the final predictions. Thus, in this work, we developed an explainable computer-aided diagnosis (CAD) system that can be used to support pathologists in the evaluation of tumor cellularity in breast histopathological slides. In particular, we compared an end-to-end DL approach that exploits the Mask R-CNN instance segmentation architecture with a two steps pipeline, where the features are extracted while considering the morphological and textural characteristics of the cell nuclei. Classifiers that are based on support vector machines and artificial neural networks are trained on top of these features in order to discriminate between tumor and non-tumor nuclei. Afterwards, the SHAP (Shapley additive explanations) explainable artificial intelligence technique was employed to perform a feature importance analysis, which led to an understanding of the features processed by the machine learning models for making their decisions. An expert pathologist validated the employed feature set, corroborating the clinical usability of the model. Even though the models resulting from the two-stage pipeline are slightly less accurate than those of the end-to-end approach, the interpretability of their features is clearer and may help build trust for pathologists to adopt artificial intelligence-based CAD systems in their clinical workflow. To further show the validity of the proposed approach, it has been tested on an external validation dataset, which was collected from IRCCS Istituto Tumori "Giovanni Paolo II" and made publicly available to ease research concerning the quantification of tumor cellularity.
RESUMO
The complex pathobiology of lung cancer, and its spread worldwide, has prompted research studies that combine radiomic and genomic approaches. Indeed, the early identification of genetic alterations and driver mutations affecting the tumor is fundamental for correctly formulating the prognosis and therapeutic response. In this work, we propose a radiogenomic workflow to detect the presence of KRAS and EGFR mutations using radiomic features extracted from computed tomography images of patients affected by lung adenocarcinoma. To this aim, we investigated several feature selection algorithms to identify the most significant and uncorrelated sets of radiomic features and different classification models to reveal the mutational status. Then, we employed the SHAP (SHapley Additive exPlanations) technique to increase the understanding of the contribution given by specific radiomic features to the identification of the investigated mutations. Two cohorts of patients with lung adenocarcinoma were used for the study. The first one, obtained from the Cancer Imaging Archive (TCIA), consisted of 60 cases (25% EGFR, 23% KRAS); the second one, provided by the Azienda Ospedaliero-Universitaria 'Ospedali Riuniti' of Foggia, was composed of 55 cases (16% EGFR, 28% KRAS). The best-performing models proposed in our study achieved an AUC of 0.69 and 0.82 on the validation set for predicting the mutational status of EGFR and KRAS, respectively. The Multi-layer Perceptron model emerged as the top-performing model for both oncogenes, in some cases outperforming the state of the art. This study showed that radiomic features can be associated with EGFR and KRAS mutational status in patients with lung adenocarcinoma.