RESUMO
It is important to operate devices with control panels and touch screens assisted by haptic feedback in mobile environments such as driving automobiles and electric power wheelchairs. A lot of consideration is needed to give accurate haptic feedback, especially, presenting clear touch feedback to the elderly and people with reduced sensation is a very critical issue from healthcare and safety perspectives. In this study, we aimed to identify the perceptual characteristics for the frequency and direction of haptic vibration on the touch screen with vehicle-driving vibration and to propose an efficient haptic system based on these characteristics. As a result, we demonstrated that the detection threshold shift decreased at frequencies above 210 Hz due to the contact pressure during active touch, but the detection threshold shift increased at below 210 Hz. We found that the detection thresholds were 0.30-0.45 gpeak with similar sensitivity in the 80-270 Hz range. The haptic system implemented by reflecting the experimental results achieved characteristics suitable for use scenarios in automobiles. Ultimately, it could provide practical guidelines for the development of touch screens to give accurate touch feedback in the real-world environment.
Assuntos
Retroalimentação Sensorial , Retroalimentação , Interface Usuário-Computador , Idoso , Desenho de Equipamento , Humanos , Estimulação Física , VibraçãoRESUMO
Functional connectivity network provides novel insights on how distributed brain regions are functionally integrated, and its deviations from healthy brain have recently been employed to identify biomarkers for neuropsychiatric disorders. However, most of brain network analysis methods utilized features extracted only from one functional connectivity network for brain disease detection and cannot provide a comprehensive representation on the subtle disruptions of brain functional organization induced by neuropsychiatric disorders. Inspired by the principles of multi-view learning which utilizes information from multiple views to enhance object representation, we propose a novel multiple network based framework to enhance the representation of functional connectivity networks by fusing the common and complementary information conveyed in multiple networks. Specifically, four functional connectivity networks corresponding to the four adjacent values of regularization parameter are generated via a sparse regression model with group constraint ( l2,1 -norm), to enhance the common intrinsic topological structure and limit the error rate caused by different views. To obtain a set of more meaningful and discriminative features, we propose using a modified version of weighted clustering coefficients to quantify the subtle differences of each group-sparse network at local level. We then linearly fuse the selected features from each individual network via a multi-kernel support vector machine for autism spectrum disorder (ASD) diagnosis. The proposed framework achieves an accuracy of 79.35%, outperforming all the compared single network methods for at least 7% improvement. Moreover, compared with other multiple network methods, our method also achieves the best performance, that is, with at least 11% improvement in accuracy.
Assuntos
Transtorno do Espectro Autista/diagnóstico por imagem , Mapeamento Encefálico/métodos , Encéfalo/diagnóstico por imagem , Interpretação de Imagem Assistida por Computador/métodos , Vias Neurais/diagnóstico por imagem , Transtorno do Espectro Autista/fisiopatologia , Encéfalo/fisiopatologia , Criança , Feminino , Humanos , Imageamento por Ressonância Magnética/métodos , Masculino , Vias Neurais/fisiopatologia , Máquina de Vetores de SuporteRESUMO
Sparse representation-based brain functional network modeling often results in large inter-subject variability in the network structure. This could reduce the statistical power in group comparison, or even deteriorate the generalization capability of the individualized diagnosis of brain diseases. Although group sparse representation (GSR) can alleviate such a limitation by increasing network similarity across subjects, it could, in turn, fail in providing satisfactory separability between the subjects from different groups (e.g., patients vs. controls). In this study, we propose to integrate individual functional connectivity (FC) information into the GSR-based network construction framework to achieve higher between-group separability while maintaining the merit of within-group consistency. Our method was based on an observation that the subjects from the same group have generally more similar FC patterns than those from different groups. To this end, we propose our new method, namely "strength and similarity guided GSR (SSGSR)", which exploits both BOLD signal temporal correlation-based "low-order" FC (LOFC) and inter-subject LOFC-profile similarity-based "high-order" FC (HOFC) as two priors to jointly guide the GSR-based network modeling. Extensive experimental comparisons are carried out, with the rs-fMRI data from mild cognitive impairment (MCI) subjects and healthy controls, between the proposed algorithm and other state-of-the-art brain network modeling approaches. Individualized MCI identification results show that our method could achieve a balance between the individually consistent brain functional network construction and the adequately maintained inter-group brain functional network distinctions, thus leading to a more accurate classification result. Our method also provides a promising and generalized solution for the future connectome-based individualized diagnosis of brain disease.
RESUMO
Brain functional networks (BFNs) constructed from resting-state functional magnetic resonance imaging (rs-fMRI) have been widely applied to the analysis and diagnosis of brain diseases, such as Alzheimer's disease and its prodrome, namely mild cognitive impairment (MCI). Constructing a meaningful brain network based on, for example, sparse representation (SR) is the most essential step prior to the subsequent analysis or disease identification. However, the independent coding process of SR fails to capture the intrinsic locality and similarity characteristics in the data. To address this problem, we propose a novel weighted graph (Laplacian) regularized SR framework, based on which BFN can be optimized by considering both intrinsic correlation similarity and local manifold structure in the data, as well as sparsity prior of the brain connectivity. Additionally, the non-convergence of the graph Laplacian in the self-representation model has been solved properly. Combined with a pipeline of sparse feature selection and classification, the effectiveness of our proposed method is demonstrated by identifying MCI based on the constructed BFNs.
RESUMO
Despite countless studies on autism spectrum disorder (ASD), diagnosis relies on specific behavioral criteria and neuroimaging biomarkers for the disorder are still relatively scarce and irrelevant for diagnostic workup. Many researchers have focused on functional networks of brain activities using resting-state functional magnetic resonance imaging (rsfMRI) to diagnose brain diseases, including ASD. Although some existing methods are able to reveal the abnormalities in functional networks, they are either highly dependent on prior assumptions for modeling these networks or do not focus on latent functional connectivities (FCs) by considering discriminative relations among FCs in a nonlinear way. In this article, we propose a novel framework to model multiple networks of rsfMRI with data-driven approaches. Specifically, we construct large-scale functional networks with hierarchical clustering and find discriminative connectivity patterns between ASD and normal controls (NC). We then learn features and classifiers for each cluster through discriminative restricted Boltzmann machines (DRBMs). In the testing phase, each DRBM determines whether a test sample is ASD or NC, based on which we make a final decision with a majority voting strategy. We assess the diagnostic performance of the proposed method using public datasets and describe the effectiveness of our method by comparing it to competing methods. We also rigorously analyze FCs learned by DRBMs on each cluster and discover dominant FCs that play a major role in discriminating between ASD and NC. Hum Brain Mapp 38:5804-5821, 2017. © 2017 Wiley Periodicals, Inc.
Assuntos
Transtorno do Espectro Autista/diagnóstico por imagem , Transtorno do Espectro Autista/fisiopatologia , Mapeamento Encefálico/métodos , Encéfalo/diagnóstico por imagem , Encéfalo/fisiopatologia , Imageamento por Ressonância Magnética/métodos , Transtorno do Espectro Autista/classificação , Análise por Conglomerados , Humanos , Aprendizado de Máquina , Vias Neurais/diagnóstico por imagem , Vias Neurais/fisiopatologia , Descanso , Sensibilidade e Especificidade , Adulto JovemRESUMO
Brain functional connectivity (FC) extracted from resting-state fMRI (RS-fMRI) has become a popular approach for diagnosing various neurodegenerative diseases, including Alzheimer's disease (AD) and its prodromal stage, mild cognitive impairment (MCI). Current studies mainly construct the FC networks between grey matter (GM) regions of the brain based on temporal co-variations of the blood oxygenation level-dependent (BOLD) signals, which reflects the synchronized neural activities. However, it was rarely investigated whether the FC detected within the white matter (WM) could provide useful information for diagnosis. Motivated by the recently proposed functional correlation tensors (FCT) computed from RS-fMRI and used to characterize the structured pattern of local FC in the WM, we propose in this article a novel MCI classification method based on the information conveyed by both the FC between the GM regions and that within the WM regions. Specifically, in the WM, the tensor-based metrics (e.g., fractional anisotropy [FA], similar to the metric calculated based on diffusion tensor imaging [DTI]) are first calculated based on the FCT and then summarized along each of the major WM fiber tracts connecting each pair of the brain GM regions. This could capture the functional information in the WM, in a similar network structure as the FC network constructed for the GM, based only on the same RS-fMRI data. Moreover, a sliding window approach is further used to partition the voxel-wise BOLD signal into multiple short overlapping segments. Then, both the FC and FCT between each pair of the brain regions can be calculated based on the BOLD signal segments in the GM and WM, respectively. In such a way, our method can generate dynamic FC and dynamic FCT to better capture functional information in both GM and WM and further integrate them together by using our developed feature extraction, selection, and ensemble learning algorithms. The experimental results verify that the dynamic FCT can provide valuable functional information in the WM; by combining it with the dynamic FC in the GM, the diagnosis accuracy for MCI subjects can be significantly improved even using RS-fMRI data alone. Hum Brain Mapp 38:5019-5034, 2017. © 2017 Wiley Periodicals, Inc.
Assuntos
Encéfalo/diagnóstico por imagem , Disfunção Cognitiva/classificação , Disfunção Cognitiva/diagnóstico por imagem , Diagnóstico por Computador/métodos , Substância Cinzenta/diagnóstico por imagem , Substância Branca/diagnóstico por imagem , Encéfalo/fisiopatologia , Mapeamento Encefálico , Circulação Cerebrovascular/fisiologia , Disfunção Cognitiva/fisiopatologia , Imagem de Tensor de Difusão , Substância Cinzenta/fisiopatologia , Humanos , Aprendizado de Máquina , Imageamento por Ressonância Magnética/métodos , Vias Neurais/diagnóstico por imagem , Vias Neurais/fisiopatologia , Oxigênio/sangue , Sensibilidade e Especificidade , Substância Branca/fisiopatologiaRESUMO
Emotions can be aroused by various kinds of stimulus modalities. Recent neuroimaging studies indicate that several brain regions represent emotions at an abstract level, i.e., independently from the sensory cues from which they are perceived (e.g., face, body, or voice stimuli). If emotions are indeed represented at such an abstract level, then these abstract representations should also be activated by the memory of an emotional event. We tested this hypothesis by asking human participants to learn associations between emotional stimuli (videos of faces or bodies) and non-emotional stimuli (fractals). After successful learning, fMRI signals were recorded during the presentations of emotional stimuli and emotion-associated fractals. We tested whether emotions could be decoded from fMRI signals evoked by the fractal stimuli using a classifier trained on the responses to the emotional stimuli (and vice versa). This was implemented as a whole-brain searchlight, multivoxel activation pattern analysis, which revealed successful emotion decoding in four brain regions: posterior cingulate cortex (PCC), precuneus, MPFC, and angular gyrus. The same analysis run only on responses to emotional stimuli revealed clusters in PCC, precuneus, and MPFC. Multidimensional scaling analysis of the activation patterns revealed clear clustering of responses by emotion across stimulus types. Our results suggest that PCC, precuneus, and MPFC contain representations of emotions that can be evoked by stimuli that carry emotional information themselves or by stimuli that evoke memories of emotional stimuli, while angular gyrus is more likely to take part in emotional memory retrieval.
Assuntos
Aprendizagem por Associação , Mapeamento Encefálico , Encéfalo/fisiologia , Formação de Conceito/fisiologia , Emoções/fisiologia , Adulto , Análise de Variância , Encéfalo/irrigação sanguínea , Expressão Facial , Feminino , Humanos , Imageamento Tridimensional , Imageamento por Ressonância Magnética , Masculino , Movimento/fisiologia , Oxigênio/sangue , Estimulação Luminosa , Adulto JovemRESUMO
Studies on resting-state functional Magnetic Resonance Imaging (rs-fMRI) have shown that different brain regions still actively interact with each other while a subject is at rest, and such functional interaction is not stationary but changes over time. In terms of a large-scale brain network, in this paper, we focus on time-varying patterns of functional networks, i.e., functional dynamics, inherent in rs-fMRI, which is one of the emerging issues along with the network modelling. Specifically, we propose a novel methodological architecture that combines deep learning and state-space modelling, and apply it to rs-fMRI based Mild Cognitive Impairment (MCI) diagnosis. We first devise a Deep Auto-Encoder (DAE) to discover hierarchical non-linear functional relations among regions, by which we transform the regional features into an embedding space, whose bases are complex functional networks. Given the embedded functional features, we then use a Hidden Markov Model (HMM) to estimate dynamic characteristics of functional networks inherent in rs-fMRI via internal states, which are unobservable but can be inferred from observations statistically. By building a generative model with an HMM, we estimate the likelihood of the input features of rs-fMRI as belonging to the corresponding status, i.e., MCI or normal healthy control, based on which we identify the clinical label of a testing subject. In order to validate the effectiveness of the proposed method, we performed experiments on two different datasets and compared with state-of-the-art methods in the literature. We also analyzed the functional networks learned by DAE, estimated the functional connectivities by decoding hidden states in HMM, and investigated the estimated functional connectivities by means of a graph-theoretic approach.
Assuntos
Encéfalo/fisiologia , Disfunção Cognitiva/diagnóstico por imagem , Aprendizado de Máquina , Imageamento por Ressonância Magnética/métodos , Modelos Neurológicos , Idoso , Feminino , Humanos , Interpretação de Imagem Assistida por Computador , Masculino , Rede Nervosa , Descanso/fisiologiaRESUMO
We propose a nonlinear dynamic model for an invasive electroencephalogram analysis that learns the optimal parameters of the neural population model via the Levenberg-Marquardt algorithm. We introduce the crucial windows where the estimated parameters present patterns before seizure onset. The optimal parameters minimizes the error between the observed signal and the generated signal by the model. The proposed approach effectively discriminates between healthy signals and epileptic seizure signals. We evaluate the proposed method using an electroencephalogram dataset with normal and epileptic seizure sequences. The empirical results show that the patterns of parameters as a seizure approach and the method is efficient in analyzing nonlinear epilepsy electroencephalogram data. The accuracy of estimating the optimal parameters is improved by using the nonlinear dynamic model.
Assuntos
Encéfalo/diagnóstico por imagem , Eletroencefalografia/métodos , Epilepsia/diagnóstico por imagem , Dinâmica não Linear , Reconhecimento Automatizado de Padrão/métodos , Processamento de Sinais Assistido por Computador , Algoritmos , Encéfalo/fisiopatologia , Encéfalo/cirurgia , Conjuntos de Dados como Assunto , Eletrodos Implantados , Epilepsia/fisiopatologia , Epilepsia/cirurgia , Humanos , Convulsões/diagnóstico por imagem , Convulsões/fisiopatologia , Convulsões/cirurgiaRESUMO
Despite intensive efforts for decades, deformable image registration is still a challenging problem due to the potential large anatomical differences across individual images, which limits the registration performance. Fortunately, this issue could be alleviated if a good initial deformation can be provided for the two images under registration, which are often termed as the moving subject and the fixed template, respectively. In this work, we present a novel patch-based initial deformation prediction framework for improving the performance of existing registration algorithms. Our main idea is to estimate the initial deformation between subject and template in a patch-wise fashion by using the sparse representation technique. We argue that two image patches should follow the same deformation toward the template image if their patch-wise appearance patterns are similar. To this end, our framework consists of two stages, i.e., the training stage and the application stage. In the training stage, we register all training images to the pre-selected template, such that the deformation of each training image with respect to the template is known. In the application stage, we apply the following four steps to efficiently calculate the initial deformation field for the new test subject: (1) We pick a small number of key points in the distinctive regions of the test subject; (2) for each key point, we extract a local patch and form a coupled appearance-deformation dictionary from training images where each dictionary atom consists of the image intensity patch as well as their respective local deformations; (3) a small set of training image patches in the coupled dictionary are selected to represent the image patch of each subject key point by sparse representation. Then, we can predict the initial deformation for each subject key point by propagating the pre-estimated deformations on the selected training patches with the same sparse representation coefficients; and (4) we employ thin-plate splines (TPS) to interpolate a dense initial deformation field by considering all key points as the control points. Thus, the conventional image registration problem becomes much easier in the sense that we only need to compute the remaining small deformation for completing the registration of the subject to the template. Experimental results on both simulated and real data show that the registration performance can be significantly improved after integrating our patch-based deformation prediction framework into the existing registration algorithms.
Assuntos
Algoritmos , Encéfalo/anatomia & histologia , Interpretação de Imagem Assistida por Computador/métodos , Neuroimagem/métodos , Simulação por Computador , HumanosRESUMO
CpG islands are GC-rich regions often located in the 5' end of genes and normally protected from cytosine methylation in mammals. The important role of CpG islands in gene transcription strongly suggests evolutionary conservation in the mammalian genome. However, as CpG dinucleotides are over-represented in CpG islands, comparative CpG island analysis using conventional sequence analysis techniques remains a major challenge in the epigenetics field. In this study, we conducted a comparative analysis of all CpG island sequences in 10 mammalian genomes. As sequence similarity methods and character composition techniques such as information theory are particularly difficult to conduct, we used exact patterns in CpG island sequences and single character discrepancies to identify differences in CpG island sequences. First, by calculating genome distance based on rank correlation tests, we show that k-mer and k-flank patterns around CpG sites can be used to correctly reconstruct the phylogeny of 10 mammalian genomes. Further, we used various machine learning algorithms to demonstrate that CpG islands sequences can be characterized using k-mers. In addition, by testing a human model on the nine different mammalian genomes, we provide the first evidence that k-mer signatures are consistent with evolutionary history.
Assuntos
Ilhas de CpG , Evolução Molecular , Mamíferos/genética , Algoritmos , Animais , Inteligência Artificial , Genômica/métodos , Humanos , Mamíferos/classificação , Filogenia , Análise de Sequência de DNARESUMO
For the last decade, it has been shown that neuroimaging can be a potential tool for the diagnosis of Alzheimer's Disease (AD) and its prodromal stage, Mild Cognitive Impairment (MCI), and also fusion of different modalities can further provide the complementary information to enhance diagnostic accuracy. Here, we focus on the problems of both feature representation and fusion of multimodal information from Magnetic Resonance Imaging (MRI) and Positron Emission Tomography (PET). To our best knowledge, the previous methods in the literature mostly used hand-crafted features such as cortical thickness, gray matter densities from MRI, or voxel intensities from PET, and then combined these multimodal features by simply concatenating into a long vector or transforming into a higher-dimensional kernel space. In this paper, we propose a novel method for a high-level latent and shared feature representation from neuroimaging modalities via deep learning. Specifically, we use Deep Boltzmann Machine (DBM)(2), a deep network with a restricted Boltzmann machine as a building block, to find a latent hierarchical feature representation from a 3D patch, and then devise a systematic method for a joint feature representation from the paired patches of MRI and PET with a multimodal DBM. To validate the effectiveness of the proposed method, we performed experiments on ADNI dataset and compared with the state-of-the-art methods. In three binary classification problems of AD vs. healthy Normal Control (NC), MCI vs. NC, and MCI converter vs. MCI non-converter, we obtained the maximal accuracies of 95.35%, 85.67%, and 74.58%, respectively, outperforming the competing methods. By visual inspection of the trained model, we observed that the proposed method could hierarchically discover the complex latent patterns inherent in both MRI and PET.
Assuntos
Doença de Alzheimer/diagnóstico , Inteligência Artificial , Disfunção Cognitiva/diagnóstico , Processamento de Imagem Assistida por Computador/métodos , Imageamento por Ressonância Magnética/métodos , Imagem Multimodal/métodos , Tomografia por Emissão de Pósitrons/métodos , Idoso , Idoso de 80 Anos ou mais , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Redes Neurais de ComputaçãoRESUMO
With the remarkable success of deep neural networks, there is a growing interest in research aimed at providing clear interpretations of their decision-making processes. In this paper, we introduce Attribution Equilibrium, a novel method to decompose output predictions into fine-grained attributions, balancing positive and negative relevance for clearer visualization of the evidence behind a network decision. We carefully analyze conventional approaches to decision explanation and present a different perspective on the conservation of evidence. We define the evidence as a gap between positive and negative influences among gradient-derived initial contribution maps. Then, we incorporate antagonistic elements and a user-defined criterion for the degree of positive attribution during propagation. Additionally, we consider the role of inactivated neurons in the propagation rule, thereby enhancing the discernment of less relevant elements such as the background. We conduct various assessments in a verified experimental environment with PASCAL VOC 2007, MS COCO 2014, and ImageNet datasets. The results demonstrate that our method outperforms existing attribution methods both qualitatively and quantitatively in identifying the key input features that influence model decisions.
RESUMO
Recent temporal action detection models have focused on end-to-end trainable approaches to utilize the representational power of backbone networks. Despite the advantages of end-to-end trainable methods, these models still employ a small spatial resolution (e.g., 96 × 96) due to the inefficient trade-off between computational cost and spatial resolution. In this study, we argue that a simple pooling method (e.g., adaptive average pooling) acts as a bottleneck at the spatial aggregation part, restricting representational power. To address this issue, we propose a temporal-wise spatial attentive pooling (TSAP), which alleviates the bottleneck between the backbone and the detection head using a temporal-wise attention mechanism. Our approach mitigates the inefficient trade-off between spatial resolution and computational cost, thereby enhancing spatial scalability in temporal action detection. Moreover, TSAP is adaptable to previous end-to-end approaches by simply replacing the spatial pooling part. Our experiments demonstrated the essential role of spatial aggregation, and consistent improvements are observed by incorporating TSAP into previous end-to-end methods.
Assuntos
Atenção , Atenção/fisiologia , Humanos , Redes Neurais de Computação , Fatores de Tempo , Percepção Espacial/fisiologiaRESUMO
Although 3D human pose estimation has recently made strides, it is still difficult to precisely recreate a 3D human posture from a single image without the aid of 3D annotation for the following reasons. Firstly, the process of reconstruction inherently suffers from ambiguity, as multiple 3D poses can be projected onto the same 2D pose. Secondly, accurately measuring camera rotation without laborious camera calibration is a difficult task. While some approaches attempt to address these issues using traditional computer vision algorithms, they are not differentiable and cannot be optimized through training. This paper introduces two modules that explicitly leverage geometry to overcome these challenges, without requiring any 3D ground-truth or camera parameters. The first module, known as the relative depth estimation module, effectively mitigates depth ambiguity by narrowing down the possible depths for each joint to only two candidates. The second module, referred to as the differentiable pose alignment module, calculates camera rotation by aligning poses from different views. The use of these geometrically interpretable modules reduces the complexity of training and yields superior performance. By adopting our proposed method, we achieve state-of-the-art results on standard benchmark datasets, surpassing other self-supervised methods and even outperforming several fully-supervised approaches that heavily rely on 3D annotations.
Assuntos
Algoritmos , Imageamento Tridimensional , Humanos , Imageamento Tridimensional/métodos , Postura , Rotação , CalibragemRESUMO
Recently, denoising diffusion models have demonstrated remarkable performance among generative models in various domains. However, in the speech domain, there are limitations in complexity and controllability to apply diffusion models for time-varying audio synthesis. Particularly, a singing voice synthesis (SVS) task, which has begun to emerge as a practical application in the game and entertainment industries, requires high-dimensional samples with long-term acoustic features. To alleviate the challenges posed by model complexity in the SVS task, we propose HiddenSinger, a high-quality SVS system using a neural audio codec and latent diffusion models. To ensure high-fidelity audio, we introduce an audio autoencoder that can encode audio into an audio codec as a compressed representation and reconstruct the high-fidelity audio from the low-dimensional compressed latent vector. Subsequently, we use the latent diffusion models to sample a latent representation from a musical score. In addition, our proposed model is extended to an unsupervised singing voice learning framework, HiddenSinger-U, to train the model using an unlabeled singing voice dataset. Experimental results demonstrate that our model outperforms previous models regarding audio quality. Furthermore, the HiddenSinger-U can synthesize high-quality singing voices of speakers trained solely on unlabeled data.
RESUMO
In multi-label recognition, effectively addressing the challenge of partial labels is crucial for reducing annotation costs and enhancing model generalization. Existing methods exhibit limitations by relying on unrealistic simulations with uniformly dropped labels, overlooking how ambiguous instances and instance-level factors impacts label ambiguity in real-world datasets. To address this deficiency, our paper introduces a realistic partial label setting grounded in instance ambiguity, complemented by Reliable Ambiguity-Aware Instance Weighting (R-AAIW)-a strategy that utilizes importance weighting to adapt dynamically to the inherent ambiguity of multi-label instances. The strategy leverages an ambiguity score to prioritize learning from clearer instances. As proficiency of the model improves, the weights are dynamically modulated to gradually shift focus towards more ambiguous instances. By employing an adaptive re-weighting method that adjusts to the complexity of each instance, our approach not only enhances the model's capability to detect subtle variations among labels but also ensures comprehensive learning without excluding difficult instances. Extensive experimentation across various benchmarks highlights our approach's superiority over existing methods, showcasing its ability to provide a more accurate and adaptable framework for multi-label recognition tasks.
RESUMO
To evaluate sleep quality, it is necessary to monitor overnight sleep duration. However, sleep monitoring typically requires more than 7 hours, which can be inefficient in termxs of data size and analysis. Therefore, we proposed to develop a deep learning-based model using a 30 sec sleep electroencephalogram (EEG) early in the sleep cycle to predict sleep onset latency (SOL) distribution and explore associations with sleep quality (SQ). We propose a deep learning model composed of a structure that decomposes and restores the signal in epoch units and a structure that predicts the SOL distribution. We used the Sleep Heart Health Study public dataset, which includes a large number of study subjects, to estimate and evaluate the proposed model. The proposed model estimated the SOL distribution and divided it into four clusters. The advantage of the proposed model is that it shows the process of falling asleep for individual participants as a probability graph over time. Furthermore, we compared the baseline of good SQ and SOL and showed that less than 10 minutes SOL correlated better with good SQ. Moreover, it was the most suitable sleep feature that could be predicted using early EEG, compared with the total sleep time, sleep efficiency, and actual sleep time. Our study showed the feasibility of estimating SOL distribution using deep learning with an early EEG and showed that SOL distribution within 10 minutes was associated with good SQ.
Assuntos
Aprendizado Profundo , Eletroencefalografia , Qualidade do Sono , Humanos , Masculino , Feminino , Adulto , Latência do Sono/fisiologia , Pessoa de Meia-Idade , Algoritmos , Idoso , Polissonografia , Sono/fisiologiaRESUMO
Sleep onset latency (SOL) is an important factor relating to the sleep quality of a subject. Therefore, accurate prediction of SOL is useful to identify individuals at risk of sleep disorders and to improve sleep quality. In this study, we estimate SOL distribution and falling asleep function using an electroencephalogram (EEG), which can measure the electric field of brain activity. We proposed a Multi Ensemble Distribution model for estimating Sleep Onset Latency (MEDi-SOL), consisting of a temporal encoder and a time distribution decoder. We evaluated the performance of the proposed model using a public dataset from the Sleep Heart Health Study. We considered four distributions, Normal, log-Normal, Weibull, and log-Logistic, and compared them with a survival model and a regression model. The temporal encoder with the ensemble log-Logistic and log-Normal distribution showed the best and second-best scores in the concordance index (C-index) and mean absolute error (MAE). Our MEDi-SOL, multi ensemble distribution with combining log-Logistic and log-Normal distribution, shows the best score in C-index and MAE, with a fast training time. Furthermore, our model can visualize the process of falling asleep for individual subjects. As a result, a distribution-based ensemble approach with appropriate distribution is more useful than point estimation.
Assuntos
Eletroencefalografia , Processamento de Sinais Assistido por Computador , Humanos , Eletroencefalografia/métodos , Masculino , Feminino , Latência do Sono/fisiologia , Pessoa de Meia-Idade , Adulto , Modelos Estatísticos , Algoritmos , Polissonografia/métodos , IdosoRESUMO
Electroencephalography (EEG) signals are the brain signals acquired using the non-invasive approach. Owing to the high portability and practicality, EEG signals have found extensive application in monitoring human physiological states across various domains. In recent years, deep learning methodologies have been explored to decode the intricate information embedded in EEG signals. However, since EEG signals are acquired from humans, it has issues with acquiring enormous amounts of data for training the deep learning models. Therefore, previous research has attempted to develop pre-trained models that could show significant performance improvement through fine-tuning when data are scarce. Nonetheless, existing pre-trained models often struggle with constraints, such as the necessity to operate within datasets of identical configurations or the need to distort the original data to apply the pre-trained model. In this paper, we proposed the domain-free transformer, called DFformer, for generalizing the EEG pre-trained model. In addition, we presented the pre-trained model based on DFformer, which is capable of seamless integration across diverse datasets without necessitating architectural modification or data distortion. The proposed model achieved competitive performance across motor imagery and sleep stage classification datasets. Notably, even when fine-tuned on datasets distinct from the pre-training phase, DFformer demonstrated marked performance enhancements. Hence, we demonstrate the potential of DFformer to overcome the conventional limitations in pre-trained model development, offering robust applicability across a spectrum of domains.