RESUMO
As semiconductor chip manufacturing technology advances, chip structures are becoming more complex, leading to an increased likelihood of void defects in the solder layer during packaging. However, identifying void defects in packaged chips remains a significant challenge due to the complex chip background, varying defect sizes and shapes, and blurred boundaries between voids and their surroundings. To address these challenges, we present a deep-learning-based framework for void defect segmentation in chip packaging. The framework consists of two main components: a solder region extraction method and a void defect segmentation network. The solder region extraction method includes a lightweight segmentation network and a rotation correction algorithm that eliminates background noise and accurately captures the solder region of the chip. The void defect segmentation network is designed for efficient and accurate defect segmentation. To cope with the variability of void defect shapes and sizes, we propose a Mamba model-based encoder that uses a visual state space module for multi-scale information extraction. In addition, we propose an interactive dual-stream decoder that uses a feature correlation cross gate module to fuse the streams' features to improve their correlation and produce more accurate void defect segmentation maps. The effectiveness of the framework is evaluated through quantitative and qualitative experiments on our custom X-ray chip dataset. Furthermore, the proposed void defect segmentation framework for chip packaging has been applied to a real factory inspection line, achieving an accuracy of 93.3% in chip qualification.
RESUMO
Pedestrian detection is a critical perception task for autonomous driving and intelligent vehicle, and it is challenging due to the potential variation of appearance and pose of human beings as well as the partial occlusion. In this paper, we present a novel pedestrian detection method via four-layer laser scanner. The proposed approach deals with the occlusion problem by fusing the segment classification results with past knowledge integration from tracking process. First, raw point cloud is segmented into the clusters of independent objects. Then, three types of features are proposed to capture the comprehensive cues, and 18 effective features are extracted with the combination of the univariate feature selection algorithm and feature correlation analysis process. Next, based on the segment classification at individual frame, the track classification is conducted further for consecutive frames using particle filter and probability data association filter. Experimental results demonstrate that both back-propagation neural network and Adaboost classifiers based on 18 selected features have their own advantages at the segment classification stage in terms of pedestrian detection performance and computation time, and the track classification procedure can improve the detection performance particularly for partially occluded pedestrians in comparison with the single segment classification procedure.
RESUMO
BACKGROUND: Virtual reality motion sickness (VRMS) is a key issue hindering the development of virtual reality technology, and accurate detection of its occurrence is the first prerequisite for solving the issue. OBJECTIVE: In this paper, a convolutional neural network (CNN) EEG detection model based on multi-scale feature correlation is proposed for detecting VRMS. METHODS: The model uses multi-scale 1D convolutional layers to extract multi-scale temporal features from the multi-lead EEG data, and then calculates the feature correlations of the extracted multi-scale features among all the leads to form the feature adjacent matrixes, which converts the time-domain features to correlation-based brain network features, thus strengthen the feature representation. Finally, the correlation features of each layer are fused. The fused features are then fed into the channel attention module to filter the channels and classify them using a fully connected network. Finally, we recruit subjects to experience 6 different modes of virtual roller coaster scenes, and collect resting EEG data before and after the task to verify the model. RESULTS: The results show that the accuracy, precision, recall and F1-score of this model for the recognition of VRMS are 98.66 %, 98.65 %, 98.68 %, and 98.66 %, respectively. The proposed model outperforms the current classic and advanced EEG recognition models. SIGNIFICANCE: It shows that this model can be used for the recognition of VRMS based on the resting state EEG.
Assuntos
Eletroencefalografia , Enjoo devido ao Movimento , Redes Neurais de Computação , Realidade Virtual , Humanos , Eletroencefalografia/métodos , Enjoo devido ao Movimento/fisiopatologia , Algoritmos , Masculino , Adulto , FemininoRESUMO
Biometric authentication prevents losses from identity misuse in the artificial intelligence (AI) era. The fusion method integrates palmprint and palm vein features, leveraging their stability and security and enhances counterfeiting prevention and overall system efficiency through multimodal correlations. However, most of the existing multi-modal palmprint and palm vein feature extraction methods extract only feature information independently from different modalities, ignoring the importance of the correlation between different modal samples in the class to the improvement of recognition performance. In this study, we addressed the aforementioned issues by proposing a feature-level joint learning fusion approach for palmprint and palm vein recognition based on modal correlations. The method employs a sparse unsupervised projection algorithm with a "purification matrix" constraint to enhance consistency in intra-modal features. This minimizes data reconstruction errors, eliminating noise and extracting compact, and discriminative representations. Subsequently, the partial least squares algorithm extracts high grayscale variance and category correlation subspaces from each modality. A weighted sum is then utilized to dynamically optimize the contribution of each modality for effective classification recognition. Experimental evaluations conducted for five multimodal databases, composed of six unimodal databases including the Chinese Academy of Sciences multispectral palmprint and palm vein databases, yielded equal error rates (EER) of 0.0173%, 0.0192%, 0.0059%, 0.0010%, and 0.0008%. Compared to some classical methods for palmprint and palm vein fusion recognition, the algorithm significantly improves recognition performance. The algorithm is suitable for identity recognition in scenarios with high security requirements and holds practical value.
Assuntos
Inteligência Artificial , Identificação Biométrica , Identificação Biométrica/métodos , Algoritmos , Mãos/anatomia & histologia , AprendizagemRESUMO
To explore the rich information contained in multi-modal data and take into account efficiency, deep cross-modal hash retrieval (DCMHR) is a wise solution. But currently, most DCMHR methods have two key limitations, one is that the recommended classification of DCMHR models is conditioned only on the objects in different regions, respectively. Another flaw is that these methods either do not learn the unified hash codes in training or cannot design an efficient training process. To solve these two problems, this paper designs Large-Scale Cross-Modal Hashing with Unified Learning and Multi-Object Regional Correlation Reasoning (HUMOR). For the proposed related labels classified by ImgNet, HUMOR uses Multiple Instance Learning (MIL) to reason the correlation of these labels. When regional correlation reasoning is low, these labels will be through "reduce-add" to rectification from max-to-min (global precedence) or min-to-max (regional precedence). Then, HUMOR conducts unified learning on hash loss and classification loss, adopts the four-step iterative algorithm to optimize the unified hash codes, and reduces bias in the model. Experiments on two baseline datasets show that the average performance of this method is higher than most of the DCMHR methods. The results demonstrate the effectiveness and innovation of our method.
Assuntos
Aprendizagem , Resolução de Problemas , AlgoritmosRESUMO
Automatic vertebra recognition from magnetic resonance imaging (MRI) is of significance in disease diagnosis and surgical treatment of spinal patients. Although modern methods have achieved remarkable progress, vertebra recognition still faces two challenges in practice: (1) Vertebral appearance challenge: The vertebral repetitive nature causes similar appearance among different vertebrae, while pathological variation causes different appearance among the same vertebrae; (2) Field of view (FOV) challenge: The FOVs of the input MRI images are unpredictable, which exacerbates the appearance challenge because there may be no specific-appearing vertebrae to assist recognition. In this paper, we propose a Feature-cOrrelation-aware history-pReserving-sparse-Coding framEwork (FORCE) to extract highly discriminative features and alleviate these challenges. FORCE is a recognition framework with two elaborated modules: (1) A feature similarity regularization (FSR) module to constrain the features of the vertebrae with the same label (but potentially with different appearances) to be closer in the latent feature space in an Eigenmap-based regularization manner. (2) A cumulative sparse representation (CSR) module to achieve feed-forward sparse coding while preventing historical features from being erased, which leverages both the intrinsic advantages of sparse codes and the historical features for obtaining more discriminative sparse codes encoding each vertebra. These two modules are embedded into the vertebra recognition framework in a plug-and-play manner to improve feature discrimination. FORCE is trained and evaluated on a challenging dataset containing 600 MRI images. The evaluation results show that FORCE achieves high performance in vertebra recognition and outperforms other state-of-the-art methods.
Assuntos
Algoritmos , Coluna Vertebral , Humanos , Coluna Vertebral/diagnóstico por imagem , Imageamento por Ressonância Magnética/métodosRESUMO
Feature selection for multiple types of data has been widely applied in mild cognitive impairment (MCI) and Alzheimer's disease (AD) classification research. Combining multi-modal data for classification can better realize the complementarity of valuable information. In order to improve the classification performance of feature selection on multi-modal data, we propose a multi-modal feature selection algorithm using feature correlation and feature structure fusion (FC2FS). First, we construct feature correlation regularization by fusing a similarity matrix between multi-modal feature nodes. Then, based on manifold learning, we employ feature matrix fusion to construct feature structure regularization, and learn the local geometric structure of the feature nodes. Finally, the two regularizations are embedded in a multi-task learning model that introduces low-rank constraint, the multi-modal features are selected, and the final features are linearly fused and input into a support vector machine (SVM) for classification. Different controlled experiments were set to verify the validity of the proposed method, which was applied to MCI and AD classification. The accuracy of normal controls versus Alzheimer's disease, normal controls versus late mild cognitive impairment, normal controls versus early mild cognitive impairment, and early mild cognitive impairment versus late mild cognitive impairment achieve 91.85 ± 1.42%, 85.33 ± 2.22%, 78.29 ± 2.20%, and 77.67 ± 1.65%, respectively. This method makes up for the shortcomings of the traditional multi-modal feature selection based on subjects and fully considers the relationship between feature nodes and the local geometric structure of feature space. Our study not only enhances the interpretation of feature selection but also improves the classification performance, which has certain reference values for the identification of MCI and AD.
RESUMO
Despite Convolutional Neural Networks (CNNs) based approaches have been successful in objects detection, they predominantly focus on positioning discriminative regions while overlooking the internal holistic part-whole associations within objects. This would ultimately lead to the neglect of feature relationships between object and its parts as well as among those parts, both of which are significantly helpful for detecting discriminative parts. In this paper, we propose to "look insider the objects" by digging into part-whole feature correlations and take the attempts to leverage those correlations endowed by the Capsule Network (CapsNet) for robust object detection. Actually, highly correlated capsules across adjacent layers share high familiarity, which will be more likely to be routed together. In light of this, we take such correlations between different capsules of the preceding training samples as an awareness to constrain the subsequent candidate voting scope during the routing procedure, and a Feature Correlation-Steered CapsNet (FCS-CapsNet) with Locally-Constrained Expectation-Maximum (EM) Routing Agreement (LCEMRA) is proposed. Different from conventional EM routing, LCEMRA stipulates that only those relevant low-level capsules (parts) meeting the requirement of quantified intra-object cohesiveness can be clustered to make up high-level capsules (objects). In doing so, part-object associations can be dug by transformation weighting matrixes between capsules layers during such "part backtracking" procedure. LCEMRA enables low-level capsules to selectively gather projections from a non-spatially-fixed set of high-level capsules. Experiments on VOC2007, VOC2012, HKU-IS, DUTS, and COCO show that FCS-CapsNet can achieve promising object detection effects across multiple evaluation metrics, which are on-par with state-of-the-arts.
Assuntos
Redes Neurais de ComputaçãoRESUMO
The COVID-19 pandemic affected the whole world, but not all countries were impacted equally. This opens the question of what factors can explain the initial faster spread in some countries compared to others. Many such factors are overshadowed by the effect of the countermeasures, so we studied the early phases of the infection when countermeasures had not yet taken place. We collected the most diverse dataset of potentially relevant factors and infection metrics to date for this task. Using it, we show the importance of different factors and factor categories as determined by both statistical methods and machine learning (ML) feature selection (FS) approaches. Factors related to culture (e.g., individualism, openness), development, and travel proved the most important. A more thorough factor analysis was then made using a novel rule discovery algorithm. We also show how interconnected these factors are and caution against relying on ML analysis in isolation. Importantly, we explore potential pitfalls found in the methodology of similar work and demonstrate their impact on COVID-19 data analysis. Our best models using the decision tree classifier can predict the infection class with roughly 80% accuracy.
Assuntos
COVID-19 , Algoritmos , Humanos , Aprendizado de Máquina , Pandemias , SARS-CoV-2RESUMO
The Alzheimer's Disease Neuroimaging Initiative (ADNI) contains extensive patient measurements (e.g., magnetic resonance imaging [MRI], biometrics, RNA expression, etc.) from Alzheimer's disease (AD) cases and controls that have recently been used by machine learning algorithms to evaluate AD onset and progression. While using a variety of biomarkers is essential to AD research, highly correlated input features can significantly decrease machine learning model generalizability and performance. Additionally, redundant features unnecessarily increase computational time and resources necessary to train predictive models. Therefore, we used 49,288 biomarkers and 793,600 extracted MRI features to assess feature correlation within the ADNI dataset to determine the extent to which this issue might impact large scale analyses using these data. We found that 93.457% of biomarkers, 92.549% of the gene expression values, and 100% of MRI features were strongly correlated with at least one other feature in ADNI based on our Bonferroni corrected α (p-value ≤ 1.40754 × 10-13). We provide a comprehensive mapping of all ADNI biomarkers to highly correlated features within the dataset. Additionally, we show that significant correlation within the ADNI dataset should be resolved before performing bulk data analyses, and we provide recommendations to address these issues. We anticipate that these recommendations and resources will help guide researchers utilizing the ADNI dataset to increase model performance and reduce the cost and complexity of their analyses.
Assuntos
Doença de Alzheimer/diagnóstico , Doença de Alzheimer/genética , Estudos de Associação Genética , Neuroimagem , Transcriptoma , Doença de Alzheimer/epidemiologia , Doença de Alzheimer/terapia , Biomarcadores/análise , Conjuntos de Dados como Assunto/estatística & dados numéricos , Estudos de Associação Genética/estatística & dados numéricos , Humanos , Aprendizado de Máquina , Imageamento por Ressonância Magnética/métodos , Neuroimagem/métodos , Neuroimagem/estatística & dados numéricosRESUMO
The current study investigated how people summarize and represent objects with multiple features to cope with the complexity due to the number of objects and feature dimensions. We presented a set of circles whose color and size were either correlated perfectly (r = 1) or not correlated at all (r = 0). Using a membership identification task, we found that participants formed a statistical representation that included information about conjunctions as well as each color and size dimensions. In addition, we found that participants represented different set boundaries depending on the correlation between features of a set. Lastly, a pair-matching task revealed that participants predicted one feature value from the other feature value based on the correlation between features of a set. Our findings suggest that people represent a multi-feature ensemble statistically as a multivariate feature distribution, which is an efficient strategy to cope with scene complexity.
RESUMO
Alzheimer's disease (AD) is a gradually progressive neurodegenerative disease affecting cognition functions. Predicting the cognitive scores from neuroimage measures and identifying relevant imaging biomarkers are important research topics in the study of AD. Despite machine learning algorithms having many successful applications, the prediction model suffers from the so-called curse of dimensionality. Multi-task feature learning (MTFL) has helped tackle this problem incorporating the correlations among multiple clinical cognitive scores. However, MTFL neglects the inherent correlation among brain imaging measures. In order to better predict the cognitive scores and identify stable biomarkers, we first propose a generalized multi-task formulation framework that incorporates the task and feature correlation structures simultaneously. Second, we present a novel feature-aware sparsity-inducing norm (FAS-norm) penalty to incorporate a useful correlation between brain regions by exploiting correlations among features. Three multi-task learning models that incorporate the FAS-norm penalty are proposed following our framework. Finally, the algorithm based on the alternating direction method of multipliers (ADMM) is developed to optimize the non-smooth problems. We comprehensively evaluate the proposed models on the cross-sectional and longitudinal Alzheimer's disease neuroimaging initiative datasets. The inputs are the thickness measures and the volume measures of the cortical regions of interest. Compared with MTFL, our methods achieve an average decrease of 4.28% in overall error in the cross-sectional analysis and an average decrease of 7.97% in the Alzheimer's Disease Assessment Scale cognitive total score longitudinal analysis. Moreover, our methods identify sensitive and stable biomarkers to physicians, such as the hippocampus, lateral ventricle, and corpus callosum.
RESUMO
Molecular interactions at identical transcriptomic locations or at proximal but non-overlapping sites can mediate RNA modification and regulation, necessitating tools to uncover these spatial relationships. We present nearBynding, a flexible algorithm and software pipeline that models spatial correlation between transcriptome-wide tracks from diverse data types. nearBynding can process and correlate interval as well as continuous data and incorporate experimentally derived or in silico predicted transcriptomic tracks. nearBynding offers visualization functions for its statistics to identify colocalizations and adjacent features. We demonstrate the application of nearBynding to correlate RNA-binding protein (RBP) binding preferences with other RBPs, RNA structure, or RNA modification. By cross-correlating RBP binding and RNA structure data, we demonstrate that nearBynding recapitulates known RBP binding to structural motifs and provides biological insights into RBP binding preference of G-quadruplexes. nearBynding is available as an R/Bioconductor package and can run on a personal computer, making correlation of transcriptomic features broadly accessible.
Assuntos
Proteínas de Ligação a RNA , Transcriptoma , Transcriptoma/genética , Proteínas de Ligação a RNA/genética , Sítios de Ligação/genética , RNA/genética , Ligação ProteicaRESUMO
Doxorubicin (DOX) is an anticancer drug widely used to treat human and nonhuman tumors but the late and persistent cardio-toxicity reduces the therapeutic utility of the drug. The full mechanism(s) of DOX-induced acute, subchronic and delayed toxicity, which has a preponderant mitochondrial component, remains unclear; therefore, it is clinically relevant to identify early markers to identify patients who are predisposed to DOX-related cardiovascular toxicity. To address this, Wistar rats (16 weeks old) were treated with a single DOX dose (20 mg/kg, i.p.); then, mRNA, protein levels and functional analysis of mitochondrial endpoints were assessed 24 h later in the heart, liver, and kidney. Using an exploratory data analysis, we observed cardiac-specific alterations after DOX treatment for mitochondrial complexes III, IV, and preferentially for complex I. Conversely, the same analysis revealed complex II alterations are associated with DOX response in the liver and kidney. Interestingly, H2O2 production by the mitochondrial respiratory chain as well as loss of calcium-loading capacity, markers of subchronic toxicity, were not reliable indicators of acute DOX cardiotoxicity in this animal model. By using sequential principal component analysis and feature correlation analysis, we demonstrated for the first time alterations in sets of transcripts and proteins, but not functional measurements, that might serve as potential early acute markers of cardiac-specific mitochondrial toxicity, contributing to explain the trajectory of DOX cardiac toxicity and to develop novel interventions to minimize DOX cardiac liabilities.
Assuntos
Antibióticos Antineoplásicos/toxicidade , Doxorrubicina/toxicidade , Cardiopatias/induzido quimicamente , Mitocôndrias Cardíacas/efeitos dos fármacos , Miócitos Cardíacos/efeitos dos fármacos , Animais , Cálcio/metabolismo , Cardiotoxicidade , Respiração Celular/efeitos dos fármacos , Complexo de Proteínas da Cadeia de Transporte de Elétrons/genética , Complexo de Proteínas da Cadeia de Transporte de Elétrons/metabolismo , Cardiopatias/genética , Cardiopatias/metabolismo , Cardiopatias/patologia , Peróxido de Hidrogênio/metabolismo , Masculino , Mitocôndrias Cardíacas/genética , Mitocôndrias Cardíacas/metabolismo , Mitocôndrias Cardíacas/patologia , Miócitos Cardíacos/metabolismo , Miócitos Cardíacos/patologia , Ratos Wistar , Fatores de TempoRESUMO
This paper presents an efficient technique to reduce the inference cost of deep and/or wide convolutional neural network models by pruning redundant features (or filters). Previous studies have shown that over-sized deep neural network models tend to produce a lot of redundant features that are either shifted version of one another or are very similar and show little or no variations, thus resulting in filtering redundancy. We propose to prune these redundant features along with their related feature maps according to their relative cosine distances in the feature space, thus leading to smaller networks with reduced post-training inference computational costs and competitive performance. We empirically show on select models (VGG-16, ResNet-56, ResNet-110, and ResNet-34) and dataset (MNIST Handwritten digits, CIFAR-10, and ImageNet) that inference costs (in FLOPS) can be significantly reduced while overall performance is still competitive with the state-of-the-art.
Assuntos
Aprendizado Profundo , Redes Neurais de Computação , Aprendizado Profundo/tendências , HumanosRESUMO
Unlike most excitable cells, certain syncytial smooth muscle cells are known to exhibit spontaneous action potentials of varying shapes and sizes. These differences in shape are observed even in electrophysiological recordings obtained from a single cell. The origin and physiological relevance of this phenomenon are currently unclear. The study presented here aims to test the hypothesis that the syncytial nature of the detrusor smooth muscle tissue contributes to the variations in the action potential profile by influencing the superposition of the passive and active signals. Data extracted from experimental recordings have been compared with those obtained through simulations. The feature correlation studies on action potentials obtained from the experimental recordings suggest the underlying presence of passive signals, called spontaneous excitatory junction potentials (sEJPs). Through simulations, we are able to demonstrate that the syncytial organization of the cells, and the variable superposition of the sEJPs with the "native action potential", contribute to the diversity in the action potential profiles exhibited. It could also be inferred that the fraction of the propagated action potentials is very low in the detrusor. It is proposed that objective measurements of spontaneous action potential profiles can lead to a better understanding of bladder physiology and pathology.
RESUMO
A non-invasive and portable bioimpedance method and a device for detecting superior to inferior closure of the pharynx during swallowing have been developed. The 2-channel device measures electric impedance across the neck at two levels of the pharynx via injected currents at 40 and 70 kHz. The device has been trialled on both healthy and dysphagic subjects. Results from these trials revealed a relationship (r = 0.59) between the temporal separation of the second peaks in the bioimpedance waveforms and descending pressure sequence in the pharynx as measured by pharyngeal manometry. However, these features were only clearly visible in the bioimpedance waveforms for 64% of swallows. Further research is underway to improve the bioimpedance measurement reliability and validate waveform feature correlation to swallowing to maximise the device's efficacy in dysphagia rehabilitation.