Your browser doesn't support javascript.
loading
: 20 | 50 | 100
1 - 20 de 181
1.
IEEE Trans Cybern ; PP2024 Feb 21.
Article En | MEDLINE | ID: mdl-38381633

Predicting the trajectory of pedestrians in crowd scenarios is indispensable in self-driving or autonomous mobile robot field because estimating the future locations of pedestrians around is beneficial for policy decision to avoid collision. It is a challenging issue because humans have different walking motions, and the interactions between humans and objects in the current environment, especially between humans themselves, are complex. Previous researchers focused on how to model human-human interactions but neglected the relative importance of interactions. To address this issue, a novel mechanism based on correntropy is introduced. The proposed mechanism not only can measure the relative importance of human-human interactions but also can build personal space for each pedestrian. An interaction module, including this data-driven mechanism, is further proposed. In the proposed module, the data-driven mechanism can effectively extract the feature representations of dynamic human-human interactions in the scene and calculate the corresponding weights to represent the importance of different interactions. To share such social messages among pedestrians, an interaction-aware architecture based on long short-term memory network for trajectory prediction is designed. Experiments are conducted on two public datasets. Experimental results demonstrate that our model can achieve better performance than several latest methods with good performance.

2.
bioRxiv ; 2023 Jul 31.
Article En | MEDLINE | ID: mdl-37577637

Distinct dynamics in different cortical layers are apparent in neuronal and local field potential (LFP) patterns, yet their associations in the context of laminar processing have been sparingly analyzed. Here, we study the laminar organization of spike-field causal flow within and across visual (V4) and frontal areas (PFC) of monkeys performing a visual task. Using an event-based quantification of LFPs and a directed information estimator, we found area and frequency specificity in the laminar organization of spike-field causal connectivity. Gamma bursts (40-80 Hz) in the superficial layers of V4 largely drove intralaminar spiking. These gamma influences also fed forward up the cortical hierarchy to modulate laminar spiking in PFC. In PFC, the direction of intralaminar information flow was from spikes → fields where these influences dually controlled top-down and bottom-up processing. Our results, enabled by innovative methodologies, emphasize the complexities of spike-field causal interactions amongst multiple brain areas and behavior.

3.
Entropy (Basel) ; 25(6)2023 Jun 03.
Article En | MEDLINE | ID: mdl-37372243

Analyzing deep neural networks (DNNs) via information plane (IP) theory has gained tremendous attention recently to gain insight into, among others, DNNs' generalization ability. However, it is by no means obvious how to estimate the mutual information (MI) between each hidden layer and the input/desired output to construct the IP. For instance, hidden layers with many neurons require MI estimators with robustness toward the high dimensionality associated with such layers. MI estimators should also be able to handle convolutional layers while at the same time being computationally tractable to scale to large networks. Existing IP methods have not been able to study truly deep convolutional neural networks (CNNs). We propose an IP analysis using the new matrix-based Rényi's entropy coupled with tensor kernels, leveraging the power of kernel methods to represent properties of the probability distribution independently of the dimensionality of the data. Our results shed new light on previous studies concerning small-scale DNNs using a completely new approach. We provide a comprehensive IP analysis of large-scale CNNs, investigating the different training phases and providing new insights into the training dynamics of large-scale neural networks.

4.
IEEE Trans Neural Netw Learn Syst ; 34(8): 5156-5170, 2023 Aug.
Article En | MEDLINE | ID: mdl-34714752

Deep-predictive-coding networks (DPCNs) are hierarchical, generative models. They rely on feed-forward and feedback connections to modulate latent feature representations of stimuli in a dynamic and context-sensitive manner. A crucial element of DPCNs is a forward-backward inference procedure to uncover sparse, invariant features. However, this inference is a major computational bottleneck. It severely limits the network depth due to learning stagnation. Here, we prove why this bottleneck occurs. We then propose a new forward-inference strategy based on accelerated proximal gradients. This strategy has faster theoretical convergence guarantees than the one used for DPCNs. It overcomes learning stagnation. We also demonstrate that it permits constructing deep and wide predictive-coding networks. Such convolutional networks implement receptive fields that capture well the entire classes of objects on which the networks are trained. This improves the feature representations compared with our lab's previous nonconvolutional and convolutional DPCNs. It yields unsupervised object recognition that surpass convolutional autoencoders and is on par with convolutional networks trained in a supervised manner.

5.
JMIR Perioper Med ; 5(1): e37104, 2022 Sep 14.
Article En | MEDLINE | ID: mdl-36103231

BACKGROUND: Long-term postoperative pain (POP) and patient responses to pain relief medications are not yet fully understood. Although recent studies have developed an index for the nociception level of patients under general anesthesia based on multiple physiological parameters, it remains unclear whether these parameters correlate with long-term POP outcomes. OBJECTIVE: This study aims to extract unbiased and interpretable descriptions of how the dynamics of physiological parameters change over time and across patients in response to surgical procedures and intraoperative medications using a multivariate-temporal analysis. We demonstrated that there is an association (correlation) between the main features of intraoperative physiological responses and long-term POP, which has a predictive value, even without claiming causality. METHODS: We proposed a complex higher-order singular value decomposition method to accurately decompose patients' physiological responses into multivariate structures evolving over time. We used intraoperative vital signs of 175 patients from a mixed surgical cohort to extract three interconnected, low-dimensional, complex-valued descriptions of patients' physiological responses: multivariate factors, reflecting subphysiological parameters; temporal factors, reflecting common intrasurgery temporal dynamics; and patients' factors, describing interpatient changes in physiological responses. RESULTS: Adoption of the complex higher-order singular value decomposition method allowed us to clarify the dynamic correlation structure included in the intraoperative physiological responses. Instantaneous phases of the complex-valued physiological responses of 242 patients within the subspace of principal descriptors enabled us to discriminate between mild and not-mild (moderate-severe) levels of pain at postoperative days 30 and 90. Following rotation of physiological responses before projection to align with the common multivariate-temporal dynamic, the method achieved an area under curve for postoperative day 30 and 90 outcomes of 0.81 and 0.89 for thoracic surgery, 0.87 and 0.83 for orthopedic surgery, 0.87 and 0.88 for urological surgery, 0.86 and 1 for colorectal surgery, 1 and 1 for transplant surgery, and 0.83 and 0.92 for pancreatic surgery, respectively. CONCLUSIONS: By categorizing patients into different surgical groups, we identified significant surgery-related principal descriptors. Each of them potentially encodes different surgical stimulation. The dynamics of patients' physiological responses to these surgical events were linked to long-term POP development.

6.
J Neural Eng ; 19(4)2022 08 19.
Article En | MEDLINE | ID: mdl-35921802

Objective.Brain-machine interfaces (BMIs) translate neural activity into motor commands to restore motor functions for people with paralysis. Local field potentials (LFPs) are promising for long-term BMIs, since the quality of the recording lasts longer than single neuronal spikes. Inferring neuronal spike activity from population activities such as LFPs is challenging, because LFPs stem from synaptic currents flowing in the neural tissue produced by various neuronal ensembles and reflect neural synchronization. Existing studies that combine LFPs with spikes leverage the spectrogram of the former, which can neither detect the transient characteristics of LFP features (here, neuromodulation in a specific frequency band) with high accuracy, nor correlate them with relevant neuronal activity with a sufficient time resolution.Approach.We propose a feature extraction and validation framework to directly extract LFP neuromodulations related to synchronized spike activity using recordings from the primary motor cortex of six Sprague Dawley rats during a lever-press task. We first select important LFP frequency bands relevant to behavior, and then implement a marked point process (MPP) methodology to extract transient LFP neuromodulations. We validate the LFP feature extraction by examining the correlation with the pairwise synchronized firing probability of important neurons, which are selected according to their contribution to behavioral decoding. The highly correlated synchronized firings identified by the LFP neuromodulations are fed into a decoder to check whether they can serve as a reliable neural data source for movement decoding.Main results.We find that the gamma band (30-80 Hz) LFP neuromodulations demonstrate significant correlation with synchronized firings. Compared with traditional spectrogram-based method, the higher-temporal resolution MPP method captures the synchronized firing patterns with fewer false alarms, and demonstrates significantly higher correlation than single neuron spikes. The decoding performance using the synchronized neuronal firings identified by the LFP neuromodulations can reach 90% compared to the full recorded neuronal ensembles.Significance.Our proposed framework successfully extracts the sparse LFP neuromodulations that can identify temporal synchronized neuronal spikes with high correlation. The identified neuronal spike pattern demonstrates high decoding performance, which suggest LFP can be used as an effective modality for long-term BMI decoding.


Motor Cortex , Action Potentials/physiology , Animals , Humans , Macaca mulatta , Motor Cortex/physiology , Neurons , Rats , Rats, Sprague-Dawley
7.
Neural Netw ; 150: 274-292, 2022 Jun.
Article En | MEDLINE | ID: mdl-35339009

Inspired by the human vision system and learning, we propose a novel cognitive architecture that understands the content of raw videos in terms of objects without using labels. The architecture achieves four objectives: (1) Decomposing raw frames in objects by exploiting foveal vision and memory. (2) Describing the world by projecting objects on an internal canvas. (3) Extracting relevant objects from the canvas by analyzing the causal relation between objects and rewards. (4) Exploiting the information of relevant objects to facilitate the reinforcement learning (RL) process. In order to speed up learning, and better identify objects that produce rewards, the architecture implements learning by causality from the perspective of Wiener and Granger using object trajectories stored in working memory and the time series of external rewards. A novel non-parametric estimator of directed information using Renyi's entropy is designed and tested. Experiments on three environments show that our architecture extracts most of relevant objects. It can be thought of as 'understanding' the world in an object-oriented way. As a consequence, our architecture outperforms state-of-the-art deep reinforcement learning in terms of training speed and transfer learning.


Learning , Reward , Causality , Cognition , Humans , Reinforcement, Psychology
8.
IEEE Trans Neural Netw Learn Syst ; 33(4): 1441-1451, 2022 04.
Article En | MEDLINE | ID: mdl-33400656

By redefining the conventional notions of layers, we present an alternative view on finitely wide, fully trainable deep neural networks as stacked linear models in feature spaces, leading to a kernel machine interpretation. Based on this construction, we then propose a provably optimal modular learning framework for classification that does not require between-module backpropagation. This modular approach brings new insights into the label requirement of deep learning (DL). It leverages only implicit pairwise labels (weak supervision) when learning the hidden modules. When training the output module, on the other hand, it requires full supervision but achieves high label efficiency, needing as few as ten randomly selected labeled examples (one from each class) to achieve 94.88% accuracy on CIFAR-10 using a ResNet-18 backbone. Moreover, modular training enables fully modularized DL workflows, which then simplify the design and implementation of pipelines and improve the maintainability and reusability of models. To showcase the advantages of such a modularized workflow, we describe a simple yet reliable method for estimating reusability of pretrained modules as well as task transferability in a transfer learning setting. At practically no computation overhead, it precisely described the task space structure of 15 binary classification tasks from CIFAR-10.


Deep Learning , Neural Networks, Computer
9.
Front Artif Intell ; 4: 568384, 2021.
Article En | MEDLINE | ID: mdl-34568811

There is an ever-growing mismatch between the proliferation of data-intensive, power-hungry deep learning solutions in the machine learning (ML) community and the need for agile, portable solutions in resource-constrained devices, particularly for intelligence at the edge. In this paper, we present a fundamentally novel approach that leverages data-driven intelligence with biologically-inspired efficiency. The proposed Sparse Embodiment Neural-Statistical Architecture (SENSA) decomposes the learning task into two distinct phases: a training phase and a hardware embedment phase where prototypes are extracted from the trained network and used to construct fast, sparse embodiment for hardware deployment at the edge. Specifically, we propose the Sparse Pulse Automata via Reproducing Kernel (SPARK) method, which first constructs a learning machine in the form of a dynamical system using energy-efficient spike or pulse trains, commonly used in neuroscience and neuromorphic engineering, then extracts a rule-based solution in the form of automata or lookup tables for rapid deployment in edge computing platforms. We propose to use the theoretically-grounded unifying framework of the Reproducing Kernel Hilbert Space (RKHS) to provide interpretable, nonlinear, and nonparametric solutions, compared to the typical neural network approach. In kernel methods, the explicit representation of the data is of secondary nature, allowing the same algorithm to be used for different data types without altering the learning rules. To showcase SPARK's capabilities, we carried out the first proof-of-concept demonstration on the task of isolated-word automatic speech recognition (ASR) or keyword spotting, benchmarked on the TI-46 digit corpus. Together, these energy-efficient and resource-conscious techniques will bring advanced machine learning solutions closer to the edge.

10.
Neural Netw ; 141: 145-159, 2021 Sep.
Article En | MEDLINE | ID: mdl-33901879

Deep learning architectures are an extremely powerful tool for recognizing and classifying images. However, they require supervised learning and normally work on vectors of the size of image pixels and produce the best results when trained on millions of object images. To help mitigate these issues, we propose an end-to-end architecture that fuses bottom-up saliency and top-down attention with an object recognition module to focus on relevant data and learn important features that can later be fine-tuned for a specific task, employing only unsupervised learning. In addition, by utilizing a virtual fovea that focuses on relevant portions of the data, the training speed can be greatly improved. We test the performance of the proposed Gamma saliency technique on the Toronto and CAT 2000 databases, and the foveated vision in the large Street View House Numbers (SVHN) database. The results with foveated vision show that Gamma saliency performs at the same level as the best alternative algorithms while being computationally faster. The results in SVHN show that our unsupervised cognitive architecture is comparable to fully supervised methods and that saliency also improves CNN performance if desired. Finally, we develop and test a top-down attention mechanism based on the Gamma saliency applied to the top layer of CNNs to facilitate scene understanding in multi-object cluttered images. We show that the extra information from top-down saliency is capable of speeding up the extraction of digits in the cluttered multidigit MNIST data set, corroborating the important role of top down attention.


Deep Learning , Unsupervised Machine Learning , Databases, Factual , Humans , Vision, Ocular
11.
J Med Eng Technol ; 45(3): 187-196, 2021 Apr.
Article En | MEDLINE | ID: mdl-33729074

Activation of peripheral nervous system (PNS) fibres to produce variable tactile and proprioceptive sensations in advanced bidirectional prosthetic limbs relies on neural stimulators with high spatial selectivity, dynamic range and resolution. A multi-channel application-specific integrated circuit (ASIC) is developed for PNS fibre activation using a wide dynamic range (10 nA-5 mA), high-resolution (30 nA step, 100 ns pulse accuracy) current stimulator, dissipating 0.73-2.75 mW at 3 V. The ASIC also enables encoding of external pressure signals via an integrate-and-fire methodology. Electrophysiological data of compound nerve action potentials were recorded for a range of stimulus amplitudes and pulse widths. This data was used to benchmark the performance of the ASIC with a known neural stimulator.


Peripheral Nerves
12.
Anesth Analg ; 132(5): 1465-1474, 2021 05 01.
Article En | MEDLINE | ID: mdl-33591118

BACKGROUND: Evidence suggests that increased early postoperative pain (POP) intensities are associated with increased pain in the weeks following surgery. However, it remains unclear which temporal aspects of this early POP relate to later pain experience. In this prospective cohort study, we used wavelet analysis of clinically captured POP intensity data on postoperative days 1 and 2 to characterize slow/fast dynamics of POP intensities and predict pain outcomes on postoperative day 30. METHODS: The study used clinical POP time series from the first 48 hours following surgery from 218 patients to predict their mean POP on postoperative day 30. We first used wavelet analysis to approximate the POP series and to represent the series at different time scales to characterize the early temporal profile of acute POP in the first 2 postoperative days. We then used the wavelet coefficients alongside demographic parameters as inputs to a neural network to predict the risk of severe pain 30 days after surgery. RESULTS: Slow dynamic approximation components, but not fast dynamic detailed components, were linked to pain intensity on postoperative day 30. Despite imbalanced outcome rates, using wavelet decomposition along with a neural network for classification, the model achieved an F score of 0.79 and area under the receiver operating characteristic curve of 0.74 on test-set data for classifying pain intensities on postoperative day 30. The wavelet-based approach outperformed logistic regression (F score of 0.31) and neural network (F score of 0.22) classifiers that were restricted to sociodemographic variables and linear trajectories of pain intensities. CONCLUSIONS: These findings identify latent mechanistic information within the temporal domain of clinically documented acute POP intensity ratings, which are accessible via wavelet analysis, and demonstrate that such temporal patterns inform pain outcomes at postoperative day 30.


Pain Measurement , Pain Perception , Pain Threshold , Pain, Postoperative/diagnosis , Wavelet Analysis , Aged , Female , Humans , Male , Middle Aged , Neural Networks, Computer , Pain, Postoperative/etiology , Pain, Postoperative/physiopathology , Pain, Postoperative/psychology , Predictive Value of Tests , Prospective Studies , Recovery of Function , Severity of Illness Index , Time Factors
13.
Neural Comput ; 33(5): 1164-1198, 2021 Apr 13.
Article En | MEDLINE | ID: mdl-33617742

This letter introduces a new framework for quantifying predictive uncertainty for both data and models that relies on projecting the data into a gaussian reproducing kernel Hilbert space (RKHS) and transforming the data probability density function (PDF) in a way that quantifies the flow of its gradient as a topological potential field (quantified at all points in the sample space). This enables the decomposition of the PDF gradient flow by formulating it as a moment decomposition problem using operators from quantum physics, specifically Schrödinger's formulation. We experimentally show that the higher-order moments systematically cluster the different tail regions of the PDF, thereby providing unprecedented discriminative resolution of data regions having high epistemic uncertainty. In essence, this approach decomposes local realizations of the data PDF in terms of uncertainty moments. We apply this framework as a surrogate tool for predictive uncertainty quantification of point-prediction neural network models, overcoming various limitations of conventional Bayesian-based uncertainty quantification methods. Experimental comparisons with some established methods illustrate performance advantages that our framework exhibits.

14.
IEEE Trans Neural Netw Learn Syst ; 32(1): 435-442, 2021 Jan.
Article En | MEDLINE | ID: mdl-32071010

A novel functional estimator for Rényi's α -entropy and its multivariate extension was recently proposed in terms of the normalized eigenspectrum of a Hermitian matrix of the projected data in a reproducing kernel Hilbert space (RKHS). However, the utility and possible applications of these new estimators are rather new and mostly unknown to practitioners. In this brief, we first show that this estimator enables straightforward measurement of information flow in realistic convolutional neural networks (CNNs) without any approximation. Then, we introduce the partial information decomposition (PID) framework and develop three quantities to analyze the synergy and redundancy in convolutional layer representations. Our results validate two fundamental data processing inequalities and reveal more inner properties concerning CNN training.

15.
J Neural Eng ; 18(2)2021 03 01.
Article En | MEDLINE | ID: mdl-33348332

Objective.Computational models of neural activity at the meso-scale suggest the involvement of discrete oscillatory bursts as constructs of cognitive processing during behavioral tasks. Classical signal processing techniques that attempt to infer neural correlates of behavior from meso-scale activity employ spectral representations of the signal, exploiting power spectral density techniques and time-frequency (T-F) energy distributions to capture band power features. However, such analyses demand more specialized methods that incorporate explicitly the concepts of neurophysiological signal generation and time resolution in the tens of milliseconds. This paper focuses on working memory (WM), a complex cognitive process involved in encoding, storing and retrieving sensory information, which has been shown to be characterized by oscillatory bursts in the beta and gamma band. Employing a generative model for oscillatory dynamics, we present a marked point process (MPP) representation of bursts during memory creation and readout. We show that the markers of the point process quantify specific neural correlates of WM.Approach.We demonstrate our results on field potentials recorded from the prelimbic and secondary motor cortices of three rats while performing a WM task. The generative model for single channel, band-passed traces of field potentials characterizes with high-resolution, the timings and amplitudes of transient neuromodulations in the high gamma (80-150 Hz,γ) and beta (10-30 Hz,ß) bands as an MPP. We use standard hypothesis testing methods on the MPP features to check for significance in encoding of task variables, sensory stimulus and executive control while comparing encoding capabilities of our model with other T-F methods.Main Results.Firstly, the advantages of an MPP approach in deciphering encoding mechanisms at the meso-scale is demonstrated. Secondly, the nature of state encoding by neuromodulatory events is determined. Third, we demonstrate the necessity of a higher time resolution alternative to conventionally employed T-F methods. Finally, our results underscore the novelty in interpreting oscillatory dynamics encompassed by the marked features of the point process.Significance.An MPP representation of meso-scale activity not just enables a rich, high-resolution parameter space for analysis but also presents a novel tool for diverse neural applications.


Executive Function , Memory, Short-Term , Animals , Memory, Short-Term/physiology , Rats
16.
Sci Rep ; 10(1): 7961, 2020 05 14.
Article En | MEDLINE | ID: mdl-32409665

In aquatic and terrestrial environments, odorants are dispersed by currents that create concentration distributions that are spatially and temporally complex. Animals navigating in a plume must therefore rely upon intermittent, and time-varying information to find the source. Navigation has typically been studied as a spatial information problem, with the aim of movement towards higher mean concentrations. However, this spatial information alone, without information of the temporal dynamics of the plume, is insufficient to explain the accuracy and speed of many animals tracking odors. Recent studies have identified a subpopulation of olfactory receptor neurons (ORNs) that consist of intrinsically rhythmically active 'bursting' ORNs (bORNs) in the lobster, Panulirus argus. As a population, bORNs provide a neural mechanism dedicated to encoding the time between odor encounters. Using a numerical simulation of a large-scale plume, the lobster is used as a framework to construct a computer model to examine the utility of intermittency for orienting within a plume. Results show that plume intermittency is reliably detectable when sampling simulated odorants on the order of seconds, and provides the most information when animals search along the plume edge. Both the temporal and spatial variation in intermittency is predictably structured on scales relevant for a searching animal that encodes olfactory information utilizing bORNs, and therefore is suitable and useful as a navigational cue.


Aquatic Organisms , Odorants/analysis , Palinuridae , Spatio-Temporal Analysis , Algorithms , Animals , Computer Simulation
17.
IEEE Trans Neural Netw Learn Syst ; 31(11): 4990-4998, 2020 Nov.
Article En | MEDLINE | ID: mdl-31902772

Radial basis function (RBF) networks are traditionally defined for sets of vector-based observations. In this brief, we reformulate such networks so that they can be applied to adjacency-matrix representations of weighted, directed graphs that represent the relationships between object pairs. We restate the sum-of-squares objective function so that it is purely dependent on entries from the adjacency matrix. From this objective function, we derive a gradient descent update for the network weights. We also derive a gradient update that simulates the repositioning of the radial basis prototypes and changes in the radial basis prototype parameters. An important property of our radial basis function networks is that they are guaranteed to yield the same responses as conventional radial basis networks trained on a corresponding vector realization of the relationships encoded by the adjacency matrix. Such a vector realization only needs to provably exist for this property to hold, which occurs whenever the relationships correspond to distances from some arbitrary metric applied to a latent set of vectors. We, therefore, completely avoid needing to actually construct vectorial realizations via multidimensional scaling, which ensures that the underlying relationships are totally preserved.

18.
IEEE Trans Neural Netw Learn Syst ; 31(8): 3100-3113, 2020 Aug.
Article En | MEDLINE | ID: mdl-31536021

The distributions of input data are very important for learning machines, such as the convex universal learning machines (CULMs). The CULMs are a family of universal learning machines with convex optimization. However, the computational complexity is a crucial problem in CULMs, because the dimension of the nonlinear mapping layer (the hidden layer) of the CULMs is usually rather large in complex system modeling. In this article, we propose an efficient quantization method called Probability density Rank-based Quantization (PRQ) to decrease the computational complexity of CULMs. The PRQ ranks the data according to the estimated probability densities and then selects a subset whose elements are equally spaced in the ranked data sequence. We apply the PRQ to kernel ridge regression (KRR) and random Fourier feature recursive least squares (RFF-RLS), which are two typical algorithms of CULMs. The proposed method not only keeps the similarity of data distribution between the code book and data set but also reduces the computational cost by using the kd-tree. Meanwhile, for a given data set, the method yields deterministic quantization results, and it can also exclude the outliers and avoid too many borders in the code book. This brings great convenience to practical applications of the CULMs. The proposed PRQ is evaluated on several real-world benchmark data sets. Experimental results show satisfactory performance of PRQ compared with some state-of-the-art methods.

19.
IEEE Trans Neural Netw Learn Syst ; 31(6): 1780-1793, 2020 06.
Article En | MEDLINE | ID: mdl-31443054

An increasing number of neural memory networks have been developed, leading to the need for a systematic approach to analyze and compare their underlying memory structures. Thus, in this paper, we first create a framework for memory organization and then compare four popular dynamic models: vanilla recurrent neural network, long short-term memory, neural stack, and neural RAM. This analysis helps to open the dynamic neural networks' black box from the memory usage prospective. Accordingly, a taxonomy for these networks and their variants is proposed and proved using a unifying architecture. With the taxonomy, both network architectures and learning tasks are classified into four classes, and a one-to-one mapping is built between them to help practitioners select the appropriate architecture. To exemplify each task type, four synthetic tasks with different memory requirements are selected. Moreover, we use some signal processing applications and two natural language processing applications to evaluate the methodology in a realistic setting.

20.
IEEE Trans Pattern Anal Mach Intell ; 42(11): 2960-2966, 2020 Nov.
Article En | MEDLINE | ID: mdl-31395536

The matrix-based Rényi's α-order entropy functional was recently introduced using the normalized eigenspectrum of a Hermitian matrix of the projected data in a reproducing kernel Hilbert space (RKHS). However, the current theory in the matrix-based Rényi's α-order entropy functional only defines the entropy of a single variable or mutual information between two random variables. In information theory and machine learning communities, one is also frequently interested in multivariate information quantities, such as the multivariate joint entropy and different interactive quantities among multiple variables. In this paper, we first define the matrix-based Rényi's α-order joint entropy among multiple variables. We then show how this definition can ease the estimation of various information quantities that measure the interactions among multiple variables, such as interactive information and total correlation. We finally present an application to feature selection to show how our definition provides a simple yet powerful way to estimate a widely-acknowledged intractable quantity from data. A real example on hyperspectral image (HSI) band selection is also provided.

...