RESUMO
This study proposes an extendable modelling framework for Digital Twin-Oriented Complex Networked Systems (DT-CNSs) with a goal of generating networks that faithfully represent real-world social networked systems. Modelling process focuses on (i) features of nodes and (ii) interaction rules for creating connections that are built based on individual node's preferences. We conduct experiments on simulation-based DT-CNSs that incorporate various features and rules about network growth and different transmissibilities related to an epidemic spread on these networks. We present a case study on disaster resilience of social networks given an epidemic outbreak by investigating the infection occurrence within specific time and social distance. The experimental results show how different levels of the structural and dynamics complexities, concerned with feature diversity and flexibility of interaction rules respectively, influence network growth and epidemic spread. The analysis revealed that, to achieve maximum disaster resilience, mitigation policies should be targeted at nodes with preferred features as they have higher infection risks and should be the focus of the epidemic control.
Assuntos
Desastres , Epidemias , Humanos , Simulação por Computador , Suscetibilidade a DoençasRESUMO
Accurate modelling of complex social systems, where people interact with each other and those interactions change over time, has been a research challenge for many years. This study proposes an evolutionary Digital Twin-Oriented Complex Networked System (DT-CNS) framework that considers heterogeneous node features and changeable connection preferences. We create heterogeneous preference mutation mechanisms to characterise nodes' adaptive decisions on preference mutation in response to interaction patterns and epidemic risks. In this space, we use nodes' interaction utilities to characterise the positive feedback from interactions and negative impact of epidemic risks. We also introduce social capital constraint to harness the density of social connections better. The nodes' heterogeneous preference mutation styles include the (i)inactive style that keeps initial social preferences, (ii) ignorant style that randomly mutates preferences, (iii) egocentric style that optimises individual interaction utility, (iv) cooperative style that optimises the total interaction utilities by group decisions and (v) collaborative style that further allows the cooperative nodes to transfer social capital. Our simulation experiments on evolutionary DT-CNSs reveal that heterogeneous preference mutation styles lead to various interaction and infection patterns. The results also show that (i) increasing social capital enables higher interactions but higher infection risks and uncertainty in decision-making; (ii) group decisions outperform individual decisions by eliminating the unawareness of the decisions of other nodes; (iii) the collaborative nodes under a strict social capital limit can promote interactions, reduce infection risks and achieve higher overall interaction utilities.
Assuntos
Mutação , Humanos , Simulação por Computador , Modelos Teóricos , Evolução Biológica , Rede SocialRESUMO
Network disruption is pivotal in understanding the robustness and vulnerability of complex networks, which is instrumental in devising strategies for infrastructure protection, epidemic control, cybersecurity, and combating crime. In this paper, with a particular focus on disrupting criminal networks, we proposed to impose a within-the-largest-connected-component constraint in a continuous batch removal disruption process. Through a series of experiments on a recently released Sicilian Mafia network, we revealed that the constraint would enhance degree-based methods while weakening betweenness-based approaches. Moreover, based on the findings from the experiments using various disruption strategies, we propose a structurally-filtered greedy disruption strategy that integrates the effectiveness of greedy-like methods with the efficiency of structural-metric-based approaches. The proposed strategy significantly outperforms the longstanding state-of-the-art method of betweenness centrality while maintaining the same time complexity.
RESUMO
The well-being and functioning of individuals with chronic pain (CP) vary significantly. Social factors, such as social integration, may help explain this differential impact. Specifically, structural (network size, density) as well as functional (perceived social support, conflict) social network characteristics may play a role. However, it is not yet clear whether and how these variables are associated with each other. Objectives were to examine 1) both social network characteristics in individuals with primary and secondary CP, 2) the association between structural network characteristics and mental distress and functioning/participation in daily life, and 3) whether the network's functionality mediated the association between structural network characteristics and mental distress, respectively, functioning/participation in daily life. Using an online ego-centered social network tool, cross-sectional data were collected from 303 individuals with CP (81.85% women). No significant differences between individuals with fibromyalgia versus secondary CP were found regarding network size and density. In contrast, ANCOVA models showed lower levels of perceived social support and higher levels of conflict in primary (vs secondary) CP. Structural equation models showed that 1) larger network size indirectly predicted lower mental distress via lower levels of conflict; 2) higher network density increased mental distress via the increase of conflict levels. Network size or density did not (in)directly predict functioning/participation in daily life. The findings highlight that the role of conflict, in addition to support, should not be underestimated as a mediator for mental well-being. Research on explanatory mechanisms for associations between the network's structure, functionality, and well-being is warranted. PERSPECTIVE: This paper presents results on associations between structural (network size, density) and functional (social support, conflict) social network characteristics and well-being in the context of CP by making use of an ego-centered network design. Results suggest an indirect association between structural network characteristics and individuals with CP their mental well-being.
Assuntos
Dor Crônica , Apoio Social , Humanos , Feminino , Masculino , Pessoa de Meia-Idade , Dor Crônica/fisiopatologia , Dor Crônica/psicologia , Adulto , Estudos Transversais , Rede Social , Ego , Angústia Psicológica , Fibromialgia/fisiopatologia , Fibromialgia/psicologia , IdosoRESUMO
This article proposes a simple yet powerful ensemble classifier, called Random Hyperboxes, constructed from individual hyperbox-based classifiers trained on the random subsets of sample and feature spaces of the training set. We also show a generalization error bound of the proposed classifier based on the strength of the individual hyperbox-based classifiers as well as the correlation among them. The effectiveness of the proposed classifier is analyzed using a carefully selected illustrative example and compared empirically with other popular single and ensemble classifiers via 20 datasets using statistical testing methods. The experimental results confirmed that our proposed method outperformed other fuzzy min-max neural networks (FMNNs), popular learning algorithms, and is competitive with other ensemble methods. Finally, we identify the existing issues related to the generalization error bounds of the real datasets and inform the potential research directions.
RESUMO
Neural architecture search (NAS) has attracted a lot of attention and has been illustrated to bring tangible benefits in a large number of applications in the past few years. Architecture topology and architecture size have been regarded as two of the most important aspects for the performance of deep learning models and the community has spawned lots of searching algorithms for both of those aspects of the neural architectures. However, the performance gain from these searching algorithms is achieved under different search spaces and training setups. This makes the overall performance of the algorithms incomparable and the improvement from a sub-module of the searching model unclear. In this paper, we propose NATS-Bench, a unified benchmark on searching for both topology and size, for (almost) any up-to-date NAS algorithm. NATS-Bench includes the search space of 15,625 neural cell candidates for architecture topology and 32,768 for architecture size on three datasets. We analyze the validity of our benchmark in terms of various criteria and performance comparison of all candidates in the search space. We also show the versatility of NATS-Bench by benchmarking 13 recent state-of-the-art NAS algorithms on it. All logs and diagnostic information trained using the same setup for each candidate are provided. This facilitates a much larger community of researchers to focus on developing better NAS algorithms in a more comparable and computationally effective environment. All codes are publicly available at: https://xuanyidong.com/assets/projects/NATS-Bench.
Assuntos
Algoritmos , Redes Neurais de Computação , Benchmarking , Neurônios , Extratos VegetaisRESUMO
Graph embedding approaches have been attracting increasing attention in recent years mainly due to their universal applicability. They convert network data into a vector space in which the graph structural information and properties are maximumly preserved. Most existing approaches, however, ignore the rich information about interactions between nodes, i.e., edge attribute or edge type. Moreover, the learned embeddings suffer from a lack of explainability, and cannot be used to study the effects of typed structures in edge-attributed networks. In this paper, we introduce a framework to embed edge type information in graphlets and generate a Typed-Edge Graphlets Degree Vector (TyE-GDV). Additionally, we extend two combinatorial approaches, i.e., the colored graphlets and heterogeneous graphlets approaches to edge-attributed networks. Through applying the proposed method to a case study of chronic pain patients, we find that not only the network structure of a patient could indicate his/her perceived pain grade, but also certain social ties, such as those with friends, colleagues, and healthcare professionals, are more crucial in understanding the impact of chronic pain. Further, we demonstrate that in a node classification task, the edge-type encoded graphlets approaches outperform the traditional graphlet degree vector approach by a significant margin, and that TyE-GDV could achieve a competitive performance of the combinatorial approaches while being far more efficient in space requirements.
Assuntos
Algoritmos , Dor Crônica , Feminino , Humanos , Masculino , Modelos BiológicosRESUMO
The triangle structure, being a fundamental and significant element, underlies many theories and techniques in studying complex networks. The formation of triangles is typically measured by the clustering coefficient, in which the focal node is the centre-node in an open triad. In contrast, the recently proposed closure coefficient measures triangle formation from an end-node perspective and has been proven to be a useful feature in network analysis. Here, we extend it by proposing the directed closure coefficient that measures the formation of directed triangles. By distinguishing the direction of the closing edge in building triangles, we further introduce the source closure coefficient and the target closure coefficient. Then, by categorising particular types of directed triangles (e.g., head-of-path), we propose four closure patterns. Through multiple experiments on 24 directed networks from six domains, we demonstrate that at network-level, the four closure patterns are distinctive features in classifying network types, while at node-level, adding the source and target closure coefficients leads to significant improvement in link prediction task in most types of directed networks.
Assuntos
Algoritmos , Modelos TeóricosRESUMO
Metalearning attracted considerable interest in the machine learning community in the last years. Yet, some disagreement remains on what does or what does not constitute a metalearning problem and in which contexts the term is used in. This survey aims at giving an all-encompassing overview of the research directions pursued under the umbrella of metalearning, reconciling different definitions given in scientific literature, listing the choices involved when designing a metalearning system and identifying some of the future research challenges in this domain.
RESUMO
Community detection in complex networks is a fundamental data analysis task in various domains, and how to effectively find overlapping communities in real applications is still a challenge. In this work, we propose a new unified model and method for finding the best overlapping communities on the basis of the associated node and link partitions derived from the same framework. Specifically, we first describe a unified model that accommodates node and link communities (partitions) together, and then present a nonnegative matrix factorization method to learn the parameters of the model. Thereafter, we infer the overlapping communities based on the derived node and link communities, i.e., determine each overlapped community between the corresponding node and link community with a greedy optimization of a local community function conductance. Finally, we introduce a model selection method based on consensus clustering to determine the number of communities. We have evaluated our method on both synthetic and real-world networks with ground-truths, and compared it with seven state-of-the-art methods. The experimental results demonstrate the superior performance of our method over the competing ones in detecting overlapping communities for all analysed data sets. Improved performance is particularly pronounced in cases of more complicated networked community structures.
RESUMO
Estimation of the generalization ability of a classification or regression model is an important issue, as it indicates the expected performance on previously unseen data and is also used for model selection. Currently used generalization error estimation procedures, such as cross-validation (CV) or bootstrap, are stochastic and, thus, require multiple repetitions in order to produce reliable results, which can be computationally expensive, if not prohibitive. The correntropy-inspired density-preserving sampling (DPS) procedure proposed in this paper eliminates the need for repeating the error estimation procedure by dividing the available data into subsets that are guaranteed to be representative of the input dataset. This allows the production of low-variance error estimates with an accuracy comparable to 10 times repeated CV at a fraction of the computations required by CV. This method can also be used for model ranking and selection. This paper derives the DPS procedure and investigates its usability and performance using a set of public benchmark datasets and standard classifiers.
RESUMO
This paper describes the methodology of building a predictive model for the purpose of marine pollution monitoring, based on low quality biomarker data. A step-by-step, systematic data analysis approach is presented, resulting in design of a purely data-driven model, able to accurately discriminate between various coastal water pollution levels. The environmental scientists often try to apply various machine learning techniques to their data without much success, mostly because of the lack of experience with different methods and required 'under the hood' knowledge. Thus this paper is a result of a collaboration between the machine learning and environmental science communities, presenting a predictive model development workflow, as well as discussing and addressing potential pitfalls and difficulties. The novelty of the modelling approach presented lays in successful application of machine learning techniques to high dimensional, incomplete biomarker data, which to our knowledge has not been done before and is the result of close collaboration between machine learning and environmental science communities.