Pesquisa | Biblioteca Virtual em Saúde

1.

Egoism, utilitarianism and egalitarianism in multi-agent reinforcement learning.

Dong, Shaokang; Li, Chao; Yang, Shangdong; An, Bo; Li, Wenbin; Gao, Yang.

Neural Netw ; 178: 106544, 2024 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-39053197

RESUMO

In multi-agent partially observable sequential decision problems with general-sum rewards, it is necessary to account for the egoism (individual rewards), utilitarianism (social welfare), and egalitarianism (fairness) criteria simultaneously. However, achieving a balance between these criteria poses a challenge for current multi-agent reinforcement learning methods. Specifically, fully decentralized methods without global information of all agents' rewards, observations and actions fail to learn a balanced policy, while agents in centralized training (with decentralized execution) methods are reluctant to share private information due to concerns of exploitation by others. To address these issues, this paper proposes a Decentralized and Federated (D&F) paradigm, where decentralized agents train egoistic policies utilizing solely local information to attain self-interest, and the federation controller primarily considers utilitarianism and egalitarianism. Meanwhile, the parameters of decentralized and federated policies are optimized with discrepancy constraints mutually, akin to a server and client pattern, which ensures the balance between egoism, utilitarianism, and egalitarianism. Furthermore, theoretical evidence demonstrates that the federated model, as well as the discrepancy between decentralized egoistic policies and federated utilitarian policies, obtains an O(1/T) convergence rate. Extensive experiments show that our D&F approach outperforms multiple baselines, in terms of both utilitarianism and egalitarianism.

Assuntos

Reforço Psicológico , Humanos , Recompensa , Tomada de Decisões/fisiologia

2.

Learning Rich Feature Representation and State Change Monitoring for Accurate Animal Target Tracking.

Yin, Kuan; Feng, Jiangfan; Dong, Shaokang.

Animals (Basel) ; 14(6)2024 Mar 14.

Artigo em Inglês | MEDLINE | ID: mdl-38539999

RESUMO

Animal tracking is crucial for understanding migration, habitat selection, and behavior patterns. However, challenges in video data acquisition and the unpredictability of animal movements have hindered progress in this field. To address these challenges, we present a novel animal tracking method based on correlation filters. Our approach integrates hand-crafted features, deep features, and temporal context information to learn a rich feature representation of the target animal, enabling effective monitoring and updating of its state. Specifically, we extract hand-crafted histogram of oriented gradient features and deep features from different layers of the animal, creating tailored fusion features that encapsulate both appearance and motion characteristics. By analyzing the response map, we select optimal fusion features based on the oscillation degree. When the target animal's state changes significantly, we adaptively update the target model using temporal context information and robust feature data from the current frame. This updated model is then used for re-tracking, leading to improved results compared to recent mainstream algorithms, as demonstrated in extensive experiments conducted on our self-constructed animal datasets. By addressing specific challenges in animal tracking, our method offers a promising approach for more effective and accurate animal behavior research.

3.

WToE: Learning When to Explore in Multiagent Reinforcement Learning.

Dong, Shaokang; Mao, Hangyu; Yang, Shangdong; Zhu, Shengyu; Li, Wenbin; Hao, Jianye; Gao, Yang.

IEEE Trans Cybern ; PP2023 Nov 21.

Artigo em Inglês | MEDLINE | ID: mdl-37988210

RESUMO

Existing multiagent exploration works focus on how to explore in the fully cooperative task, which is insufficient in the environment with nonstationarity induced by agent interactions. To tackle this issue, we propose When to Explore (WToE), a simple yet effective variational exploration method to learn WToE under nonstationary environments. WToE employs an interaction-oriented adaptive exploration mechanism to adapt to environmental changes. We first propose a novel graphical model that uses a latent random variable to model the step-level environmental change resulting from interaction effects. Leveraging this graphical model, we employ the supervised variational auto-encoder (VAE) framework to derive a short-term inferred policy from historical trajectories to deal with the nonstationarity. Finally, agents engage in exploration when the short-term inferred policy diverges from the current actor policy. The proposed approach theoretically guarantees the convergence of the Q -value function. In our experiments, we validate our exploration mechanism in grid examples, multiagent particle environments and the battle game of MAgent environments. The results demonstrate the superiority of WToE over multiple baselines and existing exploration methods, such as MAEXQ, NoisyNets, EITI, and PR2.

4.

Electrochemical Behavior of Al(III) and Formation of Different Phases Al-Ni Alloys Deposits from LiCl-KCl-AlCl3 Molten Salt.

Peng, Yaru; Chen, Zeng; Bai, Ying; Pei, Qingqing; Li, Wei; Diao, Chunli; Li, Xijin; Li, Shengjun; Dong, Shaokang.

Materials (Basel) ; 11(11)2018 Oct 27.

Artigo em Inglês | MEDLINE | ID: mdl-30373246

RESUMO

The electrochemical behaviors of Al(III) deposits on Ni substrates were investigated in LiCl-KCl-AlCl3 (2 wt.%) molten salts. Various electrochemical methods, including cyclic voltammetry (CV), square wave voltammetry (SWV), and open circuit chronopotentiometry (OCP) were used to explore the deposition processes of Al(III) on Ni substrates. Five kinds of Al-Ni alloys phase were firstly electrodeposited by the regulation of deposition potential form LiCl-KCl-AlCl3 (2 wt.%) molten salts at 753 K. The formation of Al-Ni alloys, such as AlNi3, Ni5Al3, AlNi, Al3Ni2, and Al3Ni were confirmed by X-ray diffractometer (XRD) and the cross-section morphologies were investigated by scanning electron microscope (SEM). Meanwhile, it was found that the temperature of molten salt was another key parameter for the controlling of alloys phase. No Al-Ni alloys phase other than AlNi3 and Ni5Al3 could be deposited at 703 K.

5.

Effect of Polyethylene Glycol on the NiO Photocathode.

Li, Shengjun; Chen, Zeng; Kong, Wenping; Jia, Xiyang; Cai, Junhao; Dong, Shaokang.

Nanoscale Res Lett ; 12(1): 501, 2017 Aug 17.

Artigo em Inglês | MEDLINE | ID: mdl-28819900

RESUMO

In this study, a uniform nanoporous NiO film, with a thickness of up to 2.6 µm, was prepared using polyethylene glycol (PEG). The addition of PEG significantly decreased the cracks in the NiO film and prevented the peeling of the NiO film from a fluorine-doped tin oxide substrate. The NiO cathode was prepared using CdSeS quantum dots (QDs) as the sensitizer, with an optimized photoelectric conversion of 0.80%. The optimized QD-sensitized NiO films were first assembled with the TiO2 anode to prepared QD-sensitized p-n-type tandem solar cells. The open circuit voltage was greater than that obtained using the separated NiO cathode or TiO2 anode.

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA