Pesquisa | Secretaria de Estado da Saúde

Deception abilities emerged in large language models.

Hagendorff, Thilo.

Proc Natl Acad Sci U S A ; 121(24): e2317967121, 2024 Jun 11.

Artigo em Inglês | MEDLINE | ID: mdl-38833474

RESUMO

Large language models (LLMs) are currently at the forefront of intertwining AI systems with human communication and everyday life. Thus, aligning them with human values is of great importance. However, given the steady increase in reasoning abilities, future LLMs are under suspicion of becoming able to deceive human operators and utilizing this ability to bypass monitoring efforts. As a prerequisite to this, LLMs need to possess a conceptual understanding of deception strategies. This study reveals that such strategies emerged in state-of-the-art LLMs, but were nonexistent in earlier LLMs. We conduct a series of experiments showing that state-of-the-art LLMs are able to understand and induce false beliefs in other agents, that their performance in complex deception scenarios can be amplified utilizing chain-of-thought reasoning, and that eliciting Machiavellianism in LLMs can trigger misaligned deceptive behavior. GPT-4, for instance, exhibits deceptive behavior in simple test scenarios 99.16% of the time (P < 0.001). In complex second-order deception test scenarios where the aim is to mislead someone who expects to be deceived, GPT-4 resorts to deceptive behavior 71.46% of the time (P < 0.001) when augmented with chain-of-thought reasoning. In sum, revealing hitherto unknown machine behavior in LLMs, our study contributes to the nascent field of machine psychology.

Assuntos

Enganação , Idioma , Humanos , Inteligência Artificial

Forecasting emergent risks in advanced AI systems: an analysis of a future road transport management system.

McLean, S; King, B J; Thompson, J; Carden, T; Stanton, N A; Baber, C; Read, G J M; Salmon, P M.

Ergonomics ; 66(11): 1750-1767, 2023 Nov.

Artigo em Inglês | MEDLINE | ID: mdl-38009364

RESUMO

Artificial Intelligence (AI) is being increasingly implemented within road transport systems worldwide. Next generation of AI, Artificial General Intelligence (AGI) is imminent, and is anticipated to be more powerful than current AI. AGI systems will have a broad range of abilities and be able to perform multiple cognitive tasks akin to humans that will likely produce many expected benefits, but also potential risks. This study applied the EAST Broken Links approach to forecast the functioning of an AGI system tasked with managing a road transport system and identify potential risks. In total, 363 risks were identified that could have adverse impacts on the stated goals of safety, efficiency, environmental sustainability, and economic performance of the road system. Further, risks beyond the stated goals were identified; removal from human control, mismanaging public relations, and self-preservation. A diverse set of systemic controls will be required when designing, implementing, and operating future advanced technologies.Practitioner summary: This study demonstrated the utility of HFE methods for formally considering risks associated with the design, implementation, and operation of future technologies. This study has implications for AGI research, design, and development to ensure safe and ethical AGI implementation.

Assuntos

Inteligência Artificial , Tecnologia , Humanos , Previsões

Comparing human text classification performance and explainability with large language and machine learning models using eye-tracking.

Divya Venkatesh, Jeevithashree; Jaiswal, Aparajita; Nanda, Gaurav.

Sci Rep ; 14(1): 14295, 2024 06 21.

Artigo em Inglês | MEDLINE | ID: mdl-38906943

RESUMO

To understand the alignment between reasonings of humans and artificial intelligence (AI) models, this empirical study compared the human text classification performance and explainability with a traditional machine learning (ML) model and large language model (LLM). A domain-specific noisy textual dataset of 204 injury narratives had to be classified into 6 cause-of-injury codes. The narratives varied in terms of complexity and ease of categorization based on the distinctive nature of cause-of-injury code. The user study involved 51 participants whose eye-tracking data was recorded while they performed the text classification task. While the ML model was trained on 120,000 pre-labelled injury narratives, LLM and humans did not receive any specialized training. The explainability of different approaches was compared based on the top words they used for making classification decision. These words were identified using eye-tracking for humans, explainable AI approach LIME for ML model, and prompts for LLM. The classification performance of ML model was observed to be relatively better than zero-shot LLM and non-expert humans, overall, and particularly for narratives with high complexity and difficult categorization. The top-3 predictive words used by ML and LLM for classification agreed with humans to a greater extent as compared to later predictive words.

Assuntos

Tecnologia de Rastreamento Ocular , Aprendizado de Máquina , Humanos , Idioma , Feminino , Masculino , Inteligência Artificial , Adulto , Movimentos Oculares/fisiologia

Editorial: Neural computations for brain machine interface applications.

Kang, Young Ho; Khorasani, Abed; Flint, Robert D; Farrokhi, Behraz; Lee, Sang Wan.

Front Hum Neurosci ; 17: 1334636, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-38094144

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

Detalhe da pesquisa