Pesquisa | Secretaria de Estado da Saúde

Using the Veil of Ignorance to align AI systems with principles of justice.

Weidinger, Laura; McKee, Kevin R; Everett, Richard; Huang, Saffron; Zhu, Tina O; Chadwick, Martin J; Summerfield, Christopher; Gabriel, Iason.

Proc Natl Acad Sci U S A ; 120(18): e2213709120, 2023 05 02.

Artigo em Inglês | MEDLINE | ID: mdl-37094137

RESUMO

The philosopher John Rawls proposed the Veil of Ignorance (VoI) as a thought experiment to identify fair principles for governing a society. Here, we apply the VoI to an important governance domain: artificial intelligence (AI). In five incentive-compatible studies (Nâ=â2,â508), including two preregistered protocols, participants choose principles to govern an Artificial Intelligence (AI) assistant from behind the veil: that is, without knowledge of their own relative position in the group. Compared to participants who have this information, we find a consistent preference for a principle that instructs the AI assistant to prioritize the worst-off. Neither risk attitudes nor political preferences adequately explain these choices. Instead, they appear to be driven by elevated concerns about fairness: Without prompting, participants who reason behind the VoI more frequently explain their choice in terms of fairness, compared to those in the Control condition. Moreover, we find initial support for the ability of the VoI to elicit more robust preferences: In the studies presented here, the VoI increases the likelihood of participants continuing to endorse their initial choice in a subsequent round where they know how they will be affected by the AI intervention and have a self-interested motivation to change their mind. These results emerge in both a descriptive and an immersive game. Our findings suggest that the VoI may be a suitable mechanism for selecting distributive principles to govern AI.

Assuntos

Inteligência Artificial , Sociedades , Humanos , Justiça Social

STELA: a community-centred approach to norm elicitation for AI alignment.

Bergman, Stevie; Marchal, Nahema; Mellor, John; Mohamed, Shakir; Gabriel, Iason; Isaac, William.

Sci Rep ; 14(1): 6616, 2024 03 19.

Artigo em Inglês | MEDLINE | ID: mdl-38503818

RESUMO

Value alignment, the process of ensuring that artificial intelligence (AI) systems are aligned with human values and goals, is a critical issue in AI research. Existing scholarship has mainly studied how to encode moral values into agents to guide their behaviour. Less attention has been given to the normative questions of whose values and norms AI systems should be aligned with, and how these choices should be made. To tackle these questions, this paper presents the STELA process (SocioTEchnical Language agent Alignment), a methodology resting on sociotechnical traditions of participatory, inclusive, and community-centred processes. For STELA, we conduct a series of deliberative discussions with four historically underrepresented groups in the United States in order to understand their diverse priorities and concerns when interacting with AI systems. The results of our research suggest that community-centred deliberation on the outputs of large language models is a valuable tool for eliciting latent normative perspectives directly from differently situated groups. In addition to having the potential to engender an inclusive process that is robust to the needs of communities, this methodology can provide rich contextual insights for AI alignment.

Assuntos

Inteligência Artificial , Idioma , Humanos , Princípios Morais , Descanso

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

Detalhe da pesquisa