Data hazards in synthetic biology.

Zelenka, Natalie R; Di Cara, Nina; Sharma, Kieren; Sarvaharman, Seeralan; Ghataora, Jasdeep S; Parmeggiani, Fabio; Nivala, Jeff; Abdallah, Zahraa S; Marucci, Lucia; Gorochowski, Thomas E

Zelenka, Natalie R; Di Cara, Nina; Sharma, Kieren; Sarvaharman, Seeralan; Ghataora, Jasdeep S; Parmeggiani, Fabio; Nivala, Jeff; Abdallah, Zahraa S; Marucci, Lucia; Gorochowski, Thomas E.

Afiliação

Zelenka NR; Jean Golding Institute, University of Bristol, Bristol, UK.
Di Cara N; BrisEngBio, University of Bristol, Bristol, UK.
Sharma K; School of Psychological Science, University of Bristol, Bristol, UK.
Sarvaharman S; School of Engineering Mathematics and Technology, University of Bristol, Bristol, UK.
Ghataora JS; School of Biological Sciences, University of Bristol, Bristol, UK.
Parmeggiani F; BrisEngBio, University of Bristol, Bristol, UK.
Nivala J; School of Biological Sciences, University of Bristol, Bristol, UK.
Abdallah ZS; BrisEngBio, University of Bristol, Bristol, UK.
Marucci L; School of Biochemistry, University of Bristol, Bristol, UK.
Gorochowski TE; School of Pharmacy and Pharmaceutical Sciences, Cardiff University, Cardiff, UK.

Synth Biol (Oxf) ; 9(1): ysae010, 2024.

Article em En | MEDLINE | ID: mdl-38973982

ABSTRACT

ABSTRACT

Data science is playing an increasingly important role in the design and analysis of engineered biology. This has been fueled by the development of high-throughput methods like massively parallel reporter assays, data-rich microscopy techniques, computational protein structure prediction and design, and the development of whole-cell models able to generate huge volumes of data. Although the ability to apply data-centric analyses in these contexts is appealing and increasingly simple to do, it comes with potential risks. For example, how might biases in the underlying data affect the validity of a result and what might the environmental impact of large-scale data analyses be? Here, we present a community-developed framework for assessing data hazards to help address these concerns and demonstrate its application to two synthetic biology case studies. We show the diversity of considerations that arise in common types of bioengineering projects and provide some guidelines and mitigating steps. Understanding potential issues and dangers when working with data and proactively addressing them will be essential for ensuring the appropriate use of emerging data-intensive AI methods and help increase the trustworthiness of their applications in synthetic biology.

Palavras-chave

AI; data hazards; data science; ethics; synthetic biology

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: Synth Biol (Oxf) Ano de publicação: 2024 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: Synth Biol (Oxf) Ano de publicação: 2024 Tipo de documento: Article