Your browser doesn't support javascript.
loading
Novel machine learning method allerStat identifies statistically significant allergen-specific patterns in protein sequences.
Goto, Kento; Tamehiro, Norimasa; Yoshida, Takumi; Hanada, Hiroyuki; Sakuma, Takuto; Adachi, Reiko; Kondo, Kazunari; Takeuchi, Ichiro.
Afiliação
  • Goto K; Department of Computer Science, Nagoya Institute of Technology, Nagoya, Aichi, Japan.
  • Tamehiro N; Division of Biochemistry, National Institute of Health Sciences, Kawasaki, Kanagawa, Japan.
  • Yoshida T; Department of Computer Science, Nagoya Institute of Technology, Nagoya, Aichi, Japan.
  • Hanada H; Center for Advanced Intelligence Project, RIKEN, Tokyo, Japan.
  • Sakuma T; Department of Computer Science, Nagoya Institute of Technology, Nagoya, Aichi, Japan.
  • Adachi R; Division of Biochemistry, National Institute of Health Sciences, Kawasaki, Kanagawa, Japan.
  • Kondo K; Division of Biochemistry, National Institute of Health Sciences, Kawasaki, Kanagawa, Japan. Electronic address: kondo@nihs.go.jp.
  • Takeuchi I; Center for Advanced Intelligence Project, RIKEN, Tokyo, Japan; Graduate School of Engineering, Nagoya University, Furo-cho, Nagoya, Japan. Electronic address: ichiro.takeuchi@mae.nagoya-u.ac.jp.
J Biol Chem ; 299(6): 104733, 2023 06.
Article em En | MEDLINE | ID: mdl-37086787
Cutting-edge technologies such as genome editing and synthetic biology allow us to produce novel foods and functional proteins. However, their toxicity and allergenicity must be accurately evaluated. It is known that specific amino acid sequences in proteins make some proteins allergic, but many of these sequences remain uncharacterized. In this study, we introduce a data-driven approach and a machine-learning method to find undiscovered allergen-specific patterns (ASPs) among amino acid sequences. The proposed method enables an exhaustive search for amino acid subsequences whose frequencies are statistically significantly higher in allergenic proteins. As a proof-of-concept, we created a database containing 21,154 proteins of which the presence or absence of allergic reactions are already known and applied the proposed method to the database. The detected ASPs in this proof-of-concept study were consistent with known biological findings, and the allergenicity prediction performance using the detected ASPs was higher than extant approaches, indicating this method may be useful in evaluating the utility of synthetic foods and proteins.
Assuntos
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Alérgenos / Proteínas / Aprendizado de Máquina Idioma: En Ano de publicação: 2023 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Alérgenos / Proteínas / Aprendizado de Máquina Idioma: En Ano de publicação: 2023 Tipo de documento: Article