Your browser doesn't support javascript.
loading
CLADE 2.0: Evolution-Driven Cluster Learning-Assisted Directed Evolution.
Qiu, Yuchi; Wei, Guo-Wei.
Afiliação
  • Qiu Y; Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States.
  • Wei GW; Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States.
J Chem Inf Model ; 62(19): 4629-4641, 2022 10 10.
Article em En | MEDLINE | ID: mdl-36154171
ABSTRACT
Directed evolution, a revolutionary biotechnology in protein engineering, optimizes protein fitness by searching an astronomical mutational space via expensive experiments. The cluster learning-assisted directed evolution (CLADE) efficiently explores the mutational space via a combination of unsupervised hierarchical clustering and supervised learning. However, the initial-stage sampling in CLADE treats all clusters equally despite many clusters containing a large portion of non-functional mutations. Recent statistical and deep learning tools enable evolutionary density modeling to access protein fitness in an unsupervised manner. In this work, we construct an ensemble of multiple evolutionary scores to guide the initial sampling in CLADE. The resulting evolutionary score-enhanced CLADE, called CLADE 2.0, efficiently selects a training set within a small informative space using the evolution-driven clustering sampling. CLADE 2.0 is validated by using two benchmark libraries both having 160,000 sequences from four-site mutational combinations. Extensive computational experiments and comparisons with existing cutting-edge methods indicate that CLADE 2.0 is a new state-of-art tool for machine learning-assisted directed evolution.
Assuntos

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Proteínas / Aprendizado de Máquina Tipo de estudo: Prognostic_studies Idioma: En Revista: J Chem Inf Model Assunto da revista: INFORMATICA MEDICA / QUIMICA Ano de publicação: 2022 Tipo de documento: Article País de afiliação: Estados Unidos

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Proteínas / Aprendizado de Máquina Tipo de estudo: Prognostic_studies Idioma: En Revista: J Chem Inf Model Assunto da revista: INFORMATICA MEDICA / QUIMICA Ano de publicação: 2022 Tipo de documento: Article País de afiliação: Estados Unidos