Your browser doesn't support javascript.
loading
A Generative Neural Network for Maximizing Fitness and Diversity of Synthetic DNA and Protein Sequences.
Linder, Johannes; Bogard, Nicholas; Rosenberg, Alexander B; Seelig, Georg.
Afiliação
  • Linder J; Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA 98195, USA. Electronic address: jlinder2@cs.washington.edu.
  • Bogard N; Department of Electrical and Computer Engineering, University of Washington, Seattle, WA 98195, USA.
  • Rosenberg AB; Department of Electrical and Computer Engineering, University of Washington, Seattle, WA 98195, USA.
  • Seelig G; Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA 98195, USA; Department of Electrical and Computer Engineering, University of Washington, Seattle, WA 98195, USA.
Cell Syst ; 11(1): 49-62.e16, 2020 07 22.
Article em En | MEDLINE | ID: mdl-32711843
ABSTRACT
Engineering gene and protein sequences with defined functional properties is a major goal of synthetic biology. Deep neural network models, together with gradient ascent-style optimization, show promise for sequence design. The generated sequences can however get stuck in local minima and often have low diversity. Here, we develop deep exploration networks (DENs), a class of activation-maximizing generative models, which minimize the cost of a neural network fitness predictor by gradient descent. By penalizing any two generated patterns on the basis of a similarity metric, DENs explicitly maximize sequence diversity. To avoid drifting into low-confidence regions of the predictor, we incorporate variational autoencoders to maintain the likelihood ratio of generated sequences. Using DENs, we engineered polyadenylation signals with more than 10-fold higher selection odds than the best gradient ascent-generated patterns, identified splice regulatory sequences predicted to result in highly differential splicing between cell lines, and improved on state-of-the-art results for protein design tasks.
Assuntos
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: DNA / Redes Neurais de Computação / Análise de Sequência de Proteína Tipo de estudo: Prognostic_studies Limite: Humans Idioma: En Revista: Cell Syst Ano de publicação: 2020 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: DNA / Redes Neurais de Computação / Análise de Sequência de Proteína Tipo de estudo: Prognostic_studies Limite: Humans Idioma: En Revista: Cell Syst Ano de publicação: 2020 Tipo de documento: Article