Your browser doesn't support javascript.
loading
Reconstruction of evolving gene variants and fitness from short sequencing reads.
Shen, Max W; Zhao, Kevin T; Liu, David R.
Afiliación
  • Shen MW; Merkin Institute of Transformative Technologies in Healthcare, Broad Institute of Harvard and MIT, Cambridge, MA, USA.
  • Zhao KT; Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA, USA.
  • Liu DR; Howard Hughes Medical Institute, Harvard University, Cambridge, MA, USA.
Nat Chem Biol ; 17(11): 1188-1198, 2021 11.
Article en En | MEDLINE | ID: mdl-34635842
ABSTRACT
Directed evolution can generate proteins with tailor-made activities. However, full-length genotypes, their frequencies and fitnesses are difficult to measure for evolving gene-length biomolecules using most high-throughput DNA sequencing methods, as short read lengths can lose mutation linkages in haplotypes. Here we present Evoracle, a machine learning method that accurately reconstructs full-length genotypes (R2 = 0.94) and fitness using short-read data from directed evolution experiments, with substantial improvements over related methods. We validate Evoracle on phage-assisted continuous evolution (PACE) and phage-assisted non-continuous evolution (PANCE) of adenine base editors and OrthoRep evolution of drug-resistant enzymes. Evoracle retains strong performance (R2 = 0.86) on data with complete linkage loss between neighboring nucleotides and large measurement noise, such as pooled Sanger sequencing data (~US$10 per timepoint), and broadens the accessibility of training machine learning models on gene variant fitnesses. Evoracle can also identify high-fitness variants, including low-frequency 'rising stars', well before they are identifiable from consensus mutations.
Asunto(s)

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Adenosina Desaminasa / Proteínas de Escherichia coli / Secuenciación de Nucleótidos de Alto Rendimiento Idioma: En Revista: Nat Chem Biol Asunto de la revista: BIOLOGIA / QUIMICA Año: 2021 Tipo del documento: Article País de afiliación: Estados Unidos

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Adenosina Desaminasa / Proteínas de Escherichia coli / Secuenciación de Nucleótidos de Alto Rendimiento Idioma: En Revista: Nat Chem Biol Asunto de la revista: BIOLOGIA / QUIMICA Año: 2021 Tipo del documento: Article País de afiliación: Estados Unidos
...