Your browser doesn't support javascript.
loading
Modeling 0.6 million genes for the rational design of functional cis-regulatory variants and de novo design of cis-regulatory sequences.
Li, Tianyi; Xu, Hui; Teng, Shouzhen; Suo, Mingrui; Bahitwa, Revocatus; Xu, Mingchi; Qian, Yiheng; Ramstein, Guillaume P; Song, Baoxing; Buckler, Edward S; Wang, Hai.
Affiliation
  • Li T; State Key Laboratory of Maize Bio-breeding, National Maize Improvement Center, Frontiers Science Center for Molecular Design Breeding, Department of Plant Genetics and Breeding, China Agricultural University, Beijing 100193, People's Republic of China.
  • Xu H; State Key Laboratory of Maize Bio-breeding, National Maize Improvement Center, Frontiers Science Center for Molecular Design Breeding, Department of Plant Genetics and Breeding, China Agricultural University, Beijing 100193, People's Republic of China.
  • Teng S; State Key Laboratory of Maize Bio-breeding, National Maize Improvement Center, Frontiers Science Center for Molecular Design Breeding, Department of Plant Genetics and Breeding, China Agricultural University, Beijing 100193, People's Republic of China.
  • Suo M; State Key Laboratory of Maize Bio-breeding, National Maize Improvement Center, Frontiers Science Center for Molecular Design Breeding, Department of Plant Genetics and Breeding, China Agricultural University, Beijing 100193, People's Republic of China.
  • Bahitwa R; State Key Laboratory of Maize Bio-breeding, National Maize Improvement Center, Frontiers Science Center for Molecular Design Breeding, Department of Plant Genetics and Breeding, China Agricultural University, Beijing 100193, People's Republic of China.
  • Xu M; Legumes Research Program, Research and Innovation Division, Tanzania Agricultural Research Institute, Ilonga, Kilosa, Morogoro 67410, Tanzania.
  • Qian Y; State Key Laboratory of Maize Bio-breeding, National Maize Improvement Center, Frontiers Science Center for Molecular Design Breeding, Department of Plant Genetics and Breeding, China Agricultural University, Beijing 100193, People's Republic of China.
  • Ramstein GP; State Key Laboratory of Maize Bio-breeding, National Maize Improvement Center, Frontiers Science Center for Molecular Design Breeding, Department of Plant Genetics and Breeding, China Agricultural University, Beijing 100193, People's Republic of China.
  • Song B; Center for Quantitative Genetics and Genomics, Aarhus University, Aarhus 8000, Denmark.
  • Buckler ES; National Key Laboratory of Wheat Improvement, Peking University Institute of Advanced Agricultural Sciences, Shandong Laboratory of Advanced Agriculture Sciences in Weifang, Weifang, Shandong 261325, People's Republic of China.
  • Wang H; Key Laboratory of Maize Biology and Genetic Breeding in Arid Area of Northwest Region of the Ministry of Agriculture, College of Agronomy, Northwest A&F University, Yangling, Shaanxi 712100, People's Republic of China.
Proc Natl Acad Sci U S A ; 121(26): e2319811121, 2024 Jun 25.
Article in En | MEDLINE | ID: mdl-38889146
ABSTRACT
Rational design of plant cis-regulatory DNA sequences without expert intervention or prior domain knowledge is still a daunting task. Here, we developed PhytoExpr, a deep learning framework capable of predicting both mRNA abundance and plant species using the proximal regulatory sequence as the sole input. PhytoExpr was trained over 17 species representative of major clades of the plant kingdom to enhance its generalizability. Via input perturbation, quantitative functional annotation of the input sequence was achieved at single-nucleotide resolution, revealing an abundance of predicted high-impact nucleotides in conserved noncoding sequences and transcription factor binding sites. Evaluation of maize HapMap3 single-nucleotide polymorphisms (SNPs) by PhytoExpr demonstrates an enrichment of predicted high-impact SNPs in cis-eQTL. Additionally, we provided two algorithms that harnessed the power of PhytoExpr in designing functional cis-regulatory variants, and de novo creation of species-specific cis-regulatory sequences through in silico evolution of random DNA sequences. Our model represents a general and robust approach for functional variant discovery in population genetics and rational design of regulatory sequences for genome editing and synthetic biology.
Subject(s)
Key words

Full text: 1 Database: MEDLINE Main subject: Regulatory Sequences, Nucleic Acid / Zea mays / Polymorphism, Single Nucleotide Language: En Journal: Proc Natl Acad Sci U S A Year: 2024 Type: Article

Full text: 1 Database: MEDLINE Main subject: Regulatory Sequences, Nucleic Acid / Zea mays / Polymorphism, Single Nucleotide Language: En Journal: Proc Natl Acad Sci U S A Year: 2024 Type: Article