Your browser doesn't support javascript.
loading
Machine learning enables identification of an alternative yeast galactose utilization pathway.
Harrison, Marie-Claire; Ubbelohde, Emily J; LaBella, Abigail L; Opulente, Dana A; Wolters, John F; Zhou, Xiaofan; Shen, Xing-Xing; Groenewald, Marizeth; Hittinger, Chris Todd; Rokas, Antonis.
Afiliação
  • Harrison MC; Department of Biological Sciences and Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235.
  • Ubbelohde EJ; Laboratory of Genetics, Department of Energy (DOE) Great Lakes Bioenergy Research Center, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, Wisconsin Energy Institute, University of Wisconsin-Madison, Madison, WI 53726.
  • LaBella AL; Department of Biological Sciences and Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235.
  • Opulente DA; Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC 28262.
  • Wolters JF; Laboratory of Genetics, Department of Energy (DOE) Great Lakes Bioenergy Research Center, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, Wisconsin Energy Institute, University of Wisconsin-Madison, Madison, WI 53726.
  • Zhou X; Department of Biology, Villanova University, Villanova, PA 19085.
  • Shen XX; Laboratory of Genetics, Department of Energy (DOE) Great Lakes Bioenergy Research Center, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, Wisconsin Energy Institute, University of Wisconsin-Madison, Madison, WI 53726.
  • Groenewald M; Guangdong Province Key Laboratory of Microbial Signals and Disease Control, Integrative Microbiology Research Center, South China Agricultural University, Guangzhou 510642, China.
  • Hittinger CT; Key Laboratory of Biology of Crop Pathogens and Insects of Zhejiang Province, Institute of Insect Sciences, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou 310058, China.
  • Rokas A; Westerdijk Fungal Biodiversity Institute, Utrecht 3584, The Netherlands.
Proc Natl Acad Sci U S A ; 121(18): e2315314121, 2024 Apr 30.
Article em En | MEDLINE | ID: mdl-38669185
ABSTRACT
How genomic differences contribute to phenotypic differences is a major question in biology. The recently characterized genomes, isolation environments, and qualitative patterns of growth on 122 sources and conditions of 1,154 strains from 1,049 fungal species (nearly all known) in the yeast subphylum Saccharomycotina provide a powerful, yet complex, dataset for addressing this question. We used a random forest algorithm trained on these genomic, metabolic, and environmental data to predict growth on several carbon sources with high accuracy. Known structural genes involved in assimilation of these sources and presence/absence patterns of growth in other sources were important features contributing to prediction accuracy. By further examining growth on galactose, we found that it can be predicted with high accuracy from either genomic (92.2%) or growth data (82.6%) but not from isolation environment data (65.6%). Prediction accuracy was even higher (93.3%) when we combined genomic and growth data. After the GALactose utilization genes, the most important feature for predicting growth on galactose was growth on galactitol, raising the hypothesis that several species in two orders, Serinales and Pichiales (containing the emerging pathogen Candida auris and the genus Ogataea, respectively), have an alternative galactose utilization pathway because they lack the GAL genes. Growth and biochemical assays confirmed that several of these species utilize galactose through an alternative oxidoreductive D-galactose pathway, rather than the canonical GAL pathway. Machine learning approaches are powerful for investigating the evolution of the yeast genotype-phenotype map, and their application will uncover novel biology, even in well-studied traits.
Assuntos
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Aprendizado de Máquina / Galactose Idioma: En Ano de publicação: 2024 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Aprendizado de Máquina / Galactose Idioma: En Ano de publicação: 2024 Tipo de documento: Article