Your browser doesn't support javascript.
loading
Identifying essential genes across eukaryotes by machine learning.
Beder, Thomas; Aromolaran, Olufemi; Dönitz, Jürgen; Tapanelli, Sofia; Adedeji, Eunice O; Adebiyi, Ezekiel; Bucher, Gregor; Koenig, Rainer.
Afiliación
  • Beder T; Integrated Research and Treatment Center, Center for Sepsis Control and Care (CSCC), Jena University Hospital, Am Klinikum 1, 07747 Jena, Germany.
  • Aromolaran O; Department of Computer & Information Sciences, Covenant University, Ota, Ogun State, Nigeria.
  • Dönitz J; Department of Evolutionary Developmental Genetics, GZMB, University of Göttingen, Justus-von-Liebig-Weg 11, 37077 Göttingen, Germany.
  • Tapanelli S; Department of Life Sciences, Imperial College London, London SW7 2AZ, UK.
  • Adedeji EO; Covenant University Bioinformatics Research (CUBRe), Covenant University, Ota, Ogun State, Nigeria.
  • Adebiyi E; Department of Computer & Information Sciences, Covenant University, Ota, Ogun State, Nigeria.
  • Bucher G; Department of Evolutionary Developmental Genetics, GZMB, University of Göttingen, Justus-von-Liebig-Weg 11, 37077 Göttingen, Germany.
  • Koenig R; Integrated Research and Treatment Center, Center for Sepsis Control and Care (CSCC), Jena University Hospital, Am Klinikum 1, 07747 Jena, Germany.
NAR Genom Bioinform ; 3(4): lqab110, 2021 Dec.
Article en En | MEDLINE | ID: mdl-34859210
ABSTRACT
Identifying essential genes on a genome scale is resource intensive and has been performed for only a few eukaryotes. For less studied organisms essentiality might be predicted by gene homology. However, this approach cannot be applied to non-conserved genes. Additionally, divergent essentiality information is obtained from studying single cells or whole, multi-cellular organisms, and particularly when derived from human cell line screens and human population studies. We employed machine learning across six model eukaryotes and 60 381 genes, using 41 635 features derived from the sequence, gene function information and network topology. Within a leave-one-organism-out cross-validation, the classifiers showed high generalizability with an average accuracy close to 80% in the left-out species. As a case study, we applied the method to Tribolium castaneum and Bombyx mori and validated predictions experimentally yielding similar performances. Finally, using the classifier based on the studied model organisms enabled linking the essentiality information of human cell line screens and population studies.

Texto completo: 1 Bases de datos: MEDLINE Tipo de estudio: Prognostic_studies Idioma: En Revista: NAR Genom Bioinform Año: 2021 Tipo del documento: Article País de afiliación: Alemania

Texto completo: 1 Bases de datos: MEDLINE Tipo de estudio: Prognostic_studies Idioma: En Revista: NAR Genom Bioinform Año: 2021 Tipo del documento: Article País de afiliación: Alemania