On knowing a gene: A distributional hypothesis of gene function.
Cell Syst
; 15(6): 488-496, 2024 Jun 19.
Article
em En
| MEDLINE
| ID: mdl-38810640
ABSTRACT
As words can have multiple meanings that depend on sentence context, genes can have various functions that depend on the surrounding biological system. This pleiotropic nature of gene function is limited by ontologies, which annotate gene functions without considering biological contexts. We contend that the gene function problem in genetics may be informed by recent technological leaps in natural language processing, in which representations of word semantics can be automatically learned from diverse language contexts. In contrast to efforts to model semantics as "is-a" relationships in the 1990s, modern distributional semantics represents words as vectors in a learned semantic space and fuels current advances in transformer-based models such as large language models and generative pre-trained transformers. A similar shift in thinking of gene functions as distributions over cellular contexts may enable a similar breakthrough in data-driven learning from large biological datasets to inform gene function.
Palavras-chave
Texto completo:
1
Coleções:
01-internacional
Base de dados:
MEDLINE
Assunto principal:
Semântica
/
Processamento de Linguagem Natural
Limite:
Animals
/
Humans
Idioma:
En
Revista:
Cell Syst
Ano de publicação:
2024
Tipo de documento:
Article
País de afiliação:
Estados Unidos