Your browser doesn't support javascript.
loading
On knowing a gene: A distributional hypothesis of gene function.
Kwon, Jason J; Pan, Joshua; Gonzalez, Guadalupe; Hahn, William C; Zitnik, Marinka.
Afiliação
  • Kwon JJ; Dana-Farber Cancer Institute and Harvard Medical School, Department of Medical Oncology, Boston, MA 02215, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
  • Pan J; Dana-Farber Cancer Institute and Harvard Medical School, Department of Medical Oncology, Boston, MA 02215, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
  • Gonzalez G; Department of Computing, Faculty of Engineering, Imperial College, London SW7 2AZ, UK.
  • Hahn WC; Dana-Farber Cancer Institute and Harvard Medical School, Department of Medical Oncology, Boston, MA 02215, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA. Electronic address: william_hahn@dfci.harvard.edu.
  • Zitnik M; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Harvard Medical School, Department of Biomedical Informatics, Boston, MA 02115, USA; Harvard Data Science Initiative, Harvard University, Cambridge, MA 02138, USA; Kempner Institute for the Study of Natural and Artificial Intelligence, Ha
Cell Syst ; 15(6): 488-496, 2024 Jun 19.
Article em En | MEDLINE | ID: mdl-38810640
ABSTRACT
As words can have multiple meanings that depend on sentence context, genes can have various functions that depend on the surrounding biological system. This pleiotropic nature of gene function is limited by ontologies, which annotate gene functions without considering biological contexts. We contend that the gene function problem in genetics may be informed by recent technological leaps in natural language processing, in which representations of word semantics can be automatically learned from diverse language contexts. In contrast to efforts to model semantics as "is-a" relationships in the 1990s, modern distributional semantics represents words as vectors in a learned semantic space and fuels current advances in transformer-based models such as large language models and generative pre-trained transformers. A similar shift in thinking of gene functions as distributions over cellular contexts may enable a similar breakthrough in data-driven learning from large biological datasets to inform gene function.
Assuntos
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Semântica / Processamento de Linguagem Natural Limite: Animals / Humans Idioma: En Revista: Cell Syst Ano de publicação: 2024 Tipo de documento: Article País de afiliação: Estados Unidos

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Semântica / Processamento de Linguagem Natural Limite: Animals / Humans Idioma: En Revista: Cell Syst Ano de publicação: 2024 Tipo de documento: Article País de afiliação: Estados Unidos