Búsqueda | BVS Bolivia

1.

Large language models predict human sensory judgments across six modalities.

Marjieh, Raja; Sucholutsky, Ilia; van Rijn, Pol; Jacoby, Nori; Griffiths, Thomas L.

Sci Rep ; 14(1): 21445, 2024 09 13.

Artículo en Inglés | MEDLINE | ID: mdl-39271909

RESUMEN

Determining the extent to which the perceptual world can be recovered from language is a longstanding problem in philosophy and cognitive science. We show that state-of-the-art large language models can unlock new insights into this problem by providing a lower bound on the amount of perceptual information that can be extracted from language. Specifically, we elicit pairwise similarity judgments from GPT models across six psychophysical datasets. We show that the judgments are significantly correlated with human data across all domains, recovering well-known representations like the color wheel and pitch spiral. Surprisingly, we find that a model (GPT-4) co-trained on vision and language does not necessarily lead to improvements specific to the visual modality, and provides highly correlated predictions with human data irrespective of whether direct visual input is provided or purely textual descriptors. To study the impact of specific languages, we also apply the models to a multilingual color-naming task. We find that GPT-4 replicates cross-linguistic variation in English and Russian illuminating the interaction of language and perception.

Asunto(s)

Juicio , Lenguaje , Humanos , Juicio/fisiología , Percepción Visual/fisiología

2.

GPT is an effective tool for multilingual psychological text analysis.

Rathje, Steve; Mirea, Dan-Mircea; Sucholutsky, Ilia; Marjieh, Raja; Robertson, Claire E; Van Bavel, Jay J.

Proc Natl Acad Sci U S A ; 121(34): e2308950121, 2024 Aug 20.

Artículo en Inglés | MEDLINE | ID: mdl-39133853

RESUMEN

The social and behavioral sciences have been increasingly using automated text analysis to measure psychological constructs in text. We explore whether GPT, the large-language model (LLM) underlying the AI chatbot ChatGPT, can be used as a tool for automated psychological text analysis in several languages. Across 15 datasets (n = 47,925 manually annotated tweets and news headlines), we tested whether different versions of GPT (3.5 Turbo, 4, and 4 Turbo) can accurately detect psychological constructs (sentiment, discrete emotions, offensiveness, and moral foundations) across 12 languages. We found that GPT (r = 0.59 to 0.77) performed much better than English-language dictionary analysis (r = 0.20 to 0.30) at detecting psychological constructs as judged by manual annotators. GPT performed nearly as well as, and sometimes better than, several top-performing fine-tuned machine learning models. Moreover, GPT's performance improved across successive versions of the model, particularly for lesser-spoken languages, and became less expensive. Overall, GPT may be superior to many existing methods of automated text analysis, since it achieves relatively high accuracy across many languages, requires no training data, and is easy to use with simple prompts (e.g., "is this text negative?") and little coding experience. We provide sample code and a video tutorial for analyzing text with the GPT application programming interface. We argue that GPT and other LLMs help democratize automated text analysis by making advanced natural language processing capabilities more accessible, and may help facilitate more cross-linguistic research with understudied languages.

Asunto(s)

Multilingüismo , Humanos , Lenguaje , Aprendizaje Automático , Procesamiento de Lenguaje Natural , Emociones , Medios de Comunicación Sociales

3.

exKidneyBERT: a language model for kidney transplant pathology reports and the crucial role of extended vocabularies.

Yang, Tiancheng; Sucholutsky, Ilia; Jen, Kuang-Yu; Schonlau, Matthias.

PeerJ Comput Sci ; 10: e1888, 2024.

Artículo en Inglés | MEDLINE | ID: mdl-38435545

RESUMEN

Background: Pathology reports contain key information about the patient's diagnosis as well as important gross and microscopic findings. These information-rich clinical reports offer an invaluable resource for clinical studies, but data extraction and analysis from such unstructured texts is often manual and tedious. While neural information retrieval systems (typically implemented as deep learning methods for natural language processing) are automatic and flexible, they typically require a large domain-specific text corpus for training, making them infeasible for many medical subdomains. Thus, an automated data extraction method for pathology reports that does not require a large training corpus would be of significant value and utility. Objective: To develop a language model-based neural information retrieval system that can be trained on small datasets and validate it by training it on renal transplant-pathology reports to extract relevant information for two predefined questions: (1) "What kind of rejection does the patient show?"; (2) "What is the grade of interstitial fibrosis and tubular atrophy (IFTA)?" Methods: Kidney BERT was developed by pre-training Clinical BERT on 3.4K renal transplant pathology reports and 1.5M words. Then, exKidneyBERT was developed by extending Clinical BERT's tokenizer with six technical keywords and repeating the pre-training procedure. This extended the model's vocabulary. All three models were fine-tuned with information retrieval heads. Results: The model with extended vocabulary, exKidneyBERT, outperformed Clinical BERT and Kidney BERT in both questions. For rejection, exKidneyBERT achieved an 83.3% overlap ratio for antibody-mediated rejection (ABMR) and 79.2% for T-cell mediated rejection (TCMR). For IFTA, exKidneyBERT had a 95.8% exact match rate. Conclusion: ExKidneyBERT is a high-performing model for extracting information from renal pathology reports. Additional pre-training of BERT language models on specialized small domains does not necessarily improve performance. Extending the BERT tokenizer's vocabulary library is essential for specialized domains to improve performance, especially when pre-training on small corpora.

4.

Optimal 1-NN prototypes for pathological geometries.

Sucholutsky, Ilia; Schonlau, Matthias.

PeerJ Comput Sci ; 7: e464, 2021.

Artículo en Inglés | MEDLINE | ID: mdl-33954242

RESUMEN

Using prototype methods to reduce the size of training datasets can drastically reduce the computational cost of classification with instance-based learning algorithms like the k-Nearest Neighbour classifier. The number and distribution of prototypes required for the classifier to match its original performance is intimately related to the geometry of the training data. As a result, it is often difficult to find the optimal prototypes for a given dataset, and heuristic algorithms are used instead. However, we consider a particularly challenging setting where commonly used heuristic algorithms fail to find suitable prototypes and show that the optimal number of prototypes can instead be found analytically. We also propose an algorithm for finding nearly-optimal prototypes in this setting, and use it to empirically validate the theoretical results. Finally, we show that a parametric prototype generation method that normally cannot solve this pathological setting can actually find optimal prototypes when combined with the results of our theoretical analysis.

5.

Pay attention and you won't lose it: a deep learning approach to sequence imputation.

Sucholutsky, Ilia; Narayan, Apurva; Schonlau, Matthias; Fischmeister, Sebastian.

PeerJ Comput Sci ; 5: e210, 2019.

Artículo en Inglés | MEDLINE | ID: mdl-33816863

RESUMEN

In most areas of machine learning, it is assumed that data quality is fairly consistent between training and inference. Unfortunately, in real systems, data are plagued by noise, loss, and various other quality reducing factors. While a number of deep learning algorithms solve end-stage problems of prediction and classification, very few aim to solve the intermediate problems of data pre-processing, cleaning, and restoration. Long Short-Term Memory (LSTM) networks have previously been proposed as a solution for data restoration, but they suffer from a major bottleneck: a large number of sequential operations. We propose using attention mechanisms to entirely replace the recurrent components of these data-restoration networks. We demonstrate that such an approach leads to reduced model sizes by as many as two orders of magnitude, a 2-fold to 4-fold reduction in training times, and 95% accuracy for automotive data restoration. We also show in a case study that this approach improves the performance of downstream algorithms reliant on clean data.

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA