Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 682
Filtrar
Mais filtros

Tipo de documento
Intervalo de ano de publicação
1.
Proc Natl Acad Sci U S A ; 121(25): e2320066121, 2024 Jun 18.
Artigo em Inglês | MEDLINE | ID: mdl-38861605

RESUMO

How are the merits of innovative ideas communicated in science? Here, we conduct semantic analyses of grant application success with a focus on scientific promotional language, which may help to convey an innovative idea's originality and significance. Our analysis attempts to surmount the limitations of prior grant studies by examining the full text of tens of thousands of both funded and unfunded grants from three leading public and private funding agencies: the NIH, the NSF, and the Novo Nordisk Foundation, one of the world's largest private science funding foundations. We find a robust association between promotional language and the support and adoption of innovative ideas by funders and other scientists. First, a grant proposal's percentage of promotional language is associated with up to a doubling of the grant's probability of being funded. Second, a grant's promotional language reflects its intrinsic innovativeness. Third, the percentage of promotional language is predictive of the expected citation and productivity impact of publications that are supported by funded grants. Finally, a computer-assisted experiment that manipulates the promotional language in our data demonstrates how promotional language can communicate the merit of ideas through cognitive activation. With the incidence of promotional language in science steeply rising, and the pivotal role of grants in converting promising and aspirational ideas into solutions, our analysis provides empirical evidence that promotional language is associated with effectively communicating the merits of innovative scientific ideas.


Assuntos
Idioma , Humanos , Ciência , Organização do Financiamento , Estados Unidos , Apoio à Pesquisa como Assunto , Criatividade
2.
Am J Hum Genet ; 110(10): 1661-1672, 2023 10 05.
Artigo em Inglês | MEDLINE | ID: mdl-37741276

RESUMO

In the effort to treat Mendelian disorders, correcting the underlying molecular imbalance may be more effective than symptomatic treatment. Identifying treatments that might accomplish this goal requires extensive and up-to-date knowledge of molecular pathways-including drug-gene and gene-gene relationships. To address this challenge, we present "parsing modifiers via article annotations" (PARMESAN), a computational tool that searches PubMed and PubMed Central for information to assemble these relationships into a central knowledge base. PARMESAN then predicts putatively novel drug-gene relationships, assigning an evidence-based score to each prediction. We compare PARMESAN's drug-gene predictions to all of the drug-gene relationships displayed by the Drug-Gene Interaction Database (DGIdb) and show that higher-scoring relationship predictions are more likely to match the directionality (up- versus down-regulation) indicated by this database. PARMESAN had more than 200,000 drug predictions scoring above 8 (as one example cutoff), for more than 3,700 genes. Among these predicted relationships, 210 were registered in DGIdb and 201 (96%) had matching directionality. This publicly available tool provides an automated way to prioritize drug screens to target the most-promising drugs to test, thereby saving time and resources in the development of therapeutics for genetic disorders.


Assuntos
PubMed , Humanos , Bases de Dados Factuais
3.
Proc Natl Acad Sci U S A ; 120(39): e2304513120, 2023 09 26.
Artigo em Inglês | MEDLINE | ID: mdl-37725643

RESUMO

Nitrate supply is fundamental to support shoot growth and crop performance, but the associated increase in stem height exacerbates the risks of lodging and yield losses. Despite their significance for agriculture, the mechanisms involved in the promotion of stem growth by nitrate remain poorly understood. Here, we show that the elongation of the hypocotyl of Arabidopsis thaliana, used as a model, responds rapidly and persistently to upshifts in nitrate concentration, rather than to the nitrate level itself. The response occurred even in shoots dissected from their roots and required NITRATE TRANSPORTER 1.1 (NRT1.1) in the phosphorylated state (but not NRT1.1 nitrate transport capacity) and NIN-LIKE PROTEIN 7 (NLP7). Nitrate increased PHYTOCHROME INTERACTING FACTOR 4 (PIF4) nuclear abundance by posttranscriptional mechanisms that depended on NRT1.1 and phytochrome B. In response to nitrate, PIF4 enhanced the expression of numerous SMALL AUXIN-UP RNA (SAUR) genes in the hypocotyl. The growth response to nitrate required PIF4, positive and negative regulators of its activity, including AUXIN RESPONSE FACTORs, and SAURs. PIF4 integrates cues from the soil (nitrate) and aerial (shade) environments adjusting plant stature to facilitate access to light.


Assuntos
Proteínas de Arabidopsis , Arabidopsis , Fitocromo , Nitratos/farmacologia , Fitocromo B , Arabidopsis/genética , Ácidos Indolacéticos , Transportadores de Nitrato , RNA , Proteínas de Arabidopsis/genética , Fatores de Transcrição Hélice-Alça-Hélice Básicos/genética
4.
Proc Natl Acad Sci U S A ; 120(23): e2216162120, 2023 06 06.
Artigo em Inglês | MEDLINE | ID: mdl-37253013

RESUMO

Across the United States, police chiefs, city officials, and community leaders alike have highlighted the need to de-escalate police encounters with the public. This concern about escalation extends from encounters involving use of force to routine car stops, where Black drivers are disproportionately pulled over. Yet, despite the calls for action, we know little about the trajectory of police stops or how escalation unfolds. In study 1, we use methods from computational linguistics to analyze police body-worn camera footage from 577 stops of Black drivers. We find that stops with escalated outcomes (those ending in arrest, handcuffing, or a search) diverge from stops without these outcomes in their earliest moments-even in the first 45 words spoken by the officer. In stops that result in escalation, officers are more likely to issue commands as their opening words to the driver and less likely to tell drivers the reason why they are being stopped. In study 2, we expose Black males to audio clips of the same stops and find differences in how escalated stops are perceived: Participants report more negative emotion, appraise officers more negatively, worry about force being used, and predict worse outcomes after hearing only the officer's initial words in escalated versus non-escalated stops. Our findings show that car stops that end in escalated outcomes sometimes begin in an escalated fashion, with adverse effects for Black male drivers and, in turn, police-community relations.


Assuntos
Negro ou Afro-Americano , Aplicação da Lei , Polícia , Humanos , Masculino , Aplicação da Lei/métodos , Estados Unidos , Racismo , Emoções
5.
Plant J ; 119(1): 266-282, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38605581

RESUMO

Brassica crops are susceptible to diseases which can be mitigated by breeding for resistance. MAMPs (microbe-associated molecular patterns) are conserved molecules of pathogens that elicit host defences known as pattern-triggered immunity (PTI). Necrosis and Ethylene-inducing peptide 1-like proteins (NLPs) are MAMPs found in a wide range of phytopathogens. We studied the response to BcNEP2, a representative NLP from Botrytis cinerea, and showed that it contributes to disease resistance in Brassica napus. To map regions conferring NLP response, we used the production of reactive oxygen species (ROS) induced during PTI across a population of diverse B. napus accessions for associative transcriptomics (AT), and bulk segregant analysis (BSA) on DNA pools created from a cross of NLP-responsive and non-responsive lines. In silico mapping with AT identified two peaks for NLP responsiveness on chromosomes A04 and C05 whereas the BSA identified one peak on A04. BSA delimited the region for NLP-responsiveness to 3 Mbp, containing ~245 genes on the Darmor-bzh reference genome and four co-segregating KASP markers were identified. The same pipeline with the ZS11 genome confirmed the highest-associated region on chromosome A04. Comparative BLAST analysis revealed unannotated clusters of receptor-like protein (RLP) homologues on ZS11 chromosome A04. However, no specific RLP homologue conferring NLP response could be identified. Our results also suggest that BR-SIGNALLING KINASE1 may be involved with modulating the NLP response. Overall, we demonstrate that responsiveness to NLP contributes to disease resistance in B. napus and define the associated genomic location. These results can have practical application in crop improvement.


Assuntos
Brassica napus , Resistência à Doença , Doenças das Plantas , Proteínas de Plantas , Brassica napus/genética , Brassica napus/microbiologia , Brassica napus/metabolismo , Doenças das Plantas/microbiologia , Doenças das Plantas/genética , Doenças das Plantas/imunologia , Resistência à Doença/genética , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo , Botrytis/fisiologia , Espécies Reativas de Oxigênio/metabolismo , Peptídeos/metabolismo , Peptídeos/genética , Regulação da Expressão Gênica de Plantas , Mapeamento Cromossômico , Etilenos/metabolismo
6.
Brief Bioinform ; 24(5)2023 09 20.
Artigo em Inglês | MEDLINE | ID: mdl-37478371

RESUMO

Artificial intelligence (AI) systems utilizing deep neural networks and machine learning (ML) algorithms are widely used for solving critical problems in bioinformatics, biomedical informatics and precision medicine. However, complex ML models that are often perceived as opaque and black-box methods make it difficult to understand the reasoning behind their decisions. This lack of transparency can be a challenge for both end-users and decision-makers, as well as AI developers. In sensitive areas such as healthcare, explainability and accountability are not only desirable properties but also legally required for AI systems that can have a significant impact on human lives. Fairness is another growing concern, as algorithmic decisions should not show bias or discrimination towards certain groups or individuals based on sensitive attributes. Explainable AI (XAI) aims to overcome the opaqueness of black-box models and to provide transparency in how AI systems make decisions. Interpretable ML models can explain how they make predictions and identify factors that influence their outcomes. However, the majority of the state-of-the-art interpretable ML methods are domain-agnostic and have evolved from fields such as computer vision, automated reasoning or statistics, making direct application to bioinformatics problems challenging without customization and domain adaptation. In this paper, we discuss the importance of explainability and algorithmic transparency in the context of bioinformatics. We provide an overview of model-specific and model-agnostic interpretable ML methods and tools and outline their potential limitations. We discuss how existing interpretable ML methods can be customized and fit to bioinformatics research problems. Further, through case studies in bioimaging, cancer genomics and text mining, we demonstrate how XAI methods can improve transparency and decision fairness. Our review aims at providing valuable insights and serving as a starting point for researchers wanting to enhance explainability and decision transparency while solving bioinformatics problems. GitHub: https://github.com/rezacsedu/XAI-for-bioinformatics.


Assuntos
Inteligência Artificial , Biologia Computacional , Humanos , Aprendizado de Máquina , Algoritmos , Genômica
7.
Brief Bioinform ; 24(3)2023 05 19.
Artigo em Inglês | MEDLINE | ID: mdl-37165972

RESUMO

MicroRNAs are small regulatory RNAs that decrease gene expression after transcription in various biological disciplines. In bioinformatics, identifying microRNAs and predicting their functionalities is critical. Finding motifs is one of the most well-known and important methods for identifying the functionalities of microRNAs. Several motif discovery techniques have been proposed, some of which rely on artificial intelligence-based techniques. However, in the case of few or no training data, their accuracy is low. In this research, we propose a new computational approach, called DiMo, for identifying motifs in microRNAs and generally macromolecules of small length. We employ word embedding techniques and deep learning models to improve the accuracy of motif discovery results. Also, we rely on transfer learning models to pre-train a model and use it in cases of a lack of (enough) training data. We compare our approach with five state-of-the-art works using three real-world datasets. DiMo outperforms the selected related works in terms of precision, recall, accuracy and f1-score.


Assuntos
Aprendizado Profundo , MicroRNAs , MicroRNAs/genética , Inteligência Artificial , Algoritmos
8.
Brief Bioinform ; 24(6)2023 09 22.
Artigo em Inglês | MEDLINE | ID: mdl-37864295

RESUMO

The widespread adoption of high-throughput omics technologies has exponentially increased the amount of protein sequence data involved in many salient disease pathways and their respective therapeutics and diagnostics. Despite the availability of large-scale sequence data, the lack of experimental fitness annotations underpins the need for self-supervised and unsupervised machine learning (ML) methods. These techniques leverage the meaningful features encoded in abundant unlabeled sequences to accomplish complex protein engineering tasks. Proficiency in the rapidly evolving fields of protein engineering and generative AI is required to realize the full potential of ML models as a tool for protein fitness landscape navigation. Here, we support this work by (i) providing an overview of the architecture and mathematical details of the most successful ML models applicable to sequence data (e.g. variational autoencoders, autoregressive models, generative adversarial neural networks, and diffusion models), (ii) guiding how to effectively implement these models on protein sequence data to predict fitness or generate high-fitness sequences and (iii) highlighting several successful studies that implement these techniques in protein engineering (from paratope regions and subcellular localization prediction to high-fitness sequences and protein design rules generation). By providing a comprehensive survey of model details, novel architecture developments, comparisons of model applications, and current challenges, this study intends to provide structured guidance and robust framework for delivering a prospective outlook in the ML-driven protein engineering field.


Assuntos
Redes Neurais de Computação , Aprendizado de Máquina não Supervisionado , Sequência de Aminoácidos , Exercício Físico , Proteínas/genética
9.
BMC Bioinformatics ; 25(1): 84, 2024 Feb 27.
Artigo em Inglês | MEDLINE | ID: mdl-38413851

RESUMO

BACKGROUND: Thousands of genes have been associated with different Mendelian conditions. One of the valuable sources to track these gene-disease associations (GDAs) is the Online Mendelian Inheritance in Man (OMIM) database. However, most of the information in OMIM is textual, and heterogeneous (e.g. summarized by different experts), which complicates automated reading and understanding of the data. Here, we used Natural Language Processing (NLP) to make a tool (Gene-Phenotype Association Discovery (GPAD)) that could syntactically process OMIM text and extract the data of interest. RESULTS: GPAD applies a series of language-based techniques to the text obtained from OMIM API to extract GDA discovery-related information. GPAD can inform when a particular gene was associated with a specific phenotype, as well as the type of validation-whether through model organisms or cohort-based patient-matching approaches-for such an association. GPAD extracted data was validated with published reports and was compared with large language model. Utilizing GPAD's extracted data, we analysed trends in GDA discoveries, noting a significant increase in their rate after the introduction of exome sequencing, rising from an average of about 150-250 discoveries each year. Contrary to hopes of resolving most GDAs for Mendelian disorders by now, our data indicate a substantial decline in discovery rates over the past five years (2017-2022). This decline appears to be linked to the increasing necessity for larger cohorts to substantiate GDAs. The rising use of zebrafish and Drosophila as model organisms in providing evidential support for GDAs is also observed. CONCLUSIONS: GPAD's real-time analyzing capacity offers an up-to-date view of GDA discovery and could help in planning and managing the research strategies. In future, this solution can be extended or modified to capture other information in OMIM and scientific literature.


Assuntos
Processamento de Linguagem Natural , Peixe-Zebra , Humanos , Animais , Fenótipo , Bases de Dados Genéticas , Previsões
10.
Plant J ; 115(2): 452-469, 2023 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-37026387

RESUMO

Plasma membrane represents a critical battleground between plants and attacking microbes. Necrosis-and-ethylene-inducing peptide 1 (Nep1)-like proteins (NLPs), cytolytic toxins produced by some bacterial, fungal and oomycete species, are able to target on lipid membranes by binding eudicot plant-specific sphingolipids (glycosylinositol phosphorylceramide) and form transient small pores, causing membrane leakage and subsequent cell death. NLP-producing phytopathogens are a big threat to agriculture worldwide. However, whether there are R proteins/enzymes that counteract the toxicity of NLPs in plants remains largely unknown. Here we show that cotton produces a peroxisome-localized enzyme lysophospholipase, GhLPL2. Upon Verticillium dahliae attack, GhLPL2 accumulates on the membrane and binds to V. dahliae secreted NLP, VdNLP1, to block its contribution to virulence. A higher level of lysophospholipase in cells is required to neutralize VdNLP1 toxicity and induce immunity-related genes expression, meanwhile maintaining normal growth of cotton plants, revealing the role of GhLPL2 protein in balancing resistance to V. dahliae and growth. Intriguingly, GhLPL2 silencing cotton plants also display high resistance to V. dahliae, but show severe dwarfing phenotype and developmental defects, suggesting GhLPL2 is an essential gene in cotton. GhLPL2 silencing results in lysophosphatidylinositol over-accumulation and decreased glycometabolism, leading to a lack of carbon sources required for plants and pathogens to survive. Furthermore, lysophospholipases from several other crops also interact with VdNLP1, implying that blocking NLP virulence by lysophospholipase may be a common strategy in plants. Our work demonstrates that overexpressing lysophospholipase encoding genes have great potential for breeding crops with high resistance against NLP-producing microbial pathogens.


Assuntos
Lisofosfolipase , Verticillium , Lisofosfolipase/genética , Gossypium/genética , Peroxissomos , Melhoramento Vegetal , Doenças das Plantas/microbiologia , Resistência à Doença/genética , Regulação da Expressão Gênica de Plantas
11.
Brief Bioinform ; 23(3)2022 05 13.
Artigo em Inglês | MEDLINE | ID: mdl-35443027

RESUMO

Predicting the binding of peptide and major histocompatibility complex (MHC) plays a vital role in immunotherapy for cancer. The success of Alphafold of applying natural language processing (NLP) algorithms in protein secondary struction prediction has inspired us to explore the possibility of NLP methods in predicting peptide-MHC class I binding. Based on the above motivations, we propose the MHCRoBERTa method, RoBERTa pre-training approach, for predicting the binding affinity between type I MHC and peptides. Analysis of the results on benchmark dataset demonstrates that MHCRoBERTa can outperform other state-of-art prediction methods with an increase of the Spearman rank correlation coefficient (SRCC) value. Notably, our model gave a significant improvement on IC50 value. Our method has achieved SRCC value and AUC value as 0.785 and 0.817, respectively. Our SRCC value is 14.3% higher than NetMHCpan3.0 (the second highest SRCC value on pan-specific) and is 3% higher than MHCflurry (the second highest SRCC value on all methods). The AUC value is also better than any other pan-specific methods. Moreover, we visualize the multi-head self-attention for the token representation across the layers and heads by this method. Through the analysis of the representation of each layer and head, we can show whether the model has learned the syntax and semantics necessary to perform the prediction task well. All these results demonstrate that our model can accurately predict the peptide-MHC class I binding affinity and that MHCRoBERTa is a powerful tool for screening potential neoantigens for cancer immunotherapy. MHCRoBERTa is available as an open source software at github (https://github.com/FuxuWang/MHCRoBERTa).


Assuntos
Antígenos de Histocompatibilidade Classe I , Peptídeos , Algoritmos , Sequência de Aminoácidos , Antígenos de Histocompatibilidade Classe I/metabolismo , Aprendizado de Máquina , Peptídeos/metabolismo , Ligação Proteica
12.
New Phytol ; 241(5): 2108-2123, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38155438

RESUMO

Plants evolved sophisticated machineries to monitor levels of external nitrogen supply, respond to nitrogen demand from different tissues and integrate this information for coordinating its assimilation. Although roles of inorganic nitrogen in orchestrating developments have been studied in model plants and crops, systematic understanding of the origin and evolution of its assimilation and signaling machineries remains largely unknown. We expanded taxon samplings of algae and early-diverging land plants, covering all main lineages of Archaeplastida, and reconstructed the evolutionary history of core components involved in inorganic nitrogen assimilation and signaling. Most components associated with inorganic nitrogen assimilation were derived from the ancestral Archaeplastida. Improvements of assimilation machineries by gene duplications and horizontal gene transfers were evident during plant terrestrialization. Clusterization of genes encoding nitrate assimilation proteins might be an adaptive strategy for algae to cope with changeable nitrate availability in different habitats. Green plants evolved complex nitrate signaling machinery that was stepwise improved by domains shuffling and regulation co-option. Our study highlights innovations in inorganic nitrogen assimilation and signaling machineries, ranging from molecular modifications of proteins to genomic rearrangements, which shaped developmental and metabolic adaptations of plants to changeable nutrient availability in environments.


Assuntos
Nitratos , Nitrogênio , Nitratos/metabolismo , Nitrogênio/metabolismo , Transdução de Sinais , Produtos Agrícolas/metabolismo
13.
EMBO Rep ; 23(8): e53267, 2022 08 03.
Artigo em Inglês | MEDLINE | ID: mdl-35748387

RESUMO

Synaptic connections are essential to build a functional brain. How synapses are formed during development is a fundamental question in neuroscience. Recent studies provided evidence that the gut plays an important role in neuronal development through processing signals derived from gut microbes or nutrients. Defects in gut-brain communication can lead to various neurological disorders. Although the roles of the gut in communicating signals from its internal environment to the brain are well known, it remains unclear whether the gut plays a genetically encoded role in neuronal development. Using C. elegans as a model, we uncover that a Wnt-endocrine signaling pathway in the gut regulates synaptic development in the brain. A canonical Wnt signaling pathway promotes synapse formation through regulating the expression of the neuropeptides encoding gene nlp-40 in the gut, which functions through the neuronally expressed GPCR/AEX-2 receptor during development. Wnt-NLP-40-AEX-2 signaling likely acts to modulate neuronal activity. Our study reveals a genetic role of the gut in synaptic development and identifies a novel contribution of the gut-brain axis.


Assuntos
Proteínas de Caenorhabditis elegans , Neuropeptídeos , Animais , Caenorhabditis elegans/metabolismo , Proteínas de Caenorhabditis elegans/genética , Proteínas de Caenorhabditis elegans/metabolismo , Neuropeptídeos/genética , Neuropeptídeos/metabolismo , Sinapses/fisiologia , Via de Sinalização Wnt
14.
J Biomed Inform ; 150: 104605, 2024 02.
Artigo em Inglês | MEDLINE | ID: mdl-38331082

RESUMO

OBJECTIVE: Physicians and clinicians rely on data contained in electronic health records (EHRs), as recorded by health information technology (HIT), to make informed decisions about their patients. The reliability of HIT systems in this regard is critical to patient safety. Consequently, better tools are needed to monitor the performance of HIT systems for potential hazards that could compromise the collected EHRs, which in turn could affect patient safety. In this paper, we propose a new framework for detecting anomalies in EHRs using sequence of clinical events. This new framework, EHR-Bidirectional Encoder Representations from Transformers (BERT), is motivated by the gaps in the existing deep-learning related methods, including high false negatives, sub-optimal accuracy, higher computational cost, and the risk of information loss. EHR-BERT is an innovative framework rooted in the BERT architecture, meticulously tailored to navigate the hurdles in the contemporary BERT method; thus, enhancing anomaly detection in EHRs for healthcare applications. METHODS: The EHR-BERT framework was designed using the Sequential Masked Token Prediction (SMTP) method. This approach treats EHRs as natural language sentences and iteratively masks input tokens during both training and prediction stages. This method facilitates the learning of EHR sequence patterns in both directions for each event and identifies anomalies based on deviations from the normal execution models trained on EHR sequences. RESULTS: Extensive experiments on large EHR datasets across various medical domains demonstrate that EHR-BERT markedly improves upon existing models. It significantly reduces the number of false positives and enhances the detection rate, thus bolstering the reliability of anomaly detection in electronic health records. This improvement is attributed to the model's ability to minimize information loss and maximize data utilization effectively. CONCLUSION: EHR-BERT showcases immense potential in decreasing medical errors related to anomalous clinical events, positioning itself as an indispensable asset for enhancing patient safety and the overall standard of healthcare services. The framework effectively overcomes the drawbacks of earlier models, making it a promising solution for healthcare professionals to ensure the reliability and quality of health data.


Assuntos
Registros Eletrônicos de Saúde , Sistemas de Informação em Saúde , Humanos , Reprodutibilidade dos Testes , Registros , Pessoal de Saúde
15.
J Biomed Inform ; 152: 104621, 2024 04.
Artigo em Inglês | MEDLINE | ID: mdl-38447600

RESUMO

OBJECTIVE: The primary objective of this review is to investigate the effectiveness of machine learning and deep learning methodologies in the context of extracting adverse drug events (ADEs) from clinical benchmark datasets. We conduct an in-depth analysis, aiming to compare the merits and drawbacks of both machine learning and deep learning techniques, particularly within the framework of named-entity recognition (NER) and relation classification (RC) tasks related to ADE extraction. Additionally, our focus extends to the examination of specific features and their impact on the overall performance of these methodologies. In a broader perspective, our research extends to ADE extraction from various sources, including biomedical literature, social media data, and drug labels, removing the limitation to exclusively machine learning or deep learning methods. METHODS: We conducted an extensive literature review on PubMed using the query "(((machine learning [Medical Subject Headings (MeSH) Terms]) OR (deep learning [MeSH Terms])) AND (adverse drug event [MeSH Terms])) AND (extraction)", and supplemented this with a snowballing approach to review 275 references sourced from retrieved articles. RESULTS: In our analysis, we included twelve articles for review. For the NER task, deep learning models outperformed machine learning models. In the RC task, gradient Boosting, multilayer perceptron and random forest models excelled. The Bidirectional Encoder Representations from Transformers (BERT) model consistently achieved the best performance in the end-to-end task. Future efforts in the end-to-end task should prioritize improving NER accuracy, especially for 'ADE' and 'Reason'. CONCLUSION: These findings hold significant implications for advancing the field of ADE extraction and pharmacovigilance, ultimately contributing to improved drug safety monitoring and healthcare outcomes.


Assuntos
Aprendizado Profundo , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Humanos , Inteligência Artificial , Farmacovigilância , Benchmarking , Processamento de Linguagem Natural
16.
J Med Internet Res ; 26: e51069, 2024 Jan 30.
Artigo em Inglês | MEDLINE | ID: mdl-38289662

RESUMO

BACKGROUND: Sentiment analysis is a significant yet difficult task in natural language processing. The linguistic peculiarities of Cantonese, including its high similarity with Standard Chinese, its grammatical and lexical uniqueness, and its colloquialism and multilingualism, make it different from other languages and pose additional challenges to sentiment analysis. Recent advances in models such as ChatGPT offer potential viable solutions. OBJECTIVE: This study investigated the efficacy of GPT-3.5 and GPT-4 in Cantonese sentiment analysis in the context of web-based counseling and compared their performance with other mainstream methods, including lexicon-based methods and machine learning approaches. METHODS: We analyzed transcripts from a web-based, text-based counseling service in Hong Kong, including a total of 131 individual counseling sessions and 6169 messages between counselors and help-seekers. First, a codebook was developed for human annotation. A simple prompt ("Is the sentiment of this Cantonese text positive, neutral, or negative? Respond with the sentiment label only.") was then given to GPT-3.5 and GPT-4 to label each message's sentiment. GPT-3.5 and GPT-4's performance was compared with a lexicon-based method and 3 state-of-the-art models, including linear regression, support vector machines, and long short-term memory neural networks. RESULTS: Our findings revealed ChatGPT's remarkable accuracy in sentiment classification, with GPT-3.5 and GPT-4, respectively, achieving 92.1% (5682/6169) and 95.3% (5880/6169) accuracy in identifying positive, neutral, and negative sentiment, thereby outperforming the traditional lexicon-based method, which had an accuracy of 37.2% (2295/6169), and the 3 machine learning models, which had accuracies ranging from 66% (4072/6169) to 70.9% (4374/6169). CONCLUSIONS: Among many text analysis techniques, ChatGPT demonstrates superior accuracy and emerges as a promising tool for Cantonese sentiment analysis. This study also highlights ChatGPT's applicability in real-world scenarios, such as monitoring the quality of text-based counseling services and detecting message-level sentiments in vivo. The insights derived from this study pave the way for further exploration into the capabilities of ChatGPT in the context of underresourced languages and specialized domains like psychotherapy and natural language processing.


Assuntos
Inteligência Artificial , Povo Asiático , Comunicação , Idioma , Humanos , Conselheiros , Hong Kong , Modelos Lineares
17.
J Med Internet Res ; 26: e50518, 2024 Mar 19.
Artigo em Inglês | MEDLINE | ID: mdl-38393293

RESUMO

BACKGROUND: In recent years, Korean society has increasingly recognized the importance of nurses in the context of population aging and infectious disease control. However, nurses still face difficulties with regard to policy activities that are aimed at improving the nursing workforce structure and working environment. Media coverage plays an important role in public awareness of a particular issue and can be an important strategy in policy activities. OBJECTIVE: This study analyzed data from 18 years of news coverage on nursing-related issues. The focus of this study was to examine the drivers of the social, local, economic, and political agendas that were emphasized in the media by the analysis of main sources and their quotes. This analysis revealed which nursing media agendas were emphasized (eg, social aspects), neglected (eg, policy aspects), and negotiated. METHODS: Descriptive analysis, natural language processing, and semantic network analysis were applied to analyze data collected from 2005 to 2022. BigKinds were used for the collection of data, automatic multi-categorization of news, named entity recognition of news sources, and extraction and topic modeling of quotes. The main news sources were identified by conducting a 1-mode network analysis with SNAnalyzer. The main agendas of nursing-related news coverage were examined through the qualitative analysis of major sources' quotes by section. The common and individual interests of the top-ranked sources were analyzed through a 2-mode network analysis using UCINET. RESULTS: In total, 128,339 articles from 54 media outlets on nursing-related issues were analyzed. Descriptive analysis showed that nursing-related news was mainly covered in social (99,868/128,339, 77.82%) and local (48,056/128,339, 48.56%) sections, whereas it was rarely covered in economic (9439/128,339, 7.35%) and political (7301/128,339, 5.69%) sections. Furthermore, 445 sources that had made the top 20 list at least once by year and section were analyzed. Other than "nurse," the main sources for each section were "labor union," "local resident," "government," and "Moon Jae-in." "Nursing Bill" emerged as a common interest among nurses and doctors, although the topic did not garner considerable attention from the Ministry of Health and Welfare. Analyzing quotes showed that nurses were portrayed as heroes, laborers, survivors of abuse, and perpetrators. The economic section focused on employment of youth and women in nursing. In the political section, conflicts between nurses and doctors, which may have caused policy confusion, were highlighted. Policy formulation processes were not adequately reported. Media coverage of the enactment of nursing laws tended to relate to confrontations between political parties. CONCLUSIONS: The media plays a crucial role in highlighting various aspects of nursing practice. However, policy formulation processes to solve nursing issues were not adequately reported in South Korea. This study suggests that nurses should secure policy compliance by persuading the public to understand their professional perspectives.


Assuntos
Comunicação , Processamento de Linguagem Natural , Feminino , Humanos , Adolescente , Políticas , Governo , República da Coreia , Meios de Comunicação de Massa
18.
J Med Internet Res ; 26: e47408, 2024 Feb 14.
Artigo em Inglês | MEDLINE | ID: mdl-38354044

RESUMO

BACKGROUND: Attitudes toward abortion have historically been characterized via dichotomized labels, yet research suggests that these labels do not appropriately encapsulate beliefs on abortion. Rather, contexts, circumstances, and lived experiences often shape views on abortion into more nuanced and complex perspectives. Qualitative data have also been shown to underpin belief systems regarding abortion. Social media, as a form of qualitative data, could reveal how attitudes toward abortion are communicated publicly in web-based spaces. Furthermore, in some cases, social media can also be leveraged to seek health information. OBJECTIVE: This study applies natural language processing and social media mining to analyze Reddit (Reddit, Inc) forums specific to abortion, including r/Abortion (the largest subreddit about abortion) and r/AbortionDebate (a subreddit designed to discuss and debate worldviews on abortion). Our analytical pipeline intends to identify potential themes within the data and the affect from each post. METHODS: We applied a neural network-based topic modeling pipeline (BERTopic) to uncover themes in the r/Abortion (n=2151) and r/AbortionDebate (n=2815) subreddits. After deriving the optimal number of topics per subreddit using an iterative coherence score calculation, we performed a sentiment analysis using the Valence Aware Dictionary and Sentiment Reasoner to assess positive, neutral, and negative affect and an emotion analysis using the Text2Emotion lexicon to identify potential emotionality per post. Differences in affect and emotion by subreddit were compared. RESULTS: The iterative coherence score calculation revealed 10 topics for both r/Abortion (coherence=0.42) and r/AbortionDebate (coherence=0.35). Topics in the r/Abortion subreddit primarily centered on information sharing or offering a source of social support; in contrast, topics in the r/AbortionDebate subreddit centered on contextualizing shifting or evolving views on abortion across various ethical, moral, and legal domains. The average compound Valence Aware Dictionary and Sentiment Reasoner scores for the r/Abortion and r/AbortionDebate subreddits were 0.01 (SD 0.44) and -0.06 (SD 0.41), respectively. Emotionality scores were consistent across the r/Abortion and r/AbortionDebate subreddits; however, r/Abortion had a marginally higher average fear score of 0.36 (SD 0.39). CONCLUSIONS: Our findings suggest that people posting on abortion forums on Reddit are willing to share their beliefs, which manifested in diverse ways, such as sharing abortion stories including how their worldview changed, which critiques the value of dichotomized abortion identity labels, and information seeking. Notably, the style of discourse varied significantly by subreddit. r/Abortion was principally leveraged as an information and outreach source; r/AbortionDebate largely centered on debating across various legal, ethical, and moral abortion domains. Collectively, our findings suggest that abortion remains an opaque yet politically charged issue for people and that social media can be leveraged to understand views and circumstances surrounding abortion.


Assuntos
Aborto Induzido , Transtornos Fóbicos , Mídias Sociais , Feminino , Gravidez , Humanos , Mineração de Dados , Comportamento de Busca de Informação , Processamento de Linguagem Natural
19.
J Med Internet Res ; 26: e54974, 2024 May 31.
Artigo em Inglês | MEDLINE | ID: mdl-38819896

RESUMO

ChatGPT (OpenAI) is an advanced natural language processing tool with growing applications across various disciplines in medical research. Thematic analysis, a qualitative research method to identify and interpret patterns in data, is one application that stands to benefit from this technology. This viewpoint explores the use of ChatGPT in three core phases of thematic analysis within a medical context: (1) direct coding of transcripts, (2) generating themes from a predefined list of codes, and (3) preprocessing quotes for manuscript inclusion. Additionally, we explore the potential of ChatGPT to generate interview transcripts, which may be used for training purposes. We assess the strengths and limitations of using ChatGPT in these roles, highlighting areas where human intervention remains necessary. Overall, we argue that ChatGPT can function as a valuable tool during analysis, enhancing the efficiency of the thematic analysis and offering additional insights into the qualitative data. While ChatGPT may not adequately capture the full context of each participant, it can serve as an additional member of the analysis team, contributing to researcher triangulation through knowledge building and sensemaking.


Assuntos
Processamento de Linguagem Natural , Humanos , Pesquisa Qualitativa
20.
J Med Internet Res ; 26: e56110, 2024 Jul 08.
Artigo em Inglês | MEDLINE | ID: mdl-38976865

RESUMO

BACKGROUND: OpenAI's ChatGPT is a pioneering artificial intelligence (AI) in the field of natural language processing, and it holds significant potential in medicine for providing treatment advice. Additionally, recent studies have demonstrated promising results using ChatGPT for emergency medicine triage. However, its diagnostic accuracy in the emergency department (ED) has not yet been evaluated. OBJECTIVE: This study compares the diagnostic accuracy of ChatGPT with GPT-3.5 and GPT-4 and primary treating resident physicians in an ED setting. METHODS: Among 100 adults admitted to our ED in January 2023 with internal medicine issues, the diagnostic accuracy was assessed by comparing the diagnoses made by ED resident physicians and those made by ChatGPT with GPT-3.5 or GPT-4 against the final hospital discharge diagnosis, using a point system for grading accuracy. RESULTS: The study enrolled 100 patients with a median age of 72 (IQR 58.5-82.0) years who were admitted to our internal medicine ED primarily for cardiovascular, endocrine, gastrointestinal, or infectious diseases. GPT-4 outperformed both GPT-3.5 (P<.001) and ED resident physicians (P=.01) in diagnostic accuracy for internal medicine emergencies. Furthermore, across various disease subgroups, GPT-4 consistently outperformed GPT-3.5 and resident physicians. It demonstrated significant superiority in cardiovascular (GPT-4 vs ED physicians: P=.03) and endocrine or gastrointestinal diseases (GPT-4 vs GPT-3.5: P=.01). However, in other categories, the differences were not statistically significant. CONCLUSIONS: In this study, which compared the diagnostic accuracy of GPT-3.5, GPT-4, and ED resident physicians against a discharge diagnosis gold standard, GPT-4 outperformed both the resident physicians and its predecessor, GPT-3.5. Despite the retrospective design of the study and its limited sample size, the results underscore the potential of AI as a supportive diagnostic tool in ED settings.


Assuntos
Serviço Hospitalar de Emergência , Humanos , Serviço Hospitalar de Emergência/estatística & dados numéricos , Estudos Retrospectivos , Idoso , Feminino , Pessoa de Meia-Idade , Masculino , Idoso de 80 Anos ou mais , Inteligência Artificial , Médicos/estatística & dados numéricos , Processamento de Linguagem Natural , Triagem/métodos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA