Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Sensors (Basel) ; 23(23)2023 Nov 23.
Artigo em Inglês | MEDLINE | ID: mdl-38067736

RESUMO

The rapid growth of electronic health records (EHRs) has led to unprecedented biomedical data. Clinician access to the latest patient information can improve the quality of healthcare. However, clinicians have difficulty finding information quickly and easily due to the sheer data mining volume. Biomedical information retrieval (BIR) systems can help clinicians find the information required by automatically searching EHRs and returning relevant results. However, traditional BIR systems cannot understand the complex relationships between EHR entities. Transformers are a new type of neural network that is very effective for natural language processing (NLP) tasks. As a result, transformers are well suited for tasks such as machine translation and text summarization. In this paper, we propose a new BIR system for EHRs that uses transformers for predicting cancer treatment from EHR. Our system can understand the complex relationships between the different entities in an EHR, which allows it to return more relevant results to clinicians. We evaluated our system on a dataset of EHRs and found that it outperformed state-of-the-art BIR systems on various tasks, including medical question answering and information extraction. Our results show that Transformers are a promising approach for BIR in EHRs, reaching an accuracy and an F1-score of 86.46%, and 0.8157, respectively. We believe that our system can help clinicians find the information they need more quickly and easily, leading to improved patient care.


Assuntos
Registros Eletrônicos de Saúde , Neoplasias , Humanos , Mineração de Dados/métodos , Processamento de Linguagem Natural , Redes Neurais de Computação , Sistemas de Informação , Neoplasias/terapia
2.
J Biomed Inform ; 117: 103732, 2021 05.
Artigo em Inglês | MEDLINE | ID: mdl-33737208

RESUMO

BACKGROUND: Understanding the relationships between genes, drugs, and disease states is at the core of pharmacogenomics. Two leading approaches for identifying these relationships in medical literature are: human expert led manual curation efforts, and modern data mining based automated approaches. The former generates small amounts of high-quality data, and the latter offers large volumes of mixed quality data. The algorithmically extracted relationships are often accompanied by supporting evidence, such as, confidence scores, source articles, and surrounding contexts (excerpts) from the articles, that can be used as data quality indicators. Tools that can leverage these quality indicators to help the user gain access to larger and high-quality data are needed. APPROACH: We introduce GeneDive, a web application for pharmacogenomics researchers and precision medicine practitioners that makes gene, disease, and drug interactions data easily accessible and usable. GeneDive is designed to meet three key objectives: (1) provide functionality to manage information-overload problem and facilitate easy assimilation of supporting evidence, (2) support longitudinal and exploratory research investigations, and (3) offer integration of user-provided interactions data without requiring data sharing. RESULTS: GeneDive offers multiple search modalities, visualizations, and other features that guide the user efficiently to the information of their interest. To facilitate exploratory research, GeneDive makes the supporting evidence and context for each interaction readily available and allows the data quality threshold to be controlled by the user as per their risk tolerance level. The interactive search-visualization loop enables relationship discoveries between diseases, genes, and drugs that might not be explicitly described in literature but are emergent from the source medical corpus and deductive reasoning. The ability to utilize user's data either in combination with the GeneDive native datasets or in isolation promotes richer data-driven exploration and discovery. These functionalities along with GeneDive's applicability for precision medicine, bringing the knowledge contained in biomedical literature to bear on particular clinical situations and improving patient care, are illustrated through detailed use cases. CONCLUSION: GeneDive is a comprehensive, broad-use biological interactions browser. The GeneDive application and information about its underlying system architecture are available at http://www.genedive.net. GeneDive Docker image is also available for download at this URL, allowing users to (1) import their own interaction data securely and privately; and (2) generate and test hypotheses across their own and other datasets.


Assuntos
Preparações Farmacêuticas , Medicina de Precisão , Mineração de Dados , Humanos , Farmacogenética , Software
3.
BMC Bioinformatics ; 20(1): 429, 2019 Aug 16.
Artigo em Inglês | MEDLINE | ID: mdl-31419935

RESUMO

BACKGROUND: Diagnosis and treatment decisions in cancer increasingly depend on a detailed analysis of the mutational status of a patient's genome. This analysis relies on previously published information regarding the association of variations to disease progression and possible interventions. Clinicians to a large degree use biomedical search engines to obtain such information; however, the vast majority of scientific publications focus on basic science and have no direct clinical impact. We develop the Variant-Information Search Tool (VIST), a search engine designed for the targeted search of clinically relevant publications given an oncological mutation profile. RESULTS: VIST indexes all PubMed abstracts and content from ClinicalTrials.gov. It applies advanced text mining to identify mentions of genes, variants and drugs and uses machine learning based scoring to judge the clinical relevance of indexed abstracts. Its functionality is available through a fast and intuitive web interface. We perform several evaluations, showing that VIST's ranking is superior to that of PubMed or a pure vector space model with regard to the clinical relevance of a document's content. CONCLUSION: Different user groups search repositories of scientific publications with different intentions. This diversity is not adequately reflected in the standard search engines, often leading to poor performance in specialized settings. We develop a search engine for the specific case of finding documents that are clinically relevant in the course of cancer treatment. We believe that the architecture of our engine, heavily relying on machine learning algorithms, can also act as a blueprint for search engines in other, equally specific domains. VIST is freely available at https://vist.informatik.hu-berlin.de/.


Assuntos
Neoplasias/patologia , Medicina de Precisão , Ferramenta de Busca , Algoritmos , Bases de Dados como Assunto , Documentação , Humanos , Internet , Interface Usuário-Computador
4.
BMC Bioinformatics ; 20(Suppl 16): 590, 2019 Dec 02.
Artigo em Inglês | MEDLINE | ID: mdl-31787087

RESUMO

BACKGROUND: The number of biomedical research articles have increased exponentially with the advancement of biomedicine in recent years. These articles have thus brought a great difficulty in obtaining the needed information of researchers. Information retrieval technologies seek to tackle the problem. However, information needs cannot be completely satisfied by directly introducing the existing information retrieval techniques. Therefore, biomedical information retrieval not only focuses on the relevance of search results, but also aims to promote the completeness of the results, which is referred as the diversity-oriented retrieval. RESULTS: We address the diversity-oriented biomedical retrieval task using a supervised term ranking model. The model is learned through a supervised query expansion process for term refinement. Based on the model, the most relevant and diversified terms are selected to enrich the original query. The expanded query is then fed into a second retrieval to improve the relevance and diversity of search results. To this end, we propose three diversity-oriented optimization strategies in our model, including the diversified term labeling strategy, the biomedical resource-based term features and a diversity-oriented group sampling learning method. Experimental results on TREC Genomics collections demonstrate the effectiveness of the proposed model in improving the relevance and the diversity of search results. CONCLUSIONS: The proposed three strategies jointly contribute to the improvement of biomedical retrieval performance. Our model yields more relevant and diversified results than the state-of-the-art baseline models. Moreover, our method provides a general framework for improving biomedical retrieval performance, and can be used as the basis for future work.


Assuntos
Algoritmos , Pesquisa Biomédica , Armazenamento e Recuperação da Informação , Modelos Teóricos , Genômica
5.
J Biomed Inform ; 95: 103224, 2019 07.
Artigo em Inglês | MEDLINE | ID: mdl-31200123

RESUMO

BACKGROUND: Information curation and literature surveillance efforts that synthesize the current knowledge about the impact of genetic variability on disease states and drug responses are vitally important for the practise of evidence-based precision medicine. For these efforts, finding the relevant and comprehensive set of articles from the ever growing scientific literature is a challenge. METHODS: We have designed and developed Article Retrieval for Precision Medicine (ARtPM), an end-to-end article retrieval system that employs multi-stage architecture to retrieve and rank relevant articles for a given medical case summary (genetic variants, disease, demographic, and other medical conditions). We compared ARtPM with five baselines, including PubMed Best Match, the improved search functionality recently introduced by PubMed. RESULTS: The differences in the performance of ARtPM and five baselines were statistically significant for four metrics that quantify different aspects of search effectiveness (P-values for P@10, R-prec, infNDCG, Recall@1000 were <.001, <.001,.003,.009, respectively). Pairwise systems' comparisons show that ARtPM is comparable or better than the best performing baseline on three metrics (R-prec: 0.324 vs 0.299, P-value=.06; infNDCG: 0.556 vs 0.465, P-value=.08; R@1000: 0.665 vs 0.572, P-value=.007), but performance in P@10 (0.603 vs 0.630, P-value:.64) needs to improve. CONCLUSION: The recall-focused phase of the ARtPM is effective at retrieving more relevant articles. The precision-focused ranking phase performs well at deeper ranks but needs further work on early ranks (e.g., richer feature set). Overall, the ARtPM system effectively facilitates evidence-based precision medicine practice, and provides a robust search framework for further work in this direction.


Assuntos
Armazenamento e Recuperação da Informação/métodos , Medicina de Precisão , Pesquisa Biomédica , Curadoria de Dados , Bases de Dados Factuais , Humanos , Publicações Periódicas como Assunto
6.
BMC Bioinformatics ; 17 Suppl 7: 238, 2016 Jul 25.
Artigo em Inglês | MEDLINE | ID: mdl-27455377

RESUMO

BACKGROUND: Biomedical literature retrieval is becoming increasingly complex, and there is a fundamental need for advanced information retrieval systems. Information Retrieval (IR) programs scour unstructured materials such as text documents in large reserves of data that are usually stored on computers. IR is related to the representation, storage, and organization of information items, as well as to access. In IR one of the main problems is to determine which documents are relevant and which are not to the user's needs. Under the current regime, users cannot precisely construct queries in an accurate way to retrieve particular pieces of data from large reserves of data. Basic information retrieval systems are producing low-quality search results. In our proposed system for this paper we present a new technique to refine Information Retrieval searches to better represent the user's information need in order to enhance the performance of information retrieval by using different query expansion techniques and apply a linear combinations between them, where the combinations was linearly between two expansion results at one time. Query expansions expand the search query, for example, by finding synonyms and reweighting original terms. They provide significantly more focused, particularized search results than do basic search queries. RESULTS: The retrieval performance is measured by some variants of MAP (Mean Average Precision) and according to our experimental results, the combination of best results of query expansion is enhanced the retrieved documents and outperforms our baseline by 21.06 %, even it outperforms a previous study by 7.12 %. CONCLUSIONS: We propose several query expansion techniques and their combinations (linearly) to make user queries more cognizable to search engines and to produce higher-quality search results.


Assuntos
Algoritmos , Sistemas de Informação/normas , Semântica
7.
J Biomed Inform ; 63: 379-389, 2016 10.
Artigo em Inglês | MEDLINE | ID: mdl-27593166

RESUMO

In the era of digitalization, information retrieval (IR), which retrieves and ranks documents from large collections according to users' search queries, has been popularly applied in the biomedical domain. Building patient cohorts using electronic health records (EHRs) and searching literature for topics of interest are some IR use cases. Meanwhile, natural language processing (NLP), such as tokenization or Part-Of-Speech (POS) tagging, has been developed for processing clinical documents or biomedical literature. We hypothesize that NLP can be incorporated into IR to strengthen the conventional IR models. In this study, we propose two NLP-empowered IR models, POS-BoW and POS-MRF, which incorporate automatic POS-based term weighting schemes into bag-of-word (BoW) and Markov Random Field (MRF) IR models, respectively. In the proposed models, the POS-based term weights are iteratively calculated by utilizing a cyclic coordinate method where golden section line search algorithm is applied along each coordinate to optimize the objective function defined by mean average precision (MAP). In the empirical experiments, we used the data sets from the Medical Records track in Text REtrieval Conference (TREC) 2011 and 2012 and the Genomics track in TREC 2004. The evaluation on TREC 2011 and 2012 Medical Records tracks shows that, for the POS-BoW models, the mean improvement rates for IR evaluation metrics, MAP, bpref, and P@10, are 10.88%, 4.54%, and 3.82%, compared to the BoW models; and for the POS-MRF models, these rates are 13.59%, 8.20%, and 8.78%, compared to the MRF models. Additionally, we experimentally verify that the proposed weighting approach is superior to the simple heuristic and frequency based weighting approaches, and validate our POS category selection. Using the optimal weights calculated in this experiment, we tested the proposed models on the TREC 2004 Genomics track and obtained average of 8.63% and 10.04% improvement rates for POS-BoW and POS-MRF, respectively. These significant improvements verify the effectiveness of leveraging POS tagging for biomedical IR tasks.


Assuntos
Registros Eletrônicos de Saúde , Armazenamento e Recuperação da Informação , Processamento de Linguagem Natural , Algoritmos , Humanos , Linguística
8.
Stud Health Technol Inform ; 316: 827-831, 2024 Aug 22.
Artigo em Inglês | MEDLINE | ID: mdl-39176920

RESUMO

Finding relevant information in the biomedical literature increasingly depends on efficient information retrieval (IR) algorithms. Cross-Encoders, SentenceBERT, and ColBERT are algorithms based on pre-trained language models that use nuanced but computable vector representations of search queries and documents for IR applications. Here we investigate how well these vectorization algorithms estimate relevance labels of biomedical documents for search queries using the OHSUMED dataset. For our evaluation, we compared computed scores to provided labels by using boxplots and Spearman's rank correlations. According to these metrics, we found that Sentence-BERT moderately outperformed the alternative vectorization algorithms and that additional fine-tuning based on a subset of OHSUMED labels yielded little additional benefit. Future research might aim to develop a larger dedicated dataset in order to optimize such methods more systematically, and to evaluate the corresponding functions in IR tools with end-users.


Assuntos
Algoritmos , Armazenamento e Recuperação da Informação , Processamento de Linguagem Natural , Armazenamento e Recuperação da Informação/métodos , Humanos
9.
Stud Health Technol Inform ; 316: 1151-1155, 2024 Aug 22.
Artigo em Inglês | MEDLINE | ID: mdl-39176584

RESUMO

In clinical research, the analysis of patient cohorts is a widely employed method for investigating relevant healthcare questions. The ability to automatically extract large-scale patient cohorts from hospital systems is vital in order to unlock the potential of real-world clinical data, and answer pivotal medical questions through retrospective research studies. However, existing medical data is often dispersed across various systems and databases, preventing a systematic approach to access and interoperability. Even when the data are readily accessible, clinical researchers need to sift through Electronic Medical Records, confirm ethical approval, verify status of patient consent, check the availability of imaging data, and filter the data based on disease-specific image biomarkers. We present Cohort Builder, a software pipeline designed to facilitate the creation of patient cohorts with predefined baseline characteristics from real-world ophthalmic imaging data and electronic medical records. The applicability of our approach extends beyond ophthalmology to other medical domains with similar requirements such as neurology, cardiology and orthopedics.


Assuntos
Registros Eletrônicos de Saúde , Software , Humanos , Diagnóstico por Imagem , Estudos de Coortes , Oftalmopatias/diagnóstico por imagem
10.
Ann Biomed Eng ; 2023 Oct 19.
Artigo em Inglês | MEDLINE | ID: mdl-37855948

RESUMO

Large language models (LLMs) such as ChatGPT have recently attracted significant attention due to their impressive performance on many real-world tasks. These models have also demonstrated the potential in facilitating various biomedical tasks. However, little is known of their potential in biomedical information retrieval, especially identifying drug-disease associations. This study aims to explore the potential of ChatGPT, a popular LLM, in discerning drug-disease associations. We collected 2694 true drug-disease associations and 5662 false drug-disease pairs. Our approach involved creating various prompts to instruct ChatGPT in identifying these associations. Under varying prompt designs, ChatGPT's capability to identify drug-disease associations with an accuracy of 74.6-83.5% and 96.2-97.6% for the true and false pairs, respectively. This study shows that ChatGPT has the potential in identifying drug-disease associations and may serve as a helpful tool in searching pharmacy-related information. However, the accuracy of its insights warrants comprehensive examination before its implementation in medical practice.

11.
JMIR Med Inform ; 9(6): e28272, 2021 Jun 29.
Artigo em Inglês | MEDLINE | ID: mdl-34185006

RESUMO

BACKGROUND: With the development of biomedicine, the number of biomedical documents has increased rapidly bringing a great challenge for researchers trying to retrieve the information they need. Information retrieval aims to meet this challenge by searching relevant documents from abundant documents based on the given query. However, sometimes the relevance of search results needs to be evaluated from multiple aspects in specific retrieval tasks, thereby increasing the difficulty of biomedical information retrieval. OBJECTIVE: This study aimed to find a more systematic method for retrieving relevant scientific literature for a given patient. METHODS: In the initial retrieval stage, we supplemented query terms through query expansion strategies and applied query boosting to obtain an initial ranking list of relevant documents. In the re-ranking phase, we employed a text classification model and relevance matching model to evaluate documents from different dimensions and then combined the outputs through logistic regression to re-rank all the documents from the initial ranking list. RESULTS: The proposed ensemble method contributed to the improvement of biomedical retrieval performance. Compared with the existing deep learning-based methods, experimental results showed that our method achieved state-of-the-art performance on the data collection provided by the Text Retrieval Conference 2019 Precision Medicine Track. CONCLUSIONS: In this paper, we proposed a novel ensemble method based on deep learning. As shown in the experiments, the strategies we used in the initial retrieval phase such as query expansion and query boosting are effective. The application of the text classification model and relevance matching model better captured semantic context information and improved retrieval performance.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA