Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 39
Filtrar
1.
Front Med (Lausanne) ; 11: 1274688, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38515987

RESUMEN

Patients, life science industry and regulatory authorities are united in their goal to reduce the disease burden of patients by closing remaining unmet needs. Patients have, however, not always been systematically and consistently involved in the drug development process. Recognizing this gap, regulatory bodies worldwide have initiated patient-focused drug development (PFDD) initiatives to foster a more systematic involvement of patients in the drug development process and to ensure that outcomes measured in clinical trials are truly relevant to patients and represent significant improvements to their quality of life. As a source of real-world evidence (RWE), social media has been consistently shown to capture the first-hand, spontaneous and unfiltered disease and treatment experience of patients and is acknowledged as a valid method for generating patient experience data by the Food and Drug Administration (FDA). While social media listening (SML) methods are increasingly applied to many diseases and use cases, a significant piece of uncertainty remains on how evidence derived from social media can be used in the drug development process and how it can impact regulatory decision making, including legal and ethical aspects. In this policy paper, we review the perspectives of three key stakeholder groups on the role of SML in drug development, namely patients, life science companies and regulators. We also carry out a systematic review of current practices and use cases for SML and, in particular, highlight benefits and drawbacks for the use of SML as a way to identify unmet needs of patients. While we find that the stakeholders are strongly aligned regarding the potential of social media for PFDD, we identify key areas in which regulatory guidance is needed to reduce uncertainty regarding the impact of SML as a source of patient experience data that has impact on regulatory decision making.

2.
J Am Med Inform Assoc ; 31(4): 991-996, 2024 04 03.
Artículo en Inglés | MEDLINE | ID: mdl-38218723

RESUMEN

OBJECTIVE: The aim of the Social Media Mining for Health Applications (#SMM4H) shared tasks is to take a community-driven approach to address the natural language processing and machine learning challenges inherent to utilizing social media data for health informatics. In this paper, we present the annotated corpora, a technical summary of participants' systems, and the performance results. METHODS: The eighth iteration of the #SMM4H shared tasks was hosted at the AMIA 2023 Annual Symposium and consisted of 5 tasks that represented various social media platforms (Twitter and Reddit), languages (English and Spanish), methods (binary classification, multi-class classification, extraction, and normalization), and topics (COVID-19, therapies, social anxiety disorder, and adverse drug events). RESULTS: In total, 29 teams registered, representing 17 countries. In general, the top-performing systems used deep neural network architectures based on pre-trained transformer models. In particular, the top-performing systems for the classification tasks were based on single models that were pre-trained on social media corpora. CONCLUSION: To facilitate future work, the datasets-a total of 61 353 posts-will remain available by request, and the CodaLab sites will remain active for a post-evaluation phase.


Asunto(s)
Medios de Comunicación Sociales , Humanos , Minería de Datos/métodos , Aprendizaje Automático , Procesamiento de Lenguaje Natural , Redes Neurales de la Computación
3.
Expert Opin Drug Discov ; 19(1): 33-42, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-37887266

RESUMEN

INTRODUCTION: The concept of Digital Twins (DTs) translated to drug development and clinical trials describes virtual representations of systems of various complexities, ranging from individual cells to entire humans, and enables in silico simulations and experiments. DTs increase the efficiency of drug discovery and development by digitalizing processes associated with high economic, ethical, or social burden. The impact is multifaceted: DT models sharpen disease understanding, support biomarker discovery and accelerate drug development, thus advancing precision medicine. One way to realize DTs is by generative artificial intelligence (AI), a cutting-edge technology that enables the creation of novel, realistic and complex data with desired properties. AREAS COVERED: The authors provide a brief introduction to generative AI and describe how it facilitates the modeling of DTs. In addition, they compare existing implementations of generative AI for DTs in drug discovery and clinical trials. Finally, they discuss technical and regulatory challenges that should be addressed before DTs can transform drug discovery and clinical trials. EXPERT OPINION: The current state of DTs in drug discovery and clinical trials does not exploit the entire power of generative AI yet and is limited to simulation of a small number of characteristics. Nonetheless, generative AI has the potential to transform the field by leveraging recent developments in deep learning and customizing models for the needs of scientists, physicians and patients.


Asunto(s)
Inteligencia Artificial , Investigación Biomédica , Humanos , Simulación por Computador , Desarrollo de Medicamentos , Descubrimiento de Drogas , Ensayos Clínicos como Asunto
4.
medRxiv ; 2023 Nov 08.
Artículo en Inglés | MEDLINE | ID: mdl-37986776

RESUMEN

The aim of the Social Media Mining for Health Applications (#SMM4H) shared tasks is to take a community-driven approach to address the natural language processing and machine learning challenges inherent to utilizing social media data for health informatics. The eighth iteration of the #SMM4H shared tasks was hosted at the AMIA 2023 Annual Symposium and consisted of five tasks that represented various social media platforms (Twitter and Reddit), languages (English and Spanish), methods (binary classification, multi-class classification, extraction, and normalization), and topics (COVID-19, therapies, social anxiety disorder, and adverse drug events). In total, 29 teams registered, representing 18 countries. In this paper, we present the annotated corpora, a technical summary of the systems, and the performance results. In general, the top-performing systems used deep neural network architectures based on pre-trained transformer models. In particular, the top-performing systems for the classification tasks were based on single models that were pre-trained on social media corpora. To facilitate future work, the datasets-a total of 61,353 posts-will remain available by request, and the CodaLab sites will remain active for a post-evaluation phase.

5.
Database (Oxford) ; 20232023 02 03.
Artículo en Inglés | MEDLINE | ID: mdl-36734300

RESUMEN

This study presents the outcomes of the shared task competition BioCreative VII (Task 3) focusing on the extraction of medication names from a Twitter user's publicly available tweets (the user's 'timeline'). In general, detecting health-related tweets is notoriously challenging for natural language processing tools. The main challenge, aside from the informality of the language used, is that people tweet about any and all topics, and most of their tweets are not related to health. Thus, finding those tweets in a user's timeline that mention specific health-related concepts such as medications requires addressing extreme imbalance. Task 3 called for detecting tweets in a user's timeline that mentions a medication name and, for each detected mention, extracting its span. The organizers made available a corpus consisting of 182 049 tweets publicly posted by 212 Twitter users with all medication mentions manually annotated. The corpus exhibits the natural distribution of positive tweets, with only 442 tweets (0.2%) mentioning a medication. This task was an opportunity for participants to evaluate methods that are robust to class imbalance beyond the simple lexical match. A total of 65 teams registered, and 16 teams submitted a system run. This study summarizes the corpus created by the organizers and the approaches taken by the participating teams for this challenge. The corpus is freely available at https://biocreative.bioinformatics.udel.edu/tasks/biocreative-vii/track-3/. The methods and the results of the competing systems are analyzed with a focus on the approaches taken for learning from class-imbalanced data.


Asunto(s)
Minería de Datos , Procesamiento de Lenguaje Natural , Humanos , Minería de Datos/métodos
6.
Database (Oxford) ; 20222022 09 02.
Artículo en Inglés | MEDLINE | ID: mdl-36050787

RESUMEN

Monitoring drug safety is a central concern throughout the drug life cycle. Information about toxicity and adverse events is generated at every stage of this life cycle, and stakeholders have a strong interest in applying text mining and artificial intelligence (AI) methods to manage the ever-increasing volume of this information. Recognizing the importance of these applications and the role of challenge evaluations to drive progress in text mining, the organizers of BioCreative VII (Critical Assessment of Information Extraction in Biology) convened a panel of experts to explore 'Challenges in Mining Drug Adverse Reactions'. This article is an outgrowth of the panel; each panelist has highlighted specific text mining application(s), based on their research and their experiences in organizing text mining challenge evaluations. While these highlighted applications only sample the complexity of this problem space, they reveal both opportunities and challenges for text mining to aid in the complex process of drug discovery, testing, marketing and post-market surveillance. Stakeholders are eager to embrace natural language processing and AI tools to help in this process, provided that these tools can be demonstrated to add value to stakeholder workflows. This creates an opportunity for the BioCreative community to work in partnership with regulatory agencies, pharma and the text mining community to identify next steps for future challenge evaluations.


Asunto(s)
Inteligencia Artificial , Biología Computacional , Biología Computacional/métodos , Minería de Datos/métodos , Personal de Salud , Humanos , Procesamiento de Lenguaje Natural
7.
Sci Rep ; 12(1): 14476, 2022 08 25.
Artículo en Inglés | MEDLINE | ID: mdl-36008431

RESUMEN

Drug resistance caused by mutations is a public health threat for existing and emerging viral diseases. A wealth of evidence about these mutations and their clinically associated phenotypes is scattered across the literature, but a comprehensive perspective is usually lacking. This work aimed to produce a clinically relevant view for the case of Hepatitis B virus (HBV) mutations by combining a chronic HBV clinical study with a compendium of genetic mutations systematically gathered from the scientific literature. We enriched clinical mutation data by systematically mining 2,472,725 scientific articles from PubMed Central in order to gather information about the HBV mutational landscape. By performing this analysis, we were able to identify mutational hotspots for each HBV genotype (A-E) and gene (C, X, P, S), as well as the location of disulfide bonds associated with these mutations. Through a modelling study, we also identified a mutation position common in both the clinical data and the literature that is located at the binding pocket for a known anti-HBV drug, namely entecavir. The results of this novel approach show the potential of integrated analyses to assist in the development of new drugs for viral diseases that are more robust to resistance. Such analyses should be of particular interest due to the increasing importance of viral resistance in established and emerging viruses, such as for newly developed drugs against SARS-CoV-2.


Asunto(s)
Tratamiento Farmacológico de COVID-19 , Hepatitis B Crónica , Antivirales/farmacología , Antivirales/uso terapéutico , ADN Viral/genética , Farmacorresistencia Viral/genética , Genotipo , Virus de la Hepatitis B/genética , Humanos , Mutación , SARS-CoV-2/genética
8.
Database (Oxford) ; 20222022 06 03.
Artículo en Inglés | MEDLINE | ID: mdl-35657112

RESUMEN

Current biological writing is afflicted by the use of ambiguous names, convoluted sentences, vague statements and narrative-fitted storylines. This represents a challenge for biological research in general and in particular for fields such as biological database curation and text mining, which have been tasked to cope with exponentially growing content. Improving the quality of biological writing by encouraging unambiguity and precision would foster expository discipline and machine reasoning. More specifically, the routine inclusion of formal languages in biological writing would improve our ability to describe, compile and model biology.


Asunto(s)
Lenguaje , Escritura , Minería de Datos , Bases de Datos Factuales , Procesamiento de Lenguaje Natural
9.
Cytometry B Clin Cytom ; 102(3): 220-227, 2022 05.
Artículo en Inglés | MEDLINE | ID: mdl-35253974

RESUMEN

BACKGROUND: A key step in clinical flow cytometry data analysis is gating, which involves the identification of cell populations. The process of gating produces a set of reportable results, which are typically described by gating definitions. The non-standardized, non-interpreted nature of gating definitions represents a hurdle for data interpretation and data sharing across and within organizations. Interpreting and standardizing gating definitions for subsequent analysis of gating results requires a curation effort from experts. Machine learning approaches have the potential to help in this process by predicting expert annotations associated with gating definitions. METHODS: We created a gold-standard dataset by manually annotating thousands of gating definitions with cell type and functional marker annotations. We used this dataset to train and test a machine learning pipeline able to predict standard cell types and functional marker genes associated with gating definitions. RESULTS: The machine learning pipeline predicted annotations with high accuracy for both cell types and functional marker genes. Accuracy was lower for gating definitions from assays belonging to laboratories from which limited or no prior data was available in the training. Manual error review ensured that resulting predicted annotations could be reused subsequently as additional gold-standard training data. CONCLUSIONS: Machine learning methods are able to consistently predict annotations associated with gating definitions from flow cytometry assays. However, a hybrid automatic and manual annotation workflow would be recommended to achieve optimal results.


Asunto(s)
Aprendizaje Automático , Citometría de Flujo , Humanos , Flujo de Trabajo
10.
Drug Discov Today ; 27(5): 1523-1530, 2022 05.
Artículo en Inglés | MEDLINE | ID: mdl-35114364

RESUMEN

Social media listening has been increasingly acknowledged as a tool with applications in many stages of the drug development process. These applications were created to meet the need for patient-centric therapies that are fit-for-purpose and meaningful to patients. Such applications, however, require the leverage of new quantitative approaches and analytical methods that draw from developments in artificial intelligence and real-world data (RWD) analysis. Here, we review the state-of-the-art in quantitative social media listening (QSML) methods applied to drug discovery from the perspective of the pharmaceutical industry.


Asunto(s)
Medios de Comunicación Sociales , Inteligencia Artificial , Desarrollo de Medicamentos , Industria Farmacéutica , Humanos , Atención Dirigida al Paciente
11.
PeerJ ; 10: e12764, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35070506

RESUMEN

Delays in the propagation of scientific discoveries across scientific communities have been an oft-maligned feature of scientific research for introducing a bias towards knowledge that is produced within a scientist's closest community. The vastness of the scientific literature has been commonly blamed for this phenomenon, despite recent improvements in information retrieval and text mining. Its actual negative impact on scientific progress, however, has never been quantified. This analysis attempts to do so by exploring its effects on biomedical discovery, particularly in the discovery of relations between diseases, genes and chemical compounds. Results indicate that the probability that two scientific facts will enable the discovery of a new fact depends on how far apart these two facts were originally within the scientific landscape. In particular, the probability decreases exponentially with the citation distance. Thus, the direction of scientific progress is distorted based on the location in which each scientific fact is published, representing a path-dependent bias in which originally closely-located discoveries drive the sequence of future discoveries. To counter this bias, scientists should open the scope of their scientific work with modern information retrieval and extraction approaches.


Asunto(s)
Investigación Biomédica , Minería de Datos , Minería de Datos/métodos , Publicaciones , Conocimiento
12.
J Parkinsons Dis ; 12(1): 137-151, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-34657850

RESUMEN

BACKGROUND: Individuals with Parkinson's disease (PD) develop a significant disease burden over time that contributes to a progressive decline in health-related quality of life. There is a paucity of qualitative research to understand symptoms and impacts in individuals with early-stage PD (i.e., Hoehn and Yahr stage 1-2 and ≤2 years since diagnosis). OBJECTIVE: The collection of qualitative data to inform the selection of clinical outcome assessments for clinical trials is advocated by regulators. This patient-centered, multistage study sought to create a conceptual model of symptoms and their impact for individuals with early-stage PD. METHODS: Symptoms and impacts of PD were gathered from a literature review of qualitative research, a quantitative social media listening analysis, and qualitative patient concept elicitation interviews (n = 35). Clinical experts provided input to validate and finalize the concepts. RESULTS: The final conceptual model consisted of 27 symptoms categorized into 'motor' or 'non-motor' domains, and 39 impacts divided into five domains. Most frequently reported symptoms in early-stage PD were 'tremors' (89%), 'stiffness and rigidity', and 'fatigue' (69%, both). Most frequently reported impacts included 'anxiety' (74%), 'eating and drinking' (71%), followed by 'exercise/sport' and 'relationship with family/family life' (66%, both). CONCLUSION: This study provides initial insights relating to the symptom and impact burden of early-stage PD patients. The conceptual model can be used to help researchers to develop and select optimal patient-centered outcomes to measure treatment benefit in clinical trials. These findings could inform future qualitative research and the development of outcomes specifically for early-stage PD patients.


Asunto(s)
Enfermedad de Parkinson , Calidad de Vida , Fatiga , Humanos , Enfermedad de Parkinson/complicaciones , Enfermedad de Parkinson/diagnóstico , Atención Dirigida al Paciente , Investigación Cualitativa
13.
JMIR Med Inform ; 9(11): e26272, 2021 Nov 11.
Artículo en Inglés | MEDLINE | ID: mdl-34762056

RESUMEN

BACKGROUND: The abundance of online content contributed by patients is a rich source of insight about the lived experience of disease. Patients share disease experiences with other members of the patient and caregiver community and do so using their own lexicon of words and phrases. This lexicon and the topics that are communicated using words and phrases belonging to the lexicon help us better understand disease burden. Insights from social media may ultimately guide clinical development in ways that ensure that future treatments are fit for purpose from the patient's perspective. OBJECTIVE: We sought insights into the patient experience of chronic obstructive pulmonary disease (COPD) by analyzing a substantial corpus of social media content. The corpus was sufficiently large to make manual review and manual coding all but impossible to perform in a consistent and systematic fashion. Advanced analytics were applied to the corpus content in the search for associations between symptoms and impacts across the entire text corpus. METHODS: We conducted a retrospective, cross-sectional study of 5663 posts sourced from open blogs and online forum posts published by COPD patients between February 2016 and August 2019. We applied a novel neural network approach to identify a lexicon of community words and phrases used by patients to describe their symptoms. We used this lexicon to explore the relationship between COPD symptoms and disease-related impacts. RESULTS: We identified a diverse lexicon of community words and phrases for COPD symptoms, including gasping, wheezy, mucus-y, and muck. These symptoms were mentioned in association with specific words and phrases for disease impact such as frightening, breathing discomfort, and difficulty exercising. Furthermore, we found an association between mucus hypersecretion and moderate disease severity, which distinguished mucus from the other main COPD symptoms, namely breathlessness and cough. CONCLUSIONS: We demonstrated the potential of neural networks and advanced analytics to gain patient-focused insights about how each distinct COPD symptom contributes to the burden of chronic and acute respiratory illness. Using a neural network approach, we identified words and phrases for COPD symptoms that were specific to the patient community. Identifying patterns in the association between symptoms and impacts deepened our understanding of the patient experience of COPD. This approach can be readily applied to other disease areas.

14.
BMC Bioinformatics ; 22(1): 95, 2021 Feb 26.
Artículo en Inglés | MEDLINE | ID: mdl-33637047

RESUMEN

BACKGROUND: Numerous efforts have been poured into annotating the wealth of knowledge contained in biomedical articles. Thanks to such efforts, it is now possible to quantitatively explore relations between these annotations and the citation network at large scale. RESULTS: With the aid of several large and small annotation databases, this study shows that articles share annotations with their citation neighborhood to the point that the neighborhood's most common annotations are likely to be those appearing in the article. CONCLUSIONS: These findings posit that an article's citation neighborhood defines to a large extent the article's annotated content. Thus, citations should be considered as a foundation for future knowledge management and annotation of biomedical articles.


Asunto(s)
Bibliometría , Bases de Datos Factuales , Edición
15.
Bioinformatics ; 36(7): 2224-2228, 2020 04 01.
Artículo en Inglés | MEDLINE | ID: mdl-31830249

RESUMEN

MOTIVATION: Name ambiguity has long been a central problem in biomedical text mining. To tackle it, it has been usually assumed that names present only one meaning within a given text. It is not known whether this assumption applies beyond the scope of single documents. RESULTS: Using a new method that leverages large numbers of biomedical annotations and normalized citations, this study shows that ambiguous biomedical names mentioned in scientific articles tend to present the same meaning in articles that cite them or that they cite, and, to a lesser extent, two steps away in the citation network. Citations, therefore, can be regarded as semantic connections between articles and the citation network should be considered for tasks such as automatic name disambiguation, entity linking and biomedical database annotation. A simple experiment shows the applicability of these findings to name disambiguation. AVAILABILITY AND IMPLEMENTATION: The code used for this analysis is available at: https://github.com/raroes/one-sense-per-citation-network.


Asunto(s)
Minería de Datos , Semántica , Bases de Datos Factuales
16.
J Am Med Inform Assoc ; 26(10): 1037-1045, 2019 10 01.
Artículo en Inglés | MEDLINE | ID: mdl-30958542

RESUMEN

OBJECTIVE: Author-centric analyses of fast-growing biomedical reference databases are challenging due to author ambiguity. This problem has been mainly addressed through author disambiguation using supervised machine-learning algorithms. Such algorithms, however, require adequately designed gold standards that reflect the reference database properly. In this study we used MEDLINE to build the first unbiased gold standard in a reference database and improve over the existing state of the art in author disambiguation. MATERIALS AND METHODS: Following a new corpus design method, publication pairs randomly picked from MEDLINE were evaluated by both crowdsourcing and expert curators. Because the latter showed higher accuracy than crowdsourcing, expert curators were tasked to create a full corpus. The corpus was then used to explore new features that could improve state-of-the-art author disambiguation algorithms that would not have been discoverable with previously existing gold standards. RESULTS: We created a gold standard based on 1900 publication pairs that shows close similarity to MEDLINE in terms of chronological distribution and information completeness. A machine-learning algorithm that includes new features related to the ethnic origin of authors showed significant improvements over the current state of the art and demonstrates the necessity of realistic gold standards to further develop effective author disambiguation algorithms. DISCUSSION AND CONCLUSION: An unbiased gold standard can give a more accurate picture of the status of author disambiguation research and help in the discovery of new features for machine learning. The principles and methods shown here can be applied to other reference databases beyond MEDLINE. The gold standard and code used for this study are available at the following repository: https://github.com/amorgani/AND/.


Asunto(s)
Autoria , Minería de Datos/métodos , MEDLINE , Aprendizaje Automático , Estándares de Referencia , Algoritmos , Colaboración de las Masas , Bases de Datos Bibliográficas/normas , MEDLINE/normas
17.
Mol Cytogenet ; 12: 14, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-30930962

RESUMEN

BACKGROUND: We have previously described evidence for a statistically significant, global, supra-chromosomal representation of the human body that appears to stretch over the entire genome. RESULTS: Here, we extend the genome mapping model, zooming down to the typical individual animal cell. Its cellular organization appears to be significantly mapped onto the human genome: Evidence is reported for a "cellunculus" - on the model of a homunculus, on the H. sapiens genome. CONCLUSIONS: Basic cell structure turns out to map similarly onto the total genome, mirrored via genes that express in particular cell organelles (e.g., "nuclear membrane"). Similar cell maps may also appear on individual chromosomes that map topologically on the dorsoventral body axis. This seems to constitute some of the basic structural and functional organization of nucleus and chromosome architecture.

18.
BMC Med Genomics ; 10(1): 59, 2017 10 11.
Artículo en Inglés | MEDLINE | ID: mdl-29020950

RESUMEN

BACKGROUND: Differential gene expression is important to understand the biological differences between healthy and diseased states. Two common sources of differential gene expression data are microarray studies and the biomedical literature. METHODS: With the aid of text mining and gene expression analysis we have examined the comparative properties of these two sources of differential gene expression data. RESULTS: The literature shows a preference for reporting genes associated to higher fold changes in microarray data, rather than genes that are simply significantly differentially expressed. Thus, the resemblance between the literature and microarray data increases when the fold-change threshold for microarray data is increased. Moreover, the literature has a reporting preference for differentially expressed genes that (1) are overexpressed rather than underexpressed; (2) are overexpressed in multiple diseases; and (3) are popular in the biomedical literature at large. Additionally, the degree to which diseases are similar depends on whether microarray data or the literature is used to compare them. Finally, vaguely-qualified reports of differential expression magnitudes in the literature have only small correlation with microarray fold-change data. CONCLUSIONS: Reporting biases of differential gene expression in the literature can be affecting our appreciation of disease biology and of the degree of similarity that actually exists between different diseases.


Asunto(s)
Enfermedad/genética , Perfilación de la Expresión Génica , Secuenciación de Nucleótidos de Alto Rendimiento , Análisis de Secuencia por Matrices de Oligonucleótidos , Minería de Datos , Humanos
19.
Artículo en Inglés | MEDLINE | ID: mdl-27589961

RESUMEN

Fully automated text mining (TM) systems promote efficient literature searching, retrieval, and review but are not sufficient to produce ready-to-consume curated documents. These systems are not meant to replace biocurators, but instead to assist them in one or more literature curation steps. To do so, the user interface is an important aspect that needs to be considered for tool adoption. The BioCreative Interactive task (IAT) is a track designed for exploring user-system interactions, promoting development of useful TM tools, and providing a communication channel between the biocuration and the TM communities. In BioCreative V, the IAT track followed a format similar to previous interactive tracks, where the utility and usability of TM tools, as well as the generation of use cases, have been the focal points. The proposed curation tasks are user-centric and formally evaluated by biocurators. In BioCreative V IAT, seven TM systems and 43 biocurators participated. Two levels of user participation were offered to broaden curator involvement and obtain more feedback on usability aspects. The full level participation involved training on the system, curation of a set of documents with and without TM assistance, tracking of time-on-task, and completion of a user survey. The partial level participation was designed to focus on usability aspects of the interface and not the performance per se In this case, biocurators navigated the system by performing pre-designed tasks and then were asked whether they were able to achieve the task and the level of difficulty in completing the task. In this manuscript, we describe the development of the interactive task, from planning to execution and discuss major findings for the systems tested.Database URL: http://www.biocreative.org.


Asunto(s)
Curaduría de Datos/métodos , Minería de Datos/métodos , Procesamiento Automatizado de Datos/métodos
20.
Drug Discov Today ; 21(6): 997-1002, 2016 06.
Artículo en Inglés | MEDLINE | ID: mdl-27179985

RESUMEN

Biomedical text mining of scientific knowledge bases, such as Medline, has received much attention in recent years. Given that text mining is able to automatically extract biomedical facts that revolve around entities such as genes, proteins, and drugs, from unstructured text sources, it is seen as a major enabler to foster biomedical research and drug discovery. In contrast to the biomedical literature, research into the mining of biomedical patents has not reached the same level of maturity. Here, we review existing work and highlight the associated technical challenges that emerge from automatically extracting facts from patents. We conclude by outlining potential future directions in this domain that could help drive biomedical research and drug discovery.


Asunto(s)
Minería de Datos , Patentes como Asunto , Investigación Biomédica , Descubrimiento de Drogas
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA