Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 31
Filtrar
Más filtros

Bases de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Brief Bioinform ; 22(5)2021 09 02.
Artículo en Inglés | MEDLINE | ID: mdl-33847357

RESUMEN

Bridging heterogeneous mutation data fills in the gap between various data categories and propels discovery of disease-related genes. It is known that genome-wide association study (GWAS) infers significant mutation associations that link genotype and phenotype. However, due to the differences of size and quality between GWAS studies, not all de facto vital variations are able to pass the multiple testing. In the meantime, mutation events widely reported in literature unveil typical functional biological process, including mutation types like gain of function and loss of function. To bring together the heterogeneous mutation data, we propose a 'Gene-Disease Association prediction by Mutation Data Bridging (GDAMDB)' pipeline with a statistic generative model. The model learns the distribution parameters of mutation associations and mutation types and recovers false-negative GWAS mutations that fail to pass significant test but represent supportive evidences of functional biological process in literature. Eventually, we applied GDAMDB in Alzheimer's disease (AD) and predicted 79 AD-associated genes. Besides, 12 of them from the original GWAS, 60 of them are supported to be AD-related by other GWAS or literature report, and rest of them are newly predicted genes. Our model is capable of enhancing the GWAS-based gene association discovery by well combining text mining results. The positive result indicates that bridging the heterogeneous mutation data is contributory for the novel disease-related gene discovery.


Asunto(s)
Enfermedad de Alzheimer/genética , Estudios de Asociación Genética/métodos , Predisposición Genética a la Enfermedad/genética , Estudio de Asociación del Genoma Completo/métodos , Mutación , Polimorfismo de Nucleótido Simple , Algoritmos , Biología Computacional/métodos , Minería de Datos/métodos , Redes Reguladoras de Genes/genética , Genotipo , Humanos , Fenotipo , Mapas de Interacción de Proteínas/genética , Reproducibilidad de los Resultados
2.
J Biomed Inform ; 126: 103973, 2022 02.
Artículo en Inglés | MEDLINE | ID: mdl-34995810

RESUMEN

MOTIVATION: Node embedding of biological entity network has been widely investigated for the downstream application scenarios. To embed full semantics of gene and disease, a multi-relational heterogeneous graph is considered in a scenario where uni-relation between gene/disease and other heterogeneous entities are abundant while multi-relation between gene and disease is relatively sparse. After introducing this novel graph format, it is illuminative to design a specific data integration algorithm to fully capture the graph information and bring embeddings with high quality. RESULTS: First, a typical multi-relational triple dataset was introduced, which carried significant association between gene and disease. Second, we curated all human genes and diseases in seven mainstream datasets and constructed a large-scale gene-disease network, which compromising 163,024 nodes and 25,265,607 edges, and relates to 27,165 genes, 2,665 diseases, 15,067 chemicals, 108,023 mutations, 2,363 pathways, and 7.732 phenotypes. Third, we proposed a Joint Decomposition of Heterogeneous Matrix and Tensor (JDHMT) model, which integrated all heterogeneous data resources and obtained embedding for each gene or disease. Forth, a visualized intrinsic evaluation was performed, which investigated the embeddings in terms of interpretable data clustering. Furthermore, an extrinsic evaluation was performed in the form of linking prediction. Both intrinsic and extrinsic evaluation results showed that JDHMT model outperformed other eleven state-of-the-art (SOTA) methods which are under relation-learning, proximity-preserving or message-passing paradigms. Finally, the constructed gene-disease network, embedding results and codes were made available. DATA AND CODES AVAILABILITY: The constructed massive gene-disease network is available at: https://hzaubionlp.com/heterogeneous-biological-network/. The codes are available at: https://github.com/bionlp-hzau/JDHMT.


Asunto(s)
Algoritmos , Semántica , Aprendizaje , Fenotipo
3.
J Med Internet Res ; 22(8): e20773, 2020 Aug 14.
Artículo en Inglés | MEDLINE | ID: mdl-32759101

RESUMEN

BACKGROUND: A novel disease poses special challenges for informatics solutions. Biomedical informatics relies for the most part on structured data, which require a preexisting data or knowledge model; however, novel diseases do not have preexisting knowledge models. In an emergent epidemic, language processing can enable rapid conversion of unstructured text to a novel knowledge model. However, although this idea has often been suggested, no opportunity has arisen to actually test it in real time. The current coronavirus disease (COVID-19) pandemic presents such an opportunity. OBJECTIVE: The aim of this study was to evaluate the added value of information from clinical text in response to emergent diseases using natural language processing (NLP). METHODS: We explored the effects of long-term treatment by calcium channel blockers on the outcomes of COVID-19 infection in patients with high blood pressure during in-patient hospital stays using two sources of information: data available strictly from structured electronic health records (EHRs) and data available through structured EHRs and text mining. RESULTS: In this multicenter study involving 39 hospitals, text mining increased the statistical power sufficiently to change a negative result for an adjusted hazard ratio to a positive one. Compared to the baseline structured data, the number of patients available for inclusion in the study increased by 2.95 times, the amount of available information on medications increased by 7.2 times, and the amount of additional phenotypic information increased by 11.9 times. CONCLUSIONS: In our study, use of calcium channel blockers was associated with decreased in-hospital mortality in patients with COVID-19 infection. This finding was obtained by quickly adapting an NLP pipeline to the domain of the novel disease; the adapted pipeline still performed sufficiently to extract useful information. When that information was used to supplement existing structured data, the sample size could be increased sufficiently to see treatment effects that were not previously statistically detectable.


Asunto(s)
Betacoronavirus , Bloqueadores de los Canales de Calcio/uso terapéutico , Infecciones por Coronavirus/tratamiento farmacológico , Hipertensión/complicaciones , Procesamiento de Lenguaje Natural , Neumonía Viral/tratamiento farmacológico , COVID-19 , Infecciones por Coronavirus/complicaciones , Minería de Datos , Registros Electrónicos de Salud , Humanos , Pandemias , Neumonía Viral/complicaciones , SARS-CoV-2 , Factores de Tiempo , Tratamiento Farmacológico de COVID-19
4.
Breast Cancer Res Treat ; 162(3): 571-580, 2017 04.
Artículo en Inglés | MEDLINE | ID: mdl-28190250

RESUMEN

PURPOSE: To examine the association of plasma carotenoids, micronutrients in fruits, and vegetables, with risk of premalignant breast disease (PBD) in younger women. METHODS: Blood samples were collected at the Siteman Cancer Center between 2008 and 2012 from 3537 women aged 50 or younger with no history of cancer or PBD. The analysis included 147 participants diagnosed with benign breast disease or breast carcinoma in situ during a 27-month follow-up and 293 controls. Cases and controls were matched on age, race/ethnicity, and date of and fasting status at blood draw. Plasma carotenoids were quantified. We used logistic regression to calculate odds ratios (ORs) and 95% confidence intervals (CIs) and linear regression to assess racial differences in plasma carotenoids. RESULTS: The risk reduction between the highest and lowest tertiles varied by carotenoid, with ß-cryptoxanthin having the greatest reduction (OR 0.62; 95% CI, 0.62-1.09; P trend = 0.056) and total carotenoids the least (OR 0.83; 95% CI, 0.48-1.44; P trend = 0.12). We observed an inverse association between plasma carotenoids and risk of PBD in obese women (BMI ≥ 30 kg/m2; 61 cases and 115 controls) but not lean women (BMI < 25 kg/m2; 54 cases and 79 controls), although the interaction was not statistically significant. Compared to white women, black women had lower levels of α and ß-carotene and higher levels of ß-cryptoxanthin and lutein/zeaxanthin. CONCLUSIONS: We observed suggestive inverse associations between plasma carotenoids and risk of PBD in younger women, consistent with inverse associations reported for invasive breast cancer. Carotenoids may play a role early in breast cancer development.


Asunto(s)
Neoplasias de la Mama/sangre , Neoplasias de la Mama/patología , Carotenoides/sangre , Lesiones Precancerosas/sangre , Lesiones Precancerosas/patología , Adulto , Factores de Edad , Biomarcadores , Biopsia , Estudios de Casos y Controles , Femenino , Humanos , Persona de Mediana Edad , Oportunidad Relativa , Riesgo , Adulto Joven
5.
Mycopathologia ; 182(3-4): 315-318, 2017 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-27822731

RESUMEN

Pseudomonas aeruginosa and Aspergillus fumigatus are the leading bacterial and fungal pathogens in cystic fibrosis (CF). We have shown that Af biofilms are susceptible to Pseudomonas, particularly CF phenotypes. Those studies were performed with a reference virulent non-CF Aspergillus. Pseudomonas resident in CF airways undergo profound genetic and phenotypic adaptations to the abnormal environment. Studies have also indicated Aspergillus from CF patients have unexpected profiles of antifungal susceptibility. This would suggest that Aspergillus isolates from CF patients may be different or altered from other clinical isolates. It is important to know whether Aspergillus may also be altered, as a result of that CF environment, in susceptibility to Pseudomonas. CF Aspergillus proved not different in that susceptibility.


Asunto(s)
Aspergilosis/microbiología , Aspergillus fumigatus/aislamiento & purificación , Aspergillus fumigatus/fisiología , Biopelículas/crecimiento & desarrollo , Fibrosis Quística/complicaciones , Interacciones Microbianas , Pseudomonas aeruginosa/fisiología , Antifúngicos/farmacología , Aspergillus fumigatus/efectos de los fármacos , Humanos , Viabilidad Microbiana , Pseudomonas aeruginosa/aislamiento & purificación
6.
Antimicrob Agents Chemother ; 59(10): 6514-20, 2015 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-26239975

RESUMEN

Iron acquisition is crucial for the growth of Aspergillus fumigatus. A. fumigatus biofilm formation occurs in vitro and in vivo and is associated with physiological changes. In this study, we assessed the effects of Fe chelators on biofilm formation and development. Deferiprone (DFP), deferasirox (DFS), and deferoxamine (DFM) were tested for MIC against a reference isolate via a broth macrodilution method. The metabolic effects (assessed by XTT [2,3-bis[2-methoxy-4-nitro-5-sulfophenyl]-2H-tetrazolium-5-carboxanilide inner salt]) on biofilm formation by conidia were studied upon exposure to DFP, DFM, DFP plus FeCl3, or FeCl3 alone. A preformed biofilm was exposed to DFP with or without FeCl3. The DFP and DFS MIC50 against planktonic A. fumigatus was 1,250 µM, and XTT gave the same result. DFM showed no planktonic inhibition at concentrations of ≤2,500 µM. By XTT testing, DFM concentrations of <1,250 µM had no effect, whereas DFP at 2,500 µM increased biofilms forming in A. fumigatus or preformed biofilms (P < 0.01). DFP at 156 to 2,500 µM inhibited biofilm formation (P < 0.01 to 0.001) in a dose-responsive manner. Biofilm formation with 625 µM DFP plus any concentration of FeCl3 was lower than that in the controls (P < 0.05 to 0.001). FeCl3 at ≥625 µM reversed the DFP inhibitory effect (P < 0.05 to 0.01), but the reversal was incomplete compared to the controls (P < 0.05 to 0.01). For preformed biofilms, DFP in the range of ≥625 to 1,250 µM was inhibitory compared to the controls (P < 0.01 to 0.001). FeCl3 at ≥625 µM overcame inhibition by 625 µM DFP (P < 0.001). FeCl3 alone at ≥156 µM stimulated biofilm formation (P < 0.05 to 0.001). Preformed A. fumigatus biofilm increased with 2,500 µM FeCl3 only (P < 0.05). In a strain survey, various susceptibilities of biofilms of A. fumigatus clinical isolates to DFP were noted. In conclusion, iron stimulates biofilm formation and preformed biofilms. Chelators can inhibit or enhance biofilms. Chelation may be a potential therapy for A. fumigatus, but we show here that chelators must be chosen carefully. Individual isolate susceptibility assessments may be needed.


Asunto(s)
Antifúngicos/farmacología , Aspergillus fumigatus/efectos de los fármacos , Benzoatos/farmacología , Biopelículas/efectos de los fármacos , Deferoxamina/farmacología , Quelantes del Hierro/farmacología , Piridonas/farmacología , Triazoles/farmacología , Aspergillus fumigatus/crecimiento & desarrollo , Aspergillus fumigatus/metabolismo , Biopelículas/crecimiento & desarrollo , Cloruros/farmacología , Deferasirox , Deferiprona , Compuestos Férricos/farmacología , Hierro/metabolismo , Pruebas de Sensibilidad Microbiana , Plancton/efectos de los fármacos , Plancton/crecimiento & desarrollo , Plancton/metabolismo , Esporas Fúngicas/efectos de los fármacos , Esporas Fúngicas/crecimiento & desarrollo , Esporas Fúngicas/metabolismo , Sales de Tetrazolio
7.
BMC Med Inform Decis Mak ; 13: 53, 2013 Apr 24.
Artículo en Inglés | MEDLINE | ID: mdl-23617267

RESUMEN

BACKGROUND: Cincinnati Children's Hospital Medical Center (CCHMC) has built the initial Natural Language Processing (NLP) component to extract medications with their corresponding medical conditions (Indications, Contraindications, Overdosage, and Adverse Reactions) as triples of medication-related information ([(1) drug name]-[(2) medical condition]-[(3) LOINC section header]) for an intelligent database system, in order to improve patient safety and the quality of health care. The Food and Drug Administration's (FDA) drug labels are used to demonstrate the feasibility of building the triples as an intelligent database system task. METHODS: This paper discusses a hybrid NLP system, called AutoMCExtractor, to collect medical conditions (including disease/disorder and sign/symptom) from drug labels published by the FDA. Altogether, 6,611 medical conditions in a manually-annotated gold standard were used for the system evaluation. The pre-processing step extracted the plain text from XML file and detected eight related LOINC sections (e.g. Adverse Reactions, Warnings and Precautions) for medical condition extraction. Conditional Random Fields (CRF) classifiers, trained on token, linguistic, and semantic features, were then used for medical condition extraction. Lastly, dictionary-based post-processing corrected boundary-detection errors of the CRF step. We evaluated the AutoMCExtractor on manually-annotated FDA drug labels and report the results on both token and span levels. RESULTS: Precision, recall, and F-measure were 0.90, 0.81, and 0.85, respectively, for the span level exact match; for the token-level evaluation, precision, recall, and F-measure were 0.92, 0.73, and 0.82, respectively. CONCLUSIONS: The results demonstrate that (1) medical conditions can be extracted from FDA drug labels with high performance; and (2) it is feasible to develop a framework for an intelligent database system.


Asunto(s)
Sistemas de Registro de Reacción Adversa a Medicamentos , Minería de Datos/métodos , Etiquetado de Medicamentos , United States Food and Drug Administration , Humanos , Sistemas de Medicación , Procesamiento de Lenguaje Natural , Ohio , Estados Unidos
8.
BMC Bioinformatics ; 13: 207, 2012 Aug 17.
Artículo en Inglés | MEDLINE | ID: mdl-22901054

RESUMEN

BACKGROUND: We introduce the linguistic annotation of a corpus of 97 full-text biomedical publications, known as the Colorado Richly Annotated Full Text (CRAFT) corpus. We further assess the performance of existing tools for performing sentence splitting, tokenization, syntactic parsing, and named entity recognition on this corpus. RESULTS: Many biomedical natural language processing systems demonstrated large differences between their previously published results and their performance on the CRAFT corpus when tested with the publicly available models or rule sets. Trainable systems differed widely with respect to their ability to build high-performing models based on this data. CONCLUSIONS: The finding that some systems were able to train high-performing models based on this corpus is additional evidence, beyond high inter-annotator agreement, that the quality of the CRAFT corpus is high. The overall poor performance of various systems indicates that considerable work needs to be done to enable natural language processing systems to work well when the input is full-text journal articles. The CRAFT corpus provides a valuable resource to the biomedical natural language processing community for evaluation and training of new models for biomedical full text publications.


Asunto(s)
Minería de Datos/métodos , Procesamiento de Lenguaje Natural , Programas Informáticos
9.
Abdom Radiol (NY) ; 47(8): 2721-2729, 2022 08.
Artículo en Inglés | MEDLINE | ID: mdl-35072783

RESUMEN

Abdominal radiologists perform a wide variety of image-guided interventions. Procedures performed by abdominal radiologists can be broadly categorized into paracentesis, thoracentesis, superficial and deep soft tissue biopsy, drain placement, and ablation. As these procedures continue to develop as an alternative to more invasive and potentially morbid interventions, and with continued improvements in minimally invasive technologies, it becomes increasingly important for abdominal radiologists to be familiar with options for peri-procedural analgesia and anxiolysis, as well as when to consult anesthesiology. In this review, we discuss analgesic, anxiolytic, and nonpharmacologic options available to the abdominal radiologist. We focus on practical agents that are relatively safe for general use, special populations, and considerations for post-procedural monitoring.


Asunto(s)
Analgesia , Radiólogos , Drenaje/métodos , Humanos , Manejo del Dolor , Paracentesis
10.
BMC Bioinformatics ; 12 Suppl 8: S1, 2011 Oct 03.
Artículo en Inglés | MEDLINE | ID: mdl-22151647

RESUMEN

BACKGROUND: The overall goal of the BioCreative Workshops is to promote the development of text mining and text processing tools which are useful to the communities of researchers and database curators in the biological sciences. To this end BioCreative I was held in 2004, BioCreative II in 2007, and BioCreative II.5 in 2009. Each of these workshops involved humanly annotated test data for several basic tasks in text mining applied to the biomedical literature. Participants in the workshops were invited to compete in the tasks by constructing software systems to perform the tasks automatically and were given scores based on their performance. The results of these workshops have benefited the community in several ways. They have 1) provided evidence for the most effective methods currently available to solve specific problems; 2) revealed the current state of the art for performance on those problems; 3) and provided gold standard data and results on that data by which future advances can be gauged. This special issue contains overview papers for the three tasks of BioCreative III. RESULTS: The BioCreative III Workshop was held in September of 2010 and continued the tradition of a challenge evaluation on several tasks judged basic to effective text mining in biology, including a gene normalization (GN) task and two protein-protein interaction (PPI) tasks. In total the Workshop involved the work of twenty-three teams. Thirteen teams participated in the GN task which required the assignment of EntrezGene IDs to all named genes in full text papers without any species information being provided to a system. Ten teams participated in the PPI article classification task (ACT) requiring a system to classify and rank a PubMed® record as belonging to an article either having or not having "PPI relevant" information. Eight teams participated in the PPI interaction method task (IMT) where systems were given full text documents and were required to extract the experimental methods used to establish PPIs and a text segment supporting each such method. Gold standard data was compiled for each of these tasks and participants competed in developing systems to perform the tasks automatically.BioCreative III also introduced a new interactive task (IAT), run as a demonstration task. The goal was to develop an interactive system to facilitate a user's annotation of the unique database identifiers for all the genes appearing in an article. This task included ranking genes by importance (based preferably on the amount of described experimental information regarding genes). There was also an optional task to assist the user in finding the most relevant articles about a given gene. For BioCreative III, a user advisory group (UAG) was assembled and played an important role 1) in producing some of the gold standard annotations for the GN task, 2) in critiquing IAT systems, and 3) in providing guidance for a future more rigorous evaluation of IAT systems. Six teams participated in the IAT demonstration task and received feedback on their systems from the UAG group. Besides innovations in the GN and PPI tasks making them more realistic and practical and the introduction of the IAT task, discussions were begun on community data standards to promote interoperability and on user requirements and evaluation metrics to address utility and usability of systems. CONCLUSIONS: In this paper we give a brief history of the BioCreative Workshops and how they relate to other text mining competitions in biology. This is followed by a synopsis of the three tasks GN, PPI, and IAT in BioCreative III with figures for best participant performance on the GN and PPI tasks. These results are discussed and compared with results from previous BioCreative Workshops and we conclude that the best performing systems for GN, PPI-ACT and PPI-IMT in realistic settings are not sufficient for fully automatic use. This provides evidence for the importance of interactive systems and we present our vision of how best to construct an interactive system for a GN or PPI like task in the remainder of the paper.


Asunto(s)
Biología Computacional/métodos , Minería de Datos , Genes , Proteínas/metabolismo , Programas Informáticos , Animales , Biología Computacional/normas , Humanos , Publicaciones Periódicas como Asunto , Proteínas/genética
11.
Int J Med Inform ; 149: 104410, 2021 05.
Artículo en Inglés | MEDLINE | ID: mdl-33621793

RESUMEN

BACKGROUND: Decision making in the Emergency Department (ED) requires timely identification of clinical information relevant to the complaints. Existing information retrieval solutions for the electronic health record (EHR) focus on patient cohort identification and lack clinical relevancy ranking. We aimed to compare knowledge-based (KB) and unsupervised statistical methods for ranking EHR information by relevancy to a chief complaint of chest or back pain among ED patients. METHODS: We used Pointwise-mutual information (PMI) with corpus level significance adjustment (cPMId), which modifies PMI to reward co-occurrence patterns with a higher absolute count. cPMId for each pair of medication/problem and chief complaint was estimated from a corpus of 100,000 un-annotated ED encounters. Five specialist physicians ranked the relevancy of medications and problems to each chief complaint on a 0-4 Likert scale to form the KB ranking. Reverse chronological order was used as a baseline. We directly compared the three methods on 1010 medications and 2913 problems from 99 patients with chest or back pain, where each item was manually labeled as relevant or not to the chief complaint, using mean average-precision. RESULTS: cPMId out-performed KB ranking on problems (86.8% vs. 81.3%, p < 0.01) but under-performed it on medications (93.1% vs. 96.8%, p < 0.01). Both methods significantly outperformed the baseline for both medications and problems (71.8% and 72.1%, respectively, p < 0.01 for both comparisons). The two complaints represented virtually completely different information needs (average Jaccard index of 0.008). CONCLUSION: A fully unsupervised statistical method can provide a reasonably accurate, low-effort and scalable means for situation-specific ranking of clinical information within the EHR.


Asunto(s)
Registros Electrónicos de Salud , Servicio de Urgencia en Hospital , Humanos , Almacenamiento y Recuperación de la Información
13.
Suicide Life Threat Behav ; 50(5): 939-947, 2020 10.
Artículo en Inglés | MEDLINE | ID: mdl-32484597

RESUMEN

OBJECTIVE: With early identification and intervention, many suicidal deaths are preventable. Tools that include machine learning methods have been able to identify suicidal language. This paper examines the persistence of this suicidal language up to 30 days after discharge from care. METHOD: In a multi-center study, 253 subjects were enrolled into either suicidal or control cohorts. Their responses to standardized instruments and interviews were analyzed using machine learning algorithms. Subjects were re-interviewed approximately 30 days later, and their language was compared to the original language to determine the presence of suicidal ideation. RESULTS: The results show that language characteristics used to classify suicidality at the initial encounter are still present in the speech 30 days later (AUC = 89% (95% CI: 85-95%), p < .0001) and that algorithms trained on the second interviews could also identify the subjects that produced the first interviews (AUC = 85% (95% CI: 81-90%), p < .0001). CONCLUSIONS: This approach explores the stability of suicidal language. When using advanced computational methods, the results show that a patient's language is similar 30 days after first captured, while responses to standard measures change. This can be useful when developing methods that identify the data-based phenotype of a subject.


Asunto(s)
Lenguaje , Ideación Suicida , Algoritmos , Humanos , Aprendizaje Automático , Medición de Riesgo
14.
F1000Res ; 9: 136, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-32308977

RESUMEN

We report on the activities of the 2015 edition of the BioHackathon, an annual event that brings together researchers and developers from around the world to develop tools and technologies that promote the reusability of biological data. We discuss issues surrounding the representation, publication, integration, mining and reuse of biological data and metadata across a wide range of biomedical data types of relevance for the life sciences, including chemistry, genotypes and phenotypes, orthology and phylogeny, proteomics, genomics, glycomics, and metabolomics. We describe our progress to address ongoing challenges to the reusability and reproducibility of research results, and identify outstanding issues that continue to impede the progress of bioinformatics research. We share our perspective on the state of the art, continued challenges, and goals for future research and development for the life sciences Semantic Web.


Asunto(s)
Disciplinas de las Ciencias Biológicas , Biología Computacional , Web Semántica , Minería de Datos , Metadatos , Reproducibilidad de los Resultados
15.
Brief Bioinform ; 8(5): 358-75, 2007 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-17977867

RESUMEN

It is now almost 15 years since the publication of the first paper on text mining in the genomics domain, and decades since the first paper on text mining in the medical domain. Enormous progress has been made in the areas of information retrieval, evaluation methodologies and resource construction. Some problems, such as abbreviation-handling, can essentially be considered solved problems, and others, such as identification of gene mentions in text, seem likely to be solved soon. However, a number of problems at the frontiers of biomedical text mining continue to present interesting challenges and opportunities for great improvements and interesting research. In this article we review the current state of the art in biomedical text mining or 'BioNLP' in general, focusing primarily on papers published within the past year.


Asunto(s)
Indización y Redacción de Resúmenes/tendencias , Inteligencia Artificial , Biología/tendencias , Bases de Datos Bibliográficas/tendencias , Procesamiento de Lenguaje Natural , Publicaciones Periódicas como Asunto , Predicción , Vocabulario Controlado
16.
Ann Clin Transl Neurol ; 6(7): 1352-1357, 2019 07.
Artículo en Inglés | MEDLINE | ID: mdl-31353851

RESUMEN

Communication accommodation describes how individuals adjust their communicative style to that of their conversational partner. We predicted that interpersonal prosodic correlation related to pitch and timing would be decreased in behavioral variant frontotemporal dementia (bvFTD). We predicted that the interpersonal correlation in a timing measure and a pitch measure would be increased in right temporal FTD (rtFTD) due to sparing of the neural substrate for speech timing and pitch modulation but loss of social semantics. We found no significant effects in bvFTD, but conversations including rtFTD demonstrated higher interpersonal correlations in speech rate than healthy controls.


Asunto(s)
Comunicación , Demencia Frontotemporal/psicología , Habla , Anciano , Femenino , Demencia Frontotemporal/patología , Humanos , Masculino , Persona de Mediana Edad
17.
Math Biosci Eng ; 16(3): 1376-1391, 2019 02 20.
Artículo en Inglés | MEDLINE | ID: mdl-30947425

RESUMEN

For discovery of new usage of drugs, the function type of their target genes plays an important role, and the hypothesis of "Antagonist-GOF" and "Agonist-LOF" has laid a solid foundation for supporting drug repurposing. In this research, an active gene annotation corpus was used as training data to predict the gain-of-function or loss-of-function or unknown character of each human gene after variation events. Unlike the design of(entity, predicate, entity) triples in a traditional three way tensor, a four way and a five way tensor, GMFD-/GMAFD-tensor, were designed to represent higher order links among or among part of these entities: genes(G), mutations(M), functions(F), diseases( D) and annotation labels(A). A tensor decomposition algorithm, CP decomposition, was applied to the higher order tensor and to unveil the correlation among entities. Meanwhile, a state-of-the-art baseline tensor decomposition algorithm, RESCAL, was carried on the three way tensor as a comparing method. The result showed that CP decomposition on higher order tensor performed better than RESCAL on traditional three way tensor in recovering masked data and making predictions. In addition, The four way tensor was proved to be the best format for our issue. At the end, a case study reproducing two disease-gene-drug links(Myelodysplatic Syndromes-IL2RA-Aldesleukin, Lymphoma- IL2RA-Aldesleukin) presented the feasibility of our prediction model for drug repurposing.


Asunto(s)
Reposicionamiento de Medicamentos/economía , Reposicionamiento de Medicamentos/métodos , Variación Genética , Aprendizaje Automático , Mutación , Algoritmos , Análisis Costo-Beneficio , Enfermedades Genéticas Congénitas/genética , Humanos , Interleucina-2/análogos & derivados , Interleucina-2/uso terapéutico , Subunidad alfa del Receptor de Interleucina-2/genética , Linfoma/genética , Modelos Genéticos , Anotación de Secuencia Molecular , Síndromes Mielodisplásicos/genética , Proteínas Recombinantes/uso terapéutico , Programas Informáticos
18.
Stud Health Technol Inform ; 245: 298-302, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-29295103

RESUMEN

Human-annotated data is a fundamental part of natural language processing system development and evaluation. The quality of that data is typically assessed by calculating the agreement between the annotators. It is widely assumed that this agreement between annotators is the upper limit on system performance in natural language processing: if humans can't agree with each other about the classification more than some percentage of the time, we don't expect a computer to do any better. We trace the logical positivist roots of the motivation for measuring inter-annotator agreement, demonstrate the prevalence of the widely-held assumption about the relationship between inter-annotator agreement and system performance, and present data that suggest that inter-annotator agreement is not, in fact, an upper bound on language processing system performance.


Asunto(s)
Curaduría de Datos , Procesamiento de Lenguaje Natural , Humanos , Variaciones Dependientes del Observador
19.
Suicide Life Threat Behav ; 47(1): 112-121, 2017 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-27813129

RESUMEN

Death by suicide demonstrates profound personal suffering and societal failure. While basic sciences provide the opportunity to understand biological markers related to suicide, computer science provides opportunities to understand suicide thought markers. In this novel prospective, multimodal, multicenter, mixed demographic study, we used machine learning to measure and fuse two classes of suicidal thought markers: verbal and nonverbal. Machine learning algorithms were used with the subjects' words and vocal characteristics to classify 379 subjects recruited from two academic medical centers and a rural community hospital into one of three groups: suicidal, mentally ill but not suicidal, or controls. By combining linguistic and acoustic characteristics, subjects could be classified into one of the three groups with up to 85% accuracy. The results provide insight into how advanced technology can be used for suicide assessment and prevention.


Asunto(s)
Aprendizaje Automático , Ideación Suicida , Prevención del Suicidio , Suicidio , Adolescente , Adulto , Inteligencia Artificial , Diagnóstico por Computador/métodos , Femenino , Humanos , Masculino , Pronóstico , Estudios Prospectivos , Suicidio/psicología
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA