Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 44
Filtrar
1.
Artigo em Inglês | MEDLINE | ID: mdl-38584531

RESUMO

BACKGROUND: Colorectal cancer (CRC) remains a significant contributor to mortality, often exacerbated by metastasis and chemoresistance. Novel therapeutic strategies are imperative to enhance current treatments. The dysregulation of the PI3K/Akt signaling pathway is implicated in CRC progression. This study investigates the therapeutic potential of Wortmannin, combined with 5-fluorouracil (5-FU), to target the PI3K/Akt pathway in CRC. METHODS: Anti-migratory and antiproliferative effects were assessed through wound healing and MTT assays. Apoptosis and cell cycle alterations were evaluated using Annexin V/Propidium Iodide Apoptosis Assay. Wortmannin's impact on the oxidant/antioxidant equilibrium was examined via ROS, SOD, CAT, MDA, and T-SH levels. Downstream target genes of the PI3K/AKT pathway were analyzed at mRNA and protein levels using RTPCR and western blot, respectively. RESULTS: Wortmannin demonstrated a significant inhibitory effect on cell proliferation, modulating survivin, cyclinD1, PI3K, and p-Akt. The PI3K inhibitor attenuated migratory activity, inducing E-cadherin expression. Combined Wortmannin with 5-FU induced apoptosis, increasing cells in sub-G1 via elevated ROS levels. CONCLUSION: This study underscores Wortmannin's potential in inhibiting CRC cell growth and migration through PI3K/Akt pathway modulation. It also highlights its candidacy for further investigation as a promising therapeutic option in colorectal cancer treatment.

2.
Iran J Basic Med Sci ; 24(12): 1743-1752, 2021 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-35432810

RESUMO

Objectives: Dental pulp stem cells (DPSCs) can differentiate into functional neurons and have the potential for cell therapy in neurological diseases. Granulocyte colony-stimulating factor (G-CSF) is a glycoprotein family shown neuroprotective effect in models of nerve damage.we evaluated the protective effects of G-CSF, conditioned media from DPSCs (DPSCs-CM) and conditioned media from transfected DPSCs with plasmid encoding G-CSF (DPSC-CMT) on SH-SY5Y exposed to CoCl2 as a model of hypoxia-induced neural damage. Materials and Methods: SH-SY5Y exposed to CoCl2 were treated with DPSCs-CM, G-CSF, simultaneous combination of DPSCs-CM and G-CSF and finally DPSC-CMT. Cell viability and apoptosis were determined by resazurin (or lactate dehydrogenase (LDH) assay alternatively) and propidium iodide (PI) staining. Western blot analysis was performed to detect changes in apoptotic protein levels. The interleukin-6 and interleukin-10 IL6/IL10 levels were measured with Enzyme-Linked Immunosorbent Assay (ELISA). Results: DPSCs-CM and G-CSF were able to significantly protect SH-SY5Y against neural cell damage caused by CoCl2 according to resazurin and LDH analysis. Also, the percentage of apoptotic cells decreased when SH-SY5Y were treated with DPSCs-CM and G-CSF simultaneously. After transfection of DPSCs with G-CSF plasmid, DPSC-CMT could significantly improve the protection. The amount of ß-catenin, cleaved PARP and caspase-3 were significantly decreased and the expression of survivin was considerably increased when hypoxic SH-SY5Y treated with DPSCs-CM plus G-CSF according to Western blot. Decreased level of IL-6/IL-10, which exposed to CoCl2, after treatment with DPSCs-CM indicated the suppression of inflammatory mediators. Conclusion: Combination therapy of G-CSF and DPSCs-CM improved the protective activity.

3.
Exp Dermatol ; 30(2): 284-287, 2021 02.
Artigo em Inglês | MEDLINE | ID: mdl-33217035

RESUMO

Previous studies have found an association between HLA-B*1502 allele and lamotrigine-induced Stevens-Johnson syndrome (SJS)/ toxic epidermal necrosis (TEN) spectrum in Han Chinese populations. This study aims to investigate the association between HLA-B*1502 and lamotrigine- or phenytoin- induced SJS/TEN in an Iranian population. The medical records of twenty-eight lamotrigine-induced SJS/TEN patients and twenty-five lamotrigine-tolerant controls as well as eight phenytoin-induced SJS/TEN and twelve phenytoin-tolerant controls were extracted between March 2013 and March 2019 from the university hospitals in Mashhad, Iran. The presence of HLA-B*1502 allele was determined using real-time polymerase chain reaction (PCR). Among lamotrigine-induced patients with SJS/TEN, 11 (39.3%) patients tested positive for the HLA-B*1502 while only 3 (12.0%) of the lamotrigine-tolerant controls tested positive for this allele. The risk of lamotrigine-induced SJS/TEN was significantly higher in patients with HLA-B*1502, with an odds ratio (OR) of 4.74 [95% confidence interval (CI) 1.14-19.73, p = 0.032]. Sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) of HLA-B*1502 for lamotrigine-induced SJS/TEN was 39.29%, 88.00%, 78.57% and 56.41%, respectively. The HLA-B*1502 allele was present in 2 (25.0%) of phenytoin-induced SJS/TEN cases and 5 (41.7%) of the phenytoin-tolerant controls tested positive for HLA-B*1502 allele. The risk of phenytoin-induced SJS/TEN was not higher in the patients with HLA-B*1502 (OR = 0.467 [95% confidence interval (CI) 0.065-3.34, p = 0.642]). Lamotrigine-induced SJS/TEN is associated with HLA-B*1502 allele in an Iranian population but this is not the case for phenytoin-induced SJS/TEN.


Assuntos
Anticonvulsivantes/efeitos adversos , Antígeno HLA-B15/genética , Lamotrigina/efeitos adversos , Fenitoína/efeitos adversos , Síndrome de Stevens-Johnson/genética , Adulto , Alelos , Estudos de Casos e Controles , Feminino , Humanos , Irã (Geográfico) , Masculino , Pessoa de Meia-Idade , Valor Preditivo dos Testes , Fatores de Risco , Síndrome de Stevens-Johnson/etiologia , Adulto Jovem
4.
Database (Oxford) ; 20192019 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-31603193

RESUMO

Knowledge of the molecular interactions of biological and chemical entities and their involvement in biological processes or clinical phenotypes is important for data interpretation. Unfortunately, this knowledge is mostly embedded in the literature in such a way that it is unavailable for automated data analysis procedures. Biological expression language (BEL) is a syntax representation allowing for the structured representation of a broad range of biological relationships. It is used in various situations to extract such knowledge and transform it into BEL networks. To support the tedious and time-intensive extraction work of curators with automated methods, we developed the BEL track within the framework of BioCreative Challenges. Within the BEL track, we provide training data and an evaluation environment to encourage the text mining community to tackle the automatic extraction of complex BEL relationships. In 2017 BioCreative VI, the 2015 BEL track was repeated with new test data. Although only minor improvements in text snippet retrieval for given statements were achieved during this second BEL task iteration, a significant increase of BEL statement extraction performance from provided sentences could be seen. The best performing system reached a 32% F-score for the extraction of complete BEL statements and with the given named entities this increased to 49%. This time, besides rule-based systems, new methods involving hierarchical sequence labeling and neural networks were applied for BEL statement extraction.


Assuntos
Mineração de Dados , Bases de Dados Factuais , Redes Neurais de Computação , Vocabulário Controlado
5.
J Biol Res (Thessalon) ; 26: 8, 2019 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-31548928

RESUMO

BACKGROUND: Skeletal development and its cellular function are regulated by various transcription factors. The T-box (Tbx) family of transcription factors have critical roles in cellular differentiation as well as heart and limbs organogenesis. These factors possess activator and/or repressor domains to modify the expression of target genes. Despite the obvious effects of Tbx20 on heart development, its impact on bone development is still unknown. METHODS: To investigate the consequence by forced Tbx20 expression in the osteogenic differentiation of human mesenchymal stem cells derived from adipose tissue (Ad-MSCs), these cells were transduced with a bicistronic lentiviral vector encoding Tbx20 and an enhanced green fluorescent protein. RESULTS: Tbx20 gene delivery system suppressed the osteogenic differentiation of Ad-MSCs, as indicated by reduction in alkaline phosphatase activity and Alizarin Red S staining. Consistently, reverse transcription-polymerase chain reaction analyses showed that Tbx20 gain-of-function reduced the expression levels of osteoblast marker genes in osteo-inductive Ad-MSCs cultures. Accordingly, Tbx20 negatively affected osteogenesis through modulating expression of key factors involved in this process. CONCLUSION: The present study suggests that Tbx20 could inhibit osteogenic differentiation in adipose-derived human mesenchymal stem cells.

6.
BMC Med Genet ; 20(1): 117, 2019 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-31262253

RESUMO

BACKGROUND: Mesenchymal stem cells (MSCs) are attractive choices in regenerative medicine and can be genetically modified to obtain better results in therapeutics. Bone development and metabolism are controlled by various factors including microRNAs (miRs) interference, which are small non-coding endogenous RNAs. METHODS: In the current study, the effects of forced miR-148b expression was evaluated on osteogenic activity. Human bone marrow-derived mesenchymal stem cells (BM-MSCs) were transduced with bicistronic lentiviral vector encoding hsa-miR-148b-3p or -5p and the enhanced green fluorescent protein. Fourteen days post-transduction, immunostaining as well as Western blotting were used to analyze osteogenesis. RESULTS: Overexpression of miR-148b-3p increased the osteogenic differentiation of human BM-MSCs as demonstrated by anenhancement of mineralized nodular formation and an increase in the levels of osteoblastic differentiation biomarkers, alkaline phosphatase and collagen type I. CONCLUSIONS: Since lentivirally overexpressed miR-148b-3p increased osteogenic differentiation capability of BM-MSCs, this miR could be applied as a therapeutic modulator to optimize bone function.


Assuntos
Medula Óssea/metabolismo , Células-Tronco Mesenquimais/metabolismo , MicroRNAs/genética , MicroRNAs/metabolismo , Osteogênese/genética , Fosfatase Alcalina , Sequência de Bases , Biomarcadores , Medula Óssea/crescimento & desenvolvimento , Medula Óssea/patologia , Diferenciação Celular , Colágeno Tipo I , Vetores Genéticos , Células HEK293 , Humanos , Lentivirus/genética , Células-Tronco Mesenquimais/citologia , Transdução Genética
7.
BMC Med Inform Decis Mak ; 19(Suppl 3): 69, 2019 04 04.
Artigo em Inglês | MEDLINE | ID: mdl-30943957

RESUMO

BACKGROUND: The Health Information Technology for Economic and Clinical Health Act (HITECH) has greatly accelerated the adoption of electronic health records (EHRs) with the promise of better clinical decisions and patients' outcomes. One of the core criteria for "Meaningful Use" of EHRs is to have a problem list that shows the most important health problems faced by a patient. The implementation of problem lists in EHRs has a potential to help practitioners to provide customized care to patients. However, it remains an open question on how to leverage problem lists in different practice settings to provide tailored care, of which the bottleneck lies in the associations between problem list and practice setting. METHODS: In this study, using sampled clinical documents associated with a cohort of patients who received their primary care at Mayo Clinic, we investigated the associations between problem list and practice setting through natural language processing (NLP) and topic modeling techniques. Specifically, after practice settings and problem lists were normalized, statistical χ2 test, term frequency-inverse document frequency (TF-IDF) and enrichment analysis were used to choose representative concepts for each setting. Then Latent Dirichlet Allocations (LDA) were used to train topic models and predict potential practice settings using similarity metrics based on the problem concepts representative of practice settings. Evaluation was conducted through 5-fold cross validation and Recall@k, Precision@k and F1@k were calculated. RESULTS: Our method can generate prioritized and meaningful problem lists corresponding to specific practice settings. For practice setting prediction, recall increases from 0.719 (k = 2) to 0.931 (k = 10), precision increases from 0.882 (k = 2) to 0.931 (k = 10) and F1 increases from 0.790 (k = 2) to 0.931 (k = 10). CONCLUSION: To our best knowledge, our study is the first attempting to discover the association between the problem lists and hospital practice settings. In the future, we plan to investigate how to provide more tailored care by utilizing the association between problem list and practice setting revealed in this study.


Assuntos
Uso Significativo , Informática Médica , Algoritmos , Registros Eletrônicos de Saúde , Hospitais , Humanos , Processamento de Linguagem Natural , Atenção Primária à Saúde
8.
BMC Med Inform Decis Mak ; 19(1): 32, 2019 02 14.
Artigo em Inglês | MEDLINE | ID: mdl-30764825

RESUMO

BACKGROUND: Existing resources to assist the diagnosis of rare diseases are usually curated from the literature that can be limited for clinical use. It often takes substantial effort before the suspicion of a rare disease is even raised to utilize those resources. The primary goal of this study was to apply a data-driven approach to enrich existing rare disease resources by mining phenotype-disease associations from electronic medical record (EMR). METHODS: We first applied association rule mining algorithms on EMR to extract significant phenotype-disease associations and enriched existing rare disease resources (Human Phenotype Ontology and Orphanet (HPO-Orphanet)). We generated phenotype-disease bipartite graphs for HPO-Orphanet, EMR, and enriched knowledge base HPO-Orphanet + and conducted a case study on Hodgkin lymphoma to compare performance on differential diagnosis among these three graphs. RESULTS: We used disease-disease similarity generated by the eRAM, an existing rare disease encyclopedia, as a gold standard to compare the three graphs with sensitivity and specificity as (0.17, 0.36, 0.46) and (0.52, 0.47, 0.51) for three graphs respectively. We also compared the top 15 diseases generated by the HPO-Orphanet + graph with eRAM and another clinical diagnostic tool, the Phenomizer. CONCLUSIONS: Per our evaluation results, our approach was able to enrich existing rare disease knowledge resources with phenotype-disease associations from EMR and thus support rare disease differential diagnosis.


Assuntos
Algoritmos , Mineração de Dados , Registros Eletrônicos de Saúde , Bases de Conhecimento , Doenças Raras , Humanos , Fenótipo , Doenças Raras/diagnóstico
9.
Database (Oxford) ; 20182018 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-30295724

RESUMO

Relation extraction is an important task in the field of natural language processing. In this paper, we describe our approach for the BioCreative VI Task 5: text mining chemical-protein interactions. We investigate multiple deep neural network (DNN) models, including convolutional neural networks, recurrent neural networks (RNNs) and attention-based (ATT-) RNNs (ATT-RNNs) to extract chemical-protein relations. Our experimental results indicate that ATT-RNN models outperform the same models without using attention and the ATT-gated recurrent unit (ATT-GRU) achieves the best performing micro average F1 score of 0.527 on the test set among the tested DNNs. In addition, the result of word-level attention weights also shows that attention mechanism is effective on selecting the most important trigger words when trained with semantic relation labels without the need of semantic parsing and feature engineering. The source code of this work is available at https://github.com/ohnlp/att-chemprot.


Assuntos
Algoritmos , Bases de Dados de Compostos Químicos , Bases de Dados de Proteínas , Redes Neurais de Computação , Proteínas/química
10.
J Biomed Inform ; 87: 12-20, 2018 11.
Artigo em Inglês | MEDLINE | ID: mdl-30217670

RESUMO

BACKGROUND: Word embeddings have been prevalently used in biomedical Natural Language Processing (NLP) applications due to the ability of the vector representations being able to capture useful semantic properties and linguistic relationships between words. Different textual resources (e.g., Wikipedia and biomedical literature corpus) have been utilized in biomedical NLP to train word embeddings and these word embeddings have been commonly leveraged as feature input to downstream machine learning models. However, there has been little work on evaluating the word embeddings trained from different textual resources. METHODS: In this study, we empirically evaluated word embeddings trained from four different corpora, namely clinical notes, biomedical publications, Wikipedia, and news. For the former two resources, we trained word embeddings using unstructured electronic health record (EHR) data available at Mayo Clinic and articles (MedLit) from PubMed Central, respectively. For the latter two resources, we used publicly available pre-trained word embeddings, GloVe and Google News. The evaluation was done qualitatively and quantitatively. For the qualitative evaluation, we randomly selected medical terms from three categories (i.e., disorder, symptom, and drug), and manually inspected the five most similar words computed by embeddings for each term. We also analyzed the word embeddings through a 2-dimensional visualization plot of 377 medical terms. For the quantitative evaluation, we conducted both intrinsic and extrinsic evaluation. For the intrinsic evaluation, we evaluated the word embeddings' ability to capture medical semantics by measruing the semantic similarity between medical terms using four published datasets: Pedersen's dataset, Hliaoutakis's dataset, MayoSRS, and UMNSRS. For the extrinsic evaluation, we applied word embeddings to multiple downstream biomedical NLP applications, including clinical information extraction (IE), biomedical information retrieval (IR), and relation extraction (RE), with data from shared tasks. RESULTS: The qualitative evaluation shows that the word embeddings trained from EHR and MedLit can find more similar medical terms than those trained from GloVe and Google News. The intrinsic quantitative evaluation verifies that the semantic similarity captured by the word embeddings trained from EHR is closer to human experts' judgments on all four tested datasets. The extrinsic quantitative evaluation shows that the word embeddings trained on EHR achieved the best F1 score of 0.900 for the clinical IE task; no word embeddings improved the performance for the biomedical IR task; and the word embeddings trained on Google News had the best overall F1 score of 0.790 for the RE task. CONCLUSION: Based on the evaluation results, we can draw the following conclusions. First, the word embeddings trained from EHR and MedLit can capture the semantics of medical terms better, and find semantically relevant medical terms closer to human experts' judgments than those trained from GloVe and Google News. Second, there does not exist a consistent global ranking of word embeddings for all downstream biomedical NLP applications. However, adding word embeddings as extra features will improve results on most downstream tasks. Finally, the word embeddings trained from the biomedical domain corpora do not necessarily have better performance than those trained from the general domain corpora for any downstream biomedical NLP task.


Assuntos
Registros Eletrônicos de Saúde , Aprendizado de Máquina , Informática Médica/métodos , Processamento de Linguagem Natural , Unified Medical Language System , Adolescente , Adulto , Idoso , Saúde da Família , Feminino , Humanos , Armazenamento e Recuperação da Informação , Linguística , Masculino , Pessoa de Meia-Idade , Minnesota , Modelos Estatísticos , Probabilidade , PubMed , Semântica , Adulto Jovem
11.
Front Pharmacol ; 9: 875, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-30131701

RESUMO

Multiple data sources are preferred in adverse drug event (ADEs) surveillance owing to inadequacies of single source. However, analytic methods to monitor potential ADEs after prolonged drug exposure are still lacking. In this study we propose a method aiming to screen potential ADEs by combining FDA Adverse Event Reporting System (FAERS) and Electronic Medical Record (EMR). The proposed method uses natural language processing (NLP) techniques to extract treatment outcome information captured in unstructured text and adopts case-crossover design in EMR. Performances were evaluated using two ADE knowledge bases: Adverse Drug Reaction Classification System (ADReCS) and SIDER. We tested our method in ADE signal detection of conventional disease-modifying antirheumatic drugs (DMARDs) in rheumatoid arthritis patients. Findings showed that recall greatly increased when combining FAERS with EMR compared with FAERS alone and EMR alone, especially for flexible mapping strategy. Precision (FAERS + EMR) in detecting ADEs improved using ADReCS as gold standard compared with SIDER. In addition, signals detected from EMR have considerably overlapped with signals detected from FAERS or ADE knowledge bases, implying the importance of EMR for pharmacovigilance. ADE signals detected from EMR and/or FAERS but not in existing knowledge bases provide hypothesis for future study.

12.
JMIR Cancer ; 4(1): e10, 2018 May 15.
Artigo em Inglês | MEDLINE | ID: mdl-29764801

RESUMO

BACKGROUND: Patient education materials given to breast cancer survivors may not be a good fit for their information needs. Needs may change over time, be forgotten, or be misreported, for a variety of reasons. An automated content analysis of survivors' postings to online health forums can identify expressed information needs over a span of time and be repeated regularly at low cost. Identifying these unmet needs can guide improvements to existing education materials and the creation of new resources. OBJECTIVE: The primary goals of this project are to assess the unmet information needs of breast cancer survivors from their own perspectives and to identify gaps between information needs and current education materials. METHODS: This approach employs computational methods for content modeling and supervised text classification to data from online health forums to identify explicit and implicit requests for health-related information. Potential gaps between needs and education materials are identified using techniques from information retrieval. RESULTS: We provide a new taxonomy for the classification of sentences in online health forum data. 260 postings from two online health forums were selected, yielding 4179 sentences for coding. After annotation of data and training alternative one-versus-others classifiers, a random forest-based approach achieved F1 scores from 66% (Other, dataset2) to 90% (Medical, dataset1) on the primary information types. 136 expressions of need were used to generate queries to indexed education materials. Upon examination of the best two pages retrieved for each query, 12% (17/136) of queries were found to have relevant content by all coders, and 33% (45/136) were judged to have relevant content by at least one. CONCLUSIONS: Text from online health forums can be analyzed effectively using automated methods. Our analysis confirms that breast cancer survivors have many information needs that are not covered by the written documents they typically receive, as our results suggest that at most a third of breast cancer survivors' questions would be addressed by the materials currently provided to them.

13.
BMC Med Genet ; 19(1): 81, 2018 05 18.
Artigo em Inglês | MEDLINE | ID: mdl-29776397

RESUMO

BACKGROUND: Genetic heterogeneity and consanguineous marriages make recessive inherited hearing loss in Iran the second most common genetic disorder. Only two reported pathogenic variants (c.323G>C, p.Arg108Pro and c.419A>G, p.Tyr140Cys) in the S1PR2 gene have previously been linked to autosomal recessive hearing loss (DFNB68) in two Pakistani families. We describe a segregating novel homozygous c.323G>A, p.Arg108Gln pathogenic variant in S1PR2 that was identified in four affected individuals from a consanguineous five generation Iranian family. METHODS: Whole exome sequencing and bioinformatics analysis of 116 hearing loss-associated genes was performed in an affected individual from a five generation Iranian family. Segregation analysis and 3D protein modeling of the p.Arg108 exchange was performed. RESULTS: The two Pakistani families previously identified with S1PR2 pathogenic variants presented profound hearing loss that is also observed in the affected Iranian individuals described in the current study. Interestingly, we confirmed mixed hearing loss in one affected individual. 3D protein modeling suggests that the p.Arg108 position plays a key role in ligand receptor interaction, which is disturbed by the p.Arg108Gln change. CONCLUSION: In summary, we report the third overall mutation in S1PR2 and the first report outside the Pakistani population. Furthermore, we describe a novel variant that causes an amino acid exchange (p.Arg108Gln) in the same amino acid residue as one of the previously reported Pakistani families (p.Arg108Pro). This finding emphasizes the importance of the p.Arg108 amino acid in normal hearing and confirms and consolidates the role of S1PR2 in autosomal recessive hearing loss.


Assuntos
Substituição de Aminoácidos , Arginina/genética , Perda Auditiva/genética , Receptores de Lisoesfingolipídeo/genética , Adolescente , Consanguinidade , Feminino , Humanos , Irã (Geográfico) , Masculino , Modelos Moleculares , Linhagem , Ligação Proteica , Receptores de Lisoesfingolipídeo/química , Receptores de Lisoesfingolipídeo/metabolismo , Receptores de Esfingosina-1-Fosfato , Sequenciamento do Exoma/métodos
15.
PLoS One ; 13(1): e0191568, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29373609

RESUMO

Recent scientific advances have accumulated a tremendous amount of biomedical knowledge providing novel insights into the relationship between molecular and cellular processes and diseases. Literature mining is one of the commonly used methods to retrieve and extract information from scientific publications for understanding these associations. However, due to large data volume and complicated associations with noises, the interpretability of such association data for semantic knowledge discovery is challenging. In this study, we describe an integrative computational framework aiming to expedite the discovery of latent disease mechanisms by dissecting 146,245 disease-gene associations from over 25 million of PubMed indexed articles. We take advantage of both Latent Dirichlet Allocation (LDA) modeling and network-based analysis for their capabilities of detecting latent associations and reducing noises for large volume data respectively. Our results demonstrate that (1) the LDA-based modeling is able to group similar diseases into disease topics; (2) the disease-specific association networks follow the scale-free network property; (3) certain subnetwork patterns were enriched in the disease-specific association networks; and (4) genes were enriched in topic-specific biological processes. Our approach offers promising opportunities for latent disease-gene knowledge discovery in biomedical research.


Assuntos
Predisposição Genética para Doença , PubMed , Mineração de Dados , Humanos
16.
J Biomed Inform ; 77: 34-49, 2018 01.
Artigo em Inglês | MEDLINE | ID: mdl-29162496

RESUMO

BACKGROUND: With the rapid adoption of electronic health records (EHRs), it is desirable to harvest information and knowledge from EHRs to support automated systems at the point of care and to enable secondary use of EHRs for clinical and translational research. One critical component used to facilitate the secondary use of EHR data is the information extraction (IE) task, which automatically extracts and encodes clinical information from text. OBJECTIVES: In this literature review, we present a review of recent published research on clinical information extraction (IE) applications. METHODS: A literature search was conducted for articles published from January 2009 to September 2016 based on Ovid MEDLINE In-Process & Other Non-Indexed Citations, Ovid MEDLINE, Ovid EMBASE, Scopus, Web of Science, and ACM Digital Library. RESULTS: A total of 1917 publications were identified for title and abstract screening. Of these publications, 263 articles were selected and discussed in this review in terms of publication venues and data sources, clinical IE tools, methods, and applications in the areas of disease- and drug-related studies, and clinical workflow optimizations. CONCLUSIONS: Clinical IE has been used for a wide range of applications, however, there is a considerable gap between clinical studies using EHR data and studies using clinical IE. This study enabled us to gain a more concrete understanding of the gap and to provide potential solutions to bridge this gap.


Assuntos
Registros Eletrônicos de Saúde , Armazenamento e Recuperação da Informação/métodos , Informática Médica/tendências , Humanos , Uso Significativo , Processamento de Linguagem Natural , Projetos de Pesquisa
17.
AMIA Annu Symp Proc ; 2018: 574-583, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-30815098

RESUMO

Manually annotated clinical corpora are commonly used as the gold standards for the training and evaluation of clinical natural language processing (NLP) tools. The creation of these manual annotation corpora, however, is both costly and time-consuming. There is an emerging need in the clinical NLP community for reusing existing annotation corpora across different clinical NLP tasks. The objective of this study is to design, develop and evaluate a framework and accompanying tools to support the standardization and integration of annotation corpora using the HL7 Fast Healthcare Interoperability Resources (FHIR) specification. The framework contains two main modules: 1) an automatic schema transformation module, in which the annotation schema in each corpus is automatically transformed into the FHIR-based schema; 2) an expert-based verification and annotation module, in which existing annotations can be verified and new annotations can be added for new elements defined in FHIR. We evaluated the framework using various annotation corpora created as part of different clinical NLP projects at the Mayo Clinic. We demonstrated that it is feasible to leverage FHIR as a standard data model for standardizing heterogeneous annotation corpora for their reuse and integration in advanced clinical NLP research and practices.


Assuntos
Registros Eletrônicos de Saúde/normas , Interoperabilidade da Informação em Saúde/normas , Nível Sete de Saúde , Processamento de Linguagem Natural , Estudos de Viabilidade , Humanos
18.
Int J Med Inform ; 108: 78-84, 2017 12.
Artigo em Inglês | MEDLINE | ID: mdl-29132635

RESUMO

Clinical registries are designed to collect information relating to a particular condition for research or quality improvement. Intuitively, informatics in the area of data management and extraction plays a central role in clinical registries. Due to various reasons such as lack of informatics awareness or expertise, there may be little informatics involvement in designing clinical registries. In this paper, we studied a clinical registry from two critical perspectives, data quality and interoperability, where informatics can play a role. We evaluated these two aspects of an existing registry, Gynecology Surgery Registry, by mapping data elements and value sets, used in the registry, to a standardized terminology, SNOMED-CT. The results showed that majority of the values are ad-hoc and only 6 of 91 procedures in the registry could be mapped to the SNOMED-CT. To tackle this issue, we assessed the feasibility of automated data abstraction process, by training machine learning classifiers, based on existing manually extracted data. These classifiers achieved a reasonable average F-measure of 0.94. We concluded that more informatics engagement is needed to improve the interoperability, reusability, and quality of the registry.


Assuntos
Interoperabilidade da Informação em Saúde/normas , Armazenamento e Recuperação da Informação/métodos , Melhoria de Qualidade , Sistema de Registros/normas , Systematized Nomenclature of Medicine , Humanos , Aprendizado de Máquina
19.
J Med Internet Res ; 19(10): e342, 2017 10 16.
Artigo em Inglês | MEDLINE | ID: mdl-29038097

RESUMO

BACKGROUND: Self-management is crucial to diabetes care and providing expert-vetted content for answering patients' questions is crucial in facilitating patient self-management. OBJECTIVE: The aim is to investigate the use of information retrieval techniques in recommending patient education materials for diabetic questions of patients. METHODS: We compared two retrieval algorithms, one based on Latent Dirichlet Allocation topic modeling (topic modeling-based model) and one based on semantic group (semantic group-based model), with the baseline retrieval models, vector space model (VSM), in recommending diabetic patient education materials to diabetic questions posted on the TuDiabetes forum. The evaluation was based on a gold standard dataset consisting of 50 randomly selected diabetic questions where the relevancy of diabetic education materials to the questions was manually assigned by two experts. The performance was assessed using precision of top-ranked documents. RESULTS: We retrieved 7510 diabetic questions on the forum and 144 diabetic patient educational materials from the patient education database at Mayo Clinic. The mapping rate of words in each corpus mapped to the Unified Medical Language System (UMLS) was significantly different (P<.001). The topic modeling-based model outperformed the other retrieval algorithms. For example, for the top-retrieved document, the precision of the topic modeling-based, semantic group-based, and VSM models was 67.0%, 62.8%, and 54.3%, respectively. CONCLUSIONS: This study demonstrated that topic modeling can mitigate the vocabulary difference and it achieved the best performance in recommending education materials for answering patients' questions. One direction for future work is to assess the generalizability of our findings and to extend our study to other disease areas, other patient education material resources, and online forums.


Assuntos
Diabetes Mellitus/terapia , Armazenamento e Recuperação da Informação/métodos , Educação de Pacientes como Assunto/métodos , Bases de Dados Factuais , Humanos , Inquéritos e Questionários
20.
AMIA Jt Summits Transl Sci Proc ; 2017: 95-103, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28815115

RESUMO

The use of multiple data sources has been preferred in the surveillance of adverse drug events due to shortcomings of using only a single source. In this study, we proposed a framework where the ADEs associated with interested drugs are systematically discovered from the FDA's Adverse Event Reporting System (AERS), and then validated through mining unstructured clinical notes from Electronic Medical Records (EMRs). This framework has two features. First, a higher priority was given to clinical practice during signal detection and validation. Second, the normalization by NLP facilitated the interoperation between AERS-DM and the EMR. To demonstrate this methodology, we investigated potential ADEs associated with drugs (class level) for rheumatoid arthritis (RA) patients. The results demonstrated the feasibility and sufficient accuracy of the framework. The framework can serve as the interface between the informatics domain and the medical domain to facilitate ADE discovery.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA