Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 28
Filtrar
1.
Front Mol Biosci ; 11: 1268019, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38903180

RESUMEN

Skeletal diseases impose a considerable burden on society. The clinical and tissue-engineering therapies applied to alleviate such diseases frequently result in complications and are inadequately effective. Research has shifted from conventional therapies based on mesenchymal stem cells (MSCs) to exosomes derived from MSCs. Exosomes are natural nanocarriers of endogenous DNA, RNA, proteins, and lipids and have a low immune clearance rate and good barrier penetration and allow targeted delivery of therapeutics. MSC-derived exosomes (MSC-exosomes) have the characteristics of both MSCs and exosomes, and so they can have both immunosuppressive and tissue-regenerative effects. Despite advances in our knowledge of MSC-exosomes, their regulatory mechanisms and functionalities are unclear. Here we review the therapeutic potential of MSC-exosomes for skeletal diseases.

2.
Front Cardiovasc Med ; 10: 1198526, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37705687

RESUMEN

Introduction: Venous thromboembolism (VTE) risk assessment at admission is of great importance for early screening and timely prophylaxis and management during hospitalization. The purpose of this study is to develop and validate novel risk assessment models at admission based on machine learning (ML) methods. Methods: In this retrospective study, a total of 3078 individuals were included with their Caprini variables within 24 hours at admission. Then several ML models were built, including logistic regression (LR), random forest (RF), and extreme gradient boosting (XGB). The prediction performance of ML models and the Caprini risk score (CRS) was then validated and compared through a series of evaluation metrics. Results: The values of AUROC and AUPRC were 0.798 and 0.303 for LR, 0.804 and 0.360 for RF, and 0.796 and 0.352 for XGB, respectively, which outperformed CRS significantly (0.714 and 0.180, P < 0.001). When prediction scores were stratified into three risk levels for application, RF could obtain more reasonable results than CRS, including smaller false positive alerts and larger lower-risk proportions. The boosting results of stratification were further verified by the net-reclassification-improvement (NRI) analysis. Discussion: This study indicated that machine learning models could improve VTE risk prediction at admission compared with CRS. Among the ML models, RF was found to have superior performance and great potential in clinical practice.

3.
Molecules ; 28(11)2023 May 26.
Artículo en Inglés | MEDLINE | ID: mdl-37298846

RESUMEN

Cancer, which presents with high incidence and mortality rates, has become a significant health threat worldwide. However, there is currently no effective solution for rapid screening and high-quality treatment of early-stage cancer patients. Metal-based nanoparticles (MNPs), as a new type of compound with stable properties, convenient synthesis, high efficiency, and few adverse reactions, have become highly competitive tools for early cancer diagnosis. Nevertheless, challenges such as the difference between the microenvironment of detected markers and the real-life body fluids remain in achieving widespread clinical application of MNPs. This review provides a comprehensive review of the research progress made in the field of in vitro cancer diagnosis using metal-based nanoparticles. By delving into the characteristics and advantages of these materials, this paper aims to inspire and guide researchers towards fully exploiting the potential of metal-based nanoparticles in the early diagnosis and treatment of cancer.


Asunto(s)
Nanopartículas del Metal , Nanoestructuras , Neoplasias , Humanos , Biomarcadores de Tumor , Nanoestructuras/uso terapéutico , Neoplasias/diagnóstico , Metales , Microambiente Tumoral
4.
ACS Nano ; 17(7): 6247-6260, 2023 04 11.
Artículo en Inglés | MEDLINE | ID: mdl-36961255

RESUMEN

How to effectively treat malignant osteosarcoma remains clinically challenging. Programmed delivery of chemotherapeutic agents and immunostimulants may offer a universal strategy for killing osteosarcoma cells while simultaneously eliciting in situ antitumor immunity. However, targeted chemoimmunotherapy lacks a reliable delivery system. To address this issue, we herein developed a bioinspired calcium phosphonate nanoagent that was synthesized by chemical reactions between Ca2+ and phosphonate residue from zoledronic acid using bovine serum albumin as a scaffold. In addition, methotrexate combination with a phosphorothioate CpG immunomodulator was also loaded for pH-responsive delivery to enable synergistic chemoimmunotherapy of osteosarcoma. The calcium phosphonate nanoagents were found to effectively accumulate in osteosarcoma for nearly 1 week, which is favorable for exerting the vaccination effects in situ by maturing dendritic cells and priming CD8+ T cells to suppress the osteosarcoma progression and pulmonary metastasis through controlled release of the three loaded agents in the acidic tumor microenvironment. The current study may thus offer a reliable delivery platform for achieving targeted chemotherapy-induced in situ antitumor immunity.


Asunto(s)
Neoplasias Óseas , Organofosfonatos , Osteosarcoma , Humanos , Calcio , Organofosfonatos/uso terapéutico , Linfocitos T CD8-positivos , Osteosarcoma/tratamiento farmacológico , Neoplasias Óseas/tratamiento farmacológico , Vacunación , Línea Celular Tumoral , Doxorrubicina/química , Microambiente Tumoral
5.
ACS Biomater Sci Eng ; 8(12): 5329-5337, 2022 Dec 12.
Artículo en Inglés | MEDLINE | ID: mdl-36383732

RESUMEN

Osteosarcoma is a malignant osteogenic tumor with a high metastatic rate commonly occurring in adolescents. Although radiotherapy is applied to treat unresectable osteosarcoma with radiation resistance, a high dose of radiotherapy is required, which may weaken the immune microenvironment. Therefore, there is an urgent need to develop novel agents to maximize the radiotherapeutic effects by eliciting immune activation effects. In this study, we synthesized therapeutic gadolinium-based metal-bisphosphonate nanoparticles (NPs) for osteosarcoma treatment that can be combined with radiotherapy. The gadolinium ion (Gd) was chelated with zoledronic acid (Zol), a commonly used drug to prevent/treat osteoporosis or bone metastases from advanced cancers, and stabilized by ovalbumin (OVA) to produce OVA-GdZol NPs. OVA-GdZol NPs were internalized into K7M2 osteosarcoma cells, showing a high sensitization effect under X-ray irradiation. Cell pretreatment of OVA-GdZol NPs significantly enhanced the radiation therapeutic effect in vitro by reducing the cell colonies and increased the signal of γH2AX-positive cells. More importantly, OVA-GdZol NPs promoted the maturation of bone marrow-derived dendritic cells (BMDCs) and M1 polarization of macrophages. The inhibitory effect on K7M2 osteosarcoma of OVA-GdZol NPs and X-ray radiation was evident, indicated by a significantly reduced tumor volume, high survival rate, and decreased lung metastasis. Meanwhile, both innate and adaptive immune systems were activated to exert a strong antitumor effect. The above results highly suggest that OVA-GdZol NPs serve as both radiosensitizers and immune adjuvants, suitable for the sequential combination of vaccination and radiotherapy.


Asunto(s)
Nanopartículas , Neoplasias , Humanos , Adolescente , Gadolinio , Difosfonatos/uso terapéutico , Nanopartículas/uso terapéutico , Ovalbúmina , Microambiente Tumoral
6.
J Biomed Inform ; 134: 104210, 2022 10.
Artículo en Inglés | MEDLINE | ID: mdl-36122879

RESUMEN

Venous thromboembolism (VTE) is the world's third most common cause of vascular mortality and a serious complication from multiple departments. Risk assessment of VTE guides clinical intervention in time and is of great importance to in-hospital patients. Traditional VTE risk assessment methods based on scaling tools, which always require rules carefully designed by human experts, are difficult to apply to large-population scenarios since the manually designed rules are not guaranteed to be accurate to all populations. In contrast, with the development of the electronic health record (EHR) datasets, data-driven machine-learning-based risk assessment methods have proven superior predictability in many studies in recent years. This paper uses the gradient boosting tree model to study the VTE risk assessment problem with multi-department data. There exist two distinct characteristics of VTE data collected at the level of the entire hospital: its wide distribution and heterogeneity across multiple departments. To this end, we consider the prediction task over multiple departments as a multi-task learning process, and introduce the algorithm of a task-aware tree-based method TSGB to tackle the multi-task prediction problem. Although the introduction of multi-task learning improves overall across-department performance, we reveal the problem of task-wise performance decline while dealing with imbalanced VTE data volume. According to the analysis, we finally propose two variants of TSGB to alleviate the problems and further boost the prediction performance. Compared with state-of-the-art rule-based and multi-task tree-based methods, the experimental results show the proposed methods not only improve the overall across-department AUC performance effectively, but also ensure the improvement of performance over every single department prediction.


Asunto(s)
Tromboembolia Venosa , Registros Electrónicos de Salud , Hospitales , Humanos , Medición de Riesgo/métodos , Factores de Riesgo , Tromboembolia Venosa/diagnóstico , Tromboembolia Venosa/etiología
7.
J Nanobiotechnology ; 20(1): 185, 2022 Apr 12.
Artículo en Inglés | MEDLINE | ID: mdl-35414075

RESUMEN

Albumin-biomineralized copper sulfide nanoparticles (Cu2-xS NPs) have attracted much attention as an emerging phototheranostic agent due to their advantages of facile preparation method and high biocompatibility. However, comprehensive preclinical safety evaluation is the only way to meet its further clinical translation. We herein evaluate detailedly the safety and hepatotoxicity of bovine serum albumin-biomineralized Cu2-xS (BSA@Cu2-xS) NPs with two different sizes in rats. Large-sized (LNPs, 17.8 nm) and small-sized (SNPs, 2.8 nm) BSA@Cu2-xS NPs with great near-infrared absorption and photothermal conversion efficiency are firstly obtained. Seven days after a single-dose intravenous administration, SNPs distributed throughout the body are cleared primarily through the feces, while a large amount of LNPs remained in the liver. A 14-day subacute toxicity study with a 28-day recovery period are conducted, showing long-term hepatotoxicity without recovery for LNPs but reversible toxicity for SNPs. Cellular uptake studies indicate that LNPs prefer to reside in Kupffer cells, leading to prolonged and delayed hepatotoxicity even after the cessation of NPs administration, while SNPs have much less Kupffer cell uptake. RNA-sequencing analysis for gene expression indicates that the inflammatory pathway, lipid metabolism pathway, drug metabolism-cytochrome P450 pathway, cholesterol/bile acid metabolism pathway, and copper ion transport/metabolism pathway are compromised in the liver by two sizes of BSA@Cu2-xS NPs, while only SNPs show a complete recovery of altered gene expression after NPs discontinuation. This study demonstrates that the translational feasibility of small-sized BSA@Cu2-xS NPs as excellent nanoagents with manageable hepatotoxicity.


Asunto(s)
Enfermedad Hepática Inducida por Sustancias y Drogas , Nanopartículas , Animales , Cobre/toxicidad , Ratas , Albúmina Sérica Bovina , Sulfuros/toxicidad
8.
J Biomed Inform ; 122: 103892, 2021 10.
Artículo en Inglés | MEDLINE | ID: mdl-34454079

RESUMEN

Venous thromboembolism (VTE) is a common vascular disease and potentially fatal complication during hospitalization, and so the early identification of VTE risk is of significant importance. Compared with traditional scale assessments, machine learning methods provide new opportunities for precise early warning of VTE from clinical medical records. This research aimed to propose a two-stage hierarchical machine learning model for VTE risk prediction in patients from multiple departments. First, we built a machine learning prediction model that covered the entire hospital, based on all cohorts and common risk factors. Then, we took the prediction output of the first stage as an initial assessment score and then built specific models for each department. Over the duration of the study, a total of 9213 inpatients, including 1165 VTE-positive samples, were collected from four departments, which were split into developing and test datasets. The proposed model achieved an AUC of 0.879 in the department of oncology, which outperformed the first-stage model (0.730) and the department model (0.787). This was attributed to the fully usage of both the large sample size at the hospital level and variable abundance at the department level. Experimental results show that our model could effectively improve the prediction of hospital-acquired VTE risk before image diagnosis and provide decision support for further nursing and medical intervention.


Asunto(s)
Tromboembolia Venosa , Hospitales , Humanos , Aprendizaje Automático , Medición de Riesgo , Factores de Riesgo , Tromboembolia Venosa/diagnóstico , Tromboembolia Venosa/epidemiología
9.
Nanomedicine (Lond) ; 16(17): 1487-1504, 2021 07.
Artículo en Inglés | MEDLINE | ID: mdl-34184559

RESUMEN

Aim: To explore the hepatotoxicity of copper sulfide nanoparticles (CuSNPs) toward hepatocyte spheroids. Materials & methods: Other than the traditional agarose method to generate hepatocyte spheroids, we developed a multi-concave agarose chip (MCAC) method to investigate changes in hepatocyte viability, morphology, mitochondrial membrane potential, reactive oxygen species and hepatobiliary transporter by CuSNPs. Results: The MCAC method allowed a large number of spheroids to be obtained per sample. CuSNPs showed hepatotoxicity in vitro through a decrease in spheroid viability, albumin/urea production and glycogen deposition. CuSNPs also introduced hepatocyte spheroid injury through alteration of mitochondrial membrane potential and reactive oxygen species, that could be reversed by N-acetyl-l-cysteine. CuSNPs significantly decreased the activity of BSEP transporter by downregulating its mRNA and protein levels. Activity of the MRP2 transporter remained unchanged. Conclusion: We observed the hepatotoxicity of CuSNPs in vitro with associated mechanisms in an advanced 3D culture system.


Asunto(s)
Enfermedad Hepática Inducida por Sustancias y Drogas , Nanopartículas , Células Cultivadas , Cobre/toxicidad , Hepatocitos , Humanos , Nanopartículas/toxicidad , Sefarosa , Esferoides Celulares , Sulfuros/toxicidad
10.
Biopreserv Biobank ; 19(5): 386-393, 2021 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-34042506

RESUMEN

Objective: To establish a structured and integrated platform of clinical data and biobank data, and a client to retrieve these data. Study Design: Initially, the hospital information system (HIS) and biobank information system (BIS) were integrated through the patients' ID numbers. Then, natural language processing (NLP) was used to process the integrated unstructured clinical information. A query interface was designed for this system, which enabled researchers to retrieve clinical or biobank data. Finally, several queries were listed and manually checked to test the retrieval performance of the system. Results: The construction of the biobank screening system (BSS) was completed, and the data were structured. The BSS took an average of 2 seconds to perform a search for target patients/samples. The retrieval results were consistent with the HIS and BIS. For complex queries, we manually checked the retrieved patients/samples, and the system's accuracy was 100%. Conclusion: This NLP-based system improved biological sample screening and using of clinical data. We will continue to improve this system, enhance resource sharing, and promote the development of translational medicine.


Asunto(s)
Inteligencia Artificial , Registros Electrónicos de Salud , Bancos de Muestras Biológicas , China , Humanos , Procesamiento de Lenguaje Natural
11.
Anal Chem ; 93(16): 6414-6420, 2021 04 27.
Artículo en Inglés | MEDLINE | ID: mdl-33843203

RESUMEN

The development of a specific and noninvasive technology for understanding gastritic response together with efficient therapy is an urgent clinical issue. Herein, we fabricated a novel iodinated bovine serum albumin (BSA) nanoparticle based on gastritic microenvironment for computed tomography (CT) imaging and repair of acute gastritis. Derived from the characteristic mucosa defect and inflammatory cell (e.g., macrophage and neutrophil) infiltration in acute gastritis, the pH-sensitive nanoparticles can sedimentate under acidic conditions and be uniformly distributed in the defected mucosal via the phagocytosis of inflammatory cells. Hence, enhanced CT images can clearly reveal the mucosal morphology in the nanoparticle-treated gastritic rat over a long time window comparison with nanoparticle-treated healthy rats and clinical small-molecule-treated gastritic rat. In addition, we have discovered that nanoparticles can repair the atrophic gastric mucosa to a normal state. This repair process mainly stems from inflammatory immune response caused by phagocytized nanoparticles, such as the polarization of proinflammatory macrophages (M1) to anti-inflammatory macrophages (M2). The biocompatible nanoparticles that avoid the inherent defects of the clinical small molecules have great potential for accurate diagnosis and treatment of gastritis in the early stage.


Asunto(s)
Gastritis , Nanopartículas , Albúmina Sérica Bovina , Tomografía Computarizada por Rayos X , Animales , Gastritis/diagnóstico por imagen , Gastritis/tratamiento farmacológico , Macrófagos , Ratas
13.
BMC Med Inform Decis Mak ; 19(1): 156, 2019 08 07.
Artículo en Inglés | MEDLINE | ID: mdl-31391038

RESUMEN

BACKGROUND: Imaging examinations, such as ultrasonography, magnetic resonance imaging and computed tomography scans, play key roles in healthcare settings. To assess and improve the quality of imaging diagnosis, we need to manually find and compare the pre-existing reports of imaging and pathology examinations which contain overlapping exam body sites from electrical medical records (EMRs). The process of retrieving those reports is time-consuming. In this paper, we propose a convolutional neural network (CNN) based method which can better utilize semantic information contained in report texts to accelerate the retrieving process. METHODS: We included 16,354 imaging and pathology report-pairs from 1926 patients who admitted to Shanghai Tongren Hospital and had ultrasonic examinations between 1st May 2017 and 31st July 2017. We adapted the CNN model to calculate the similarities among the report-pairs to identify target report-pairs with overlapping body sites, and compared the performance with other six conventional models, including keyword mapping, latent semantic analysis (LSA), latent Dirichlet allocation (LDA), Doc2Vec, Siamese long short term memory (LSTM) and a model based on named entity recognition (NER). We also utilized graph embedding method to enhance the word representation by capturing the semantic relations information from medical ontologies. Additionally, we used LIME algorithm to identify which features (or words) are decisive for the prediction results and improved the model interpretability. RESULTS: Experiment results showed that our CNN model gained significant improvement compared to all other conventional models on area under the receiver operating characteristic (AUROC), precision, recall and F1-score in our test dataset. The AUROC of our CNN models gained approximately 3-7% improvement. The AUROC of CNN model with graph-embedding and ontology based medical concept vectors was 0.8% higher than the model with randomly initialized vectors and 1.5% higher than the one with pre-trained word vectors. CONCLUSION: Our study demonstrates that CNN model with pre-trained medical concept vectors could accurately identify target report-pairs with overlapping body sites and potentially accelerate the retrieving process for imaging diagnosis quality measurement.


Asunto(s)
Algoritmos , Registros Electrónicos de Salud , Almacenamiento y Recuperación de la Información/métodos , Redes Neurales de la Computación , Humanos , Patología , Curva ROC , Semántica , Ultrasonografía
14.
JMIR Med Inform ; 7(3): e13331, 2019 Jul 16.
Artículo en Inglés | MEDLINE | ID: mdl-31313661

RESUMEN

BACKGROUND: The growing interest in observational trials using patient data from electronic medical records poses challenges to both efficiency and quality of clinical data collection and management. Even with the help of electronic data capture systems and electronic case report forms (eCRFs), the manual data entry process followed by chart review is still time consuming. OBJECTIVE: To facilitate the data entry process, we developed a natural language processing-driven medical information extraction system (NLP-MIES) based on the i2b2 reference standard. We aimed to evaluate whether the NLP-MIES-based eCRF application could improve the accuracy and efficiency of the data entry process. METHODS: We conducted a randomized and controlled field experiment, and 24 eligible participants were recruited (12 for the manual group and 12 for NLP-MIES-supported group). We simulated the real-world eCRF completion process using our system and compared the performance of data entry on two research topics, pediatric congenital heart disease and pneumonia. RESULTS: For the congenital heart disease condition, the NLP-MIES-supported group increased accuracy by 15% (95% CI 4%-120%, P=.03) and reduced elapsed time by 33% (95% CI 22%-42%, P<.001) compared with the manual group. For the pneumonia condition, the NLP-MIES-supported group increased accuracy by 18% (95% CI 6%-32%, P=.008) and reduced elapsed time by 31% (95% CI 19%-41%, P<.001). CONCLUSIONS: Our system could improve both the accuracy and efficiency of the data entry process.

15.
JMIR Med Inform ; 7(2): e12704, 2019 May 23.
Artículo en Inglés | MEDLINE | ID: mdl-31124461

RESUMEN

BACKGROUND: The vocabulary gap between consumers and professionals in the medical domain hinders information seeking and communication. Consumer health vocabularies have been developed to aid such informatics applications. This purpose is best served if the vocabulary evolves with consumers' language. OBJECTIVE: Our objective is to develop a method for identifying and adding new terms to consumer health vocabularies, so that it can keep up with the constantly evolving medical knowledge and language use. METHODS: In this paper, we propose a consumer health term-finding framework based on a distributed word vector space model. We first learned word vectors from a large-scale text corpus and then adopted a supervised method with existing consumer health vocabularies for learning vector representation of words, which can provide additional supervised fine tuning after unsupervised word embedding learning. With a fine-tuned word vector space, we identified pairs of professional terms and their consumer variants by their semantic distance in the vector space. A subsequent manual review of the extracted and labeled pairs of entities was conducted to validate the results generated by the proposed approach. The results were evaluated using mean reciprocal rank (MRR). RESULTS: Manual evaluation showed that it is feasible to identify alternative medical concepts by using professional or consumer concepts as queries in the word vector space without fine tuning, but the results are more promising in the final fine-tuned word vector space. The MRR values indicated that on an average, a professional or consumer concept is about 14th closest to its counterpart in the word vector space without fine tuning, and the MRR in the final fine-tuned word vector space is 8. Furthermore, the results demonstrate that our method can collect abbreviations and common typos frequently used by consumers. CONCLUSIONS: By integrating a large amount of text information and existing consumer health vocabularies, our method outperformed several baseline ranking methods and is effective for generating a list of candidate terms for human review during consumer health vocabulary development.

16.
Proc Int World Wide Web Conf ; 2017: 1073-1081, 2017 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-28967000

RESUMEN

Patients discuss complementary and alternative medicine (CAM) in online health communities. Sometimes, patients' conflicting opinions toward CAM-related issues trigger debates in the community. The objectives of this paper are to identify such debates, identify controversial CAM therapies in a popular online breast cancer community, as well as patients' stances towards them. To scale our analysis, we trained a set of classifiers. We first constructed a supervised classifier based on a long short-term memory neural network (LSTM) stacked over a convolutional neural network (CNN) to detect automatically CAM-related debates from a popular breast cancer forum. Members' stances in these debates were also identified by a CNN-based classifier. Finally, posts automatically flagged as debates by the classifier were analyzed to explore which specific CAM therapies trigger debates more often than others. Our methods are able to detect CAM debates with F score of 77%, and identify stances with F score of 70%. The debate classifier identified about 1/6 of all CAM-related posts as debate. About 60% of CAM-related debate posts represent the supportive stance toward CAM usage. Qualitative analysis shows that some specific therapies, such as Gerson therapy and usage of laetrile, trigger debates frequently among members of the breast cancer community. This study demonstrates that neural networks can effectively locate debates on usage and effectiveness of controversial CAM therapies, and can help make sense of patients' opinions on such issues under dispute. As to CAM for breast cancer, perceptions of their effectiveness vary among patients. Many of the specific therapies trigger debates frequently and are worth more exploration in future work.

17.
Proc Int World Wide Web Conf ; 2017: 123-131, 2017 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-28736777

RESUMEN

A large number of patients discuss treatments in online health communities (OHCs). One research question of interest to health researchers is whether treatments being discussed in OHCs are eventually used by community members in their real lives. In this paper, we rely on machine learning methods to automatically identify attributions of mentions of treatments from an online autism community. The context of our work is online autism communities, where parents exchange support for the care of their children with autism spectrum disorder. Our methods are able to distinguish discussions of treatments that are associated with patients, caregivers, and others, as well as identify whether a treatment is actually taken. We investigate treatments that are not just discussed but also used by patients according to two types of content analysis, cross-sectional and longitudinal. The treatments identified through our content analysis help create a catalogue of real-world treatments. This study results lay the foundation for future research to compare real-world drug usage with established clinical guidelines.

18.
J Biomed Inform ; 73: 76-83, 2017 09.
Artículo en Inglés | MEDLINE | ID: mdl-28756160

RESUMEN

With rapid adoption of Electronic Health Records (EHR) in China, an increasing amount of clinical data has been available to support clinical research. Clinical data secondary use usually requires de-identification of personal information to protect patient privacy. Since manually de-identification of free clinical text requires significant amount of human work, developing an automated de-identification system is necessary. While there are many de-identification systems available for English clinical text, designing a de-identification system for Chinese clinical text faces many challenges such as unavailability of necessary lexical resources and sparsity of patient health information (PHI) in Chinese clinical text. In this paper, we designed a de-identification pipeline taking advantage of both rule-based and machine learning techniques. Our method, in particular, can effectively construct a data set with dense PHI information, which saves annotation time significantly for subsequent supervised learning. We experiment on a dataset of 3000 heterogeneous clinical documents to evaluate the annotation cost and the de-identification performance. Our approach can increase the efficiency of the annotation effort by over 60% while reaching performance as high as over 90% measured by F score. We demonstrate that combing rule-based and machine learning is an effective way to reduce the annotation cost and achieve high performance in Chinese clinical text de-identification task.


Asunto(s)
Confidencialidad , Curaduría de Datos , Registros Electrónicos de Salud , Procesamiento de Lenguaje Natural , China , Humanos
19.
J Am Med Inform Assoc ; 24(6): 1062-1071, 2017 Nov 01.
Artículo en Inglés | MEDLINE | ID: mdl-28379377

RESUMEN

OBJECTIVE: To develop an open-source information extraction system called Eligibility Criteria Information Extraction (EliIE) for parsing and formalizing free-text clinical research eligibility criteria (EC) following Observational Medical Outcomes Partnership Common Data Model (OMOP CDM) version 5.0. MATERIALS AND METHODS: EliIE parses EC in 4 steps: (1) clinical entity and attribute recognition, (2) negation detection, (3) relation extraction, and (4) concept normalization and output structuring. Informaticians and domain experts were recruited to design an annotation guideline and generate a training corpus of annotated EC for 230 Alzheimer's clinical trials, which were represented as queries against the OMOP CDM and included 8008 entities, 3550 attributes, and 3529 relations. A sequence labeling-based method was developed for automatic entity and attribute recognition. Negation detection was supported by NegEx and a set of predefined rules. Relation extraction was achieved by a support vector machine classifier. We further performed terminology-based concept normalization and output structuring. RESULTS: In task-specific evaluations, the best F1 score for entity recognition was 0.79, and for relation extraction was 0.89. The accuracy of negation detection was 0.94. The overall accuracy for query formalization was 0.71 in an end-to-end evaluation. CONCLUSIONS: This study presents EliIE, an OMOP CDM-based information extraction system for automatic structuring and formalization of free-text EC. According to our evaluation, machine learning-based EliIE outperforms existing systems and shows promise to improve.


Asunto(s)
Ensayos Clínicos como Asunto , Determinación de la Elegibilidad/métodos , Aprendizaje Automático , Procesamiento de Lenguaje Natural , Selección de Paciente , Humanos
20.
Comput Methods Programs Biomed ; 140: 53-59, 2017 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-28254090

RESUMEN

BACKGROUND AND OBJECTIVES: Researchers have developed effective methods to index free-text clinical notes into structured database, in which negation detection is a critical but challenging step. In Chinese clinical records, negation detection is particularly challenging because it may depend on upstream Chinese information processing components such as word segmentation [1]. Traditionally, negation detection was carried out mostly using rule-based methods, whose comprehensiveness and portability were usually limited. Our objectives in this paper are to: 1) Construct a large Chinese clinical notes corpus with negation annotated; 2) develop a negation detection tool for Chinese clinical notes; 3) evaluate the performance of character and word embedding features in Chinese clinical natural language processing. METHODS: In this paper, we construct a Chinese clinical corpus consisting of admission and discharge summaries, and propose sequence labeling based systems for negation and scope detection. Our systems rely on features from bag of characters, bag of words, character embedding and word embedding. For scopes, we introduce an additional feature to handle nested scopes with multiple negations. RESULTS: The two annotators reached an agreement of 0.79 measured by Kappa in manual annotation. In cue detection, our systems are able to achieve a performance as high as 99.0% measured by F score, which significantly outperform its rule-based counterpart (79% F). The best system uses word embedding as features, which yields precision of 99.0% and recall of 99.1%. In scope detection, our system is able to achieve a performance of 94.6% measured by F score. CONCLUSIONS: Our study provides a state-of-the-art negation-detecting tool for Chinese clinical free-text notes; Experimental results demonstrate that word embedding is effective in identifying negations, and that nested scopes can be identified effectively by our method.


Asunto(s)
Procesamiento de Lenguaje Natural , Semántica , China , Registros Electrónicos de Salud , Aprendizaje Automático
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA