Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
1.
J Med Internet Res ; 22(12): e18526, 2020 12 09.
Artigo em Inglês | MEDLINE | ID: mdl-33295294

RESUMO

BACKGROUND: Common data models (CDMs) help standardize electronic health record data and facilitate outcome analysis for observational and longitudinal research. An analysis of pathology reports is required to establish fundamental information infrastructure for data-driven colon cancer research. The Observational Medical Outcomes Partnership (OMOP) CDM is used in distributed research networks for clinical data; however, it requires conversion of free text-based pathology reports into the CDM's format. There are few use cases of representing cancer data in CDM. OBJECTIVE: In this study, we aimed to construct a CDM database of colon cancer-related pathology with natural language processing (NLP) for a research platform that can utilize both clinical and omics data. The essential text entities from the pathology reports are extracted, standardized, and converted to the OMOP CDM format in order to utilize the pathology data in cancer research. METHODS: We extracted clinical text entities, mapped them to the standard concepts in the Observational Health Data Sciences and Informatics vocabularies, and built databases and defined relations for the CDM tables. Major clinical entities were extracted through NLP on pathology reports of surgical specimens, immunohistochemical studies, and molecular studies of colon cancer patients at a tertiary general hospital in South Korea. Items were extracted from each report using regular expressions in Python. Unstructured data, such as text that does not have a pattern, were handled with expert advice by adding regular expression rules. Our own dictionary was used for normalization and standardization to deal with biomarker and gene names and other ungrammatical expressions. The extracted clinical and genetic information was mapped to the Logical Observation Identifiers Names and Codes databases and the Systematized Nomenclature of Medicine (SNOMED) standard terminologies recommended by the OMOP CDM. The database-table relationships were newly defined through SNOMED standard terminology concepts. The standardized data were inserted into the CDM tables. For evaluation, 100 reports were randomly selected and independently annotated by a medical informatics expert and a nurse. RESULTS: We examined and standardized 1848 immunohistochemical study reports, 3890 molecular study reports, and 12,352 pathology reports of surgical specimens (from 2017 to 2018). The constructed and updated database contained the following extracted colorectal entities: (1) NOTE_NLP, (2) MEASUREMENT, (3) CONDITION_OCCURRENCE, (4) SPECIMEN, and (5) FACT_RELATIONSHIP of specimen with condition and measurement. CONCLUSIONS: This study aimed to prepare CDM data for a research platform to take advantage of all omics clinical and patient data at Seoul National University Bundang Hospital for colon cancer pathology. A more sophisticated preparation of the pathology data is needed for further research on cancer genomics, and various types of text narratives are the next target for additional research on the use of data in the CDM.


Assuntos
Neoplasias do Colo/patologia , Registros Eletrônicos de Saúde/normas , Informática Médica/métodos , Oncologia/métodos , Bases de Dados Factuais , Humanos
2.
BMC Med Inform Decis Mak ; 18(1): 29, 2018 05 21.
Artigo em Inglês | MEDLINE | ID: mdl-29783980

RESUMO

BACKGROUND: Pathology reports are written in free-text form, which precludes efficient data gathering. We aimed to overcome this limitation and design an automated system for extracting biomarker profiles from accumulated pathology reports. METHODS: We designed a new data model for representing biomarker knowledge. The automated system parses immunohistochemistry reports based on a "slide paragraph" unit defined as a set of immunohistochemistry findings obtained for the same tissue slide. Pathology reports are parsed using context-free grammar for immunohistochemistry, and using a tree-like structure for surgical pathology. The performance of the approach was validated on manually annotated pathology reports of 100 randomly selected patients managed at Seoul National University Hospital. RESULTS: High F-scores were obtained for parsing biomarker name and corresponding test results (0.999 and 0.998, respectively) from the immunohistochemistry reports, compared to relatively poor performance for parsing surgical pathology findings. However, applying the proposed approach to our single-center dataset revealed information on 221 unique biomarkers, which represents a richer result than biomarker profiles obtained based on the published literature. Owing to the data representation model, the proposed approach can associate biomarker profiles extracted from an immunohistochemistry report with corresponding pathology findings listed in one or more surgical pathology reports. Term variations are resolved by normalization to corresponding preferred terms determined by expanded dictionary look-up and text similarity-based search. CONCLUSIONS: Our proposed approach for biomarker data extraction addresses key limitations regarding data representation and can handle reports prepared in the clinical setting, which often contain incomplete sentences, typographical errors, and inconsistent formatting.


Assuntos
Biomarcadores , Tomada de Decisão Clínica , Imuno-Histoquímica , Modelos Teóricos , Processamento de Linguagem Natural , Neoplasias/metabolismo , Neoplasias/patologia , Neoplasias/cirurgia , Biomarcadores/metabolismo , Humanos
3.
J Am Soc Nephrol ; 26(6): 1426-33, 2015 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-25343954

RESUMO

Although renal hyperfiltration (RHF) or an abnormal increase in GFR has been associated with many lifestyles and clinical conditions, including diabetes, its clinical consequence is not clear. RHF is frequently considered to be the result of overestimating true GFR in subjects with muscle wasting. To evaluate the association between RHF and mortality, 43,503 adult Koreans who underwent voluntary health screening at Seoul National University Hospital between March of 1995 and May of 2006 with baseline GFR≥60 ml/min per 1.73 m(2) were followed up for mortality until December 31, 2012. GFR was estimated with the Chronic Kidney Disease Epidemiology Collaboration creatinine equation, and RHF was defined as GFR>95th percentile after adjustment for age, sex, muscle mass, and history of diabetes and/or hypertension medication. Muscle mass was measured with bioimpedance analysis at baseline. During the median follow-up of 12.4 years, 1743 deaths occurred. The odds ratio of RHF in participants with the highest quartile of muscle mass was 1.31 (95% confidence interval [95% CI], 1.11 to 1.54) compared with the lowest quartile after adjusting for confounding factors, including body mass index. The hazard ratio of all-cause mortality for RHF was 1.37 (95% CI, 1.11 to 1.70) by Cox proportional hazards model with adjustment for known risk factors, including smoking. These data suggest RHF may be associated with increased all-cause mortality in an apparently healthy population. The possibility of RHF as a novel marker of all-cause mortality should be confirmed.


Assuntos
Causas de Morte , Glomérulos Renais/fisiopatologia , Estilo de Vida , Insuficiência Renal Crônica/mortalidade , Adulto , Fatores Etários , Idoso , Antropometria , Índice de Massa Corporal , Comorbidade , Feminino , Taxa de Filtração Glomerular , Humanos , Incidência , Glomérulos Renais/metabolismo , Masculino , Programas de Rastreamento/métodos , Pessoa de Meia-Idade , Razão de Chances , Modelos de Riscos Proporcionais , Insuficiência Renal Crônica/fisiopatologia , República da Coreia , Estudos Retrospectivos , Fatores de Risco , Índice de Gravidade de Doença , Fatores Sexuais , Fumar/epidemiologia , Análise de Sobrevida
4.
Appl Clin Inform ; 13(3): 521-531, 2022 05.
Artigo em Inglês | MEDLINE | ID: mdl-35705182

RESUMO

BACKGROUND: Cancer staging information is an essential component of cancer research. However, the information is primarily stored as either a full or semistructured free-text clinical document which is limiting the data use. By transforming the cancer-specific data to the Observational Medical Outcome Partnership Common Data Model (OMOP CDM), the information can contribute to establish multicenter observational cancer studies. To the best of our knowledge, there have been no studies on OMOP CDM transformation and natural language processing (NLP) for thyroid cancer to date. OBJECTIVE: We aimed to demonstrate the applicability of the OMOP CDM oncology extension module for thyroid cancer diagnosis and cancer stage information by processing free-text medical reports. METHODS: Thyroid cancer diagnosis and stage-related modifiers were extracted with rule-based NLP from 63,795 thyroid cancer pathology reports and 56,239 Iodine whole-body scan reports from three medical institutions in the Observational Health Data Sciences and Informatics data network. The data were converted into the OMOP CDM v6.0 according to the OMOP CDM oncology extension module. The cancer staging group was derived and populated using the transformed CDM data. RESULTS: The extracted thyroid cancer data were completely converted into the OMOP CDM. The distributions of histopathological types of thyroid cancer were approximately 95.3 to 98.8% of papillary carcinoma, 0.9 to 3.7% of follicular carcinoma, 0.04 to 0.54% of adenocarcinoma, 0.17 to 0.81% of medullary carcinoma, and 0 to 0.3% of anaplastic carcinoma. Regarding cancer staging, stage-I thyroid cancer accounted for 55 to 64% of the cases, while stage III accounted for 24 to 26% of the cases. Stage-II and -IV thyroid cancers were detected at a low rate of 2 to 6%. CONCLUSION: As a first study on OMOP CDM transformation and NLP for thyroid cancer, this study will help other institutions to standardize thyroid cancer-specific data for retrospective observational research and participate in multicenter studies.


Assuntos
Carcinoma Neuroendócrino , Neoplasias da Glândula Tireoide , Bases de Dados Factuais , Registros Eletrônicos de Saúde , Humanos , Estudos Retrospectivos , Neoplasias da Glândula Tireoide/diagnóstico
5.
Artigo em Inglês | MEDLINE | ID: mdl-18255361

RESUMO

We developed a quantitative method for the determination of methyl esterase activity, analyzing substrate specificity against three major signal molecules, jasmonic acid methyl ester (MeJA), salicylic acid methyl ester (MeSA), and indole-3-acetic acid methyl ester (MeIAA). We used a silylation reagent for chemical derivatization and used gas chromatography (GC)-mass spectroscopy in analyses, for high precision. To test this method, an Arabidopsis esterase gene, AtME8, was expressed in Escherichia coli, and then the kinetic parameters of the recombinant enzyme were determined for three substrates. Finally, this method was also applied to the direct quantification of phytohormones in petals from lilies and roses.


Assuntos
Hidrolases de Éster Carboxílico/análise , Arabidopsis/química , Cromatografia em Camada Fina , Escherichia coli/química , Cromatografia Gasosa-Espectrometria de Massas , Concentração de Íons de Hidrogênio , Indicadores e Reagentes , Cinética , Reguladores de Crescimento de Plantas/análise , Plantas/química , Padrões de Referência , Compostos de Trimetilsilil
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA