Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
1.
BMC Med Inform Decis Mak ; 19(1): 141, 2019 07 25.
Artículo en Inglés | MEDLINE | ID: mdl-31340796

RESUMEN

BACKGROUND: Usage of structured fields in Electronic Health Records (EHRs) to ascertain smoking history is important but fails in capturing the nuances of smoking behaviors. Knowledge of smoking behaviors, such as pack year history and most recent cessation date, allows care providers to select the best care plan for patients at risk of smoking attributable diseases. METHODS: We developed and evaluated a health informatics pipeline for identifying complete smoking history from clinical notes in EHRs. We utilized 758 patient-visit notes (from visits between 03/28/2016 and 04/04/2016) from our local EHR in addition to a public dataset of 502 clinical notes from the 2006 i2b2 Challenge to assess the performance of this pipeline. We used a machine-learning classifier to extract smoking status and a comprehensive set of text processing regular expressions to extract pack years and cessation date information from these clinical notes. RESULTS: We identified smoking status with an F1 score of 0.90 on both the i2b2 and local data sets. Regular expression identification of pack year history in the local test set was 91.7% sensitive and 95.2% specific, but due to variable context the pack year extraction was incomplete in 25% of cases, extracting packs per day or years smoked only. Regular expression identification of cessation date was 63.2% sensitive and 94.6% specific. CONCLUSIONS: Our work indicates that the development of an EHR-based Smokers' Registry containing information relating to smoking behaviors, not just status, from free-text clinical notes using an informatics pipeline is feasible. This pipeline is capable of functioning in external EHRs, reducing the amount of time and money needed at the institute-level to create a Smokers' Registry for improved identification of patient risk and eligibility for preventative and early detection services.


Asunto(s)
Algoritmos , Fumar Cigarrillos/epidemiología , Registros Electrónicos de Salud , Sistema de Registros , Conjuntos de Datos como Asunto , Humanos , Aprendizaje Automático , Informática Médica , Procesamiento de Lenguaje Natural
2.
BMC Med Inform Decis Mak ; 19(1): 143, 2019 07 25.
Artículo en Inglés | MEDLINE | ID: mdl-31345210

RESUMEN

BACKGROUND: Approximately 20% of deaths in the US each year are attributable to smoking, yet current practices in the recording of this health risk in electronic health records (EHRs) have not led to discernable changes in health outcomes. Several groups have developed algorithms for extracting smoking behaviors from clinical notes, but none of these approaches were assessed with external data to report on anticipated clinical performance. METHODS: Previously, we developed an informatics pipeline that extracts smoking status, pack year history, and cessation date from clinical notes. Here we report on the clinical implementation performance of our pipeline using 1,504 clinical notes matched to an external questionnaire. RESULTS: We found that 73% of available notes contained no smoking behavior information. The weighted Cohen's kappa between the external questionnaire and EHR smoking status was 0.62 (95% CI 0.56-0.69) for the clinical notes we were able to extract information from. The correlation between pack years reported by our pipeline and the external questionnaire was 0.39 on the 81 notes for which this information was present in both. We also assessed for lung cancer screening eligibility using notes from individuals identified as never smokers or smokers with pack year history extracted by our pipeline (n = 196). We found a positive predictive value of 85.4%, a negative predictive value of 83.8%, sensitivity of 63.1%, and specificity of 94.7%. CONCLUSIONS: We have demonstrated that our pipeline can extract smoking behaviors from unannotated EHR notes when the information is present. This information is reliable enough to identify patients most likely to be eligible for smoking related services. Ensuring capture of smoking information during clinical encounters should continue to be a high priority.


Asunto(s)
Algoritmos , Fumar Cigarrillos , Registros Electrónicos de Salud , Almacenamiento y Recuperación de la Información/métodos , Procesamiento de Lenguaje Natural , Adulto , Detección Precoz del Cáncer , Humanos , Neoplasias Pulmonares/diagnóstico , Sistemas de Registros Médicos Computarizados , Sistema de Registros , Encuestas y Cuestionarios
3.
Res Sq ; 2024 Jul 09.
Artículo en Inglés | MEDLINE | ID: mdl-39041033

RESUMEN

Spatial proteomics enable detailed analysis of tissue at single cell resolution. However, creating reliable segmentation masks and assigning accurate cell phenotypes to discrete cellular phenotypes can be challenging. We introduce IMmuneCite, a computational framework for comprehensive image pre-processing and single-cell dataset creation, focused on defining complex immune landscapes when using spatial proteomics platforms. We demonstrate that IMmuneCite facilitates the identification of 32 discrete immune cell phenotypes using data from human liver samples while substantially reducing nonbiological cell clusters arising from co-localization of markers for different cell lineages. We established its versatility and ability to accommodate any antibody panel and different species by applying IMmuneCite to data from murine liver tissue. This approach enabled deep characterization of different functional states in each immune compartment, uncovering key features of the immune microenvironment in clinical liver transplantation and murine hepatocellular carcinoma. In conclusion, we demonstrated that IMmuneCite is a user-friendly, integrated computational platform that facilitates investigation of the immune microenvironment across species, while ensuring the creation of an immune focused, spatially resolved single-cell proteomic dataset to provide high fidelity, biologically relevant analyses.

4.
Lung Cancer ; 166: 242-249, 2022 04.
Artículo en Inglés | MEDLINE | ID: mdl-35378489

RESUMEN

OBJECTIVES: Targeted RNA-based Next-Generation Sequencing (tRNA-seq) is increasingly being used in molecular diagnostics for gene fusion detection in non-small cell lung cancer (NSCLC). However, few data support its clinical application for the detection of single nucleotide variants (SNVs) and small insertions/deletions. In this study, we evaluated the performance of tRNA-seq using Archer FusionPlex for simultaneous detection of actionable gene fusions, splice variants, SNVs and indels in formalin-fixed, paraffin-embedded NSCLC tissue. MATERIALS AND METHODS: A total of 126 NSCLC samples, including 20 validation samples and 106 diagnostic cases, were analyzed by targeted DNA-based Next-Generation Sequencing (tDNA-seq) followed by tRNA-seq. RESULTS: All 28 SNVs and indels in the validation set, and 34 out of 35 mutations in the diagnostic set were identified by tRNA-seq. The only mutation undetected by tRNA-seq, ERBB2 p.(Ser310Tyr), was not included in the current Archer panel design. tRNA-seq revealed one additional BRAF p.(Val600Glu) mutation not found by tDNA-seq. SNVs and indels were correctly called by the vendor supplied software, except for ERBB2 duplication p.(Tyr772_A775dup) which was only detected by an additional in-house developed bio-informatics pipeline. Variant allelic frequency (VAF) values were generally higher at the expression level compared to the genomic level (range 6-96% for tRNA-seq versus 6-61% for tDNA-seq) and low VAF mutations in DNA (6-8% VAF) were all confirmed by tRNA-seq. Finally, tRNA-seq additionally identified a driver fusion or splice variant in 10 diagnostic NSCLC samples including one MET exon 14 skipping variant not detected by tDNA-seq. CONCLUSION: Our results demonstrate that tRNA-seq can be implemented in a diagnostic setting as an efficient strategy for simultaneous detection of actionable gene fusions, splice variants, SNVs and indels in NSCLC provided that adequate RNA-seq analysis tools are available, especially for the detection of indels. This approach allows upfront identification of currently recommended targetable molecular alterations in NSCLC samples.


Asunto(s)
Carcinoma de Pulmón de Células no Pequeñas , Neoplasias Pulmonares , Carcinoma de Pulmón de Células no Pequeñas/diagnóstico , Carcinoma de Pulmón de Células no Pequeñas/genética , ADN , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Neoplasias Pulmonares/diagnóstico , Neoplasias Pulmonares/genética , Mutación , Análisis de Secuencia de ARN/métodos
5.
PeerJ ; 7: e6374, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-30723633

RESUMEN

The pig is a well-studied model animal of biomedical and agricultural importance. Genes of this species, Sus scrofa, are known from experiments and predictions, and collected at the NCBI reference sequence database section. Gene reconstruction from transcribed gene evidence of RNA-seq now can accurately and completely reproduce the biological gene sets of animals and plants. Such a gene set for the pig is reported here, including human orthologs missing from current NCBI and Ensembl reference pig gene sets, additional alternate transcripts, and other improvements. Methodology for accurate and complete gene set reconstruction from RNA is used: the automated SRA2Genes pipeline of EvidentialGene project.

6.
Appl Transl Genom ; 3(3): 60-7, 2014 Sep 01.
Artículo en Inglés | MEDLINE | ID: mdl-27284505

RESUMEN

The field of medical genomics involves translating high throughput genetic methods to the clinic, in order to improve diagnostic efficiency and treatment decision making. Technical questions related to sample enrichment, sequencing methodologies and variant identification and calling algorithms, still need careful investigation in order to validate the analytical step of next generation sequencing techniques for clinical applications. However, the main foreseeable challenge will be interpreting the clinical significance of the variants observed in a given patient, as well as their significance for family members and for other patients. Every step in the variant interpretation process has limitations and difficulties, and its quote of contribution to false positive and false negative results. There is no single piece of evidence enough on its own to make firm conclusions on the pathogenicity and disease causality of a given variant. A plethora of automated analysis software tools is being developed that will enhance efficiency and accuracy. However a risk of misinterpretation could derive from biased biorepository content, facilitated by annotation of variant functional consequences using previous datasets stored in the same or linked repositories. In order to improve variant interpretation and avoid an exponential accumulation of confounding noise in the medical literature, the use of terms in a standard way should be sought and requested when reporting genetic variants and their consequences. Generally, stepwise and linear interpretation processes are likely to overrate some pieces of evidence while underscoring others. Algorithms are needed that allow a multidimensional, parallel analysis of diverse lines of evidence to be carried out by expert teams for specific genes, cellular pathways or disorders.

SELECCIÓN DE REFERENCIAS
Detalles de la búsqueda