RESUMO
The aim of this study was to evaluate the feasibility of a noninvasive and low-operator-dependent imaging method for carotid-artery-stenosis diagnosis. A previously developed prototype for 3D ultrasound scans based on a standard ultrasound machine and a pose reading sensor was used for this study. Working in a 3D space and processing data using automatic segmentation lowers operator dependency. Additionally, ultrasound imaging is a noninvasive diagnosis method. Artificial intelligence (AI)-based automatic segmentation of the acquired data was performed for the reconstruction and visualization of the scanned area: the carotid artery wall, the carotid artery circulated lumen, soft plaque, and calcified plaque. A qualitative evaluation was conducted via comparing the US reconstruction results with the CT angiographies of healthy and carotid-artery-disease patients. The overall scores for the automated segmentation using the MultiResUNet model for all segmented classes in our study were 0.80 for the IoU and 0.94 for the Dice. The present study demonstrated the potential of the MultiResUNet-based model for 2D-ultrasound-image automated segmentation for atherosclerosis diagnosis purposes. Using 3D ultrasound reconstructions may help operators achieve better spatial orientation and evaluation of segmentation results.
Assuntos
Inteligência Artificial , Angiografia por Tomografia Computadorizada , Humanos , Glândula Tireoide , Artérias Carótidas/diagnóstico por imagem , Ultrassonografia/métodos , Inteligência , Imageamento Tridimensional/métodosRESUMO
BACKGROUND: Automated language analysis of radiology reports using natural language processing (NLP) can provide valuable information on patients' health and disease. With its rapid development, NLP studies should have transparent methodology to allow comparison of approaches and reproducibility. This systematic review aims to summarise the characteristics and reporting quality of studies applying NLP to radiology reports. METHODS: We searched Google Scholar for studies published in English that applied NLP to radiology reports of any imaging modality between January 2015 and October 2019. At least two reviewers independently performed screening and completed data extraction. We specified 15 criteria relating to data source, datasets, ground truth, outcomes, and reproducibility for quality assessment. The primary NLP performance measures were precision, recall and F1 score. RESULTS: Of the 4,836 records retrieved, we included 164 studies that used NLP on radiology reports. The commonest clinical applications of NLP were disease information or classification (28%) and diagnostic surveillance (27.4%). Most studies used English radiology reports (86%). Reports from mixed imaging modalities were used in 28% of the studies. Oncology (24%) was the most frequent disease area. Most studies had dataset size > 200 (85.4%) but the proportion of studies that described their annotated, training, validation, and test set were 67.1%, 63.4%, 45.7%, and 67.7% respectively. About half of the studies reported precision (48.8%) and recall (53.7%). Few studies reported external validation performed (10.8%), data availability (8.5%) and code availability (9.1%). There was no pattern of performance associated with the overall reporting quality. CONCLUSIONS: There is a range of potential clinical applications for NLP of radiology reports in health services and research. However, we found suboptimal reporting quality that precludes comparison, reproducibility, and replication. Our results support the need for development of reporting standards specific to clinical NLP studies.
Assuntos
Processamento de Linguagem Natural , Radiografia , Radiologia/normas , Conjuntos de Dados como Assunto , Humanos , Reprodutibilidade dos Testes , Relatório de Pesquisa/normasRESUMO
BACKGROUND: Patient-based analysis of social media is a growing research field with the aim of delivering precision medicine but it requires accurate classification of posts relating to patients' experiences. We motivate the need for this type of classification as a pre-processing step for further analysis of social media data in the context of related work in this area. In this paper we present experiments for a three-way document classification by patient voice, professional voice or other. We present results for a convolutional neural network classifier trained on English data from two different data sources (Reddit and Twitter) and two domains (cardiovascular and skin diseases). RESULTS: We found that document classification by patient voice, professional voice or other can be done consistently manually (0.92 accuracy). Annotators agreed roughly equally for each domain (cardiovascular and skin) but they agreed more when annotating Reddit posts compared to Twitter posts. Best classification performance was obtained when training two separate classifiers for each data source, one for Reddit and one for Twitter posts, when evaluating on in-source test data for both test sets combined with an overall accuracy of 0.95 (and macro-average F1 of 0.92) and an F1-score of 0.95 for patient voice only. CONCLUSION: The main conclusion resulting from this work is that combining social media data from platforms with different characteristics for training a patient and professional voice classifier does not result in best possible performance. We showed that it is best to train separate models per data source (Reddit and Twitter) instead of a model using the combined training data from both sources. We also found that it is preferable to train separate models per domain (cardiovascular and skin) while showing that the difference to the combined model is only minor (0.01 accuracy). Our highest overall F1-score (0.95) obtained for classifying posts as patient voice is a very good starting point for further analysis of social media data reflecting the experience of patients.
Assuntos
Mídias Sociais , Humanos , Redes Neurais de Computação , Medicina de PrecisãoRESUMO
BACKGROUND: Natural language processing (NLP) has a significant role in advancing healthcare and has been found to be key in extracting structured information from radiology reports. Understanding recent developments in NLP application to radiology is of significance but recent reviews on this are limited. This study systematically assesses and quantifies recent literature in NLP applied to radiology reports. METHODS: We conduct an automated literature search yielding 4836 results using automated filtering, metadata enriching steps and citation search combined with manual review. Our analysis is based on 21 variables including radiology characteristics, NLP methodology, performance, study, and clinical application characteristics. RESULTS: We present a comprehensive analysis of the 164 publications retrieved with publications in 2019 almost triple those in 2015. Each publication is categorised into one of 6 clinical application categories. Deep learning use increases in the period but conventional machine learning approaches are still prevalent. Deep learning remains challenged when data is scarce and there is little evidence of adoption into clinical practice. Despite 17% of studies reporting greater than 0.85 F1 scores, it is hard to comparatively evaluate these approaches given that most of them use different datasets. Only 14 studies made their data and 15 their code available with 10 externally validating results. CONCLUSIONS: Automated understanding of clinical narratives of the radiology reports has the potential to enhance the healthcare process and we show that research in this field continues to grow. Reproducibility and explainability of models are important if the domain is to move applications into clinical use. More could be done to share code enabling validation of methods on different institutional data and to reduce heterogeneity in reporting of study properties allowing inter-study comparisons. Our results have significance for researchers in the field providing a systematic synthesis of existing work to build on, identify gaps, opportunities for collaboration and avoid duplication.
Assuntos
Sistemas de Informação em Radiologia , Radiologia , Humanos , Aprendizado de Máquina , Processamento de Linguagem Natural , Reprodutibilidade dos TestesRESUMO
The classic ultrasonographic differentiation between benign and malignant adnexal masses encounters several limitations. Ultrasonography-based texture analysis (USTA) offers a new perspective, but its role has been incompletely evaluated. This study aimed to further investigate USTA's capacity in differentiating benign from malignant adnexal tumors, as well as comparing the workflow and the results with previously-published research. A total of 123 adnexal lesions (benign, 88; malignant, 35) were retrospectively included. The USTA was performed on dedicated software. By applying three reduction techniques, 23 features with the highest discriminatory potential were selected. The features' ability to identify ovarian malignancies was evaluated through univariate, multivariate, and receiver operating characteristics analyses, and also by the use of the k-nearest neighbor (KNN) classifier. Three parameters were independent predictors for ovarian neoplasms (sum variance, and two variations of the sum of squares). Benign and malignant lesions were differentiated with 90.48% sensitivity and 93.1% specificity by the prediction model (which included the three independent predictors), and with 71.43-80% sensitivity and 87.5-89.77% specificity by the KNN classifier. The USTA shows statistically significant differences between the textures of the two groups, but it is unclear whether the parameters can reflect the true histopathological characteristics of adnexal lesions.
RESUMO
BACKGROUND AND AIMS: While there is an increasing emphasis on the value of interdisciplinarity in scholarship in the medical humanities, it is unknown to what extent there is joint working between historians and clinicians in medical history. We aimed to quantify evidence of joint working in authorship of medical history papers. METHODS: Observational survey of authorship. We studied authorship data in all papers published in the three major medical history journals between 2009 and 2019 (n = 634). RESULTS: The majority of medical history papers is written by single authors with single disciplinary affiliations (68%), most commonly history (65%): fewer than one paper in seven (14%) shows evidence of joint working between disciplines in authorship. A minority of papers (8%) are written by authors with primary medical affiliations. Almost three-quarters (71%) of papers have an acknowledgements section, but only 6% shows clear evidence of joint working between disciplines in the acknowledgements. CONCLUSIONS: Scholarship engaging both historians and clinicians is rare in medical history journals. Possible solutions include enhanced research collaborations between historians and clinicians, interdisciplinary educational seminars and cross-institutional knowledge exchanges.
Assuntos
Autoria , Bolsas de Estudo , Pessoal de Saúde , Humanos , Conhecimento , PublicaçõesRESUMO
How does scientific research affect the world around us? Being able to answer this question is of great importance in order to appropriately channel efforts and resources in science. The impact by scientists in academia is currently measured by citation based metrics such as h-index, i-index and citation counts. These academic metrics aim to represent the dissemination of knowledge among scientists rather than the impact of the research on the wider world. In this work we are interested in measuring scientific impact beyond academia, on the economy, society, health and legislation (comprehensive impact). Indeed scientists are asked to demonstrate evidence of such comprehensive impact by authoring case studies in the context of the Research Excellence Framework (REF). We first investigate the extent to which existing citation based metrics can be indicative of comprehensive impact. We have collected all recent REF impact case studies from 2014 and we have linked these to papers in citation networks that we constructed and derived from CiteSeerX, arXiv and PubMed Central using a number of text processing and information retrieval techniques. We have demonstrated that existing citation-based metrics for impact measurement do not correlate well with REF impact results. We also consider metrics of online attention surrounding scientific works, such as those provided by the Altmetric API. We argue that in order to be able to evaluate wider non-academic impact we need to mine information from a much wider set of resources, including social media posts, press releases, news articles and political debates stemming from academic work. We also provide our data as a free and reusable collection for further analysis, including the PubMed citation network and the correspondence between REF case studies, grant applications and the academic literature.