RESUMO
Grain boundary (GB) migration in polycrystalline materials necessarily implies the concurrent motion of triple junctions (TJs), the lines along which three GBs meet. Today, we understand that GB migration occurs through the motion of disconnections in the GB plane (line defects with both step and dislocation character). We present evidence from molecular dynamics grain growth simulations and idealized microstructures that demonstrates that TJ motion and GB migration are coupled through disconnection dynamics. Based on these results, we develop a theory of coupled GB/TJ migration and use it to develop a physically based, disconnection mechanism-specific continuum model of microstructure evolution. The continuum approach provides a means of reducing the complexity of the discrete disconnection picture to extract the features of disconnection dynamics that are important for microstructure evolution. We implement this model in a numerical, continuum simulation and demonstrate that it is capable of reproducing the molecular dynamics (MD) simulation results.
RESUMO
Mass spectrometry imaging can produce large amounts of complex spectral and spatial data. Such data sets are often analyzed with unsupervised machine learning approaches, which aim at reducing their complexity and facilitating their interpretation. However, choices made during data processing can impact the overall interpretation of these analyses. This work investigates the impact of the choices made at the peak selection step, which often occurs early in the data processing pipeline. The discussion is done in terms of visualization and interpretation of the results of two commonly used unsupervised approaches: t-distributed stochastic neighbor embedding and k-means clustering, which differ in nature and complexity. Criteria considered for peak selection include those based on hypotheses (exemplified herein in the analysis of metabolic alterations in genetically engineered mouse models of human colorectal cancer), particular molecular classes, and ion intensity. The results suggest that the choices made at the peak selection step have a significant impact in the visual interpretation of the results of either dimensionality reduction or clustering techniques and consequently in any downstream analysis that relies on these. Of particular significance, the results of this work show that while using the most abundant ions can result in interesting structure-related segmentation patterns that correlate well with histological features, using a smaller number of ions specifically selected based on prior knowledge about the biochemistry of the tissues under investigation can result in an easier-to-interpret, potentially more valuable, hypothesis-confirming result. Findings presented will help researchers understand and better utilize unsupervised machine learning approaches to mine high-dimensionality data.
RESUMO
In polycrystalline materials, grain boundaries are sites of enhanced atomic motion, but the complexity of the atomic structures within a grain boundary network makes it difficult to link the structure and atomic dynamics. Here, we use a machine learning technique to establish a connection between local structure and dynamics of these materials. Following previous work on bulk glassy materials, we define a purely structural quantity (softness) that captures the propensity of an atom to rearrange. This approach correctly identifies crystalline regions, stacking faults, and twin boundaries as having low likelihood of atomic rearrangements while finding a large variability within high-energy grain boundaries. As has been found in glasses, the probability that atoms of a given softness will rearrange is nearly Arrhenius. This indicates a well-defined energy barrier as well as a well-defined prefactor for the Arrhenius form for atoms of a given softness. The decrease in the prefactor for low-softness atoms indicates that variations in entropy exhibit a dominant influence on the atomic dynamics in grain boundaries.
RESUMO
Introduction: Renal transplant biopsies provide insights into graft health and support decision making. The current evidence on links between biopsy scores and transplant outcomes suggests there may be numerous factors affecting biopsy scores. Here we adopt measurement science approach to investigate the sources of uncertainty in biopsy assessment and suggest techniques to improve its robustness. Methods: Histological assessments, Remuzzi scores, biopsy processing and clinical variables are obtained from 144 repeat biopsies originating from 16 deceased-donor kidneys. We conducted sensitivity analysis to find the morphometric features with highest discriminating power and studied the dependencies of these features on biopsy and stain type. The analysis results formed a basis for recommendations on reducing the assessment variability. Results: Most morphometric variables are influenced by the biopsy and stain types. The variables with the highest discriminatory power are sclerotic glomeruli counts, healthy glomeruli counts per unit area, percentages of interstitial fibrosis and tubular atrophy as well as diameter and lumen of the worst artery. A revised glomeruli adequacy score is proposed to improve the robustness of the glomeruli statistics, whereby a minimum of 104 µm2 of cortex tissue is recommended to keep type 1 and type 2 error probabilities below 0.15 and 0.2. Discussion: The findings are transferable to several biopsy scoring systems. We hope that this work will help practitioners to understand the sources of statistical uncertainty and improve the utility of renal biopsy.
RESUMO
Molecular imaging is a key tool in the diagnosis and treatment of prostate cancer (PCa). Magnetic Resonance (MR) plays a major role in this respect with nuclear medicine imaging, particularly, Prostate-Specific Membrane Antigen-based, (PSMA-based) positron emission tomography with computed tomography (PET/CT) also playing a major role of rapidly increasing importance. Another key technology finding growing application across medicine and specifically in molecular imaging is the use of machine learning (ML) and artificial intelligence (AI). Several authoritative reviews are available of the role of MR-based molecular imaging with a sparsity of reviews of the role of PET/CT. This review will focus on the use of AI for molecular imaging for PCa. It will aim to achieve two goals: firstly, to give the reader an introduction to the AI technologies available, and secondly, to provide an overview of AI applied to PET/CT in PCa. The clinical applications include diagnosis, staging, target volume definition for treatment planning, outcome prediction and outcome monitoring. ML and AL techniques discussed include radiomics, convolutional neural networks (CNN), generative adversarial networks (GAN) and training methods: supervised, unsupervised and semi-supervised learning.
RESUMO
Background: Preimplantation biopsy combines measurements of injury into a composite index to inform organ acceptance. The uncertainty in these measurements remains poorly characterized, raising concerns variability may contribute to inappropriate clinical decisions. Methods: We adopted a metrological approach to evaluate biopsy score reliability. Variability was assessed by performing repeat biopsies (nâ =â 293) on discarded allografts (nâ =â 16) using 3 methods (core, punch, and wedge). Uncertainty was quantified using a bootstrapping analysis. Observer effects were controlled by semi-blinded scoring, and the findings were validated by comparison with standard glass evaluation. Results: The surgical method strongly determined the size (core biopsy area 9.04 mm2, wedge 37.9 mm2) and, therefore, yield (glomerular yield râ =â 0.94, arterial râ =â 0.62) of each biopsy. Core biopsies yielded inadequate slides most frequently. Repeat biopsy of the same kidney led to marked variation in biopsy scores. In 10 of 16 cases, scores were contradictory, crossing at least 1 decision boundary (ie, to transplant or to discard). Bootstrapping demonstrated significant uncertainty associated with single-slide assessment; however, scores were similar for paired kidneys from the same donor. Conclusions: Our investigation highlights the risks of relying on single-slide assessment to quantify organ injury. Biopsy evaluation is subject to uncertainty, meaning each slide is better conceptualized as providing an estimate of the kidney's condition rather than a definitive result. Pooling multiple assessments could improve the reliability of biopsy analysis, enhancing confidence. Where histological quantification is necessary, clinicians should seek to develop new protocols using more tissue and consider automated methods to assist pathologists in delivering analysis within clinical time frames.
RESUMO
Performing a mitosis count (MC) is the diagnostic task of histologically grading canine Soft Tissue Sarcoma (cSTS). However, mitosis count is subject to inter- and intra-observer variability. Deep learning models can offer a standardisation in the process of MC used to histologically grade canine Soft Tissue Sarcomas. Subsequently, the focus of this study was mitosis detection in canine Perivascular Wall Tumours (cPWTs). Generating mitosis annotations is a long and arduous process open to inter-observer variability. Therefore, by keeping pathologists in the loop, a two-step annotation process was performed where a pre-trained Faster R-CNN model was trained on initial annotations provided by veterinary pathologists. The pathologists reviewed the output false positive mitosis candidates and determined whether these were overlooked candidates, thus updating the dataset. Faster R-CNN was then trained on this updated dataset. An optimal decision threshold was applied to maximise the F1-score predetermined using the validation set and produced our best F1-score of 0.75, which is competitive with the state of the art in the canine mitosis domain.
RESUMO
INTRODUCTION: Worldwide, pancreatic cancer has a poor prognosis. Early diagnosis may improve survival by enabling curative treatment. Statistical and machine learning diagnostic prediction models using risk factors such as patient demographics and blood tests are being developed for clinical use to improve early diagnosis. One example is the Enriching New-onset Diabetes for Pancreatic Cancer (ENDPAC) model, which employs patients' age, blood glucose and weight changes to provide pancreatic cancer risk scores. These values are routinely collected in primary care in the UK. Primary care's central role in cancer diagnosis makes it an ideal setting to implement ENDPAC but it has yet to be used in clinical settings. This study aims to determine the feasibility of applying ENDPAC to data held by UK primary care practices. METHODS AND ANALYSIS: This will be a multicentre observational study with a cohort design, determining the feasibility of applying ENDPAC in UK primary care. We will develop software to search, extract and process anonymised data from 20 primary care providers' electronic patient record management systems on participants aged 50+ years, with a glycated haemoglobin (HbA1c) test result of ≥48 mmol/mol (6.5%) and no previous abnormal HbA1c results. Software to calculate ENDPAC scores will be developed, and descriptive statistics used to summarise the cohort's demographics and assess data quality. Findings will inform the development of a future UK clinical trial to test ENDPAC's effectiveness for the early detection of pancreatic cancer. ETHICS AND DISSEMINATION: This project has been reviewed by the University of Surrey University Ethics Committee and received a favourable ethical opinion (FHMS 22-23151 EGA). Study findings will be presented at scientific meetings and published in international peer-reviewed journals. Participating primary care practices, clinical leads and policy makers will be provided with summaries of the findings.
Assuntos
Diabetes Mellitus , Neoplasias Pancreáticas , Humanos , Estudos de Viabilidade , Hemoglobinas Glicadas , Estudos Observacionais como Assunto , Atenção Primária à Saúde , Fatores de Risco , Pessoa de Meia-Idade , Estudos Multicêntricos como Assunto , IdosoRESUMO
Social determinants of health (SDoH) play a critical role in patient outcomes, yet their documentation is often missing or incomplete in the structured data of electronic health records (EHRs). Large language models (LLMs) could enable high-throughput extraction of SDoH from the EHR to support research and clinical care. However, class imbalance and data limitations present challenges for this sparsely documented yet critical information. Here, we investigated the optimal methods for using LLMs to extract six SDoH categories from narrative text in the EHR: employment, housing, transportation, parental status, relationship, and social support. The best-performing models were fine-tuned Flan-T5 XL for any SDoH mentions (macro-F1 0.71), and Flan-T5 XXL for adverse SDoH mentions (macro-F1 0.70). Adding LLM-generated synthetic data to training varied across models and architecture, but improved the performance of smaller Flan-T5 models (delta F1 + 0.12 to +0.23). Our best-fine-tuned models outperformed zero- and few-shot performance of ChatGPT-family models in the zero- and few-shot setting, except GPT4 with 10-shot prompting for adverse SDoH. Fine-tuned models were less likely than ChatGPT to change their prediction when race/ethnicity and gender descriptors were added to the text, suggesting less algorithmic bias (p < 0.05). Our models identified 93.8% of patients with adverse SDoH, while ICD-10 codes captured 2.0%. These results demonstrate the potential of LLMs in improving real-world evidence on SDoH and assisting in identifying patients who could benefit from resource support.
RESUMO
We have analysed mental health data for in-patient admissions from 1997 to 2021 in Scotland. The number of patient admissions for mental health patients is declining despite population numbers increasing. This is driven by the adult population; child and adolescent numbers are consistent. We find that mental health in-patients are more likely to be from deprived areas: 33 % of patients are from the most deprived areas, compared to only 11 % from the least deprived. The average length of stay for a mental health in-patient is decreasing, with a rise in stays lasting less than a day. The number of mental health patients who have been readmitted within a month fell from 1997 to 2011, then increased to 2021. Despite the average stay length decreasing, the number of overall readmissions is increasing, suggesting patients are having more, shorter stays.
Assuntos
Saúde Mental , Readmissão do Paciente , Adolescente , Adulto , Criança , Humanos , Hospitalização , Admissão do Paciente , Escócia/epidemiologiaRESUMO
Self-supervised learning (SSL) has become a popular method for generating invariant representations without the need for human annotations. Nonetheless, the desired invariant representation is achieved by utilizing prior online transformation functions on the input data. As a result, each SSL framework is customized for a particular data type, for example, visual data, and further modifications are required if it is used for other dataset types. On the other hand, autoencoder (AE), which is a generic and widely applicable framework, mainly focuses on dimension reduction and is not suited for learning invariant representation. This article proposes a generic SSL framework based on a constrained self-labeling assignment process that prevents degenerate solutions. Specifically, the prior transformation functions are replaced with a self-transformation mechanism, derived through an unsupervised training process of adversarial training, for imposing invariant representations. Via the self-transformation mechanism, pairs of augmented instances can be generated from the same input data. Finally, a training objective based on contrastive learning is designed by leveraging both the self-labeling assignment and the self-transformation mechanism. Despite the fact that the self-transformation process is very generic, the proposed training strategy outperforms a majority of state-of-the-art representation learning methods based on AE structures. To validate the performance of our method, we conduct experiments on four types of data, namely visual, audio, text, and mass spectrometry data and compare them in terms of four quantitative metrics. Our comparison results demonstrate that the proposed method is effective and robust in identifying patterns within the tested datasets.
RESUMO
The definitive diagnosis of canine soft-tissue sarcomas (STSs) is based on histological assessment of formalin-fixed tissues. Assessment of parameters, such as degree of differentiation, necrosis score and mitotic score, give rise to a final tumour grade, which is important in determining prognosis and subsequent treatment modalities. However, grading discrepancies are reported to occur in human and canine STSs, which can result in complications regarding treatment plans. The introduction of digital pathology has the potential to help improve STS grading via automated determination of the presence and extent of necrosis. The detected necrotic regions can be factored in the grading scheme or excluded before analysing the remaining tissue. Here we describe a method to detect tumour necrosis in histopathological whole-slide images (WSIs) of STSs using machine learning. Annotated areas of necrosis were extracted from WSIs and the patches containing necrotic tissue fed into a pre-trained DenseNet161 convolutional neural network (CNN) for training, testing and validation. The proposed CNN architecture reported favourable results, with an overall validation accuracy of 92.7% for necrosis detection which represents the number of correctly classified data instances over the total number of data instances. The proposed method, when vigorously validated represents a promising tool to assist pathologists in evaluating necrosis in canine STS tumours, by increasing efficiency, accuracy and reducing inter-rater variation.
RESUMO
Image clustering has recently attracted significant attention due to the increased availability of unlabeled datasets. The efficiency of traditional clustering algorithms heavily depends on the distance functions used and the dimensionality of the features. Therefore, performance degradation is often observed when tackling either unprocessed images or high-dimensional features extracted from processed images. To deal with these challenges, we propose a deep clustering framework consisting of a modified generative adversarial network (GAN) and an auxiliary classifier. The modification employs Sobel operations prior to the discriminator of the GAN to enhance the separability of the learned features. The discriminator is then leveraged to generate representations as to the input to an auxiliary classifier. An objective function is utilized to train the auxiliary classifier by maximizing the mutual information between the representations obtained via the discriminator model and the same representations perturbed via adversarial training. We further improve the robustness of the auxiliary classifier by introducing a penalty term into the objective function. This minimizes the divergence across multiple transformed representations generated by the discriminator model with a low dropout rate. The auxiliary classifier is implemented with a group of multiple cluster-heads, where a tolerance hyper-parameter is used to tackle imbalanced data. Our results indicate that the proposed method achieves competitive results compared with state-of-the-art clustering methods on a wide range of benchmark datasets including CIFAR-10, CIFAR-100/20, and STL10.
RESUMO
There is a global emergency in relation to mental health (MH) and healthcare. In the UK each year, 1 in 4 people will experience MH problems. Healthcare services are increasingly oversubscribed, and COVID-19 has deepened the healthcare gap. We investigated the effect of COVID-19 on waiting times for MH services in Scotland. We used national registers of MH services provided by Public Health Scotland. The results show that waiting times for adults and children increased drastically during the pandemic. This was seen nationally and across most of the administrative regions of Scotland. We find, however, that child and adolescent services were comparatively less impacted by the pandemic than adult services. This is potentially due to prioritisation of paediatric patients, or due to an increasing demand on adult services triggered by the pandemic itself.
Assuntos
COVID-19 , Serviços de Saúde Mental , Adolescente , Adulto , COVID-19/epidemiologia , Criança , Humanos , Saúde Mental , Escócia/epidemiologia , Reino Unido/epidemiologiaRESUMO
Background: Pathology services experienced a surge in demand during the COVID-19 pandemic. Digitalisation of pathology workflows can help to increase throughput, yet many existing digitalisation solutions use non-standardised workflows captured in proprietary data formats and processed by black-box software, yielding data of varying quality. This study presents the views of a UK-led expert group on the barriers to adoption and the required input of measurement science to improve current practices in digital pathology. Methods: With an aim to support the UK's efforts in digitalisation of pathology services, this study comprised: (1) a review of existing evidence, (2) an online survey of domain experts, and (3) a workshop with 42 representatives from healthcare, regulatory bodies, pharmaceutical industry, academia, equipment, and software manufacturers. The discussion topics included sample processing, data interoperability, image analysis, equipment calibration, and use of novel imaging modalities. Findings: The lack of data interoperability within the digital pathology workflows hinders data lookup and navigation, according to 80% of attendees. All participants stressed the importance of integrating imaging and non-imaging data for diagnosis, while 80% saw data integration as a priority challenge. 90% identified the benefits of artificial intelligence and machine learning, but identified the need for training and sound performance metrics.Methods for calibration and providing traceability were seen as essential to establish harmonised, reproducible sample processing, and image acquisition pipelines. Vendor-neutral data standards were seen as a "must-have" for providing meaningful data for downstream analysis. Users and vendors need good practice guidance on evaluation of uncertainty, fitness-for-purpose, and reproducibility of artificial intelligence/machine learning tools. All of the above needs to be accompanied by an upskilling of the pathology workforce. Conclusions: Digital pathology requires interoperable data formats, reproducible and comparable laboratory workflows, and trustworthy computer analysis software. Despite high interest in the use of novel imaging techniques and artificial intelligence tools, their adoption is slowed down by the lack of guidance and evaluation tools to assess the suitability of these techniques for specific clinical question. Measurement science expertise in uncertainty estimation, standardisation, reference materials, and calibration can help establishing reproducibility and comparability between laboratory procedures, yielding high quality data and providing higher confidence in diagnosis.
RESUMO
Necrosis seen in histopathology Whole Slide Images is a major criterion that contributes towards scoring tumour grade which then determines treatment options. However conventional manual assessment suffers from inter-operator reproducibility impacting grading precision. To address this, automatic necrosis detection using AI may be used to assess necrosis for final scoring that contributes towards the final clinical grade. Using deep learning AI, we describe a novel approach for automating necrosis detection in Whole Slide Images, tested on a canine Soft Tissue Sarcoma (cSTS) data set consisting of canine Perivascular Wall Tumours (cPWTs). A patch-based deep learning approach was developed where different variations of training a DenseNet-161 Convolutional Neural Network architecture were investigated as well as a stacking ensemble. An optimised DenseNet-161 with post-processing produced a hold-out test F1-score of 0.708 demonstrating state-of-the-art performance. This represents a novel first-time automated necrosis detection method in the cSTS domain as well specifically in detecting necrosis in cPWTs demonstrating a significant step forward in reproducible and reliable necrosis assessment for improving the precision of tumour grading.
Assuntos
Aprendizado Profundo , Neoplasias de Tecido Conjuntivo e de Tecidos Moles , Animais , Cães , Necrose , Redes Neurais de Computação , Reprodutibilidade dos TestesRESUMO
In this work, we compare the performance of six state-of-the-art deep neural networks in classification tasks when using only image features, to when these are combined with patient metadata. We utilise transfer learning from networks pretrained on ImageNet to extract image features from the ISIC HAM10000 dataset prior to classification. Using several classification performance metrics, we evaluate the effects of including metadata with the image features. Furthermore, we repeat our experiments with data augmentation. Our results show an overall enhancement in performance of each network as assessed by all metrics, only noting degradation in a vgg16 architecture. Our results indicate that this performance enhancement may be a general property of deep networks and should be explored in other areas. Moreover, these improvements come at a negligible additional cost in computation time, and therefore are a practical method for other applications.
Assuntos
Metadados , Redes Neurais de Computação , Humanos , Aprendizado de MáquinaRESUMO
The effect of the 2020 pandemic, and of the national measures introduced to control it, is not yet fully understood. The aim of this study was to investigate how different types of primary care data can help quantify the effect of the coronavirus disease (COVID-19) crisis on mental health. A retrospective cohort study investigated changes in weekly counts of mental health consultations and prescriptions. The data were extracted from one the UK's largest primary care databases between January 1st 2015 and October 31st 2020 (end of follow-up). The 2020 trends were compared to the 2015-19 average with 95% confidence intervals using longitudinal plots and analysis of covariance (ANCOVA). A total number of 504 practices (7,057,447 patients) contributed data. During the period of national restrictions, on average, there were 31% (3957 ± 269, p < 0.001) fewer events and 6% (4878 ± 1108, p < 0.001) more prescriptions per week as compared to the 2015-19 average. The number of events was recovering, increasing by 75 (± 29, p = 0.012) per week. Prescriptions returned to the 2015-19 levels by the end of the study (p = 0.854). The significant reduction in the number of consultations represents part of the crisis. Future service planning and quality improvements are needed to reduce the negative effect on health and healthcare.
Assuntos
COVID-19 , Saúde Mental , Humanos , Prescrições , Atenção Primária à Saúde , Encaminhamento e Consulta , Estudos Retrospectivos , SARS-CoV-2RESUMO
Healthcare is increasingly and routinely generating large volumes of data from different sources, which are difficult to handle and integrate. Confidence in data can be established through the knowledge that the data are validated, well-curated and with minimal bias or errors. As the National Measurement Institute of the UK, the National Physical Laboratory (NPL) is running an interdisciplinary project on digital health data curation. The project addresses one of the key challenges of the UK's Measurement Strategy, to provide confidence in the intelligent and effective use of data. A workshop was organised by NPL in which important stakeholders from NHS, industry and academia outlined the current and future challenges in healthcare data curation. This paper summarises the findings of the workshop and outlines NPL's views on how a metrological approach to the curation of healthcare data sets could help solve some of the important and emerging challenges of utilising healthcare data.
Assuntos
Coleta de Dados/métodos , Informática Médica/métodos , Projetos de Pesquisa/normas , Coleta de Dados/normas , Difusão de Inovações , Humanos , Informática Médica/normas , Metadados/normas , Telemedicina/métodos , Telemedicina/normas , Reino UnidoRESUMO
Although routine healthcare data are not collected for research, they are increasingly used in epidemiology and are key real-world evidence for improving healthcare. This study presents a method to identify prostate cancer cases from a large English primary care database. 19,619 (1.3%) men had a code for prostate cancer diagnosis. Codes for medium and high Gleason grading enabled identification of additional 94 (0.5%) cases. Many studies do not report codes used to identify patients, and if published, the lists of codes differ from study to study. This can lead to poor research reproducibility and hinder validation. This work demonstrates that carefully developed comprehensive lists of clinical codes can be used to identify prostate cancer; and that approaches that do not solely rely on clinical codes such as ontologies or data linkage should also be considered.