Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 49
Filtrar
Más filtros

Bases de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Mod Pathol ; 37(4): 100439, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38286221

RESUMEN

This work puts forth and demonstrates the utility of a reporting framework for collecting and evaluating annotations of medical images used for training and testing artificial intelligence (AI) models in assisting detection and diagnosis. AI has unique reporting requirements, as shown by the AI extensions to the Consolidated Standards of Reporting Trials (CONSORT) and Standard Protocol Items: Recommendations for Interventional Trials (SPIRIT) checklists and the proposed AI extensions to the Standards for Reporting Diagnostic Accuracy (STARD) and Transparent Reporting of a Multivariable Prediction model for Individual Prognosis or Diagnosis (TRIPOD) checklists. AI for detection and/or diagnostic image analysis requires complete, reproducible, and transparent reporting of the annotations and metadata used in training and testing data sets. In an earlier work by other researchers, an annotation workflow and quality checklist for computational pathology annotations were proposed. In this manuscript, we operationalize this workflow into an evaluable quality checklist that applies to any reader-interpreted medical images, and we demonstrate its use for an annotation effort in digital pathology. We refer to this quality framework as the Collection and Evaluation of Annotations for Reproducible Reporting of Artificial Intelligence (CLEARR-AI).


Asunto(s)
Inteligencia Artificial , Lista de Verificación , Humanos , Pronóstico , Procesamiento de Imagen Asistido por Computador , Proyectos de Investigación
2.
Histopathology ; 84(6): 915-923, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38433289

RESUMEN

A growing body of research supports stromal tumour-infiltrating lymphocyte (TIL) density in breast cancer to be a robust prognostic and predicive biomarker. The gold standard for stromal TIL density quantitation in breast cancer is pathologist visual assessment using haematoxylin and eosin-stained slides. Artificial intelligence/machine-learning algorithms are in development to automate the stromal TIL scoring process, and must be validated against a reference standard such as pathologist visual assessment. Visual TIL assessment may suffer from significant interobserver variability. To improve interobserver agreement, regulatory science experts at the US Food and Drug Administration partnered with academic pathologists internationally to create a freely available online continuing medical education (CME) course to train pathologists in assessing breast cancer stromal TILs using an interactive format with expert commentary. Here we describe and provide a user guide to this CME course, whose content was designed to improve pathologist accuracy in scoring breast cancer TILs. We also suggest subsequent steps to translate knowledge into clinical practice with proficiency testing.


Asunto(s)
Neoplasias de la Mama , Humanos , Femenino , Patólogos , Linfocitos Infiltrantes de Tumor , Inteligencia Artificial , Pronóstico
3.
J Pathol ; 261(4): 378-384, 2023 12.
Artículo en Inglés | MEDLINE | ID: mdl-37794720

RESUMEN

Quantifying tumor-infiltrating lymphocytes (TILs) in breast cancer tumors is a challenging task for pathologists. With the advent of whole slide imaging that digitizes glass slides, it is possible to apply computational models to quantify TILs for pathologists. Development of computational models requires significant time, expertise, consensus, and investment. To reduce this burden, we are preparing a dataset for developers to validate their models and a proposal to the Medical Device Development Tool (MDDT) program in the Center for Devices and Radiological Health of the U.S. Food and Drug Administration (FDA). If the FDA qualifies the dataset for its submitted context of use, model developers can use it in a regulatory submission within the qualified context of use without additional documentation. Our dataset aims at reducing the regulatory burden placed on developers of models that estimate the density of TILs and will allow head-to-head comparison of multiple computational models on the same data. In this paper, we discuss the MDDT preparation and submission process, including the feedback we received from our initial interactions with the FDA and propose how a qualified MDDT validation dataset could be a mechanism for open, fair, and consistent measures of computational model performance. Our experiences will help the community understand what the FDA considers relevant and appropriate (from the perspective of the submitter), at the early stages of the MDDT submission process, for validating stromal TIL density estimation models and other potential computational models. © 2023 The Authors. The Journal of Pathology published by John Wiley & Sons Ltd on behalf of The Pathological Society of Great Britain and Ireland. This article has been contributed to by U.S. Government employees and their work is in the public domain in the USA.


Asunto(s)
Linfocitos Infiltrantes de Tumor , Patólogos , Estados Unidos , Humanos , United States Food and Drug Administration , Linfocitos Infiltrantes de Tumor/patología , Reino Unido
4.
Stat Med ; 34(4): 685-703, 2015 Feb 20.
Artículo en Inglés | MEDLINE | ID: mdl-25399736

RESUMEN

The area under the receiver operating characteristic curve is often used as a summary index of the diagnostic ability in evaluating biomarkers when the clinical outcome (truth) is binary. When the clinical outcome is right-censored survival time, the C index, motivated as an extension of area under the receiver operating characteristic curve, has been proposed by Harrell as a measure of concordance between a predictive biomarker and the right-censored survival outcome. In this work, we investigate methods for statistical comparison of two diagnostic or predictive systems, of which they could either be two biomarkers or two fixed algorithms, in terms of their C indices. We adopt a U-statistics-based C estimator that is asymptotically normal and develop a nonparametric analytical approach to estimate the variance of the C estimator and the covariance of two C estimators. A z-score test is then constructed to compare the two C indices. We validate our one-shot nonparametric method via simulation studies in terms of the type I error rate and power. We also compare our one-shot method with resampling methods including the jackknife and the bootstrap. Simulation results show that the proposed one-shot method provides almost unbiased variance estimations and has satisfactory type I error control and power. Finally, we illustrate the use of the proposed method with an example from the Framingham Heart Study.


Asunto(s)
Bioestadística/métodos , Estadísticas no Paramétricas , Algoritmos , Área Bajo la Curva , Biomarcadores , Enfermedades Cardiovasculares/etiología , Simulación por Computador , Humanos , Modelos Estadísticos , Análisis Multivariante , Estudios Prospectivos , Curva ROC , Análisis de Supervivencia
5.
BMC Med Res Methodol ; 13: 98, 2013 Jul 29.
Artículo en Inglés | MEDLINE | ID: mdl-23895587

RESUMEN

BACKGROUND: The surge in biomarker development calls for research on statistical evaluation methodology to rigorously assess emerging biomarkers and classification models. Recently, several authors reported the puzzling observation that, in assessing the added value of new biomarkers to existing ones in a logistic regression model, statistical significance of new predictor variables does not necessarily translate into a statistically significant increase in the area under the ROC curve (AUC). Vickers et al. concluded that this inconsistency is because AUC "has vastly inferior statistical properties," i.e., it is extremely conservative. This statement is based on simulations that misuse the DeLong et al. method. Our purpose is to provide a fair comparison of the likelihood ratio (LR) test and the Wald test versus diagnostic accuracy (AUC) tests. DISCUSSION: We present a test to compare ideal AUCs of nested linear discriminant functions via an F test. We compare it with the LR test and the Wald test for the logistic regression model. The null hypotheses of these three tests are equivalent; however, the F test is an exact test whereas the LR test and the Wald test are asymptotic tests. Our simulation shows that the F test has the nominal type I error even with a small sample size. Our results also indicate that the LR test and the Wald test have inflated type I errors when the sample size is small, while the type I error converges to the nominal value asymptotically with increasing sample size as expected. We further show that the DeLong et al. method tests a different hypothesis and has the nominal type I error when it is used within its designed scope. Finally, we summarize the pros and cons of all four methods we consider in this paper. SUMMARY: We show that there is nothing inherently less powerful or disagreeable about ROC analysis for showing the usefulness of new biomarkers or characterizing the performance of classification models. Each statistical method for assessing biomarkers and classification models has its own strengths and weaknesses. Investigators need to choose methods based on the assessment purpose, the biomarker development phase at which the assessment is being performed, the available patient data, and the validity of assumptions behind the methodologies.


Asunto(s)
Biomarcadores , Modelos Estadísticos , Valor Predictivo de las Pruebas , Área Bajo la Curva , Humanos , Funciones de Verosimilitud , Modelos Logísticos
6.
J Med Imaging (Bellingham) ; 10(5): 051804, 2023 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-37361549

RESUMEN

Purpose: To introduce developers to medical device regulatory processes and data considerations in artificial intelligence and machine learning (AI/ML) device submissions and to discuss ongoing AI/ML-related regulatory challenges and activities. Approach: AI/ML technologies are being used in an increasing number of medical imaging devices, and the fast evolution of these technologies presents novel regulatory challenges. We provide AI/ML developers with an introduction to U.S. Food and Drug Administration (FDA) regulatory concepts, processes, and fundamental assessments for a wide range of medical imaging AI/ML device types. Results: The device type for an AI/ML device and appropriate premarket regulatory pathway is based on the level of risk associated with the device and informed by both its technological characteristics and intended use. AI/ML device submissions contain a wide array of information and testing to facilitate the review process with the model description, data, nonclinical testing, and multi-reader multi-case testing being critical aspects of the AI/ML device review process for many AI/ML device submissions. The agency is also involved in AI/ML-related activities that support guidance document development, good machine learning practice development, AI/ML transparency, AI/ML regulatory research, and real-world performance assessment. Conclusion: FDA's AI/ML regulatory and scientific efforts support the joint goals of ensuring patients have access to safe and effective AI/ML devices over the entire device lifecycle and stimulating medical AI/ML innovation.

7.
Stat Methods Med Res ; 31(11): 2069-2086, 2022 11.
Artículo en Inglés | MEDLINE | ID: mdl-35790462

RESUMEN

The area under the receiver operating characteristic curve (AUC) is widely used in evaluating diagnostic performance for many clinical tasks. It is still challenging to evaluate the reading performance of distinguishing between positive and negative regions of interest (ROIs) in the nested-data problem, where multiple ROIs are nested within the cases. To address this issue, we identify two kinds of AUC estimators, within-cases AUC and between-cases AUC. We focus on the between-cases AUC estimator, since our main research interest is in patient-level diagnostic performance rather than location-level performance (the ability to separate ROIs with and without disease within each patient). Another reason is that as the case number increases, the number of between-cases paired ROIs is much larger than the number of within-cases ROIs. We provide estimators for the variance of the between-cases AUC and for the covariance when there are two readers. We derive and prove the above estimators' theoretical values based on a simulation model and characterize their behavior using Monte Carlo simulation results. We also provide a real-data example. Moreover, we connect the distribution-based simulation model with the simulation model based on the linear mixed-effect model, which helps better understand the sources of variation in the simulated dataset.


Asunto(s)
Área Bajo la Curva , Humanos , Curva ROC , Método de Montecarlo , Simulación por Computador , Modelos Lineales
8.
Cancers (Basel) ; 14(10)2022 May 17.
Artículo en Inglés | MEDLINE | ID: mdl-35626070

RESUMEN

The High Throughput Truthing project aims to develop a dataset for validating artificial intelligence and machine learning models (AI/ML) fit for regulatory purposes. The context of this AI/ML validation dataset is the reporting of stromal tumor-infiltrating lymphocytes (sTILs) density evaluations in hematoxylin and eosin-stained invasive breast cancer biopsy specimens. After completing the pilot study, we found notable variability in the sTILs estimates as well as inconsistencies and gaps in the provided training to pathologists. Using the pilot study data and an expert panel, we created custom training materials to improve pathologist annotation quality for the pivotal study. We categorized regions of interest (ROIs) based on their mean sTILs density and selected ROIs with the highest and lowest sTILs variability. In a series of eight one-hour sessions, the expert panel reviewed each ROI and provided verbal density estimates and comments on features that confounded the sTILs evaluation. We aggregated and shaped the comments to identify pitfalls and instructions to improve our training materials. From these selected ROIs, we created a training set and proficiency test set to improve pathologist training with the goal to improve data collection for the pivotal study. We are not exploring AI/ML performance in this paper. Instead, we are creating materials that will train crowd-sourced pathologists to be the reference standard in a pivotal study to create an AI/ML model validation dataset. The issues discussed here are also important for clinicians to understand about the evaluation of sTILs in clinical practice and can provide insight to developers of AI/ML models.

9.
J Med Imaging (Bellingham) ; 9(4): 047501, 2022 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-35911208

RESUMEN

Purpose: Validation of artificial intelligence (AI) algorithms in digital pathology with a reference standard is necessary before widespread clinical use, but few examples focus on creating a reference standard based on pathologist annotations. This work assesses the results of a pilot study that collects density estimates of stromal tumor-infiltrating lymphocytes (sTILs) in breast cancer biopsy specimens. This work will inform the creation of a validation dataset for the evaluation of AI algorithms fit for a regulatory purpose. Approach: Collaborators and crowdsourced pathologists contributed glass slides, digital images, and annotations. Here, "annotations" refer to any marks, segmentations, measurements, or labels a pathologist adds to a report, image, region of interest (ROI), or biological feature. Pathologists estimated sTILs density in 640 ROIs from hematoxylin and eosin stained slides of 64 patients via two modalities: an optical light microscope and two digital image viewing platforms. Results: The pilot study generated 7373 sTILs density estimates from 29 pathologists. Analysis of annotations found the variability of density estimates per ROI increases with the mean; the root mean square differences were 4.46, 14.25, and 26.25 as the mean density ranged from 0% to 10%, 11% to 40%, and 41% to 100%, respectively. The pilot study informs three areas of improvement for future work: technical workflows, annotation platforms, and agreement analysis methods. Upgrades to the workflows and platforms will improve operability and increase annotation speed and consistency. Conclusions: Exploratory data analysis demonstrates the need to develop new statistical approaches for agreement. The pilot study dataset and analysis methods are publicly available to allow community feedback. The development and results of the validation dataset will be publicly available to serve as an instructive tool that can be replicated by developers and researchers.

10.
JNCI Cancer Spectr ; 6(1)2022 01 05.
Artículo en Inglés | MEDLINE | ID: mdl-35699495

RESUMEN

Medical image interpretation is central to detecting, diagnosing, and staging cancer and many other disorders. At a time when medical imaging is being transformed by digital technologies and artificial intelligence, understanding the basic perceptual and cognitive processes underlying medical image interpretation is vital for increasing diagnosticians' accuracy and performance, improving patient outcomes, and reducing diagnostician burnout. Medical image perception remains substantially understudied. In September 2019, the National Cancer Institute convened a multidisciplinary panel of radiologists and pathologists together with researchers working in medical image perception and adjacent fields of cognition and perception for the "Cognition and Medical Image Perception Think Tank." The Think Tank's key objectives were to identify critical unsolved problems related to visual perception in pathology and radiology from the perspective of diagnosticians, discuss how these clinically relevant questions could be addressed through cognitive and perception research, identify barriers and solutions for transdisciplinary collaborations, define ways to elevate the profile of cognition and perception research within the medical image community, determine the greatest needs to advance medical image perception, and outline future goals and strategies to evaluate progress. The Think Tank emphasized diagnosticians' perspectives as the crucial starting point for medical image perception research, with diagnosticians describing their interpretation process and identifying perceptual and cognitive problems that arise. This article reports the deliberations of the Think Tank participants to address these objectives and highlight opportunities to expand research on medical image perception.


Asunto(s)
Inteligencia Artificial , Radiología , Cognición , Diagnóstico por Imagen , Humanos , Radiología/métodos , Percepción Visual
11.
J Opt Soc Am A Opt Image Sci Vis ; 28(6): 1145-63, 2011 Jun 01.
Artículo en Inglés | MEDLINE | ID: mdl-21643400

RESUMEN

Current clinical practice is rapidly moving in the direction of volumetric imaging. For two-dimensional (2D) images, task-based medical image quality is often assessed using numerical model observers. For three-dimensional (3D) images, however, these models have been little explored so far. In this work, first, two novel designs of a multislice channelized Hotelling observer (CHO) are proposed for the task of detecting 3D signals in 3D images. The novel designs are then compared and evaluated in a simulation study with five different CHO designs: a single-slice model, three multislice models, and a volumetric model. Four different random background statistics are considered, both gaussian (noncorrelated and correlated gaussian noise) and non-gaussian (lumpy and clustered lumpy backgrounds). Overall, the results show that the volumetric model outperforms the others, while the disparity between the models decreases for greater complexity of the detection task. Among the multislice models, the second proposed CHO could most closely approach the volumetric model, whereas the first new CHO seems to be least affected by the number of training samples.


Asunto(s)
Imagenología Tridimensional/métodos , Modelos Teóricos , Control de Calidad
12.
J Pathol Inform ; 12: 45, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34881099

RESUMEN

PURPOSE: Validating artificial intelligence algorithms for clinical use in medical images is a challenging endeavor due to a lack of standard reference data (ground truth). This topic typically occupies a small portion of the discussion in research papers since most of the efforts are focused on developing novel algorithms. In this work, we present a collaboration to create a validation dataset of pathologist annotations for algorithms that process whole slide images. We focus on data collection and evaluation of algorithm performance in the context of estimating the density of stromal tumor-infiltrating lymphocytes (sTILs) in breast cancer. METHODS: We digitized 64 glass slides of hematoxylin- and eosin-stained invasive ductal carcinoma core biopsies prepared at a single clinical site. A collaborating pathologist selected 10 regions of interest (ROIs) per slide for evaluation. We created training materials and workflows to crowdsource pathologist image annotations on two modes: an optical microscope and two digital platforms. The microscope platform allows the same ROIs to be evaluated in both modes. The workflows collect the ROI type, a decision on whether the ROI is appropriate for estimating the density of sTILs, and if appropriate, the sTIL density value for that ROI. RESULTS: In total, 19 pathologists made 1645 ROI evaluations during a data collection event and the following 2 weeks. The pilot study yielded an abundant number of cases with nominal sTIL infiltration. Furthermore, we found that the sTIL densities are correlated within a case, and there is notable pathologist variability. Consequently, we outline plans to improve our ROI and case sampling methods. We also outline statistical methods to account for ROI correlations within a case and pathologist variability when validating an algorithm. CONCLUSION: We have built workflows for efficient data collection and tested them in a pilot study. As we prepare for pivotal studies, we will investigate methods to use the dataset as an external validation tool for algorithms. We will also consider what it will take for the dataset to be fit for a regulatory purpose: study size, patient population, and pathologist training and qualifications. To this end, we will elicit feedback from the Food and Drug Administration via the Medical Device Development Tool program and from the broader digital pathology and AI community. Ultimately, we intend to share the dataset, statistical methods, and lessons learned.

13.
J Pathol Inform ; 11: 22, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-33042601

RESUMEN

Unlocking the full potential of pathology data by gaining computational access to histological pixel data and metadata (digital pathology) is one of the key promises of computational pathology. Despite scientific progress and several regulatory approvals for primary diagnosis using whole-slide imaging, true clinical adoption at scale is slower than anticipated. In the U.S., advances in digital pathology are often siloed pursuits by individual stakeholders, and to our knowledge, there has not been a systematic approach to advance the field through a regulatory science initiative. The Alliance for Digital Pathology (the Alliance) is a recently established, volunteer, collaborative, regulatory science initiative to standardize digital pathology processes to speed up innovation to patients. The purpose is: (1) to account for the patient perspective by including patient advocacy; (2) to investigate and develop methods and tools for the evaluation of effectiveness, safety, and quality to specify risks and benefits in the precompetitive phase; (3) to help strategize the sequence of clinically meaningful deliverables; (4) to encourage and streamline the development of ground-truth data sets for machine learning model development and validation; and (5) to clarify regulatory pathways by investigating relevant regulatory science questions. The Alliance accepts participation from all stakeholders, and we solicit clinically relevant proposals that will benefit the field at large. The initiative will dissolve once a clinical, interoperable, modularized, integrated solution (from tissue acquisition to diagnostic algorithm) has been implemented. In times of rapidly evolving discoveries, scientific input from subject-matter experts is one essential element to inform regulatory guidance and decision-making. The Alliance aims to establish and promote synergistic regulatory science efforts that will leverage diverse inputs to move digital pathology forward and ultimately improve patient care.

14.
NPJ Breast Cancer ; 6: 17, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-32411819

RESUMEN

Stromal tumor-infiltrating lymphocytes (sTILs) are important prognostic and predictive biomarkers in triple-negative (TNBC) and HER2-positive breast cancer. Incorporating sTILs into clinical practice necessitates reproducible assessment. Previously developed standardized scoring guidelines have been widely embraced by the clinical and research communities. We evaluated sources of variability in sTIL assessment by pathologists in three previous sTIL ring studies. We identify common challenges and evaluate impact of discrepancies on outcome estimates in early TNBC using a newly-developed prognostic tool. Discordant sTIL assessment is driven by heterogeneity in lymphocyte distribution. Additional factors include: technical slide-related issues; scoring outside the tumor boundary; tumors with minimal assessable stroma; including lymphocytes associated with other structures; and including other inflammatory cells. Small variations in sTIL assessment modestly alter risk estimation in early TNBC but have the potential to affect treatment selection if cutpoints are employed. Scoring and averaging multiple areas, as well as use of reference images, improve consistency of sTIL evaluation. Moreover, to assist in avoiding the pitfalls identified in this analysis, we developed an educational resource available at www.tilsinbreastcancer.org/pitfalls.

15.
J Med Imaging (Bellingham) ; 6(1): 015501, 2019 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-30713851

RESUMEN

We investigated effects of prevalence and case distribution on radiologist diagnostic performance as measured by area under the receiver operating characteristic curve (AUC) and sensitivity-specificity in lab-based reader studies evaluating imaging devices. Our retrospective reader studies compared full-field digital mammography (FFDM) to screen-film mammography (SFM) for women with dense breasts. Mammograms were acquired from the prospective Digital Mammographic Imaging Screening Trial. We performed five reader studies that differed in terms of cancer prevalence and the distribution of noncancers. Twenty radiologists participated in each reader study. Using split-plot study designs, we collected recall decisions and multilevel scores from the radiologists for calculating sensitivity, specificity, and AUC. Differences in reader-averaged AUCs slightly favored SFM over FFDM (biggest AUC difference: 0.047, SE = 0.023 , p = 0.047 ), where standard error accounts for reader and case variability. The differences were not significant at a level of 0.01 (0.05/5 reader studies). The differences in sensitivities and specificities were also indeterminate. Prevalence had little effect on AUC (largest difference: 0.02), whereas sensitivity increased and specificity decreased as prevalence increased. We found that AUC is robust to changes in prevalence, while radiologists were more aggressive with recall decisions as prevalence increased.

16.
Diagn Pathol ; 14(1): 65, 2019 Jun 26.
Artículo en Inglés | MEDLINE | ID: mdl-31238983

RESUMEN

BACKGROUND: The establishment of whole-slide imaging (WSI) as a medical diagnostic device allows that pathologists may evaluate mitotic activity with this new technology. Furthermore, the image digitalization provides an opportunity to develop algorithms for automatic quantifications, ideally leading to improved reproducibility as compared to the naked eye examination by pathologists. In order to implement them effectively, accuracy of mitotic figure detection using WSI should be investigated. In this study, we aimed to measure pathologist performance in detecting mitotic figures (MFs) using multiple platforms (multiple scanners) and compare the results with those obtained using a brightfield microscope. METHODS: Four slides of canine oral melanoma were prepared and digitized using 4 WSI scanners. In these slides, 40 regions of interest (ROIs) were demarcated, and five observers identified the MFs using different viewing modes: microscopy and WSI. We evaluated the inter- and intra-observer agreements between modes with Cohen's Kappa and determined "true" MFs with a consensus panel. We then assessed the accuracy (agreement with truth) using the average of sensitivity and specificity. RESULTS: In the 40 ROIs, 155 candidate MFs were detected by five pathologists; 74 of them were determined to be true MFs. Inter- and intra-observer agreement was mostly "substantial" or greater (Kappa = 0.594-0.939). Accuracy was between 0.632 and 0.843 across all readers and modes. After averaging over readers for each modality, we found that mitosis detection accuracy for 3 of the 4 WSI scanners was significantly less than that of the microscope (p = 0.002, 0.012, and 0.001). CONCLUSIONS: This study is the first to compare WSIs and microscopy in detecting MFs at the level of individual cells. Our results suggest that WSI can be used for mitotic cell detection and offers similar reproducibility to the microscope, with slightly less accuracy.


Asunto(s)
Enfermedades de los Perros/patología , Melanoma/patología , Neoplasias de la Boca/patología , Animales , Enfermedades de los Perros/tratamiento farmacológico , Perros , Interpretación de Imagen Asistida por Computador , Melanoma/diagnóstico , Microscopía , Mitosis , Neoplasias de la Boca/diagnóstico , Variaciones Dependientes del Observador , Patólogos , Reproducibilidad de los Resultados
17.
Acad Pathol ; 6: 2374289519859841, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31321298

RESUMEN

Validating digital pathology as substitute for conventional microscopy in diagnosis remains a priority to assure effectiveness. Intermodality concordance studies typically focus on achieving the same diagnosis by digital display of whole slide images and conventional microscopy. Assessment of discrete histological features in whole slide images, such as mitotic figures, has not been thoroughly evaluated in diagnostic practice. To further gauge the interchangeability of conventional microscopy with digital display for primary diagnosis, 12 pathologists examined 113 canine naturally occurring mucosal melanomas exhibiting a wide range of mitotic activity. Design reflected diverse diagnostic settings and investigated independent location, interpretation, and enumeration of mitotic figures. Intermodality agreement was assessed employing conventional microscopy (CM40×), and whole slide image specimens scanned at 20× (WSI20×) and at 40× (WSI40×) objective magnifications. An aggregate 1647 mitotic figure count observations were available from conventional microscopy and whole slide images for comparison. The intraobserver concordance rate of paired observations was 0.785 to 0.801; interobserver rate was 0.784 to 0.794. Correlation coefficients between the 2 digital modes, and as compared to conventional microscopy, were similar and suggest noninferiority among modalities, including whole slide image acquired at lower 20× resolution. As mitotic figure counts serve for prognostic grading of several tumor types, including melanoma, 6 of 8 pathologists retrospectively predicted survival prognosis using whole slide images, compared to 9 of 10 by conventional microscopy, a first evaluation of whole slide image for mitotic figure prognostic grading. This study demonstrated agreement of replicate reads obtained across conventional microscopy and whole slide images. Hence, quantifying mitotic figures served as surrogate histological feature with which to further credential the interchangeability of whole slide images for primary diagnosis.

18.
Med Phys ; 35(10): 4744-56, 2008 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-18975719

RESUMEN

The H operator represents the deterministic performance of any imaging system. For a linear, digital imaging system, this system operator can be written in terms of a matrix, H, that describes the deterministic response of the system to a set of point objects. A singular value decomposition of this matrix results in a set of orthogonal functions (singular vectors) that form the system basis. A linear combination of these vectors completely describes the transfer of objects through the linear system, where the respective singular values associated with each singular vector describe the magnitude with which that contribution to the object is transferred through the system. This paper is focused on the measurement, analysis, and interpretation of the H matrix for digital x-ray detectors. A key ingredient in the measurement of the H matrix is the detector response to a single x ray (or infinitestimal x-ray beam). The authors have developed a method to estimate the 2D detector shift-variant, asymmetric ray response function (RRF) from multiple measured line response functions (LRFs) using a modified edge technique. The RRF measurements cover a range of x-ray incident angles from 0 degree (equivalent location at the detector center) to 30 degrees (equivalent location at the detector edge) for a standard radiographic or cone-beam CT geometric setup. To demonstrate the method, three beam qualities were tested using the inherent, Lu/Er, and Yb beam filtration. The authors show that measures using the LRF, derived from an edge measurement, underestimate the system's performance when compared with the H matrix derived using the RRF. Furthermore, the authors show that edge measurements must be performed at multiple directions in order to capture rotational asymmetries of the RRF. The authors interpret the results of the H matrix SVD and provide correlations with the familiar MTF methodology. Discussion is made about the benefits of the H matrix technique with regards to signal detection theory, and the characterization of shift-variant imaging systems.


Asunto(s)
Diseño Asistido por Computadora , Diseño de Equipo , Análisis de Falla de Equipo , Intensificación de Imagen Radiográfica/instrumentación , Transductores , Pantallas Intensificadoras de Rayos X , Intensificación de Imagen Radiográfica/métodos , Reproducibilidad de los Resultados , Sensibilidad y Especificidad , Rayos X
19.
Acad Radiol ; 15(3): 370-82, 2008 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-18280935

RESUMEN

RATIONALE AND OBJECTIVES: Statistics show that radiologists are reading more studies than ever before, creating the challenge of interpreting an increasing number of images without compromising diagnostic performance. Stack-mode image display has the potential to allow radiologists to browse large three-dimensional (3D) datasets at refresh rates as high as 30 images/second. In this framework, the slow temporal response of liquid crystal displays (LCDs) can compromise the image quality when the images are browsed in a fast sequence. MATERIALS AND METHODS: In this article, we report on the effect of the LCD response time at different image browsing speeds based on the performance of a contrast-sensitive channelized-hoteling observer. A stack of simulated 3D clustered lumpy background images with a designer nodule to be detected is used. The effect of different browsing speeds is calculated with LCD temporal response measurements from our previous work. The image set is then analyzed by the model observer, which has been shown to predict human detection performance in Gaussian and non-Gaussian lumpy backgrounds. This methodology allows us to quantify the effect of slow temporal response of medical liquid crystal displays on the performance of the anthropomorphic observers. RESULTS: We find that the slow temporal response of the display device greatly affects lesion contrast and observer performance. A detectability decrease of more than 40% could be caused by the slow response of the display. CONCLUSIONS: After validation with human observers, this methodology can be applied to more realistic background data with the goal of providing recommendations for the browsing speed of large volumetric image datasets (from computed tomography, magnetic resonance, or tomosynthesis) when read in stack-mode.


Asunto(s)
Terminales de Computador , Presentación de Datos , Cristales Líquidos , Sistemas de Información Radiológica , Algoritmos , Área Bajo la Curva , Color , Humanos , Imagenología Tridimensional , Luz , Mamografía , Curva ROC , Intensificación de Imagen Radiográfica/métodos , Factores de Tiempo
20.
Neural Netw ; 21(2-3): 387-97, 2008.
Artículo en Inglés | MEDLINE | ID: mdl-18215501

RESUMEN

Evaluation of computational intelligence (CI) systems designed to improve the performance of a human operator is complicated by the need to include the effect of human variability. In this paper we consider human (reader) variability in the context of medical imaging computer-assisted diagnosis (CAD) systems, and we outline how to compare the detection performance of readers with and without the CAD. An effective and statistically powerful comparison can be accomplished with a receiver operating characteristic (ROC) experiment, summarized by the reader-averaged area under the ROC curve (AUC). The comparison requires sophisticated yet well-developed methods for multi-reader multi-case (MRMC) variance analysis. MRMC variance analysis accounts for random readers, random cases, and correlations in the experiment. In this paper, we extend the methods available for estimating this variability. Specifically, we present a method that can treat arbitrary study designs. Most methods treat only the fully-crossed study design, where every reader reads every case in two experimental conditions. We demonstrate our method with a computer simulation, and we assess the statistical power of a variety of study designs.


Asunto(s)
Investigación Biomédica , Diagnóstico por Computador , Redes Neurales de la Computación , Curva ROC , Simulación por Computador , Diagnóstico por Imagen , Enfermedad , Humanos , Método de Montecarlo , Reproducibilidad de los Resultados , Especificidad de la Especie
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA