Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 40
Filtrar
1.
J Biomed Opt ; 28(6): 065003, 2023 06.
Artigo em Inglês | MEDLINE | ID: mdl-37325190

RESUMO

Significance: We present a fiberless, portable, and modular continuous wave-functional near-infrared spectroscopy system, Spotlight, consisting of multiple palm-sized modules-each containing high-density light-emitting diode and silicon photomultiplier detector arrays embedded in a flexible membrane that facilitates optode coupling to scalp curvature. Aim: Spotlight's goal is to be a more portable, accessible, and powerful functional near-infrared spectroscopy (fNIRS) device for neuroscience and brain-computer interface (BCI) applications. We hope that the Spotlight designs we share here can spur more advances in fNIRS technology and better enable future non-invasive neuroscience and BCI research. Approach: We report sensor characteristics in system validation on phantoms and motor cortical hemodynamic responses in a human finger-tapping experiment, where subjects wore custom 3D-printed caps with two sensor modules. Results: The task conditions can be decoded offline with a median accuracy of 69.6%, reaching 94.7% for the best subject, and at a comparable accuracy in real time for a subset of subjects. We quantified how well the custom caps fitted to each subject and observed that better fit leads to more observed task-dependent hemodynamic response and better decoding accuracy. Conclusions: The advances presented here should serve to make fNIRS more accessible for BCI applications.


Assuntos
Hemodinâmica , Espectroscopia de Luz Próxima ao Infravermelho , Humanos , Espectroscopia de Luz Próxima ao Infravermelho/métodos , Hemodinâmica/fisiologia , Mãos
3.
Med Phys ; 44(6): 2207-2222, 2017 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-28382718

RESUMO

PURPOSE: The objective was to design and implement a bivariate extension to the contaminated binormal model (CBM) to fit paired receiver operating characteristic (ROC) datasets-possibly degenerate-with proper ROC curves. Paired datasets yield two correlated ratings per case. Degenerate datasets have no interior operating points and proper ROC curves do not inappropriately cross the chance diagonal. The existing method, developed more than three decades ago utilizes a bivariate extension to the binormal model, implemented in CORROC2 software, which yields improper ROC curves and cannot fit degenerate datasets. CBM can fit proper ROC curves to unpaired (i.e., yielding one rating per case) and degenerate datasets, and there is a clear scientific need to extend it to handle paired datasets. METHODS: In CBM, nondiseased cases are modeled by a probability density function (pdf) consisting of a unit variance peak centered at zero. Diseased cases are modeled with a mixture distribution whose pdf consists of two unit variance peaks, one centered at positive µ with integrated probability α, the mixing fraction parameter, corresponding to the fraction of diseased cases where the disease was visible to the radiologist, and one centered at zero, with integrated probability (1-α), corresponding to disease that was not visible. It is shown that: (a) for nondiseased cases the bivariate extension is a unit variances bivariate normal distribution centered at (0,0) with a specified correlation ρ1 ; (b) for diseased cases the bivariate extension is a mixture distribution with four peaks, corresponding to disease not visible in either condition, disease visible in only one condition, contributing two peaks, and disease visible in both conditions. An expression for the likelihood function is derived. A maximum likelihood estimation (MLE) algorithm, CORCBM, was implemented in the R programming language that yields parameter estimates and the covariance matrix of the parameters, and other statistics. A limited simulation validation of the method was performed. RESULTS: CORCBM and CORROC2 were applied to two datasets containing nine readers each contributing paired interpretations. CORCBM successfully fitted the data for all readers, whereas CORROC2 failed to fit a degenerate dataset. All fits were visually reasonable. All CORCBM fits were proper, whereas all CORROC2 fits were improper. CORCBM and CORROC2 were in agreement (a) in declaring only one of the nine readers as having significantly different performances in the two modalities; (b) in estimating higher correlations for diseased cases than for nondiseased ones; and (c) in finding that the intermodality correlation estimates for nondiseased cases were consistent between the two methods. All CORCBM fits yielded higher area under curve (AUC) than the CORROC2 fits, consistent with the fact that a proper ROC model like CORCBM is based on a likelihood-ratio-equivalent decision variable, and consequently yields higher performance than the binormal model-based CORROC2. The method gave satisfactory fits to four simulated datasets. CONCLUSIONS: CORCBM is a robust method for fitting paired ROC datasets, always yielding proper ROC curves, and able to fit degenerate datasets.


Assuntos
Algoritmos , Funções Verossimilhança , Curva ROC , Área Sob a Curva , Humanos , Modelos Estatísticos , Software
5.
Radiology ; 282(1): 236-250, 2017 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-27439324

RESUMO

Purpose To conduct a multi-institutional, multireader study to compare the performance of digital tomosynthesis, dual-energy (DE) imaging, and conventional chest radiography for pulmonary nodule detection and management. Materials and Methods In this binational, institutional review board-approved, HIPAA-compliant prospective study, 158 subjects (43 subjects with normal findings) were enrolled at four institutions. Informed consent was obtained prior to enrollment. Subjects underwent chest computed tomography (CT) and imaging with conventional chest radiography (posteroanterior and lateral), DE imaging, and tomosynthesis with a flat-panel imaging device. Three experienced thoracic radiologists identified true locations of nodules (n = 516, 3-20-mm diameters) with CT and recommended case management by using Fleischner Society guidelines. Five other radiologists marked nodules and indicated case management by using images from conventional chest radiography, conventional chest radiography plus DE imaging, tomosynthesis, and tomosynthesis plus DE imaging. Sensitivity, specificity, and overall accuracy were measured by using the free-response receiver operating characteristic method and the receiver operating characteristic method for nodule detection and case management, respectively. Results were further analyzed according to nodule diameter categories (3-4 mm, >4 mm to 6 mm, >6 mm to 8 mm, and >8 mm to 20 mm). Results Maximum lesion localization fraction was higher for tomosynthesis than for conventional chest radiography in all nodule size categories (3.55-fold for all nodules, P < .001; 95% confidence interval [CI]: 2.96, 4.15). Case-level sensitivity was higher with tomosynthesis than with conventional chest radiography for all nodules (1.49-fold, P < .001; 95% CI: 1.25, 1.73). Case management decisions showed better overall accuracy with tomosynthesis than with conventional chest radiography, as given by the area under the receiver operating characteristic curve (1.23-fold, P < .001; 95% CI: 1.15, 1.32). There were no differences in any specificity measures. DE imaging did not significantly affect nodule detection when paired with either conventional chest radiography or tomosynthesis. Conclusion Tomosynthesis outperformed conventional chest radiography for lung nodule detection and determination of case management; DE imaging did not show significant differences over conventional chest radiography or tomosynthesis alone. These findings indicate performance likely achievable with a range of reader expertise. © RSNA, 2016 Online supplemental material is available for this article.


Assuntos
Nódulos Pulmonares Múltiplos/diagnóstico por imagem , Nódulos Pulmonares Múltiplos/terapia , Intensificação de Imagem Radiográfica/métodos , Imagem Radiográfica a Partir de Emissão de Duplo Fóton , Radiografia Torácica , Adulto , Idoso , Estudos de Casos e Controles , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Sensibilidade e Especificidade , Suécia , Tomografia Computadorizada por Raios X , Estados Unidos , Ecrans Intensificadores para Raios X
6.
Med Phys ; 43(5): 2548, 2016 May.
Artigo em Inglês | MEDLINE | ID: mdl-27147365

RESUMO

PURPOSE: The free-response receiver operating characteristic (FROC) method is being increasingly used to evaluate observer performance in search tasks. Data analysis requires definition of a figure of merit (FOM) quantifying performance. While a number of FOMs have been proposed, the recommended one, namely, the weighted alternative FROC (wAFROC) FOM, is not well understood. The aim of this work is to clarify the meaning of this FOM by relating it to the empirical area under a proposed wAFROC curve. METHODS: The weighted wAFROC FOM is defined in terms of a quasi-Wilcoxon statistic that involves weights, coding the clinical importance, assigned to each lesion. A new wAFROC curve is proposed, the y-axis of which incorporates the weights, giving more credit for marking clinically important lesions, while the x-axis is identical to that of the AFROC curve. An expression is derived relating the area under the empirical wAFROC curve to the wAFROC FOM. Examples are presented with small numbers of cases showing how AFROC and wAFROC curves are affected by correct and incorrect decisions and how the corresponding FOMs credit or penalize these decisions. The wAFROC, AFROC, and inferred ROC FOMs were applied to three clinical data sets involving multiple reader FROC interpretations in different modalities. RESULTS: It is shown analytically that the area under the empirical wAFROC curve equals the wAFROC FOM. This theorem is the FROC analog of a well-known theorem developed in 1975 for ROC analysis, which gave meaning to a Wilcoxon statistic based ROC FOM. A similar equivalence applies between the area under the empirical AFROC curve and the AFROC FOM. The examples show explicitly that the wAFROC FOM gives equal importance to all diseased cases, regardless of the number of lesions, a desirable statistical property not shared by the AFROC FOM. Applications to the clinical data sets show that the wAFROC FOM yields results comparable to that using the AFROC FOM. CONCLUSIONS: The equivalence theorem gives meaning to the weighted AFROC FOM, namely, it is identical to the empirical area under weighted AFROC curve.


Assuntos
Modelos Estatísticos , Curva ROC , Algoritmos , Área Sob a Curva , Mama/diagnóstico por imagem , Doenças Mamárias/diagnóstico por imagem , Calcinose/diagnóstico por imagem , Simulação por Computador , Interpretação Estatística de Dados , Conjuntos de Dados como Assunto , Humanos , Mamografia/instrumentação , Mamografia/métodos , Modelos Anatômicos , Imagens de Fantasmas , Tomografia por Emissão de Pósitrons/instrumentação , Tomografia por Emissão de Pósitrons/métodos , Software , Tomografia Computadorizada por Raios X/instrumentação , Tomografia Computadorizada por Raios X/métodos
7.
Phys Med ; 32(4): 568-74, 2016 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-27061872

RESUMO

PURPOSE: To investigate the relationship between image quality measurements and the clinical performance of digital mammographic systems. METHODS: Mammograms containing subtle malignant non-calcification lesions and simulated malignant calcification clusters were adapted to appear as if acquired by four types of detector. Observers searched for suspicious lesions and gave these a malignancy score. Analysis was undertaken using jackknife alternative free-response receiver operating characteristics weighted figure of merit (FoM). Images of a CDMAM contrast-detail phantom were adapted to appear as if acquired using the same four detectors as the clinical images. The resultant threshold gold thicknesses were compared to the FoMs using a linear regression model and an F-test was used to find if the gradient of the relationship was significantly non-zero. RESULTS: The detectors with the best image quality measurement also had the highest FoM values. The gradient of the inverse relationship between FoMs and threshold gold thickness for the 0.25mm diameter disk was significantly different from zero for calcification clusters (p=0.027), but not for non-calcification lesions (p=0.11). Systems performing just above the minimum image quality level set in the European Guidelines for Quality Assurance in Breast Cancer Screening and Diagnosis resulted in reduced cancer detection rates compared to systems performing at the achievable level. CONCLUSIONS: The clinical effectiveness of mammography for the task of detecting calcification clusters was found to be linked to image quality assessment using the CDMAM phantom. The European Guidelines should be reviewed as the current minimum image quality standards may be too low.


Assuntos
Neoplasias da Mama/diagnóstico por imagem , Mamografia/métodos , Neoplasias da Mama/metabolismo , Neoplasias da Mama/patologia , Calcinose/diagnóstico por imagem , Calcinose/metabolismo , Calcinose/patologia , Feminino , Guias como Assunto , Humanos , Mamografia/normas , Intensificação de Imagem Radiográfica/métodos
8.
Eur Radiol ; 26(3): 874-83, 2016 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-26105023

RESUMO

OBJECTIVE: To compare the performance of different types of detectors in breast cancer detection. METHODS: A mammography image set containing subtle malignant non-calcification lesions, biopsy-proven benign lesions, simulated malignant calcification clusters and normals was acquired using amorphous-selenium (a-Se) detectors. The images were adapted to simulate four types of detectors at the same radiation dose: digital radiography (DR) detectors with a-Se and caesium iodide (CsI) convertors, and computed radiography (CR) detectors with a powder phosphor (PIP) and a needle phosphor (NIP). Seven observers marked suspicious and benign lesions. Analysis was undertaken using jackknife alternative free-response receiver operating characteristics weighted figure of merit (FoM). The cancer detection fraction (CDF) was estimated for a representative image set from screening. RESULTS: No significant differences in the FoMs between the DR detectors were measured. For calcification clusters and non-calcification lesions, both CR detectors' FoMs were significantly lower than for DR detectors. The calcification cluster's FoM for CR NIP was significantly better than for CR PIP. The estimated CDFs with CR PIP and CR NIP detectors were up to 15% and 22% lower, respectively, than for DR detectors. CONCLUSION: Cancer detection is affected by detector type, and the use of CR in mammography should be reconsidered. KEY POINTS: The type of mammography detector can affect the cancer detection rates. CR detectors performed worse than DR detectors in mammography. Needle phosphor CR performed better than powder phosphor CR. Calcification clusters detection is more sensitive to detector type than other cancers.


Assuntos
Neoplasias da Mama/diagnóstico por imagem , Calcinose/diagnóstico por imagem , Mamografia/instrumentação , Idoso , Detecção Precoce de Câncer/instrumentação , Detecção Precoce de Câncer/métodos , Feminino , Humanos , Mamografia/métodos , Programas de Rastreamento/instrumentação , Programas de Rastreamento/métodos , Pessoa de Meia-Idade , Agulhas , Variações Dependentes do Observador , Curva ROC , Intensificação de Imagem Radiográfica/métodos
9.
AJR Am J Roentgenol ; 203(2): 387-93, 2014 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-25055275

RESUMO

OBJECTIVE. The objective of our study was to investigate the effect of image processing on the detection of cancers in digital mammography images. MATERIALS AND METHODS. Two hundred seventy pairs of breast images (both breasts, one view) were collected from eight systems using Hologic amorphous selenium detectors: 80 image pairs showed breasts containing subtle malignant masses; 30 image pairs, biopsy-proven benign lesions; 80 image pairs, simulated calcification clusters; and 80 image pairs, no cancer (normal). The 270 image pairs were processed with three types of image processing: standard (full enhancement), low contrast (intermediate enhancement), and pseudo-film-screen (no enhancement). Seven experienced observers inspected the images, locating and rating regions they suspected to be cancer for likelihood of malignancy. The results were analyzed using a jackknife-alternative free-response receiver operating characteristic (JAFROC) analysis. RESULTS. The detection of calcification clusters was significantly affected by the type of image processing: The JAFROC figure of merit (FOM) decreased from 0.65 with standard image processing to 0.63 with low-contrast image processing (p = 0.04) and from 0.65 with standard image processing to 0.61 with film-screen image processing (p = 0.0005). The detection of noncalcification cancers was not significantly different among the image-processing types investigated (p > 0.40). CONCLUSION. These results suggest that image processing has a significant impact on the detection of calcification clusters in digital mammography. For the three image-processing versions and the system investigated, standard image processing was optimal for the detection of calcification clusters. The effect on cancer detection should be considered when selecting the type of image processing in the future.


Assuntos
Neoplasias da Mama/diagnóstico por imagem , Calcinose/diagnóstico por imagem , Mamografia/métodos , Intensificação de Imagem Radiográfica/métodos , Interpretação de Imagem Radiográfica Assistida por Computador/métodos , Idoso , Biópsia , Feminino , Humanos , Pessoa de Meia-Idade , Reino Unido
10.
Acad Radiol ; 21(4): 538-45, 2014 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-24594424

RESUMO

RATIONALE AND OBJECTIVES: The purpose of this study was to compare lesion-detection performance when interpreting computed tomography (CT) images that are acquired for attenuation correction when performing single photon emission computed tomography/computed tomography (SPECT/CT) myocardial perfusion studies. In the United Kingdom, there is a requirement that these images be interpreted; thus, it is necessary to understand observer performance on these images. MATERIALS AND METHODS: An anthropomorphic chest phantom with inserted spherical lesions of different sizes and contrasts was scanned on five different SPECT/CT systems using site-specific CT protocols for SPECT/CT myocardial perfusion imaging. Twenty-one observers (0-4 years of CT experience) searched 26 image slices (17 abnormal, containing 1-3 lesions, and 9 normal, containing no lesions) for each CT acquisition. The observers marked and rated perceived lesions under the free-response paradigm. Four analyses were conducted using jackknife alternative free-response receiver operating characteristic (JAFROC) analysis: (1) 20-pixel acceptance radius (AR) with all 21 readers, abbreviated to 20/ALL analysis, (2) 40-pixel AR with 21 readers (40/ALL), (3) 20-pixel AR with 14 readers experienced in CT (20/EXP), and (4) 20-pixel AR with 7 readers with no CT experience (20/NOT). The significance level of the test was set so as to conservatively control the overall probability of a type I error to <0.05. RESULTS: The mean JAFROC figure of merit (FOM) for the five CT acquisitions for the 20/ALL study were 0.602, 0.639, 0.372, 0.475, and 0.719 with a significant difference in lesion-detection performance evident between all individual treatment pairs (P < .0001) with the exception of the 1-2 pairing, which was not significant (these differed only in milliamp seconds). System 5, which had the highest performance, had the smallest slice thickness and the largest matrix size. For the other analyses, the system orderings remained unchanged, and the significance of FOM difference findings remained identical to those for 20/ALL, with one exception: for 20/EXP analysis the 1-2 difference became significant with the higher milliamp seconds superior. Improved detection performance was associated with a smaller slice thickness, increased matrix size, and, to a lesser extent, increased tube charge. CONCLUSIONS: Protocol variations for CT-based attenuation correction (AC) in SPECT/CT imaging have a measurable impact on lesion-detection performance. The results imply that z-axis resolution and matrix size had the greatest impact on lesion detection, with a weaker but detectable dependence on the product of milliamp and seconds.


Assuntos
Algoritmos , Achados Incidentais , Neoplasias Pulmonares/diagnóstico por imagem , Imagens de Fantasmas , Intensificação de Imagem Radiográfica/métodos , Interpretação de Imagem Radiográfica Assistida por Computador/métodos , Tomografia Computadorizada por Raios X/métodos , Artefatos , Competência Clínica , Humanos , Variações Dependentes do Observador , Radiografia Torácica/métodos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Tomografia Computadorizada de Emissão de Fóton Único/métodos
11.
Eur Radiol ; 23(11): 3205-12, 2013 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-23732690

RESUMO

OBJECTIVES: To investigate pulmonary vasculature opacification during CTPA using an optimised patient-specific protocol for administering contrast agent. METHODS: CTPA was performed on 200 patients with suspected PE. Patients were assigned to two protocol groups: protocol A, fixed 80 ml contrast agent; protocol B used a patient-specific approach. The mean cross-sectional opacification profile of 8 central and 11 peripheral pulmonary arteries and veins was measured and the arteriovenous contrast ratio (AVCR) calculated. Protocols were compared using Mann-Whitney U non-parametric statistics. Jack-knife alternative free-response receiver-operating characteristic (JAFROC) analyses assessed diagnostic efficacy. Interobserver variations were investigated using kappa methods. RESULTS: A number of pulmonary arteries demonstrated increases in opacification (P < 0.03) for protocol B compared to A, whilst opacification in the heart and veins was reduced in protocol B (P = 0.05). Increased AVCR in protocol B compared with A was observed at all anatomic locations (P < 0.0002). Increased JAFROC (P < 0.0002) and kappa variation were observed with protocol B (κ = 0.78) compared to A (κ = 0.25). Mean contrast volume was reduced in protocol B (33 ± 9 ml) compared to A (80 ± 1 ml). CONCLUSIONS: Significant improvements in visualisation of the pulmonary vasculature can be achieved with a low volume of contrast agent using injection timing based on a patient-specific contrast formula. KEY POINTS: • Optimal opacification of the pulmonary arteries is essential for CT pulmonary angiography. • Matching timing with vessel dynamics significantly improves vessel opacification. • This leads to increased arterial opacification and reduced venous opacification. • This can also lead to a reduced volume of contrast agent.


Assuntos
Angiografia/métodos , Meios de Contraste/administração & dosagem , Artéria Pulmonar/diagnóstico por imagem , Embolia Pulmonar/diagnóstico por imagem , Veias Pulmonares/diagnóstico por imagem , Tomografia Computadorizada por Raios X/métodos , Feminino , Humanos , Injeções , Masculino , Pessoa de Meia-Idade , Estudos Retrospectivos
12.
Acad Radiol ; 20(7): 915-9, 2013 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-23583665

RESUMO

In the receiver operating characteristic paradigm the observer assigns a single rating to each image and the location of the perceived abnormality, if any, is ignored. In the free-response receiver operating characteristic paradigm the observer is free to mark and rate as many suspicious regions as are considered clinically reportable. Credit for a correct localization is given only if a mark is sufficiently close to an actual lesion; otherwise, the observer's mark is scored as a location-level false positive. Until fairly recently there existed no accepted method for analyzing the resulting relatively unstructured data containing random numbers of mark-rating pairs per image. This report reviews the history of work in this field, which has now spanned more than five decades. It introduces terminology used to describe the paradigm, proposed measures of performance (figures of merit), ways of visualizing the data (operating characteristics), and software for analyzing free-response receiver operating characteristic studies.


Assuntos
Neoplasias da Mama/diagnóstico por imagem , Interpretação Estatística de Dados , Mamografia/estatística & dados numéricos , Curva ROC , Feminino , Humanos , Mamografia/métodos , Interpretação de Imagem Radiográfica Assistida por Computador/métodos , Reprodutibilidade dos Testes , Software
13.
Radiology ; 268(1): 46-53, 2013 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-23481165

RESUMO

PURPOSE: To establish the extent to which test set reading can represent actual clinical reporting in screening mammography. MATERIALS AND METHODS: Institutional ethics approval was granted, and informed consent was obtained from each participating screen reader. The need for informed consent with respect to the use of patient materials was waived. Two hundred mammographic examinations were selected from examinations reported by 10 individual expert screen readers, resulting in 10 reader-specific test sets. Data generated from actual clinical reports were compared with three test set conditions: clinical test set reading with prior images, laboratory test set reading with prior images, and laboratory test set reading without prior images. A further set of five expert screen readers was asked to interpret a common set of images in two identical test set conditions to establish a baseline for intraobserver variability. Confidence scores (from 1 to 4) were assigned to the respective decisions made by readers. Region-of-interest (ROI) figures of merit (FOMs) and side-specific sensitivity and specificity were described for the actual clinical reporting of each reader-specific test set and were compared with those for the three test set conditions. Agreement between pairs of readings was performed by using the Kendall coefficient of concordance. RESULTS: Moderate or acceptable levels of agreement were evident (W = 0.69-0.73, P < .01) when describing group performance between actual clinical reporting and test set conditions that were reasonably close to the established baseline (W = 0.77, P < .01) and were lowest when prior images were excluded. Higher median values for ROI FOMs were demonstrated for the test set conditions than for the actual clinical reporting values; this was possibly linked to changes in sensitivity. CONCLUSION: Reasonable levels of agreement between actual clinical reporting and test set conditions can be achieved, although inflated sensitivity may be evident with test set conditions.


Assuntos
Neoplasias da Mama/diagnóstico por imagem , Mamografia , Competência Profissional , Tomada de Decisões , Diagnóstico Diferencial , Feminino , Humanos , Variações Dependentes do Observador , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
14.
Acad Radiol ; 19(12): 1474-83, 2012 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-23040503

RESUMO

RATIONALE AND OBJECTIVES: Studies of medical image interpretation have focused on either assessing radiologists' performance using, for example, the receiver operating characteristic (ROC) paradigm, or assessing the interpretive process by analyzing their eye-tracking (ET) data. Analysis of ET data has not benefited from threshold-bias independent figures of merit (FOMs) analogous to the area under the receiver operating characteristic (ROC) curve. The aim was to demonstrate the feasibility of such FOMs and to measure the agreement between FOMs derived from free-response ROC (FROC) and ET data. METHODS: Eight expert breast radiologists interpreted a case set of 120 two-view mammograms while eye-position data and FROC data were continuously collected during the interpretation interval. Regions that attract prolonged (>800 ms) visual attention were considered to be virtual marks, and ratings based on the dwell and approach-rate (inverse of time-to-hit) were assigned to them. The virtual ratings were used to define threshold-bias independent FOMs in a manner analogous to the area under the trapezoidal alternative FROC (AFROC) curve (0 = worst, 1 = best). Agreement at the case level (0.5 = chance, 1 = perfect) was measured using the jackknife and 95% confidence intervals (CI) for the FOMs and agreement were estimated using the bootstrap. RESULTS: The AFROC mark-ratings' FOM was largest at 0.734 (CI 0.65-0.81) followed by the dwell at 0.460 (0.34-0.59) and then by the approach-rate FOM 0.336 (0.25-0.46). The differences between the FROC mark-ratings' FOM and the perceptual FOMs were significant (P < .05). All pairwise agreements were significantly better then chance: ratings vs. dwell 0.707 (0.63-0.88), dwell vs. approach-rate 0.703 (0.60-0.79) and rating vs. approach-rate 0.606 (0.53-0.68). The ratings vs. approach-rate agreement was significantly smaller than the dwell vs. approach-rate agreement (P = .008). CONCLUSIONS: Leveraging current methods developed for analyzing observer performance data could complement current ways of analyzing ET data and lead to new insights.


Assuntos
Medições dos Movimentos Oculares , Mamografia , Curva ROC , Interpretação de Imagem Radiográfica Assistida por Computador , Feminino , Humanos
15.
Med Phys ; 39(6): 3202-13, 2012 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-22755704

RESUMO

PURPOSE: This study aims to investigate if microcalcification detection varies significantly when mammographic images are acquired using different image qualities, including: different detectors, dose levels, and different image processing algorithms. An additional aim was to determine how the standard European method of measuring image quality using threshold gold thickness measured with a CDMAM phantom and the associated limits in current EU guidelines relate to calcification detection. METHODS: One hundred and sixty two normal breast images were acquired on an amorphous selenium direct digital (DR) system. Microcalcification clusters extracted from magnified images of slices of mastectomies were electronically inserted into half of the images. The calcification clusters had a subtle appearance. All images were adjusted using a validated mathematical method to simulate the appearance of images from a computed radiography (CR) imaging system at the same dose, from both systems at half this dose, and from the DR system at quarter this dose. The original 162 images were processed with both Hologic and Agfa (Musica-2) image processing. All other image qualities were processed with Agfa (Musica-2) image processing only. Seven experienced observers marked and rated any identified suspicious regions. Free response operating characteristic (FROC) and ROC analyses were performed on the data. The lesion sensitivity at a nonlesion localization fraction (NLF) of 0.1 was also calculated. Images of the CDMAM mammographic test phantom were acquired using the automatic setting on the DR system. These images were modified to the additional image qualities used in the observer study. The images were analyzed using automated software. In order to assess the relationship between threshold gold thickness and calcification detection a power law was fitted to the data. RESULTS: There was a significant reduction in calcification detection using CR compared with DR: the alternative FROC (AFROC) area decreased from 0.84 to 0.63 and the ROC area decreased from 0.91 to 0.79 (p < 0.0001). This corresponded to a 30% drop in lesion sensitivity at a NLF equal to 0.1. Detection was also sensitive to the dose used. There was no significant difference in detection between the two image processing algorithms used (p > 0.05). It was additionally found that lower threshold gold thickness from CDMAM analysis implied better cluster detection. The measured threshold gold thickness passed the acceptable limit set in the EU standards for all image qualities except half dose CR. However, calcification detection varied significantly between image qualities. This suggests that the current EU guidelines may need revising. CONCLUSIONS: Microcalcification detection was found to be sensitive to detector and dose used. Standard measurements of image quality were a good predictor of microcalcification cluster detection.


Assuntos
Calcinose/diagnóstico por imagem , Mamografia/métodos , Intensificação de Imagem Radiográfica/métodos , Neoplasias da Mama/complicações , Neoplasias da Mama/diagnóstico por imagem , Calcinose/complicações , Humanos , Processamento de Imagem Assistida por Computador , Imagens de Fantasmas , Controle de Qualidade , Curva ROC , Doses de Radiação
16.
Phys Med Biol ; 57(10): 2873-904, 2012 May 21.
Artigo em Inglês | MEDLINE | ID: mdl-22516804

RESUMO

Laboratory receiver operating characteristic (ROC) studies, that are often used to evaluate medical imaging systems, differ from 'live' clinical interpretations in several respects which could compromise their clinical relevance. The aim was to develop methodology for quantifying the clinical relevance of a laboratory ROC study. A simulator was developed to generate ROC ratings data and binary clinical interpretations classified as correct or incorrect for a common set of images interpreted under clinical and laboratory conditions. The area under the trapezoidal ROC curve (AUC) was used as the laboratory figure-of-merit and the fraction of correct clinical decisions as the clinical figure-of-merit. Conventional agreement measures (Pearson, Spearman, Kendall and kappa) between the bootstrap-induced fluctuations of the two figures of merit were estimated. A jackknife pseudovalue transformation applied to the figures of merit was also investigated as a way to capture agreement existing at the individual image level that could be lost at the figure-of-merit level. It is shown that the pseudovalues define a relevance-ROC curve. The area under this curve (rAUC) measures the ability of the laboratory figure-of-merit-based pseudovalues to correctly classify incorrect versus correct clinical interpretations. Therefore, rAUC is a measure of the clinical relevance of an ROC study. The conventional measures and rAUC were compared under varying simulator conditions. It was found that design details of the ROC study, namely the number of bins, the difficulty level of the images, the ratio of disease-present to disease-absent images and the unavoidable difference between laboratory and clinical performance levels, can lead to serious underestimation of the agreement as indicated by conventional agreement measures, even for perfectly correlated data, while rAUC showed high agreement and was relatively immune to these details. At the same time rAUC was sensitive to factors such as intrinsic correlation between the laboratory and clinical decision variables and differences in reporting thresholds that are expected to influence agreement both at the individual image level and at the figure-of-merit level. Suggestions are made for how to conduct relevance-ROC studies aimed at assessing agreement between laboratory and clinical interpretations. The method could be used to evaluate the clinical relevance of alternative scalar figures of merit, such as the sensitivity at a predifined specificity.


Assuntos
Interpretação de Imagem Assistida por Computador/métodos , Modelos Teóricos , Área Sob a Curva , Humanos , Laboratórios , Variações Dependentes do Observador , Curva ROC
17.
Semin Nucl Med ; 41(6): 401-18, 2011 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-21978444

RESUMO

A common task in medical imaging is assessing whether a new imaging system, or a variant of an existing one, is an improvement over an existing imaging technology. Imaging systems are generally quite complex, consisting of several components-for example, image acquisition hardware, image processing and display hardware and software, and image interpretation by radiologists- each of which can affect performance. Although it may appear odd to include the radiologist as a "component" of the imaging chain, because the radiologist's decision determines subsequent patient care, the effect of the human interpretation has to be included. Physical measurements such as modulation transfer function, signal-to-noise ratio, are useful for characterizing the nonhuman parts of the imaging chain under idealized and often unrealistic conditions, such as uniform background phantoms and target objects with sharp edges. Measuring the performance of the entire imaging chain, including the radiologist, and using real clinical images requires different methods that fall under the rubric of observer performance methods or "ROC" analysis, that involve collecting rating data on images. The purpose of this work is to review recent developments in this field, particularly with respect to the free-response method, where location information is also collected.


Assuntos
Diagnóstico por Computador/métodos , Diagnóstico por Imagem/métodos , Processamento de Imagem Assistida por Computador/métodos , Curva ROC , Área Sob a Curva , Reações Falso-Negativas , Reações Falso-Positivas , Humanos , Variações Dependentes do Observador , Imagens de Fantasmas , Valor Preditivo dos Testes , Reprodutibilidade dos Testes , Análise e Desempenho de Tarefas
18.
Nucl Instrum Methods Phys Res A ; 648 Supplement 1: S297-S301, 2011 Aug 21.
Artigo em Inglês | MEDLINE | ID: mdl-21804679

RESUMO

A frequent problem in imaging is assessing whether a new imaging system is an improvement over an existing standard. Observer performance methods, in particular the receiver operating characteristic (ROC) paradigm, are widely used in this context. In ROC analysis lesion location information is not used and consequently scoring ambiguities can arise in tasks, such as nodule detection, involving finding localized lesions. This paper reviews progress in the free-response ROC (FROC) paradigm in which the observer marks and rates suspicious regions and the location information is used to determine whether lesions were correctly localized. Reviewed are FROC data analysis, a search-model for simulating FROC data, predictions of the model and a method for estimating the parameters. The search model parameters are physically meaningful quantities that can guide system optimization.

20.
Acad Radiol ; 17(5): 628-38, 2010 May.
Artigo em Inglês | MEDLINE | ID: mdl-20380980

RESUMO

RATIONALE AND OBJECTIVES: Sample-size estimation is an important consideration when planning a receiver operating characteristic (ROC) study. The aim of this work was to assess the prediction accuracy of a sample-size estimation method using the Monte Carlo simulation method. MATERIALS AND METHODS: Two ROC ratings simulators characterized by low reader and high case variabilities (LH) and high reader and low case variabilities (HL) were used to generate pilot data sets in two modalities. Dorfman-Berbaum-Metz multiple-reader multiple-case (DBM-MRMC) analysis of the ratings yielded estimates of the modality-reader, modality-case, and error variances. These were input to the Hillis-Berbaum (HB) sample-size estimation method, which predicted the number of cases needed to achieve 80% power for 10 readers and an effect size of 0.06 in the pivotal study. Predictions that generalized to readers and cases (random-all), to cases only (random-cases), and to readers only (random-readers) were generated. A prediction-accuracy index defined as the probability that any single prediction yields true power in the 75%-90% range was used to assess the HB method. RESULTS: For random-case generalization, the HB-method prediction-accuracy was reasonable, approximately 50% for five readers and 100 cases in the pilot study. Prediction-accuracy was generally higher under LH conditions than under HL conditions. Under ideal conditions (many readers in the pilot study) the DBM-MRMC-based HB method overestimated the number of cases. The overestimates could be explained by the larger modality-reader variance estimates when reader variability was large (HL). The largest benefit of increasing the number of readers in the pilot study was realized for LH, where 15 readers were enough to yield prediction accuracy >50% under all generalization conditions, but the benefit was lesser for HL where prediction accuracy was approximately 36% for 15 readers under random-all and random-reader conditions. CONCLUSION: The HB method tends to overestimate the number of cases. Random-case generalization had reasonable prediction accuracy. Provided about 15 readers were used in the pilot study the method performed reasonably under all conditions for LH. When reader variability was large, the prediction-accuracy for random-all and random-reader generalizations was compromised. Study designers may wish to compare the HB predictions to those of other methods and to sample-sizes used in previous similar studies.


Assuntos
Algoritmos , Interpretação Estatística de Dados , Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos , Reconhecimento Automatizado de Padrão/métodos , Curva ROC , Variações Dependentes do Observador , Reprodutibilidade dos Testes , Tamanho da Amostra , Sensibilidade e Especificidade
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA