Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Resultados 1 - 20 de 57
Filtrar
1.
Ann Surg Oncol ; 31(2): 957-965, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-37947974

RESUMEN

BACKGROUND: Breast cancer patients with residual disease after neoadjuvant systemic treatment (NAST) have a worse prognosis compared with those achieving a pathologic complete response (pCR). Earlier identification of these patients might allow timely, extended neoadjuvant treatment strategies. We explored the feasibility of a vacuum-assisted biopsy (VAB) after NAST to identify patients with residual disease (ypT+ or ypN+) prior to surgery. METHODS: We used data from a multicenter trial, collected at 21 study sites (NCT02948764). The trial included women with cT1-3, cN0/+ breast cancer undergoing routine post-neoadjuvant imaging (ultrasound, MRI, mammography) and VAB prior to surgery. We compared the findings of VAB and routine imaging with the histopathologic evaluation of the surgical specimen. RESULTS: Of 398 patients, 34 patients with missing ypN status and 127 patients with luminal tumors were excluded. Among the remaining 237 patients, tumor cells in the VAB indicated a surgical non-pCR in all patients (73/73, positive predictive value [PPV] 100%), whereas PPV of routine imaging after NAST was 56.0% (75/134). Sensitivity of the VAB was 72.3% (73/101), and 74.3% for sensitivity of imaging (75/101). CONCLUSION: Residual cancer found in a VAB specimen after NAST always corresponds to non-pCR. Residual cancer assumed on routine imaging after NAST corresponds to actual residual cancer in about half of patients. Response assessment by VAB is not safe for the exclusion of residual cancer. Response assessment by biopsies after NAST may allow studying the new concept of extended neoadjuvant treatment for patients with residual disease in future trials.


Asunto(s)
Neoplasias de la Mama , Humanos , Femenino , Neoplasias de la Mama/tratamiento farmacológico , Neoplasias de la Mama/cirugía , Neoplasias de la Mama/patología , Terapia Neoadyuvante/métodos , Neoplasia Residual/patología , Mama/patología , Biopsia Guiada por Imagen/métodos
2.
Eur Radiol ; 34(4): 2560-2573, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-37707548

RESUMEN

OBJECTIVES: Response assessment to neoadjuvant systemic treatment (NAST) to guide individualized treatment in breast cancer is a clinical research priority. We aimed to develop an intelligent algorithm using multi-modal pretreatment ultrasound and tomosynthesis radiomics features in addition to clinical variables to predict pathologic complete response (pCR) prior to the initiation of therapy. METHODS: We used retrospective data on patients who underwent ultrasound and tomosynthesis before starting NAST. We developed a support vector machine algorithm using pretreatment ultrasound and tomosynthesis radiomics features in addition to patient and tumor variables to predict pCR status (ypT0 and ypN0). Findings were compared to the histopathologic evaluation of the surgical specimen. The main outcome measures were area under the curve (AUC) and false-negative rate (FNR). RESULTS: We included 720 patients, 504 in the development set and 216 in the validation set. Median age was 51.6 years and 33.6% (242 of 720) achieved pCR. The addition of radiomics features significantly improved the performance of the algorithm (AUC 0.72 to 0.81; p = 0.007). The FNR of the multi-modal radiomics and clinical algorithm was 6.7% (10 of 150 with missed residual cancer). Surface/volume ratio at tomosynthesis and peritumoral entropy characteristics at ultrasound were the most relevant radiomics. Hormonal receptors and HER-2 status were the most important clinical predictors. CONCLUSION: A multi-modal machine learning algorithm with pretreatment clinical, ultrasound, and tomosynthesis radiomics features may aid in predicting residual cancer after NAST. Pending prospective validation, this may facilitate individually tailored NAST regimens. CLINICAL RELEVANCE STATEMENT: Multi-modal radiomics using pretreatment ultrasound and tomosynthesis showed significant improvement in assessing response to NAST compared to an algorithm using clinical variables only. Further prospective validation of our findings seems warranted to enable individualized predictions of NAST outcomes. KEY POINTS: • We proposed a multi-modal machine learning algorithm with pretreatment clinical, ultrasound, and tomosynthesis radiomics features to predict response to neoadjuvant breast cancer treatment. • Compared with the clinical algorithm, the AUC of this integrative algorithm is significantly higher. • Used prior to the initiative of therapy, our algorithm can identify patients who will experience pathologic complete response following neoadjuvant therapy with a high negative predictive value.


Asunto(s)
Neoplasias de la Mama , Humanos , Persona de Mediana Edad , Femenino , Neoplasias de la Mama/terapia , Neoplasias de la Mama/tratamiento farmacológico , Terapia Neoadyuvante , Estudios Retrospectivos , Neoplasia Residual , Radiómica
3.
J Ultrasound Med ; 43(3): 467-478, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38069582

RESUMEN

OBJECTIVES: Patients with triple-negative breast cancer (TNBC) exhibit a fast tumor growth rate and poor survival outcomes. In this study, we aimed to develop and compare intelligent algorithms using ultrasound radiomics features in addition to clinical variables to identify patients with TNBC prior to histopathologic diagnosis. METHODS: We used single-center, retrospective data of patients who underwent ultrasound before histopathologic verification and subsequent neoadjuvant systemic treatment (NAST). We developed a logistic regression with an elastic net penalty algorithm using pretreatment ultrasound radiomics features in addition to patient and tumor variables to identify patients with TNBC. Findings were compared to the histopathologic evaluation of the biopsy specimen. The main outcome measure was the area under the curve (AUC). RESULTS: We included 1161 patients, 813 in the development set and 348 in the validation set. Median age was 50.1 years and 24.4% (283 of 1161) had TNBC. The integrative model using radiomics and clinical information showed significantly better performance in identifying TNBC compared to the radiomics model (AUC: 0.71, 95% confidence interval [CI]: 0.65-0.76 versus 0.64, 95% CI: 0.57-0.71, P = .004). The five most important variables were cN status, shape surface volume ratio (SA:V), gray level co-occurrence matrix (GLCM) correlation, gray level dependence matrix (GLDM) dependence nonuniformity normalized, and age. Patients with TNBC were more often categorized as BI-RADS 4 than BI-RADS 5 compared to non-TNBC patients (P = .002). CONCLUSION: A machine learning algorithm showed promising potential to identify patients with TNBC using ultrasound radiomics features and clinical information prior to histopathologic evaluation.


Asunto(s)
Neoplasias de la Mama , Neoplasias de la Mama Triple Negativas , Humanos , Persona de Mediana Edad , Femenino , Radiómica , Estudios Retrospectivos , Ultrasonografía , Algoritmos
4.
Ann Surg ; 277(1): e144-e152, 2023 Jan 01.
Artículo en Inglés | MEDLINE | ID: mdl-33914464

RESUMEN

OBJECTIVE: We developed, tested, and validated machine learning algorithms to predict individual patient-reported outcomes at 1-year follow-up to facilitate individualized, patient-centered decision-making for women with breast cancer. SUMMARY OF BACKGROUND DATA: Satisfaction with breasts is a key outcome for women undergoing cancer-related mastectomy and reconstruction. Current decision-making relies on group-level evidence which may lead to suboptimal treatment recommendations for individuals. METHODS: We trained, tested, and validated 3 machine learning algorithms using data from 1921 women undergoing cancer-related mastectomy and reconstruction conducted at eleven study sites in North America from 2011 to 2016. Data from 1921 women undergoing cancer-related mastectomy and reconstruction were collected before surgery and at 1-year follow-up. Data from 10 of the 11 sites were randomly split into training and test samples (2:1 ratio) to develop and test 3 algorithms (logistic regression with elastic net penalty, extreme gradient boosting tree, and neural network) which were further validated using the additional site's data.AUC to predict clinically-significant changes in satisfaction with breasts at 1-year follow-up using the validated BREAST-Q were the outcome measures. RESULTS: The 3 algorithms performed equally well when predicting both improved or decreased satisfaction with breasts in both testing and validation datasets: For the testing dataset median accuracy = 0.81 (range 0.73-0.83), median AUC = 0.84 (range 0.78-0.85). For the validation dataset median accuracy = 0.83 (range 0.81-0.84), median AUC = 0.86 (range 0.83-0.89). CONCLUSION: Individual patient-reported outcomes can be accurately predicted using machine learning algorithms, which may facilitate individualized, patient-centered decision-making for women undergoing breast cancer treatment.


Asunto(s)
Neoplasias de la Mama , Mamoplastia , Femenino , Humanos , Neoplasias de la Mama/cirugía , Mastectomía , Estudios de Seguimiento , Medición de Resultados Informados por el Paciente , Aprendizaje Automático , Atención Dirigida al Paciente
5.
Ann Surg Oncol ; 30(12): 7046-7059, 2023 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-37516723

RESUMEN

BACKGROUND: We sought to predict clinically meaningful changes in physical, sexual, and psychosocial well-being for women undergoing cancer-related mastectomy and breast reconstruction 2 years after surgery using machine learning (ML) algorithms trained on clinical and patient-reported outcomes data. PATIENTS AND METHODS: We used data from women undergoing mastectomy and reconstruction at 11 study sites in North America to develop three distinct ML models. We used data of ten sites to predict clinically meaningful improvement or worsening by comparing pre-surgical scores with 2 year follow-up data measured by validated Breast-Q domains. We employed ten-fold cross-validation to train and test the algorithms, and then externally validated them using the 11th site's data. We considered area-under-the-receiver-operating-characteristics-curve (AUC) as the primary metric to evaluate performance. RESULTS: Overall, between 1454 and 1538 patients completed 2 year follow-up with data for physical, sexual, and psychosocial well-being. In the hold-out validation set, our ML algorithms were able to predict clinically significant changes in physical well-being (chest and upper body) (worsened: AUC range 0.69-0.70; improved: AUC range 0.81-0.82), sexual well-being (worsened: AUC range 0.76-0.77; improved: AUC range 0.74-0.76), and psychosocial well-being (worsened: AUC range 0.64-0.66; improved: AUC range 0.66-0.66). Baseline patient-reported outcome (PRO) variables showed the largest influence on model predictions. CONCLUSIONS: Machine learning can predict long-term individual PROs of patients undergoing postmastectomy breast reconstruction with acceptable accuracy. This may better help patients and clinicians make informed decisions regarding expected long-term effect of treatment, facilitate patient-centered care, and ultimately improve postoperative health-related quality of life.


Asunto(s)
Neoplasias de la Mama , Mamoplastia , Humanos , Femenino , Mastectomía/efectos adversos , Neoplasias de la Mama/cirugía , Neoplasias de la Mama/psicología , Calidad de Vida , Satisfacción del Paciente , Mamoplastia/efectos adversos
6.
Qual Life Res ; 32(3): 713-727, 2023 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-36308591

RESUMEN

PURPOSE: The objective of the current study was to develop and test the performances of different ML algorithms which were trained using patient-reported symptom severity data to predict mortality within 180 days for patients with advanced cancer. METHODS: We randomly selected 630 of 689 patients with advanced cancer at our institution who completed symptom PRO measures as part of routine care between 2009 and 2020. Using clinical, demographic, and PRO data, we trained and tested four ML algorithms: generalized regression with elastic net regularization (GLM), extreme gradient boosting (XGBoost) trees, support vector machines (SVM), and a single hidden layer neural network (NNET). We assessed the performance of algorithms individually as well as part of an unweighted voting ensemble on the hold-out testing sample. Performance was assessed using area under the receiver-operating characteristic curve (AUROC), sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). RESULTS: The starting cohort of 630 patients was randomly partitioned into training (n = 504) and testing (n = 126) samples. Of the four ML models, the XGBoost algorithm demonstrated the best performance for 180-day mortality prediction in testing data (AUROC = 0.69, sensitivity = 0.68, specificity = 0.62, PPV = 0.66, NPV = 0.64). Ensemble of all algorithms performed worst (AUROC = 0.65, sensitivity = 0.65, specificity = 0.62, PPV = 0.65, NPV = 0.62). Of individual PRO symptoms, shortness of breath emerged as the variable of highest impact on the XGBoost 180-mortality prediction (1-AUROC = 0.30). CONCLUSION: Our findings support ML models driven by patient-reported symptom severity as accurate predictors of short-term mortality in patients with advanced cancer, highlighting the opportunity to integrate these models prospectively into future studies of goal-concordant care.


Asunto(s)
Neoplasias , Calidad de Vida , Humanos , Calidad de Vida/psicología , Algoritmos , Aprendizaje Automático , Medición de Resultados Informados por el Paciente
7.
J Med Internet Res ; 25: e41870, 2023 04 27.
Artículo en Inglés | MEDLINE | ID: mdl-37104031

RESUMEN

BACKGROUND: Routine use of patient-reported outcome measures (PROMs) and computerized adaptive tests (CATs) may improve care in a range of surgical conditions. However, most available CATs are neither condition-specific nor coproduced with patients and lack clinically relevant score interpretation. Recently, a PROM called the CLEFT-Q has been developed for use in the treatment of cleft lip or palate (CL/P), but the assessment burden may be limiting its uptake into clinical practice. OBJECTIVE: We aimed to develop a CAT for the CLEFT-Q, which could facilitate the uptake of the CLEFT-Q PROM internationally. We aimed to conduct this work with a novel patient-centered approach and make source code available as an open-source framework for CAT development in other surgical conditions. METHODS: CATs were developed with the Rasch measurement theory, using full-length CLEFT-Q responses collected during the CLEFT-Q field test (this included 2434 patients across 12 countries). These algorithms were validated in Monte Carlo simulations involving full-length CLEFT-Q responses collected from 536 patients. In these simulations, the CAT algorithms approximated full-length CLEFT-Q scores iteratively, using progressively fewer items from the full-length PROM. Agreement between full-length CLEFT-Q score and CAT score at different assessment lengths was measured using the Pearson correlation coefficient, root-mean-square error (RMSE), and 95% limits of agreement. CAT settings, including the number of items to be included in the final assessments, were determined in a multistakeholder workshop that included patients and health care professionals. A user interface was developed for the platform, and it was prospectively piloted in the United Kingdom and the Netherlands. Interviews were conducted with 6 patients and 4 clinicians to explore end-user experience. RESULTS: The length of all 8 CLEFT-Q scales in the International Consortium for Health Outcomes Measurement (ICHOM) Standard Set combined was reduced from 76 to 59 items, and at this length, CAT assessments reproduced full-length CLEFT-Q scores accurately (with correlations between full-length CLEFT-Q score and CAT score exceeding 0.97, and the RMSE ranging from 2 to 5 out of 100). Workshop stakeholders considered this the optimal balance between accuracy and assessment burden. The platform was perceived to improve clinical communication and facilitate shared decision-making. CONCLUSIONS: Our platform is likely to facilitate routine CLEFT-Q uptake, and this may have a positive impact on clinical care. Our free source code enables other researchers to rapidly and economically reproduce this work for other PROMs.


Asunto(s)
Labio Leporino , Fisura del Paladar , Procedimientos de Cirugía Plástica , Cirugía Plástica , Humanos , Labio Leporino/cirugía , Fisura del Paladar/cirugía , Medición de Resultados Informados por el Paciente , Pruebas Adaptativas Computarizadas
8.
PLoS Med ; 19(4): e1003954, 2022 04.
Artículo en Inglés | MEDLINE | ID: mdl-35385471

RESUMEN

BACKGROUND: The importance of patient-reported outcome measurement in chronic kidney disease (CKD) populations has been established. However, there remains a lack of research that has synthesised data around CKD-specific symptom and health-related quality of life (HRQOL) burden globally, to inform focused measurement of the most relevant patient-important information in a way that minimises patient burden. The aim of this review was to synthesise symptom prevalence/severity and HRQOL data across the following CKD clinical groups globally: (1) stage 1-5 and not on renal replacement therapy (RRT), (2) receiving dialysis, or (3) in receipt of a kidney transplant. METHODS AND FINDINGS: MEDLINE, PsycINFO, and CINAHL were searched for English-language cross-sectional/longitudinal studies reporting prevalence and/or severity of symptoms and/or HRQOL in CKD, published between January 2000 and September 2021, including adult patients with CKD, and measuring symptom prevalence/severity and/or HRQOL using a patient-reported outcome measure (PROM). Random effects meta-analyses were used to pool data, stratified by CKD group: not on RRT, receiving dialysis, or in receipt of a kidney transplant. Methodological quality of included studies was assessed using the Joanna Briggs Institute Critical Appraisal Checklist for Studies Reporting Prevalence Data, and an exploration of publication bias performed. The search identified 1,529 studies, of which 449, with 199,147 participants from 62 countries, were included in the analysis. Studies used 67 different symptom and HRQOL outcome measures, which provided data on 68 reported symptoms. Random effects meta-analyses highlighted the considerable symptom and HRQOL burden associated with CKD, with fatigue particularly prevalent, both in patients not on RRT (14 studies, 4,139 participants: 70%, 95% CI 60%-79%) and those receiving dialysis (21 studies, 2,943 participants: 70%, 95% CI 64%-76%). A number of symptoms were significantly (p < 0.05 after adjustment for multiple testing) less prevalent and/or less severe within the post-transplantation population, which may suggest attribution to CKD (fatigue, depression, itching, poor mobility, poor sleep, and dry mouth). Quality of life was commonly lower in patients on dialysis (36-Item Short Form Health Survey [SF-36] Mental Component Summary [MCS] 45.7 [95% CI 45.5-45.8]; SF-36 Physical Component Summary [PCS] 35.5 [95% CI 35.3-35.6]; 91 studies, 32,105 participants for MCS and PCS) than in other CKD populations (patients not on RRT: SF-36 MCS 66.6 [95% CI 66.5-66.6], p = 0.002; PCS 66.3 [95% CI 66.2-66.4], p = 0.002; 39 studies, 24,600 participants; transplant: MCS 50.0 [95% CI 49.9-50.1], p = 0.002; PCS 48.0 [95% CI 47.9-48.1], p = 0.002; 39 studies, 9,664 participants). Limitations of the analysis are the relatively few studies contributing to symptom severity estimates and inconsistent use of PROMs (different measures and time points) across the included literature, which hindered interpretation. CONCLUSIONS: The main findings highlight the considerable symptom and HRQOL burden associated with CKD. The synthesis provides a detailed overview of the symptom/HRQOL profile across clinical groups, which may support healthcare professionals when discussing, measuring, and managing the potential treatment burden associated with CKD. PROTOCOL REGISTRATION: PROSPERO CRD42020164737.


Asunto(s)
Calidad de Vida , Insuficiencia Renal Crónica , Adulto , Estudios Transversales , Fatiga , Humanos , Diálisis Renal , Insuficiencia Renal Crónica/complicaciones , Insuficiencia Renal Crónica/terapia
9.
Eur Radiol ; 32(6): 4101-4115, 2022 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-35175381

RESUMEN

OBJECTIVES: AI-based algorithms for medical image analysis showed comparable performance to human image readers. However, in practice, diagnoses are made using multiple imaging modalities alongside other data sources. We determined the importance of this multi-modal information and compared the diagnostic performance of routine breast cancer diagnosis to breast ultrasound interpretations by humans or AI-based algorithms. METHODS: Patients were recruited as part of a multicenter trial (NCT02638935). The trial enrolled 1288 women undergoing routine breast cancer diagnosis (multi-modal imaging, demographic, and clinical information). Three physicians specialized in ultrasound diagnosis performed a second read of all ultrasound images. We used data from 11 of 12 study sites to develop two machine learning (ML) algorithms using unimodal information (ultrasound features generated by the ultrasound experts) to classify breast masses which were validated on the remaining study site. The same ML algorithms were subsequently developed and validated on multi-modal information (clinical and demographic information plus ultrasound features). We assessed performance using area under the curve (AUC). RESULTS: Of 1288 breast masses, 368 (28.6%) were histopathologically malignant. In the external validation set (n = 373), the performance of the two unimodal ultrasound ML algorithms (AUC 0.83 and 0.82) was commensurate with performance of the human ultrasound experts (AUC 0.82 to 0.84; p for all comparisons > 0.05). The multi-modal ultrasound ML algorithms performed significantly better (AUC 0.90 and 0.89) but were statistically inferior to routine breast cancer diagnosis (AUC 0.95, p for all comparisons ≤ 0.05). CONCLUSIONS: The performance of humans and AI-based algorithms improves with multi-modal information. KEY POINTS: • The performance of humans and AI-based algorithms improves with multi-modal information. • Multimodal AI-based algorithms do not necessarily outperform expert humans. • Unimodal AI-based algorithms do not represent optimal performance to classify breast masses.


Asunto(s)
Inteligencia Artificial , Neoplasias de la Mama , Algoritmos , Mama/diagnóstico por imagen , Mama/patología , Neoplasias de la Mama/diagnóstico por imagen , Neoplasias de la Mama/patología , Femenino , Humanos , Imagen Multimodal
10.
BMC Med Res Methodol ; 22(1): 282, 2022 11 01.
Artículo en Inglés | MEDLINE | ID: mdl-36319956

RESUMEN

BACKGROUND: There is growing enthusiasm for the application of machine learning (ML) and artificial intelligence (AI) techniques to clinical research and practice. However, instructions on how to develop robust high-quality ML and AI in medicine are scarce. In this paper, we provide a practical example of techniques that facilitate the development of high-quality ML systems including data pre-processing, hyperparameter tuning, and model comparison using open-source software and data. METHODS: We used open-source software and a publicly available dataset to train and validate multiple ML models to classify breast masses into benign or malignant using mammography image features and patient age. We compared algorithm predictions to the ground truth of histopathologic evaluation. We provide step-by-step instructions with accompanying code lines. FINDINGS: Performance of the five algorithms at classifying breast masses as benign or malignant based on mammography image features and patient age was statistically equivalent (P > 0.05). Area under the receiver operating characteristics curve (AUROC) for the logistic regression with elastic net penalty was 0.89 (95% CI 0.85 - 0.94), for the Extreme Gradient Boosting Tree 0.88 (95% CI 0.83 - 0.93), for the Multivariate Adaptive Regression Spline algorithm 0.88 (95% CI 0.83 - 0.93), for the Support Vector Machine 0.89 (95% CI 0.84 - 0.93), and for the neural network 0.89 (95% CI 0.84 - 0.93). INTERPRETATION: Our paper allows clinicians and medical researchers who are interested in using ML algorithms to understand and recreate the elements of a comprehensive ML analysis. Following our instructions may help to improve model generalizability and reproducibility in medical ML studies.


Asunto(s)
Inteligencia Artificial , Aprendizaje Automático , Humanos , Reproducibilidad de los Resultados , Redes Neurales de la Computación , Algoritmos
11.
Qual Life Res ; 31(3): 917-925, 2022 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-34590202

RESUMEN

PURPOSE: This study aimed to evaluate and improve the accuracy and efficiency of the QuickDASH for use in assessment of limb function in patients with upper extremity lymphedema using modern psychometric techniques. METHOD: We conducted confirmative factor analysis (CFA) and Mokken analysis to examine the assumption of unidimensionality for IRT model on data from 285 patients who completed the QuickDASH, and then fit the data to Samejima's graded response model (GRM) and assessed the assumption of local independence of items and calibrated the item responses for CAT simulation. RESULTS: Initial CFA and Mokken analyses demonstrated good scalability of items and unidimensionality. However, the local independence of items assumption was violated between items 9 (severity of pain) and 11 (sleeping difficulty due to pain) (Yen's Q3 = 0.46) and disordered thresholds were evident for item 5 (cutting food). After addressing these breaches of assumptions, the re-analyzed GRM with the remaining 10 items achieved an improved fit. Simulation of CAT administration demonstrated a high correlation between scores on the CAT and the QuickDash (r = 0.98). Items 2 (doing heavy chores) and 8 (limiting work or daily activities) were the most frequently used. The correlation among factor scores derived from the QuickDASH version with 11 items and the Ultra-QuickDASH version with items 2 and 8 was as high as 0.91. CONCLUSION: By administering just these two best performing QuickDash items we can obtain estimates that are very similar to those obtained from the full-length QuickDash without the need for CAT technology.


Asunto(s)
Pruebas Adaptativas Computarizadas , Linfedema , Humanos , Linfedema/diagnóstico , Psicometría , Calidad de Vida/psicología , Encuestas y Cuestionarios
12.
Aesthetic Plast Surg ; 46(6): 2769-2780, 2022 12.
Artículo en Inglés | MEDLINE | ID: mdl-35764813

RESUMEN

INTRODUCTION: In the past decade there has been an increasing interest in the field of patient-reported outcome measures (PROMs) which are now commonly used alongside traditional outcome measures, such as morbidity and mortality. Since the FACE-Q Aesthetic development in 2010, it has been widely used in clinical practice and research, measuring the quality of life and patient satisfaction. It quantifies the impact and change across different aspects of cosmetic facial surgery and minimally invasive treatments. We review how researchers have utilized the FACE-Q Aesthetic module to date, and aim to understand better whether and how it has enhanced our understanding and practice of aesthetic facial procedures. METHODS: We performed a systematic search of the literature. Publications that used the FACE-Q Aesthetic module to evaluate patient outcomes were included. Publications about the development of PROMs or modifications of the FACE-Q Aesthetic, translation or validation studies of the FACE-Q Aesthetic scales, papers not published in English, reviews, comments/discussions, or letters to the editor were excluded. RESULTS: Our search produced 1189 different articles; 70 remained after applying in- and exclusion criteria. Significant findings and associations were further explored. The need for evidence-based patient-reported outcome caused a growing uptake of the FACE-Q Aesthetic in cosmetic surgery and dermatology an increasing amount of evidence concerning facelift surgery, botulinum toxin, rhinoplasty, soft tissue fillers, scar treatments, and experimental areas. DISCUSSION: The FACE-Q Aesthetic has been used to contribute substantial evidence about the outcome from the patient perspective in cosmetic facial surgery and minimally invasive treatments. The FACE-Q Aesthetic holds great potential to improve quality of care and may fundamentally change the way we measure success in plastic surgery and dermatology. LEVEL OF EVIDENCE III: This journal requires that authors assign a level of evidence to each article. For a full description of these Evidence-Based Medicine ratings, please refer to the Table of Contents or the online Instructions to Authors www.springer.com/00266 .


Asunto(s)
Medición de Resultados Informados por el Paciente , Procedimientos de Cirugía Plástica , Calidad de Vida , Humanos , Estética
13.
BMC Med Res Methodol ; 21(1): 158, 2021 07 31.
Artículo en Inglés | MEDLINE | ID: mdl-34332525

RESUMEN

BACKGROUND: Unstructured text, including medical records, patient feedback, and social media comments, can be a rich source of data for clinical research. Natural language processing (NLP) describes a set of techniques used to convert passages of written text into interpretable datasets that can be analysed by statistical and machine learning (ML) models. The purpose of this paper is to provide a practical introduction to contemporary techniques for the analysis of text-data, using freely-available software. METHODS: We performed three NLP experiments using publicly-available data obtained from medicine review websites. First, we conducted lexicon-based sentiment analysis on open-text patient reviews of four drugs: Levothyroxine, Viagra, Oseltamivir and Apixaban. Next, we used unsupervised ML (latent Dirichlet allocation, LDA) to identify similar drugs in the dataset, based solely on their reviews. Finally, we developed three supervised ML algorithms to predict whether a drug review was associated with a positive or negative rating. These algorithms were: a regularised logistic regression, a support vector machine (SVM), and an artificial neural network (ANN). We compared the performance of these algorithms in terms of classification accuracy, area under the receiver operating characteristic curve (AUC), sensitivity and specificity. RESULTS: Levothyroxine and Viagra were reviewed with a higher proportion of positive sentiments than Oseltamivir and Apixaban. One of the three LDA clusters clearly represented drugs used to treat mental health problems. A common theme suggested by this cluster was drugs taking weeks or months to work. Another cluster clearly represented drugs used as contraceptives. Supervised machine learning algorithms predicted positive or negative drug ratings with classification accuracies ranging from 0.664, 95% CI [0.608, 0.716] for the regularised regression to 0.720, 95% CI [0.664,0.776] for the SVM. CONCLUSIONS: In this paper, we present a conceptual overview of common techniques used to analyse large volumes of text, and provide reproducible code that can be readily applied to other research studies using open-source software.


Asunto(s)
Aprendizaje Automático , Procesamiento de Lenguaje Natural , Algoritmos , Humanos , Redes Neurales de la Computación , Máquina de Vectores de Soporte
14.
J Med Internet Res ; 23(7): e26412, 2021 07 30.
Artículo en Inglés | MEDLINE | ID: mdl-34328443

RESUMEN

BACKGROUND: Computerized adaptive testing (CAT) has been shown to deliver short, accurate, and personalized versions of the CLEFT-Q patient-reported outcome measure for children and young adults born with a cleft lip and/or palate. Decision trees may integrate clinician-reported data (eg, age, gender, cleft type, and planned treatments) to make these assessments even shorter and more accurate. OBJECTIVE: We aimed to create decision tree models incorporating clinician-reported data into adaptive CLEFT-Q assessments and compare their accuracy to traditional CAT models. METHODS: We used relevant clinician-reported data and patient-reported item responses from the CLEFT-Q field test to train and test decision tree models using recursive partitioning. We compared the prediction accuracy of decision trees to CAT assessments of similar length. Participant scores from the full-length questionnaire were used as ground truth. Accuracy was assessed through Pearson's correlation coefficient of predicted and ground truth scores, mean absolute error, root mean squared error, and a two-tailed Wilcoxon signed-rank test comparing squared error. RESULTS: Decision trees demonstrated poorer accuracy than CAT comparators and generally made data splits based on item responses rather than clinician-reported data. CONCLUSIONS: When predicting CLEFT-Q scores, individual item responses are generally more informative than clinician-reported data. Decision trees that make binary splits are at risk of underfitting polytomous patient-reported outcome measure data and demonstrated poorer performance than CATs in this study.


Asunto(s)
Labio Leporino , Fisura del Paladar , Labio Leporino/diagnóstico , Fisura del Paladar/diagnóstico , Humanos , Medición de Resultados Informados por el Paciente , Calidad de Vida
15.
Qual Life Res ; 29(4): 1065-1072, 2020 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-31758485

RESUMEN

PURPOSE: With the BODY-Q, one can assess outcomes, such as satisfaction with appearance, in weight loss and body contouring patients using multiple scales. All scales can be used independently in any given combination or order. Currently, the BODY-Q cannot provide overall appearance scores across scales that measure a similar super-ordinate construct (i.e., overall appearance), which could improve the scales' usefulness as a benchmarking tool and improve the comprehensibility of patient feedback. We explored the possibility of establishing overall appearance scores, by applying a bifactor model to the BODY-Q appearance scales. METHODS: In a bifactor model, questionnaire items load onto both a primary specific factors and a general factor, such as satisfaction with appearance. The international BODY-Q validation patient sample (n = 734) was used to fit a bifactor model to the appearance domain. Factor loadings, fit indices, and correlation between bifactor appearance domain and satisfaction with body scale were assessed. RESULTS: All items loaded on the general factor of their corresponding domain. In the appearance domain, all items demonstrated adequate item fit to the model. All scales had satisfactory fit to the bifactor model (RMSEA 0.045, CFI 0.969, and TLI 0.964). The correlation between the appearance domain summary scores and satisfaction with body scale scores was found to be 0.77. DISCUSSION: We successfully applied a bifactor model to BODY-Q data with good item and model fit indices. With this method, we were able to produce reliable overall appearance scores which may improve the interpretability of the BODY-Q while increasing flexibility.


Asunto(s)
Imagen Corporal/psicología , Satisfacción del Paciente/estadística & datos numéricos , Apariencia Física/fisiología , Psicometría/métodos , Benchmarking , Estado de Salud , Humanos , Calidad de Vida/psicología , Encuestas y Cuestionarios , Pérdida de Peso
16.
J Med Internet Res ; 22(10): e20950, 2020 10 29.
Artículo en Inglés | MEDLINE | ID: mdl-33118937

RESUMEN

Patient-reported assessments are transforming many facets of health care, but there is scope to modernize their delivery. Contemporary assessment techniques like computerized adaptive testing (CAT) and machine learning can be applied to patient-reported assessments to reduce burden on both patients and health care professionals; improve test accuracy; and provide individualized, actionable feedback. The Concerto platform is a highly adaptable, secure, and easy-to-use console that can harness the power of CAT and machine learning for developing and administering advanced patient-reported assessments. This paper introduces readers to contemporary assessment techniques and the Concerto platform. It reviews advances in the field of patient-reported assessment that have been driven by the Concerto platform and explains how to create an advanced, adaptive assessment, for free, with minimal prior experience with CAT or programming.


Asunto(s)
Aprendizaje Automático/normas , Medición de Resultados Informados por el Paciente , Psicometría/métodos , Computadores , Retroalimentación , Femenino , Humanos , Masculino
18.
Clin Genet ; 96(5): 411-417, 2019 11.
Artículo en Inglés | MEDLINE | ID: mdl-31323115

RESUMEN

Genome sequencing (GS) is increasingly being used to diagnose rare diseases in paediatric patients; however, no measures exist to evaluate their knowledge of this technology. We aimed to develop a robust measure of knowledge of GS (the kids-KOGS') suitable for use in the paediatric setting as well as for general public education. The target age was 11 to 15 year olds. An iterative process involving six sequential stages was conducted to develop a set of draft true/false items. These were then administered to 539 target-age school pupils (mean 12.8; SD ± 1.3), from the United Kingdom. Item-response theory was used to confirm the psychometric suitability of the candidate items. None of the Items was identified as misfits. All 10 items performed well under the two-parameter logistic model. The internal consistency of the test was 0.84 (Cronbach alpha value) indicating excellent reliability. The mean kids-KOGS score in the sample overall was 4.24 (SD; 2.49), where 0 = low knowledge and 10 = high knowledge. Age was positively associated with score in a multivariate linear regression. The kids-KOGS is a short and reliable tool that can be used by researchers and healthcare professionals offering GS to paediatric patients. Further validation in a clinical setting is required.


Asunto(s)
Pediatría , Enfermedades Raras/genética , Secuenciación Completa del Genoma , Adolescente , Niño , Femenino , Humanos , Modelos Lineales , Masculino , Análisis Multivariante , Enfermedades Raras/diagnóstico , Enfermedades Raras/epidemiología , Enfermedades Raras/patología , Reino Unido/epidemiología
19.
BMC Med Res Methodol ; 19(1): 64, 2019 03 19.
Artículo en Inglés | MEDLINE | ID: mdl-30890124

RESUMEN

BACKGROUND: Following visible successes on a wide range of predictive tasks, machine learning techniques are attracting substantial interest from medical researchers and clinicians. We address the need for capacity development in this area by providing a conceptual introduction to machine learning alongside a practical guide to developing and evaluating predictive algorithms using freely-available open source software and public domain data. METHODS: We demonstrate the use of machine learning techniques by developing three predictive models for cancer diagnosis using descriptions of nuclei sampled from breast masses. These algorithms include regularized General Linear Model regression (GLMs), Support Vector Machines (SVMs) with a radial basis function kernel, and single-layer Artificial Neural Networks. The publicly-available dataset describing the breast mass samples (N=683) was randomly split into evaluation (n=456) and validation (n=227) samples. We trained algorithms on data from the evaluation sample before they were used to predict the diagnostic outcome in the validation dataset. We compared the predictions made on the validation datasets with the real-world diagnostic decisions to calculate the accuracy, sensitivity, and specificity of the three models. We explored the use of averaging and voting ensembles to improve predictive performance. We provide a step-by-step guide to developing algorithms using the open-source R statistical programming environment. RESULTS: The trained algorithms were able to classify cell nuclei with high accuracy (.94 -.96), sensitivity (.97 -.99), and specificity (.85 -.94). Maximum accuracy (.96) and area under the curve (.97) was achieved using the SVM algorithm. Prediction performance increased marginally (accuracy =.97, sensitivity =.99, specificity =.95) when algorithms were arranged into a voting ensemble. CONCLUSIONS: We use a straightforward example to demonstrate the theory and practice of machine learning for clinicians and medical researchers. The principals which we demonstrate here can be readily applied to other complex tasks including natural language processing and image recognition.


Asunto(s)
Algoritmos , Neoplasias de la Mama/diagnóstico , Diagnóstico por Computador/métodos , Aprendizaje Automático , Redes Neurales de la Computación , Máquina de Vectores de Soporte , Femenino , Humanos , Sensibilidad y Especificidad , Programas Informáticos
20.
J Med Internet Res ; 21(7): e12212, 2019 07 11.
Artículo en Inglés | MEDLINE | ID: mdl-31298217

RESUMEN

BACKGROUND: Quality of life (QoL) assessments, or patient-reported outcome measures (PROMs), are becoming increasingly important in health care and have been associated with improved decision making, higher satisfaction, and better outcomes of care. Some physicians and patients may find questionnaires too burdensome; however, this issue could be addressed by making use of computerized adaptive testing (CAT). In addition, making the questionnaire more interesting, for example by providing graphical and contextualized feedback, may further improve the experience of the users. However, little is known about how shorter assessments and feedback impact user experience. OBJECTIVE: We conducted a controlled experiment to assess the impact of tailored multimodal feedback and CAT on user experience in QoL assessment using validated PROMs. METHODS: We recruited a representative sample from the general population in the United Kingdom using the Oxford Prolific academic Web panel. Participants completed either a CAT version of the World Health Organization Quality of Life assessment (WHOQOL-CAT) or the fixed-length WHOQOL-BREF, an abbreviated version of the WHOQOL-100. We randomly assigned participants to conditions in which they would receive no feedback, graphical feedback only, or graphical and adaptive text-based feedback. Participants rated the assessment in terms of perceived acceptability, engagement, clarity, and accuracy. RESULTS: We included 1386 participants in our analysis. Assessment experience was improved when graphical and tailored text-based feedback was provided along with PROMs (Δ=0.22, P<.001). Providing graphical feedback alone was weakly associated with improvement in overall experience (Δ=0.10, P=.006). Graphical and text-based feedback made the questionnaire more interesting, and users were more likely to report they would share the results with a physician or family member (Δ=0.17, P<.001, and Δ=0.17, P<.001, respectively). No difference was found in perceived accuracy of the graphical feedback scores of the WHOQOL-CAT and WHOQOL-BREF (Δ=0.06, P=.05). CAT (stopping rule [SE<0.45]) resulted in the administration of 25% fewer items than the fixed-length assessment, but it did not result in an improved user experience (P=.21). CONCLUSIONS: Using tailored text-based feedback to contextualize numeric scores maximized the acceptability of electronic QoL assessment. Improving user experience may increase response rates and reduce attrition in research and clinical use of PROMs. In this study, CAT administration was associated with a modest decrease in assessment length but did not improve user experience. Patient-perceived accuracy of feedback was equivalent when comparing CAT with fixed-length assessment. Fixed-length forms are already generally acceptable to respondents; however, CAT might have an advantage over longer questionnaires that would be considered burdensome. Further research is warranted to explore the relationship between assessment length, feedback, and response burden in diverse populations.


Asunto(s)
Medición de Resultados Informados por el Paciente , Psicometría/métodos , Calidad de Vida/psicología , Adolescente , Adulto , Anciano , Computadores , Retroalimentación , Femenino , Humanos , Masculino , Persona de Mediana Edad , Encuestas y Cuestionarios , Adulto Joven
SELECCIÓN DE REFERENCIAS
Detalles de la búsqueda