Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
Mais filtros








Base de dados
Intervalo de ano de publicação
1.
BMC Med Res Methodol ; 24(1): 158, 2024 Jul 23.
Artigo em Inglês | MEDLINE | ID: mdl-39044195

RESUMO

BACKGROUND: In randomized clinical trials, treatment effects may vary, and this possibility is referred to as heterogeneity of treatment effect (HTE). One way to quantify HTE is to partition participants into subgroups based on individual's risk of experiencing an outcome, then measuring treatment effect by subgroup. Given the limited availability of externally validated outcome risk prediction models, internal models (created using the same dataset in which heterogeneity of treatment analyses also will be performed) are commonly developed for subgroup identification. We aim to compare different methods for generating internally developed outcome risk prediction models for subject partitioning in HTE analysis. METHODS: Three approaches were selected for generating subgroups for the 2,441 participants from the United States enrolled in the ASPirin in Reducing Events in the Elderly (ASPREE) randomized controlled trial. An extant proportional hazards-based outcomes predictive risk model developed on the overall ASPREE cohort of 19,114 participants was identified and was used to partition United States' participants by risk of experiencing a composite outcome of death, dementia, or persistent physical disability. Next, two supervised non-parametric machine learning outcome classifiers, decision trees and random forests, were used to develop multivariable risk prediction models and partition participants into subgroups with varied risks of experiencing the composite outcome. Then, we assessed how the partitioning from the proportional hazard model compared to those generated by the machine learning models in an HTE analysis of the 5-year absolute risk reduction (ARR) and hazard ratio for aspirin vs. placebo in each subgroup. Cochran's Q test was used to detect if ARR varied significantly by subgroup. RESULTS: The proportional hazard model was used to generate 5 subgroups using the quintiles of the estimated risk scores; the decision tree model was used to generate 6 subgroups (6 automatically determined tree leaves); and the random forest model was used to generate 5 subgroups using the quintiles of the prediction probability as risk scores. Using the semi-parametric proportional hazards model, the ARR at 5 years was 15.1% (95% CI 4.0-26.3%) for participants with the highest 20% of predicted risk. Using the random forest model, the ARR at 5 years was 13.7% (95% CI 3.1-24.4%) for participants with the highest 20% of predicted risk. The highest outcome risk group in the decision tree model also exhibited a risk reduction, but the confidence interval was wider (5-year ARR = 17.0%, 95% CI= -5.4-39.4%). Cochran's Q test indicated ARR varied significantly only by subgroups created using the proportional hazards model. The hazard ratio for aspirin vs. placebo therapy did not significantly vary by subgroup in any of the models. The highest risk groups for the proportional hazards model and random forest model contained 230 participants each, while the highest risk group in the decision tree model contained 41 participants. CONCLUSIONS: The choice of technique for internally developed models for outcome risk subgroups influences HTE analyses. The rationale for the use of a particular subgroup determination model in HTE analyses needs to be explicitly defined based on desired levels of explainability (with features importance), uncertainty of prediction, chances of overfitting, and assumptions regarding the underlying data structure. Replication of these analyses using data from other mid-size clinical trials may help to establish guidance for selecting an outcomes risk prediction modelling technique for HTE analyses.


Assuntos
Aspirina , Aprendizado de Máquina , Modelos de Riscos Proporcionais , Humanos , Aspirina/uso terapêutico , Idoso , Feminino , Masculino , Resultado do Tratamento , Estados Unidos , Medição de Risco/métodos , Medição de Risco/estatística & dados numéricos , Modelos Estatísticos , Ensaios Clínicos Controlados Aleatórios como Assunto/métodos , Ensaios Clínicos Controlados Aleatórios como Assunto/estatística & dados numéricos , Árvores de Decisões , Avaliação de Resultados em Cuidados de Saúde/métodos , Avaliação de Resultados em Cuidados de Saúde/estatística & dados numéricos
2.
J Clin Transl Sci ; 8(1): e70, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38690227

RESUMO

[This corrects the article DOI: 10.1017/cts.2024.284.].

3.
Pac Symp Biocomput ; 29: 108-119, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38160273

RESUMO

Classical machine learning and deep learning models for Computer-Aided Diagnosis (CAD) commonly focus on overall classification performance, treating misclassification errors (false negatives and false positives) equally during training. This uniform treatment overlooks the distinct costs associated with each type of error, leading to suboptimal decision-making, particularly in the medical domain where it is important to improve the prediction sensitivity without significantly compromising overall accuracy. This study introduces a novel deep learning-based CAD system that incorporates a cost-sensitive parameter into the activation function. By applying our methodologies to two medical imaging datasets, our proposed study shows statistically significant increases of 3.84% and 5.4% in sensitivity while maintaining overall accuracy for Lung Image Database Consortium (LIDC) and Breast Cancer Histological Database (BreakHis), respectively. Our findings underscore the significance of integrating cost-sensitive parameters into future CAD systems to optimize performance and ultimately reduce costs and improve patient outcomes.


Assuntos
Aprendizado Profundo , Humanos , Biologia Computacional , Diagnóstico por Computador/métodos , Pulmão , Computadores
4.
Front Big Data ; 6: 1173038, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37139170

RESUMO

Data integration is a well-motivated problem in the clinical data science domain. Availability of patient data, reference clinical cases, and datasets for research have the potential to advance the healthcare industry. However, the unstructured (text, audio, or video data) and heterogeneous nature of the data, the variety of data standards and formats, and patient privacy constraint make data interoperability and integration a challenge. The clinical text is further categorized into different semantic groups and may be stored in different files and formats. Even the same organization may store cases in different data structures, making data integration more challenging. With such inherent complexity, domain experts and domain knowledge are often necessary to perform data integration. However, expert human labor is time and cost prohibitive. To overcome the variability in the structure, format, and content of the different data sources, we map the text into common categories and compute similarity within those. In this paper, we present a method to categorize and merge clinical data by considering the underlying semantics behind the cases and use reference information about the cases to perform data integration. Evaluation shows that we were able to merge 88% of clinical data from five different sources.

5.
Annu Int Conf IEEE Eng Med Biol Soc ; 2020: 1254-1257, 2020 07.
Artigo em Inglês | MEDLINE | ID: mdl-33018215

RESUMO

Computer-aided Diagnosis (CAD) systems have long aimed to be used in clinical practice to help doctors make decisions by providing a second opinion. However, most machine learning based CAD systems make predictions without explicitly showing how their predictions were generated. Since the cognitive process of the diagnostic imaging interpretation involves various visual characteristics of the region of interest, the explainability of the results should leverage those characteristics. We encode visual characteristics of the region of interest based on pairs of similar images rather than the image content by itself. Using a Siamese convolutional neural network (SCNN), we first learn the similarity among nodules, then encode image content using the SCNN similarity-based feature representation, and lastly, we apply the K-nearest neighbor (KNN) approach to make diagnostic characterizations using the Siamese-based image features. We demonstrate the feasibility of our approach on spiculation, a visual characteristic that radiologists consider when interpreting the degree of cancer malignancy, and the NIH/NCI Lung Image Database Consortium (LIDC) dataset that contains both spiculation and malignancy characteristics for lung nodules.Clinical Relevance - This establishes that spiculation can be quantified to automate the diagnostic characterization of lung nodules in Computed Tomography images.


Assuntos
Neoplasias Pulmonares , Interpretação de Imagem Radiográfica Assistida por Computador , Humanos , Pulmão , Neoplasias Pulmonares/diagnóstico por imagem , Redes Neurais de Computação , Tomografia Computadorizada por Raios X
6.
J Chem Educ ; 93(9): 1561-1568, 2016 09 13.
Artigo em Inglês | MEDLINE | ID: mdl-27795574

RESUMO

Structured databases of chemical and physical properties play a central role in the everyday research activities of scientists and engineers. In materials science, researchers and engineers turn to these databases to quickly query, compare, and aggregate various properties, thereby allowing for the development or application of new materials. The vast majority of these databases have been generated manually, through decades of labor-intensive harvesting of information from the literature; yet, while there are many examples of commonly used databases, a significant number of important properties remain locked within the tables, figures, and text of publications. The question addressed in our work is whether, and to what extent, the process of data collection can be automated. Students of the physical sciences and engineering are often confronted with the challenge of finding and applying property data from the literature, and a central aspect of their education is to develop the critical skills needed to identify such data and discern their meaning or validity. To address shortcomings associated with automated information extraction, while simultaneously preparing the next generation of scientists for their future endeavors, we developed a novel course-based approach in which students develop skills in polymer chemistry and physics and apply their knowledge by assisting with the semi-automated creation of a thermodynamic property database.

7.
Procedia Comput Sci ; 80: 386-397, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-28649288

RESUMO

A wealth of valuable data is locked within the millions of research articles published each year. Reading and extracting pertinent information from those articles has become an unmanageable task for scientists. This problem hinders scientific progress by making it hard to build on results buried in literature. Moreover, these data are loosely structured, encoded in manuscripts of various formats, embedded in different content types, and are, in general, not machine accessible. We present a hybrid human-computer solution for semi-automatically extracting scientific facts from literature. This solution combines an automated discovery, download, and extraction phase with a semi-expert crowd assembled from students to extract specific scientific facts. To evaluate our approach we apply it to a challenging molecular engineering scenario, extraction of a polymer property: the Flory-Huggins interaction parameter. We demonstrate useful contributions to a comprehensive database of polymer properties.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA