Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 2 de 2
Filtrar
Mais filtros

Base de dados
Ano de publicação
Tipo de documento
Intervalo de ano de publicação
1.
J Biopharm Stat ; : 1-7, 2024 Mar 29.
Artigo em Inglês | MEDLINE | ID: mdl-38549510

RESUMO

The U.S. Food and Drug Administration (FDA) has broadly supported quality by design initiatives for clinical trials - including monitoring and data validation - by releasing two related guidance documents (FDA 2013 and 2019). Centralized statistical monitoring (CSM) can be a component of a quality by design process. In this article, we describe our experience with a CSM platform as part of a Cooperative Research and Development Agreement between CluePoints and FDA. This agreement's approach to CSM is based on many statistical tests performed on all relevant subject-level data submitted to identify outlying sites. An overall data inconsistency score is calculated to assess the inconsistency of data from one site compared to data from all sites. Sites are ranked by the data inconsistency score (-log10p,where p is an aggregated p-value). Results from a deidentified trial demonstrate the typical data anomaly findings through Statistical Monitoring Applied to Research Trials analyses. Sensitivity analyses were performed after excluding laboratory data and questionnaire data. Graphics from deidentified subject-level trial data illustrate abnormal data patterns. The analyses were performed by site, country/region, and patient separately. Key risk indicator analyses were conducted for the selected endpoints. Potential data anomalies and their possible causes are discussed. This data-driven approach can be effective and efficient in selecting sites that exhibit data anomalies and provides insights to statistical reviewers for conducting sensitivity analyses, subgroup analyses, and site by treatment effect explorations. Messy data, data failing to conform to standards, and other disruptions (e.g. the COVID-19 pandemic) can pose challenges.

2.
Sensors (Basel) ; 22(23)2022 Nov 30.
Artigo em Inglês | MEDLINE | ID: mdl-36502025

RESUMO

Addressing data anomalies (e.g., garbage data, outliers, redundant data, and missing data) plays a vital role in performing accurate analytics (billing, forecasting, load profiling, etc.) on smart homes' energy consumption data. From the literature, it has been identified that the data imputation with machine learning (ML)-based single-classifier approaches are used to address data quality issues. However, these approaches are not effective to address the hidden issues of smart home energy consumption data due to the presence of a variety of anomalies. Hence, this paper proposes ML-based ensemble classifiers using random forest (RF), support vector machine (SVM), decision tree (DT), naive Bayes, K-nearest neighbor, and neural networks to handle all the possible anomalies in smart home energy consumption data. The proposed approach initially identifies all anomalies and removes them, and then imputes this removed/missing information. The entire implementation consists of four parts. Part 1 presents anomaly detection and removal, part 2 presents data imputation, part 3 presents single-classifier approaches, and part 4 presents ensemble classifiers approaches. To assess the classifiers' performance, various metrics, namely, accuracy, precision, recall/sensitivity, specificity, and F1 score are computed. From these metrics, it is identified that the ensemble classifier "RF+SVM+DT" has shown superior performance over the conventional single classifiers as well the other ensemble classifiers for anomaly handling.


Assuntos
Aprendizado de Máquina , Máquina de Vetores de Suporte , Teorema de Bayes , Redes Neurais de Computação , Análise por Conglomerados
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa