RESUMEN
PURPOSE: During critical illness interpretation of serum creatinine is affected by non-steady state conditions, reduced creatinine generation, and altered distribution. We evaluated healthcare professionals' ability to adjudicate underlying kidney function, based on simulated creatinine values. METHODS: We developed an online survey, incorporating 12 scenarios with simulated trajectories of creatinine based on profiles of muscle mass, GFR and fluid balance using bespoke kinetic modelling. Participants predicted true underlying GFR (<5, 5-14, 15-29, 30-44, 45-59, 60-90, >90 ml.min-1.1.73 m-2) and AKI stage (stages 1-3, defined as 33 %, 50 %, 66 % decrease in GFR from baseline) during the first 7-days and at ICU discharge. RESULTS: 100 of 103 respondents from 16 countries, 94 completed 1 or more scenarios. 43(43 %) were senior physicians, 74(74 %) critical care and 31(31 %) nephrology physicians. Over the first 7-days, true GFR was correctly estimated 43 % of the time and underlying AKI stage in 57 % of patient days. At ICU discharge GFR was predicted 35 % of the time. At all timepoints, over and under-estimation of GFR was observed. CONCLUSION: Participants displayed marked variation in estimation of kidney function, suggesting difficulty in accounting for multiple confounders. There is need for alternative, unbiased measures of kidney function in critical illness to avoid misclassifying kidney disease.
RESUMEN
BACKGROUND: Stepped-wedge cluster trials (SW-CTs) describe a cluster trial design where treatment rollout is staggered over the course of the trial. Clusters are commonly randomized to receive treatment beginning at different time points in this study design (commonly referred to as a Stepped-wedge cluster randomized trial; SW-CRT), but they can also be non-randomized. Trials with this design regularly have a low number of clusters and can be vulnerable to covariate imbalance. To address such covariate imbalance, previous work has examined covariate-constrained randomization and analysis adjustment for imbalanced covariates in mixed-effects models. These methods require the imbalanced covariate to always be known and measured. In contrast, the fixed-effects model automatically adjusts for all imbalanced time-invariant covariates, both measured and unmeasured, and has been implicated to have proper type I error control in SW-CTs with a small number of clusters and binary outcomes. METHODS: We present a simulation study comparing the performance of the fixed-effects model against the mixed-effects model in randomized and non-randomized SW-CTs with small numbers of clusters and continuous outcomes. Additionally, we compare these models in scenarios with cluster-level covariate imbalances or confounding. RESULTS: We found that the mixed-effects model can have low coverage probabilities and inflated type I error rates in SW-CTs with continuous outcomes, especially with a small number of clusters or when the ICC is low. Furthermore, mixed-effects models with a Satterthwaite or Kenward-Roger small sample correction can still result in inflated or overly conservative type I error rates, respectively. In contrast, the fixed-effects model consistently produced the target level of coverage probability and type I error rates without dramatically compromising power. Furthermore, the fixed-effects model was able to automatically account for all time-invariant cluster-level covariate imbalances and confounding to robustly yield unbiased estimates. CONCLUSIONS: We recommend the fixed-effects model for robust analysis of SW-CTs with a small number of clusters and continuous outcomes, due to its proper type I error control and ability to automatically adjust for all potential imbalanced time-invariant cluster-level covariates and confounders.
Asunto(s)
Simulación por Computador , Modelos Estadísticos , Ensayos Clínicos Controlados Aleatorios como Asunto , Proyectos de Investigación , Humanos , Análisis por Conglomerados , Ensayos Clínicos Controlados Aleatorios como Asunto/métodos , Ensayos Clínicos Controlados Aleatorios como Asunto/estadística & datos numéricos , Interpretación Estadística de Datos , Factores de Tiempo , Resultado del Tratamiento , Tamaño de la MuestraRESUMEN
Measuring disease progression in clinical trials for testing novel treatments for multifaceted diseases as progressive supranuclear palsy (PSP), remains challenging. In this study we assess a range of statistical approaches to compare outcomes as measured by the items of the progressive supranuclear palsy rating scale (PSPRS). We consider several statistical approaches, including sum scores, a modified PSPRS rating scale that had been recommended by FDA in a pre-IND meeting, multivariate tests, and analysis approaches based on multiple comparisons of the individual items. In addition, we propose two novel approaches which measure disease status based on Item Response Theory models. We assess the performance of these tests under various scenarios in an extensive simulation study and illustrate their use with a re-analysis of the ABBV-8E12 clinical trial. Furthermore, we discuss the impact of the FDA-recommended scoring of item scores on the power of the statistical tests. We find that classical approaches as the PSPRS sum score demonstrate moderate to high power when treatment effects are consistent across the individual items. The tests based on Item Response Theory (IRT) models yield the highest power when the simulated data are generated from an IRT model. The multiple testing based approaches have a higher power in settings where the treatment effect is limited to certain domains or items. The study demonstrates that there is no one-size-fits-all testing procedure for evaluating treatment effects using PSPRS items; the optimal method varies based on the specific effect size patterns. The efficiency of the PSPRS sum score, while generally robust and straightforward to apply, varies depending on the specific patterns of effect sizes encountered and more powerful alternatives are available in specific settings. These findings can have important implications for the design of future clinical trials in PSP and similar multifaceted diseases.
Asunto(s)
Parálisis Supranuclear Progresiva , Parálisis Supranuclear Progresiva/diagnóstico , Humanos , Análisis Multivariante , Ensayos Clínicos como Asunto , Progresión de la EnfermedadRESUMEN
Introduction: Parkinson's Disease affects over 8.5 million people and there are currently no medications approved to treat underlying disease. Clinical trials for disease modifying therapies (DMT) are hampered by a lack of sufficiently sensitive measures to detect treatment effect. Reliable digital assessments of motor function allow for frequent at-home measurements that may be able to sensitively detect disease progression. Methods: Here, we estimate the test-retest reliability of a suite of at-home motor measures derived from raw triaxial accelerometry data collected from 44 participants (21 with confirmed PD) and use the estimates to simulate digital measures in DMT trials. We consider three schedules of assessments and fit linear mixed models to the simulated data to determine whether a treatment effect can be detected. Results: We find at-home measures vary in reliability; many have ICCs as high as or higher than MDS-UPDRS part III total score. Compared with quarterly in-clinic assessments, frequent at-home measures reduce the sample size needed to detect a 30% reduction in disease progression from over 300 per study arm to 150 or less than 100 for bursts and evenly spaced at-home assessments, respectively. The results regarding superiority of at-home assessments for detecting change over time are robust to relaxing assumptions regarding the responsiveness to disease progression and variability in progression rates. Discussion: Overall, at-home measures have a favorable reliability profile for sensitive detection of treatment effects in DMT trials. Future work is needed to better understand the causes of variability in PD progression and identify the most appropriate statistical methods for effect detection.
RESUMEN
BACKGROUND: Up to half of the children with new-onset type 1 diabetes present to the hospital with diabetic ketoacidosis, a life-threatening condition that can develop because of diagnostic delay. Three-quarters of Australian children visit their general practitioner (GP) the week before presenting to the hospital with diabetic ketoacidosis. Our prototype, DIRECT-T1DM (Decision-Support for Integrated, Real-Time Evaluation and Clinical Treatment of Type 1 Diabetes Mellitus), is an electronic clinical decision support tool that promotes immediate point-of-care testing in general practice to confirm the suspicion of diabetes. This avoids laboratory testing, which has been documented internationally as a cause of diagnostic delay. OBJECTIVE: In this investigation, we aimed to pilot and assess the feasibility and acceptability of our prototype to GP end users. We also explored the challenges of diagnosing type 1 diabetes in the Australian general practice context. METHODS: In total, 4 GPs, a pediatric endocrinologist, and a PhD candidate were involved in conceptualizing the DIRECT-T1DM prototype, which was developed at the Department of General Practice and Primary Care at the University of Melbourne. Furthermore, 6 GPs were recruited via convenience sampling to evaluate the tool. The study involved 3 phases: a presimulation interview, simulated clinical scenarios, and a postsimulation interview. The interview guide was developed using the Consolidated Framework for Implementation Research (CFIR) as a guide. All phases of the study were video, audio, and screen recorded. Audio recordings were transcribed by the investigating team. Analysis was carried out using CFIR as the underlying framework. RESULTS: Major themes were identified among three domains and 7 constructs of the CFIR: (1) outer setting-time pressure, difficulty in diagnosing pediatric type 1 diabetes, and secondary care considerations influenced GPs' needs regarding DIRECT-T1DM; (2) inner setting-DIRECT-T1DM fits within existing workflows, it has a high relative priority due to its importance in patient safety, and GPs exhibited high tension for change; and (3) innovation-design recommendations included altering coloring to reflect urgency, font style and bolding, specific language, information and guidelines, and inclusion of patient information sheets. CONCLUSIONS: End-user acceptability of DIRECT-T1DM was high. This was largely due to its implications for patient safety and its "real-time" nature. DIRECT-T1DM may assist in appropriate management of children with new-onset diabetes, which is an uncommon event in general practice, through safety netting.
Asunto(s)
Sistemas de Apoyo a Decisiones Clínicas , Diabetes Mellitus Tipo 1 , Estudios de Factibilidad , Médicos Generales , Investigación Cualitativa , Humanos , Diabetes Mellitus Tipo 1/diagnóstico , Niño , Femenino , Masculino , Australia , Entrevistas como Asunto , AdultoRESUMEN
Methodological developments in different sectors like health, biomedical and biological areas are the recent burning issue in the statistical literature. The approach of implementing declining hazard function which is obtained by compounding truncated Poisson distribution and a lifetime distribution is a special concern in a few studies. In this paper we proposed a newly introduced distribution called inverse Lomax-Uniform Poisson distribution mostly applied in the health, biomedical, biological, and related sectors. Some basic statistical properties of the distribution are discussed. The capability of the model is checked by comparing it with six potential models by using a practical real data set. Based on the comparison techniques, the newly proposed model out performs all its counterparts. The simulation study is also conducted. Furthermore, the joint modelling of repeatedly measured and time-to-vent processes is discussed in detail with the real data set in the health sector.
Asunto(s)
Modelos Estadísticos , Distribución de Poisson , Humanos , Simulación por ComputadorRESUMEN
BACKGROUND: A more sustainable diet with fewer animal-based products has a lower ecological impact but might lead to a lower protein quantity and quality. The extent to which shifting to more plant-based diets impacts the adequacy of protein intake in older adults needs to be studied. OBJECTIVES: We simulated how a transition towards a more plant-based diet (flexitarian, pescetarian, vegetarian, or vegan) affects protein availability in the diets of older adults. SETTING: Community. PARTICIPANTS: Data from the Dutch National Food Consumption Survey 2019-2021 of community-dwelling older adults (n = 607) was used MEASUREMENTS: Food consumption data was collected via two 24 -h dietary recalls per participant. Protein availability was expressed as total protein, digestible protein, and utilizable protein (based on digestibility corrected amino acid score) intake. The percentage below estimated average requirements (EAR) for utilizable protein was assessed using an adjusted EAR. RESULTS: Compared to the original diet (â¼62% animal-based), utilizable protein intake decreased by about 5% in the flexitarian, pescetarian and vegetarian scenarios. In the vegan scenario, both total protein intake and utilizable protein were lower, leading to nearly 50% less utilizable protein compared to the original diet. In the original diet, the protein intake of 7.5% of men and 11.1% of women did not meet the EAR. This slightly increased in the flexitarian, pescetarian, and vegetarian scenarios. In the vegan scenario, 83.3% (both genders) had a protein intake below EAR. CONCLUSIONS: Replacing animal-based protein sources with plant-based food products in older adults reduces both protein quantity and quality, albeit minimally in non-vegan plant-rich diets. In a vegan scenario, the risk of an inadequate protein intake is imminent.
Asunto(s)
Dieta Vegana , Proteínas en la Dieta , Humanos , Anciano , Masculino , Femenino , Proteínas en la Dieta/administración & dosificación , Países Bajos , Vida Independiente , Dieta Vegetariana/estadística & datos numéricos , Anciano de 80 o más Años , Necesidades Nutricionales , Encuestas sobre Dietas , Prevalencia , Patrones DietéticosRESUMEN
The development of methods for the meta-analysis of diagnostic test accuracy (DTA) studies is still an active area of research. While methods for the standard case where each study reports a single pair of sensitivity and specificity are nearly routinely applied nowadays, methods to meta-analyze receiver operating characteristic (ROC) curves are not widely used. This situation is more complex, as each primary DTA study may report on several pairs of sensitivity and specificity, each corresponding to a different threshold. In a case study published earlier, we applied a number of methods for meta-analyzing DTA studies with multiple thresholds to a real-world data example (Zapf et al., Biometrical Journal. 2021; 63(4): 699-711). To date, no simulation study exists that systematically compares different approaches with respect to their performance in various scenarios when the truth is known. In this article, we aim to fill this gap and present the results of a simulation study that compares three frequentist approaches for the meta-analysis of ROC curves. We performed a systematic simulation study, motivated by an example from medical research. In the simulations, all three approaches worked partially well. The approach by Hoyer and colleagues was slightly superior in most scenarios and is recommended in practice.
Asunto(s)
Biometría , Metaanálisis como Asunto , Curva ROC , Biometría/métodos , Pruebas Diagnósticas de Rutina/métodos , Humanos , Simulación por ComputadorRESUMEN
The primary purpose of this article is to examine the issue of estimating the finite population distribution function from auxiliary information, such as population mean and rank of the auxiliary variables, that are already known. In order to better estimate the distribution function (DF) of a finite population, two improved estimators are developed. The bias and mean squared error of the suggested and existing estimators are derived up to the first order of approximation. To improve the efficiency of an estimators, we compare the suggested estimators with existing counterpart. Based on the numerical outcomes, it is to be noted that the suggested classes of estimators perform well using six actual data sets. The strength and generalization of the suggested estimators are also verified using a simulation analysis. Based on the result of actual data sets and a simulation study, we observe that the suggested estimator outperforms as compared to all existing estimators which is compared in this study.
Asunto(s)
Modelos Estadísticos , Simulación por Computador , Humanos , AlgoritmosRESUMEN
Abundance estimation is frequently an objective of conservation and monitoring initiatives for threatened and other managed populations. While abundance estimation via capture-mark-recapture or spatially explicit capture-recapture is now common, such approaches are logistically challenging and expensive for species such as boreal caribou (Rangifer tarandus), which inhabit remote regions, are widely dispersed, and exist at low densities. Fortunately, the recently developed 'close-kin mark-recapture' (CKMR) framework, which uses the number of kin pairs obtained within a sample to generate an abundance estimate, eliminates the need for multiple sampling events. As a result, some caribou managers are interested in using this method to generate an abundance estimate from a single, non-invasive sampling event for caribou populations. We conducted a simulation study using realistic boreal caribou demographic rates and population sizes to assess how population size and the proportion of the population surveyed impact the accuracy and precision of single-survey CKMR-based abundance estimates. Our results indicated that abundance estimates were biased and highly imprecise when very small proportions of the population were sampled, regardless of the population size. However, the larger the population size, the smaller the required proportion of the population surveyed to generate both accurate and reasonably precise estimates. Additionally, we also present a case study in which we used the CKMR framework to generate annual female abundance estimates for a small caribou population in Jasper National Park, Alberta, Canada, from 2006 to 2015 and compared them to existing published capture-mark-recapture-based estimates. Both the accuracy and precision of the annual CKMR-based abundance estimates varied across years and were sensitive to the proportion of pairwise kinship comparisons which yielded a mother-offspring pair. Taken together, our study demonstrates that it is possible to generate CKMR-based abundance estimates from a single sampling event for small caribou populations, so long as a sufficient sampling intensity can be achieved.
RESUMEN
The fit of a regression model to new data is often worse due to overfitting. Analysts use variable selection techniques to develop parsimonious regression models, which may introduce bias into regression estimates. Shrinkage methods have been proposed to mitigate overfitting and reduce bias in estimates. Post-estimation shrinkage is an alternative to penalized methods. This study evaluates effectiveness of post-estimation shrinkage in improving prediction performance of full and selected models. Through a simulation study, results were compared with ordinary least squares (OLS) and ridge in full models, and best subset selection (BSS) and lasso in selected models. We focused on prediction errors and the number of selected variables. Additionally, we proposed a modified version of the parameter-wise shrinkage (PWS) approach named non-negative PWS (NPWS) to address weaknesses of PWS. Results showed that no method was superior in all scenarios. In full models, NPWS outperformed global shrinkage, whereas PWS was inferior to OLS. In low correlation with moderate-to-high signal-to-noise ratio (SNR), NPWS outperformed ridge, but ridge performed best in small sample sizes, high correlation, and low SNR. In selected models, all post-estimation shrinkage performed similarly, with global shrinkage slightly inferior. Lasso outperformed BSS and post-estimation shrinkage in small sample sizes, low SNR, and high correlation but was inferior when the opposite was true. Our study suggests that, with sufficient information, NPWS is more effective than global shrinkage in improving prediction accuracy of models. However, in high correlation, small sample sizes, and low SNR, penalized methods generally outperform post-estimation shrinkage methods.
Asunto(s)
Biometría , Biometría/métodos , Modelos Lineales , HumanosRESUMEN
In health technology assessment, matching-adjusted indirect comparison (MAIC) is the most common method for pairwise comparisons that control for imbalances in baseline characteristics across trials. One of the primary challenges in MAIC is the need to properly account for the additional uncertainty introduced by the matching process. Limited evidence and guidance are available on variance estimation in MAICs. Therefore, we conducted a comprehensive Monte Carlo simulation study to evaluate the performance of different statistical methods across 108 scenarios. Four general approaches for variance estimation were compared in both anchored and unanchored MAICs of binary and time-to-event outcomes: (1) conventional estimators (CE) using raw weights; (2) CE using weights rescaled to the effective sample size (ESS); (3) robust sandwich estimators; and (4) bootstrapping. Several variants of sandwich estimators and bootstrap methods were tested. Performance was quantified on the basis of empirical coverage probabilities for 95% confidence intervals and variability ratios. Variability was underestimated by CE + raw weights when population overlap was poor or moderate. Despite several theoretical limitations, CE + ESS weights accurately estimated uncertainty across most scenarios. Original implementations of sandwich estimators had a downward bias in MAICs with a small ESS, and finite sample adjustments led to marked improvements. Bootstrapping was unstable if population overlap was poor and the sample size was limited. All methods produced valid coverage probabilities and standard errors in cases of strong population overlap. Our findings indicate that the sample size, population overlap, and outcome type are important considerations for variance estimation in MAICs.
RESUMEN
Many clinical trials involve partially clustered data, where some observations belong to a cluster and others can be considered independent. For example, neonatal trials may include infants from single or multiple births. Sample size and analysis methods for these trials have received limited attention. A simulation study was conducted to (1) assess whether existing power formulas based on generalized estimating equations (GEEs) provide an adequate approximation to the power achieved by mixed effects models, and (2) compare the performance of mixed models vs GEEs in estimating the effect of treatment on a continuous outcome. We considered clusters that exist prior to randomization with a maximum cluster size of 2, three methods of randomizing the clustered observations, and simulated datasets with uninformative cluster size and the sample size required to achieve 80% power according to GEE-based formulas with an independence or exchangeable working correlation structure. The empirical power of the mixed model approach was close to the nominal level when sample size was calculated using the exchangeable GEE formula, but was often too high when the sample size was based on the independence GEE formula. The independence GEE always converged and performed well in all scenarios. Performance of the exchangeable GEE and mixed model was also acceptable under cluster randomization, though under-coverage and inflated type I error rates could occur with other methods of randomization. Analysis of partially clustered trials using GEEs with an independence working correlation structure may be preferred to avoid the limitations of mixed models and exchangeable GEEs.
Asunto(s)
Simulación por Computador , Modelos Estadísticos , Humanos , Análisis por Conglomerados , Tamaño de la Muestra , Ensayos Clínicos Controlados Aleatorios como Asunto/estadística & datos numéricos , Ensayos Clínicos Controlados Aleatorios como Asunto/métodos , Ensayos Clínicos como Asunto/métodos , Ensayos Clínicos como Asunto/estadística & datos numéricos , Interpretación Estadística de Datos , Recién NacidoRESUMEN
Ranked set sampling (RSS) is known to increase the efficiency of the estimators while comparing it with simple random sampling. The problem of missingness creates a gap in the information that needs to be addressed before proceeding for estimation. Negligible amount of work has been carried out to deal with missingness utilizing RSS. This paper proposes some logarithmic type methods of imputation for the estimation of population mean under RSS using auxiliary information. The properties of the suggested imputation procedures are examined. A simulation study is accomplished to show that the proposed imputation procedures exhibit better results in comparison to some of the existing imputation procedures. Few real applications of the proposed imputation procedures is also provided to generalize the simulation study.
RESUMEN
Fertility awareness-based methods (FABMs), also known as natural family planning (NFP), enable couples to identify the days of the menstrual cycle when intercourse may result in pregnancy ("fertile days"), and to avoid intercourse on fertile days if they wish to avoid pregnancy. Thus, these methods are fully dependent on user behavior for effectiveness to avoid pregnancy. For couples and clinicians considering the use of an FABM, one important metric to consider is the highest expected effectiveness (lowest possible pregnancy rate) during the correct use of the method to avoid pregnancy. To assess this, most studies of FABMs have reported a method-related pregnancy rate (a cumulative proportion), which is calculated based on all cycles (or months) in the study. In contrast, the correct use to avoid pregnancy rate (also a cumulative proportion) has the denominator of cycles with the correct use of the FABM to avoid pregnancy. The relationship between these measures has not been evaluated quantitatively. We conducted a series of simulations demonstrating that the method-related pregnancy rate is artificially decreased in direct proportion to the proportion of cycles with intermediate use (any use other than correct use to avoid or targeted use to conceive), which also increases the total pregnancy rate. Thus, as the total pregnancy rate rises (related to intermediate use), the method-related pregnancy rate falls artificially while the correct use pregnancy rate remains constant. For practical application, we propose the core elements needed to assess correct use cycles in FABM studies. Summary: Fertility awareness-based methods (FABMs) can be used by couples to avoid pregnancy, by avoiding intercourse on fertile days. Users want to know what the highest effectiveness (lowest pregnancy rate) would be if they use an FABM correctly and consistently to avoid pregnancy. In this simulation study, we compare two different measures: (1) the method-related pregnancy rate; and (2) the correct use pregnancy rate. We show that the method-related pregnancy rate is biased too low if some users in the study are not using the method consistently to avoid pregnancy, while the correct use pregnancy rate obtains an accurate estimate. Short Summary: In FABM studies, the method-related pregnancy rate is biased too low, but the correct use pregnancy rate is unbiased.
RESUMEN
Many clinical trials assess time-to-event endpoints. To describe the difference between groups in terms of time to event, we often employ hazard ratios. However, the hazard ratio is only informative in the case of proportional hazards (PHs) over time. There exist many other effect measures that do not require PHs. One of them is the average hazard ratio (AHR). Its core idea is to utilize a time-dependent weighting function that accounts for time variation. Though propagated in methodological research papers, the AHR is rarely used in practice. To facilitate its application, we unfold approaches for sample size calculation of an AHR test. We assess the reliability of the sample size calculation by extensive simulation studies covering various survival and censoring distributions with proportional as well as nonproportional hazards (N-PHs). The findings suggest that a simulation-based sample size calculation approach can be useful for designing clinical trials with N-PHs. Using the AHR can result in increased statistical power to detect differences between groups with more efficient sample sizes.
Asunto(s)
Modelos de Riesgos Proporcionales , Tamaño de la Muestra , Humanos , Ensayos Clínicos como Asunto , Biometría/métodosRESUMEN
Ultrasound computed tomography (USCT) is an emerging imaging modality that holds great promise for breast imaging. Full-waveform inversion (FWI)-based image reconstruction methods incorporate accurate wave physics to produce high spatial resolution quantitative images of speed of sound or other acoustic properties of the breast tissues from USCT measurement data. However, the high computational cost of FWI reconstruction represents a significant burden for its widespread application in a clinical setting. The research reported here investigates the use of a convolutional neural network (CNN) to learn a mapping from USCT waveform data to speed of sound estimates. The CNN was trained using a supervised approach with a task-informed loss function aiming at preserving features of the image that are relevant to the detection of lesions. A large set of anatomically and physiologically realistic numerical breast phantoms (NBPs) and corresponding simulated USCT measurements was employed during training. Once trained, the CNN can perform real-time FWI image reconstruction from USCT waveform data. The performance of the proposed method was assessed and compared against FWI using a hold-out sample of 41 NBPs and corresponding USCT data. Accuracy was measured using relative mean square error (RMSE), structural self-similarity index measure (SSIM), and lesion detection performance (DICE score). This numerical experiment demonstrates that a supervised learning model can achieve accuracy comparable to FWI in terms of RMSE and SSIM, and better performance in terms of task performance, while significantly reducing computational time.
RESUMEN
This article proposes and discusses a novel approach for generating trigonometric G-families using hybrid generalizers of distributions. The proposed generalizer is constructed by utilizing the tangent trigonometric function and distribution function of base model G ( x ) . The newly proposed family of uni-variate continuous distributions is named the "Lomax Tangent Generalized Family of Distributions (LT-G)" and structural-mathematical-statistical properties are derived. Some special and sub-models of the proposed family are also presented. A Weibull-based model, 'The Lomax Tangent Weibull (LT-W) Distribution," is discussed and the plots of density (pdf) and hazard (hrf) functions are also explained. Model parameter estimates are estimated by employing the maximum likelihood estimation (MLE) procedure. The accuracy of the MLEs is evaluated through Monte Carlo simulation. Last but not least, to demonstrate the flexibility and potential of the proposed distribution, two actual hydrological and strength data sets are analyzed. The obtained results are compared with well-known, competitive, and related existing distributions.
RESUMEN
BACKGROUND AND OBJECTIVES: Comprehending the research dataset is crucial for obtaining reliable and valid outcomes. Health analysts must have a deep comprehension of the data being analyzed. This comprehension allows them to suggest practical solutions for handling missing data, in a clinical data source. Accurate handling of missing values is critical for producing precise estimates and making informed decisions, especially in crucial areas like clinical research. With data's increasing diversity and complexity, numerous scholars have developed a range of imputation techniques. To address this, we conducted a systematic review to introduce various imputation techniques based on tabular dataset characteristics, including the mechanism, pattern, and ratio of missingness, to identify the most appropriate imputation methods in the healthcare field. MATERIALS AND METHODS: We searched four information databases namely PubMed, Web of Science, Scopus, and IEEE Xplore, for articles published up to September 20, 2023, that discussed imputation methods for addressing missing values in a clinically structured dataset. Our investigation of selected articles focused on four key aspects: the mechanism, pattern, ratio of missingness, and various imputation strategies. By synthesizing insights from these perspectives, we constructed an evidence map to recommend suitable imputation methods for handling missing values in a tabular dataset. RESULTS: Out of 2955 articles, 58 were included in the analysis. The findings from the development of the evidence map, based on the structure of the missing values and the types of imputation methods used in the extracted items from these studies, revealed that 45% of the studies employed conventional statistical methods, 31% utilized machine learning and deep learning methods, and 24% applied hybrid imputation techniques for handling missing values. CONCLUSION: Considering the structure and characteristics of missing values in a clinical dataset is essential for choosing the most appropriate data imputation technique, especially within conventional statistical methods. Accurately estimating missing values to reflect reality enhances the likelihood of obtaining high-quality and reusable data, contributing significantly to precise medical decision-making processes. Performing this review study creates a guideline for choosing the most appropriate imputation methods in data preprocessing stages to perform analytical processes on structured clinical datasets.
Asunto(s)
Investigación Biomédica , Humanos , Interpretación Estadística de Datos , Investigación Biomédica/métodos , Investigación Biomédica/normas , Investigación Biomédica/estadística & datos numéricos , Conjuntos de Datos como AsuntoRESUMEN
The test-negative design (TND) is a popular method for evaluating vaccine effectiveness (VE). A "classical" TND study includes symptomatic individuals tested for the disease targeted by the vaccine to estimate VE against symptomatic infection. However, recent applications of the TND have attempted to estimate VE against infection by including all tested individuals, regardless of their symptoms. In this article, we use directed acyclic graphs and simulations to investigate potential biases in TND studies of COVID-19 VE arising from the use of this "alternative" approach, particularly when applied during periods of widespread testing. We show that the inclusion of asymptomatic individuals can potentially lead to collider stratification bias, uncontrolled confounding by health and healthcare-seeking behaviors (HSBs), and differential outcome misclassification. While our focus is on the COVID-19 setting, the issues discussed here may also be relevant in the context of other infectious diseases. This may be particularly true in scenarios where there is either a high baseline prevalence of infection, a strong correlation between HSBs and vaccination, different testing practices for vaccinated and unvaccinated individuals, or settings where both the vaccine under study attenuates symptoms of infection and diagnostic accuracy is modified by the presence of symptoms.