Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 626
Filter
Add more filters

Publication year range
1.
Genet Epidemiol ; 2024 Mar 19.
Article in English | MEDLINE | ID: mdl-38504141

ABSTRACT

Young breast and bowel cancers (e.g., those diagnosed before age 40 or 50 years) have far greater morbidity and mortality in terms of years of life lost, and are increasing in incidence, but have been less studied. For breast and bowel cancers, the familial relative risks, and therefore the familial variances in age-specific log(incidence), are much greater at younger ages, but little of these familial variances has been explained. Studies of families and twins can address questions not easily answered by studies of unrelated individuals alone. We describe existing and emerging family and twin data that can provide special opportunities for discovery. We present designs and statistical analyses, including novel ideas such as the VALID (Variance in Age-specific Log Incidence Decomposition) model for causes of variation in risk, the DEPTH (DEPendency of association on the number of Top Hits) and other approaches to analyse genome-wide association study data, and the within-pair, ICE FALCON (Inference about Causation from Examining FAmiliaL CONfounding) and ICE CRISTAL (Inference about Causation from Examining Changes in Regression coefficients and Innovative STatistical AnaLysis) approaches to causation and familial confounding. Example applications to breast and colorectal cancer are presented. Motivated by the availability of the resources of the Breast and Colon Cancer Family Registries, we also present some ideas for future studies that could be applied to, and compared with, cancers diagnosed at older ages and address the challenges posed by young breast and bowel cancers.

2.
Trends Genet ; 37(11): 995-1011, 2021 11.
Article in English | MEDLINE | ID: mdl-34243982

ABSTRACT

Accurate genetic prediction of complex traits can facilitate disease screening, improve early intervention, and aid in the development of personalized medicine. Genetic prediction of complex traits requires the development of statistical methods that can properly model polygenic architecture and construct a polygenic score (PGS). We present a comprehensive review of 46 methods for PGS construction. We connect the majority of these methods through a multiple linear regression framework which can be instrumental for understanding their prediction performance for traits with distinct genetic architectures. We discuss the practical considerations of PGS analysis as well as challenges and future directions of PGS method development. We hope our review serves as a useful reference both for statistical geneticists who develop PGS methods and for data analysts who perform PGS analysis.


Subject(s)
Genome-Wide Association Study , Multifactorial Inheritance , Genome-Wide Association Study/methods , Multifactorial Inheritance/genetics , Phenotype
3.
Biostatistics ; 24(4): 901-921, 2023 10 18.
Article in English | MEDLINE | ID: mdl-35277956

ABSTRACT

Pharmacogenomic experiments allow for the systematic testing of drugs, at varying dosage concentrations, to study how genomic markers correlate with cell sensitivity to treatment. The first step in the analysis is to quantify the response of cell lines to variable dosage concentrations of the drugs being tested. The signal to noise in these measurements can be low due to biological and experimental variability. However, the increasing availability of pharmacogenomic studies provides replicated data sets that can be leveraged to gain power. To do this, we formulate a hierarchical mixture model to estimate the drug-specific mixture distributions for estimating cell sensitivity and for assessing drug effect type as either broad or targeted effect. We use this formulation to propose a unified approach that can yield posterior probability of a cell being susceptible to a drug conditional on being a targeted effect or relative effect sizes conditioned on the cell being broad. We demonstrate the usefulness of our approach via case studies. First, we assess pairwise agreements for cell lines/drugs within the intersection of two data sets and confirm the moderate pairwise agreement between many publicly available pharmacogenomic data sets. We then present an analysis that identifies sensitivity to the drug crizotinib for cells harboring EML4-ALK or NPM1-ALK gene fusions, as well as significantly down-regulated cell-matrix pathways associated with crizotinib sensitivity.


Subject(s)
Carcinoma, Non-Small-Cell Lung , Lung Neoplasms , Humans , Crizotinib/therapeutic use , Carcinoma, Non-Small-Cell Lung/drug therapy , Carcinoma, Non-Small-Cell Lung/genetics , Lung Neoplasms/genetics , Pharmacogenetics , Models, Statistical , Receptor Protein-Tyrosine Kinases/genetics , Receptor Protein-Tyrosine Kinases/therapeutic use
4.
Brief Bioinform ; 23(1)2022 01 17.
Article in English | MEDLINE | ID: mdl-34586372

ABSTRACT

MOTIVATION: m6A methylation is a highly prevalent post-transcriptional modification in eukaryotes. MeRIP-seq or m6A-seq, which comprises immunoprecipitation of methylation fragments , is the most common method for measuring methylation signals. Existing computational tools for analyzing MeRIP-seq data sets and identifying differentially methylated genes/regions are not most optimal. They either ignore the sparsity or dependence structure of the methylation signals within a gene/region. Modeling the methylation signals using univariate distributions could also lead to high type I error rates and low sensitivity. In this paper, we propose using mean vector testing (MVT) procedures for testing differential methylation of RNA at the gene level. MVTs use a distribution-free test statistic with proven ability to control type I error even for extremely small sample sizes. We performed a comprehensive simulation study comparing the MVTs to existing MeRIP-seq data analysis tools. Comparative analysis of existing MeRIP-seq data sets is presented to illustrate the advantage of using MVTs. RESULTS: Mean vector testing procedures are observed to control type I error rate and achieve high power for detecting differential RNA methylation using m6A-seq data. Results from two data sets indicate that the genes detected identified as having different m6A methylation patterns have high functional relevance to the study conditions. AVAILABILITY: The dimer software package for differential RNA methylation analysis is freely available at https://github.com/ouyang-lab/DIMER. SUPPLEMENTARY INFORMATION: Supplementary data are available at Briefings in Bioinformatics online.


Subject(s)
RNA , Computer Simulation , Immunoprecipitation , Methylation , RNA/chemistry , RNA/genetics , Sequence Analysis, RNA/methods
5.
Brief Bioinform ; 23(2)2022 03 10.
Article in English | MEDLINE | ID: mdl-35037015

ABSTRACT

Direct coupling analysis (DCA) has been widely used to infer evolutionary coupled residue pairs from the multiple sequence alignment (MSA) of homologous sequences. However, effectively selecting residue pairs with significant evolutionary couplings according to the result of DCA is a non-trivial task. In this study, we developed a general statistical framework for significant evolutionary coupling detection, referred to as irreproducible discovery rate (IDR)-DCA, which is based on reproducibility analysis of the coupling scores obtained from DCA on manually created MSA replicates. IDR-DCA was applied to select residue pairs for contact prediction for monomeric proteins, protein-protein interactions and monomeric RNAs, in which three different versions of DCA were applied. We demonstrated that with the application of IDR-DCA, the residue pairs selected using a universal threshold always yielded stable performance for contact prediction. Comparing with the application of carefully tuned coupling score cutoffs, IDR-DCA always showed better performance. The robustness of IDR-DCA was also supported through the MSA downsampling analysis. We further demonstrated the effectiveness of applying constraints obtained from residue pairs selected by IDR-DCA to assist RNA secondary structure prediction.


Subject(s)
Algorithms , Proteins , Protein Structure, Secondary , Proteins/chemistry , RNA , Reproducibility of Results , Sequence Alignment
6.
Mass Spectrom Rev ; 2023 May 04.
Article in English | MEDLINE | ID: mdl-37143314

ABSTRACT

With urinary proteomics profiling (UPP) as exemplary omics technology, this review describes a workflow for the analysis of omics data in large study populations. The proposed workflow includes: (i) planning omics studies and sample size considerations; (ii) preparing the data for analysis; (iii) preprocessing the UPP data; (iv) the basic statistical steps required for data curation; (v) the selection of covariables; (vi) relating continuously distributed or categorical outcomes to a series of single markers (e.g., sequenced urinary peptide fragments identifying the parental proteins); (vii) showing the added diagnostic or prognostic value of the UPP markers over and beyond classical risk factors, and (viii) pathway analysis to identify targets for personalized intervention in disease prevention or treatment. Additionally, two short sections respectively address multiomics studies and machine learning. In conclusion, the analysis of adverse health outcomes in relation to omics biomarkers rests on the same statistical principle as any other data collected in large population or patient cohorts. The large number of biomarkers, which have to be considered simultaneously requires planning ahead how the study database will be structured and curated, imported in statistical software packages, analysis results will be triaged for clinical relevance, and presented.

7.
Neuroepidemiology ; 58(5): 369-382, 2024.
Article in English | MEDLINE | ID: mdl-38560977

ABSTRACT

INTRODUCTION: Hippocampal atrophy is an established biomarker for conversion from the normal ageing process to developing cognitive impairment and dementia. This study used a novel hypothesis-free machine-learning approach, to uncover potential risk factors of lower hippocampal volume using information from the world's largest brain imaging study. METHODS: A combination of machine learning and conventional statistical methods were used to identify predictors of low hippocampal volume. We run gradient boosting decision tree modelling including 2,891 input features measured before magnetic resonance imaging assessments (median 9.2 years, range 4.2-13.8 years) using data from 42,152 dementia-free UK Biobank participants. Logistic regression analyses were run on 87 factors identified as important for prediction based on Shapley values. False discovery rate-adjusted p value <0.05 was used to declare statistical significance. RESULTS: Older age, male sex, greater height, and whole-body fat-free mass were the main predictors of low hippocampal volume with the model also identifying associations with lung function and lifestyle factors including smoking, physical activity, and coffee intake (corrected p < 0.05 for all). Red blood cell count and several red blood cell indices such as haemoglobin concentration, mean corpuscular haemoglobin, mean corpuscular volume, mean reticulocyte volume, mean sphered cell volume, and red blood cell distribution width were among many biomarkers associated with low hippocampal volume. CONCLUSION: Lifestyles, physical measures, and biomarkers may affect hippocampal volume, with many of the characteristics potentially reflecting oxygen supply to the brain. Further studies are required to establish causality and clinical relevance of these findings.


Subject(s)
Biological Specimen Banks , Hippocampus , Machine Learning , Magnetic Resonance Imaging , Humans , Hippocampus/diagnostic imaging , Hippocampus/pathology , Male , Female , Aged , Middle Aged , United Kingdom , Atrophy/pathology , Risk Factors , UK Biobank
8.
J Anim Ecol ; 93(3): 267-280, 2024 03.
Article in English | MEDLINE | ID: mdl-38167802

ABSTRACT

Individual body size distributions (ISD) within communities are remarkably consistent across habitats and spatiotemporal scales and can be represented by size spectra, which are described by a power law. The focus of size spectra analysis is to estimate the exponent ( λ ) of the power law. A common application of size spectra studies is to detect anthropogenic pressures. Many methods have been proposed for estimating λ most of which involve binning the data, counting the abundance within bins, and then fitting an ordinary least squares regression in log-log space. However, recent work has shown that binning procedures return biased estimates of λ compared to procedures that directly estimate λ using maximum likelihood estimation (MLE). While it is clear that MLE produces less biased estimates of site-specific λ's, it is less clear how this bias affects the ability to test for changes in λ across space and time, a common question in the ecological literature. Here, we used simulation to compare the ability of two normalised binning methods (equal logarithmic and log2 bins) and MLE to (1) recapture known values of λ , and (2) recapture parameters in a linear regression measuring the change in λ across a hypothetical environmental gradient. We also compared the methods using two previously published body size datasets across a natural temperature gradient and an anthropogenic pollution gradient. Maximum likelihood methods always performed better than common binning methods, which demonstrated consistent bias depending on the simulated values of λ . This bias carried over to the regressions, which were more accurate when λ was estimated using MLE compared to the binning procedures. Additionally, the variance in estimates using MLE methods is markedly reduced when compared to binning methods. The error induced by binning methods can be of similar magnitudes as the variation previously published in experimental and observational studies, bringing into question the effect sizes of previously published results. However, while the methods produced different regression slope estimates, they were in qualitative agreement on the sign of those slopes (i.e. all negative or all positive). Our results provide further support for the direct estimation of λ and its relative variation across environmental gradients using MLE over the more common methods of binning.


Subject(s)
Ecosystem , Animals , Computer Simulation , Likelihood Functions
9.
Psychophysiology ; 61(4): e14471, 2024 Apr.
Article in English | MEDLINE | ID: mdl-37937737

ABSTRACT

Cannabis use disorder (CUD) is increasing in the United States, yet, specific neural mechanisms of CUD are not well understood. Disordered substance use is characterized by heightened drug cue incentive salience, which can be measured using the late positive potential (LPP), an event-related potential (ERP) evoked by motivationally significant stimuli. The drug cue LPP is typically quantified by averaging the slow wave's scalp-recorded amplitude across its entire time course, which may obscure distinct underlying factors with differential predictive validity; however, no study to date has examined this possibility. In a sample of 105 cannabis users, temporo-spatial Principal Component Analysis was used to decompose cannabis cue modulation of the LPP into its underlying factors. Acute stress was also inducted to allow for identification of specific cannabis LPP factors sensitive to stress. Factor associations with CUD severity were also explored. Eight factors showed significantly increased amplitudes to cannabis images relative to neutral images. These factors spanned early (~372 ms), middle (~824 ms), and late (>1000 ms) windows across frontal, central, and parietal-occipital sites. CUD phenotype individual differences were primarily associated with frontal, middle/late latency factor amplitudes. Acute stress effects were limited to one early central and one late frontal factor. Taken together, results suggest that the cannabis LPP can be decomposed into distinct, temporal-spatial factors with differential responsivity to acute stress and CUD phenotype variability. Future individual difference studies examining drug cue modulation of the LPP should consider (1) frontalcentral poolings in addition to conventional central-parietal sites, and (2) later LPP time windows.


Subject(s)
Cannabis , Cues , Humans , Electroencephalography/methods , Principal Component Analysis , Evoked Potentials/physiology
10.
Psychophysiology ; 61(4): e14475, 2024 Apr.
Article in English | MEDLINE | ID: mdl-37947235

ABSTRACT

Machine learning techniques have proven to be a useful tool in cognitive neuroscience. However, their implementation in scalp-recorded electroencephalography (EEG) is relatively limited. To address this, we present three analyses using data from a previous study that examined event-related potential (ERP) responses to a wide range of naturally-produced speech sounds. First, we explore which features of the EEG signal best maximize machine learning accuracy for a voicing distinction, using a support vector machine (SVM). We manipulate three dimensions of the EEG signal as input to the SVM: number of trials averaged, number of time points averaged, and polynomial fit. We discuss the trade-offs in using different feature sets and offer some recommendations for researchers using machine learning. Next, we use SVMs to classify specific pairs of phonemes, finding that we can detect differences in the EEG signal that are not otherwise detectable using conventional ERP analyses. Finally, we characterize the timecourse of phonetic feature decoding across three phonological dimensions (voicing, manner of articulation, and place of articulation), and find that voicing and manner are decodable from neural activity, whereas place of articulation is not. This set of analyses addresses both practical considerations in the application of machine learning to EEG, particularly for speech studies, and also sheds light on current issues regarding the nature of perceptual representations of speech.


Subject(s)
Phonetics , Speech Perception , Humans , Speech Perception/physiology , Speech/physiology , Evoked Potentials , Electroencephalography/methods
11.
Psychophysiology ; 61(7): e14562, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38459627

ABSTRACT

Recent evidence indicates that event-related potentials (ERPs) as measured on the electroencephalogram (EEG) are more closely related to transdiagnostic, dimensional measures of psychopathology (TDP) than to diagnostic categories. A comprehensive examination of correlations between well-studied ERPs and measures of TDP is called for. In this study, we recruited 50 patients with emotional disorders undergoing 14 weeks of transdiagnostic group psychotherapy as well as 37 healthy comparison subjects (HC) matched in age and sex. HCs were assessed once and patients three times throughout treatment (N = 172 data sets) with a battery of well-studied ERPs and psychopathology measures consistent with the TDP framework The Hierarchical Taxonomy of Psychopathology (HiTOP). ERPs were quantified using robust single-trial analysis (RSTA) methods and TDP correlations with linear regression models as implemented in the EEGLAB toolbox LIMO EEG. We found correlations at several levels of the HiTOP hierarchy. Among these, a reduced P3b was associated with the general p-factor. A reduced error-related negativity correlated strongly with worse symptomatology across the Internalizing spectrum. Increases in the correct-related negativity correlated with symptoms loading unto the Distress subfactor in the HiTOP. The Flanker N2 was related to specific symptoms of Intrusive Cognitions and Traumatic Re-experiencing and the mismatch negativity to maladaptive personality traits at the lowest levels of the HiTOP hierarchy. Our study highlights the advantages of RSTA methods and of using validated TDP constructs within a consistent framework. Future studies could utilize machine learning methods to predict TDP from a set of ERP features at the subject level.


Subject(s)
Electroencephalography , Evoked Potentials , Humans , Female , Male , Adult , Evoked Potentials/physiology , Young Adult , Middle Aged
12.
Psychophysiology ; 61(5): e14500, 2024 May.
Article in English | MEDLINE | ID: mdl-38073133

ABSTRACT

Recent evidence indicates that measures of brain functioning as indexed by event-related potentials (ERPs) on the electroencephalogram align more closely to transdiagnostic measures of psychopathology than to categorical taxonomies. The Hierarchical Taxonomy of Psychopathology (HiTOP) is a transdiagnostic, dimensional framework aiming to solve issues of comorbidity, symptom heterogeneity, and arbitrary diagnostic boundaries. Based on shared features, the emotional disorders are allocated into subfactors Distress and Fear. Evidence indicates that disorders that are close in the HiTOP hierarchy share etiology, symptom profiles, and treatment outcomes. However, further studies testing the biological underpinnings of the HiTOP are called for. In this study, we assessed differences between Distress and Fear in a range of well-studied ERP components. In total, 50 patients with emotional disorders were divided into two groups (Distress, N = 25; Fear, N = 25) according to HiTOP criteria and compared against 37 healthy comparison (HC) subjects. Addressing issues in traditional ERP preprocessing and analysis methods, we applied robust single-trial analysis as implemented in the EEGLAB toolbox LIMO EEG. Several ERP components were found to differ between the groups. Surprisingly, we found no difference between Fear and HC for any of the ERPs. This suggests that some well-established results from the literature, e.g., increased error-related negativity in OCD, are not a shared neurobiological correlate of the Fear subfactor. Conversely, for Distress, we found reductions compared to Fear and HC in several ERP components across paradigms. Future studies could utilize HiTOP-validated psychopathology measures to more precisely define subfactor groups.


Subject(s)
Mental Disorders , Psychopathology , Humans , Fear , Mood Disorders , Evoked Potentials , Comorbidity , Mental Disorders/psychology
13.
BMC Med Res Methodol ; 24(1): 31, 2024 Feb 10.
Article in English | MEDLINE | ID: mdl-38341540

ABSTRACT

BACKGROUND: The Interrupted Time Series (ITS) is a robust design for evaluating public health and policy interventions or exposures when randomisation may be infeasible. Several statistical methods are available for the analysis and meta-analysis of ITS studies. We sought to empirically compare available methods when applied to real-world ITS data. METHODS: We sourced ITS data from published meta-analyses to create an online data repository. Each dataset was re-analysed using two ITS estimation methods. The level- and slope-change effect estimates (and standard errors) were calculated and combined using fixed-effect and four random-effects meta-analysis methods. We examined differences in meta-analytic level- and slope-change estimates, their 95% confidence intervals, p-values, and estimates of heterogeneity across the statistical methods. RESULTS: Of 40 eligible meta-analyses, data from 17 meta-analyses including 282 ITS studies were obtained (predominantly investigating the effects of public health interruptions (88%)) and analysed. We found that on average, the meta-analytic effect estimates, their standard errors and between-study variances were not sensitive to meta-analysis method choice, irrespective of the ITS analysis method. However, across ITS analysis methods, for any given meta-analysis, there could be small to moderate differences in meta-analytic effect estimates, and important differences in the meta-analytic standard errors. Furthermore, the confidence interval widths and p-values for the meta-analytic effect estimates varied depending on the choice of confidence interval method and ITS analysis method. CONCLUSIONS: Our empirical study showed that meta-analysis effect estimates, their standard errors, confidence interval widths and p-values can be affected by statistical method choice. These differences may importantly impact interpretations and conclusions of a meta-analysis and suggest that the statistical methods are not interchangeable in practice.


Subject(s)
Public Health , Humans , Interrupted Time Series Analysis
14.
Epidemiol Infect ; 152: e57, 2024 Mar 20.
Article in English | MEDLINE | ID: mdl-38506229

ABSTRACT

Current World Health Organization (WHO) reports claim a decline in COVID-19 testing and reporting of new infections. To discuss the consequences of ignoring severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) infection, the endemic characteristics of the disease in 2023 with the ones estimated before using 2022 data sets are compared. The accumulated numbers of cases and deaths reported to the WHO by the 10 most infected countries and global figures were used to calculate the average daily numbers of cases DCC and deaths DDC per capita and case fatality rates (CFRs = DDC/DCC) for two periods in 2023. In some countries, the DDC values can be higher than the upper 2022 limit and exceed the seasonal influenza mortality. The increase in CFR in 2023 shows that SARS-CoV-2 infection is still dangerous. The numbers of COVID-19 cases and deaths per capita in 2022 and 2023 do not demonstrate downward trends with the increase in the percentages of fully vaccinated people and boosters. The reasons may be both rapid mutations of the coronavirus, which reduced the effectiveness of vaccines and led to a large number of re-infections, and inappropriate management.


Subject(s)
COVID-19 , Influenza Vaccines , Humans , SARS-CoV-2 , COVID-19 Testing , World Health Organization
15.
Transpl Infect Dis ; 26(2): e14231, 2024 Apr.
Article in English | MEDLINE | ID: mdl-38375954

ABSTRACT

Observational studies of coronavirus disease 2019 (COVID-19) among transplant candidates and recipients remain important as immunocompromised patients formed a very small proportion of patients included in COVID-19 trials and large database analyses. We discuss methods that have been used in such analyses to evaluate the impact of vaccination on the risk of symptomatic COVID-19 in such patients and on the probability of developing post-acute sequelae of severe acute respiratory syndrome coronavirus 2 after the onset of infection. We also propose future directions for research and discuss the methods that will be useful to conduct such investigations. The study design and analytical issues that we consider have the potential to be helpful not only for COVID-19 research but also for other infections as well.


Subject(s)
COVID-19 , SARS-CoV-2 , Humans , COVID-19/complications , Databases, Factual , Disease Progression , Immunocompromised Host , Observational Studies as Topic
16.
J Biomed Inform ; 152: 104629, 2024 04.
Article in English | MEDLINE | ID: mdl-38552994

ABSTRACT

BACKGROUND: In health research, multimodal omics data analysis is widely used to address important clinical and biological questions. Traditional statistical methods rely on the strong assumptions of distribution. Statistical methods such as testing and differential expression are commonly used in omics analysis. Deep learning, on the other hand, is an advanced computer science technique that is powerful in mining high-dimensional omics data for prediction tasks. Recently, integrative frameworks or methods have been developed for omics studies that combine statistical models and deep learning algorithms. METHODS AND RESULTS: The aim of these integrative frameworks is to combine the strengths of both statistical methods and deep learning algorithms to improve prediction accuracy while also providing interpretability and explainability. This review report discusses the current state-of-the-art integrative frameworks, their limitations, and potential future directions in survival and time-to-event longitudinal analysis, dimension reduction and clustering, regression and classification, feature selection, and causal and transfer learning.


Subject(s)
Deep Learning , Genomics , Genomics/methods , Computational Biology/methods , Algorithms , Models, Statistical
17.
Pediatr Nephrol ; 39(7): 2139-2145, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38396091

ABSTRACT

BACKGROUND: Assessing bias (estimated - measured) is key to evaluating glomerular filtration rate (GFR). Stratification by subgroups can indicate where equations perform differently. However, there is a fallacy in the assessment of two instruments (e.g., eGFR and mGFR) when stratifying on the level of only one of those instruments. Here, we present statistical aspects of the problem and a solution for GFR stratification along with an empirical investigation using data from the CKiD study. METHODS: Compared and contrasted biases (eGFR relative to mGFR) with 95% confidence intervals within strata of mGFR only, eGFR only, and the average of mGFR and eGFR using data from the Chronic Kidney Disease in Children (CKiD) study. RESULTS: A total of 304 participants contributed 843 GFR studies with a mean mGFR of 48.46 (SD = 22.72) and mean eGFR of 48.67 (SD = 22.32) and correlation of 0.904. Despite strong agreement, eGFR significantly overestimated mGFR when mGFR < 30 (+ 6.2%; 95%CI + 2.9%, + 9.7%) and significantly underestimated when mGFR > 90 (-12.2%; 95%CI - 17.3%, - 7.0%). Significant biases in opposite direction were present when stratifying by eGFR only. In contrast, when stratifying by the average of eGFR and mGFR, biases were not significant (+ 1.3% and - 1.0%, respectively) congruent with strong agreement. CONCLUSIONS: Stratifying by either mGFR or eGFR only to assess eGFR biases is ubiquitous but can lead to inappropriate inference due to intrinsic statistical issues that we characterize and empirically illustrate using data from the CKiD study. Using the average of eGFR and mGFR is recommended for valid inferences in evaluations of eGFR biases.


Subject(s)
Bias , Glomerular Filtration Rate , Renal Insufficiency, Chronic , Humans , Female , Child , Male , Renal Insufficiency, Chronic/physiopathology , Renal Insufficiency, Chronic/diagnosis , Adolescent , Creatinine/blood , Kidney/physiopathology , Reproducibility of Results
18.
Acta Obstet Gynecol Scand ; 103(3): 611-620, 2024 Mar.
Article in English | MEDLINE | ID: mdl-38140844

ABSTRACT

INTRODUCTION: Obstetric care is a highly active area in the development and application of prognostic prediction models. The development and validation of these models often require the utilization of advanced statistical techniques. However, failure to adhere to rigorous methodological standards could greatly undermine the reliability and trustworthiness of the resultant models. Consequently, the aim of our study was to examine the current statistical practices employed in obstetric care and offer recommendations to enhance the utilization of statistical methods in the development of prognostic prediction models. MATERIAL AND METHODS: We conducted a cross-sectional survey using a sample of studies developing or validating prognostic prediction models for obstetric care published in a 10-year span (2011-2020). A structured questionnaire was developed to investigate the statistical issues in five domains, including model derivation (predictor selection and algorithm development), model validation (internal and external), model performance, model presentation, and risk threshold setting. On the ground of survey results and existing guidelines, a list of recommendations for statistical methods in prognostic models was developed. RESULTS: A total of 112 eligible studies were included, with 107 reporting model development and five exclusively reporting external validation. During model development, 58.9% of the studies did not include any form of validation. Of these, 46.4% used stepwise regression in a crude manner for predictor selection, while two-thirds made decisions on retaining or dropping candidate predictors solely based on p-values. Additionally, 26.2% transformed continuous predictors into categorical variables, and 80.4% did not consider nonlinear relationships between predictors and outcomes. Surprisingly, 94.4% of the studies did not examine the correlation between predictors. Moreover, 47.1% of the studies did not compare population characteristics between the development and external validation datasets, and only one-fifth evaluated both discrimination and calibration. Furthermore, 53.6% of the studies did not clearly present the model, and less than half established a risk threshold to define risk categories. In light of these findings, 10 recommendations were formulated to promote the appropriate use of statistical methods. CONCLUSIONS: The use of statistical methods is not yet optimal. Ten recommendations were offered to assist the statistical methods of prognostic prediction models in obstetric care.


Subject(s)
Algorithms , Models, Statistical , Pregnancy , Female , Humans , Prognosis , Cross-Sectional Studies , Reproducibility of Results , Surveys and Questionnaires
19.
Am J Respir Crit Care Med ; 208(9): 983-989, 2023 11 01.
Article in English | MEDLINE | ID: mdl-37771035

ABSTRACT

Rationale: U.S. lung transplant mortality risk models do not account for patients' disease progression as time accrues between mandated clinical parameter updates. Objectives: To investigate the effects of accrued waitlist (WL) time on mortality in lung transplant candidates and recipients beyond those expressed by worsening clinical status and to present a new framework for conceptualizing mortality risk in end-stage lung disease. Methods: Using Scientific Registry of Transplant Recipients data (2015-2020, N = 12,616), we modeled transitions among multiple clinical states over time: WL, posttransplant, and death. Using cause-specific and ordinary Cox regression to estimate trajectories of composite 1-year mortality risk as a function of time from waitlisting to transplantation, we quantified the predictive accuracy of these estimates. We compared multistate model-derived candidate rankings against composite allocation score (CAS) rankings. Measurements and Main Results: There were 11.5% of candidates whose predicted 1-year mortality risk increased by >10% by day 30 on the WL. The multistate model ascribed lower numerical rankings (i.e., higher priority) than CAS for those who died while on the WL (multistate mean; median [interquartile range] ranking at death, 227; 154 [57-334]; CAS median [interquartile range] ranking at death, 329; 162 [11-668]). Patients with interstitial lung disease were more likely to have increasing risk trajectories as a function of time accrued on the WL compared with other lung diagnoses. Conclusions: Incorporating the effects of time accrued on the WL for lung transplant candidates and recipients in donor lung allocation systems may improve the survival of patients with end-stage lung diseases on the individual and population levels.


Subject(s)
Lung Transplantation , Tissue and Organ Procurement , Humans , Waiting Lists , Tissue Donors
20.
Int J Qual Health Care ; 36(3)2024 Sep 04.
Article in English | MEDLINE | ID: mdl-39120969

ABSTRACT

Urban-rural disparities in medical care, including in home healthcare, persist globally. With aging populations and medical advancements, demand for home health services rises, warranting investigation into home healthcare disparities. Our study aimed to (i) investigate the impact of rurality on home healthcare quality, and (ii) assess the temporal disparities and the changes in disparities in home healthcare quality between urban and rural home health agencies (HHAs), incorporating an analysis of geospatial distribution to visualize the underlying patterns. This study analyzed data from HHAs listed on the Centers for Medicare and Medicaid Services website, covering the period from 2010 to 2022. Data were classified into urban and rural categories for each HHA. We employed panel data analysis to examine the impact of rurality on home healthcare quality, specifically focusing on hospital admission and emergency room (ER) visit rates. Disparities between urban and rural HHAs were assessed using the Wilcoxon test, with results visualized through line and dot plots and heat maps to illustrate trends and differences comprehensively. Rurality is demonstrated as the most significant variable in hospital admission and ER visit rates in the panel data analysis. Urban HHAs consistently exhibit significantly lower hospital admission rates and ER visit rates compared to rural HHAs from 2010 to 2022. Longitudinally, the gap in hospital admission rates between urban and rural HHAs is shrinking, while there is an increasing gap in ER visit rates. In 2022, HHAs in Mountain areas, which are characterized by a higher proportion of rural regions, exhibited higher hospital admission and ER visit rates than other areas. This study underscores the persistent urban-rural disparities in home healthcare quality. The analysis emphasizes the ongoing need for targeted interventions to address disparities in home healthcare delivery and ensure equitable access to quality care across urban and rural regions. Our findings have the potential to inform policy and practice, promoting equity and efficiency in the long-term care system, for better health outcomes throughout the USA.


Subject(s)
Healthcare Disparities , Quality of Health Care , Rural Health Services , Rural Population , Humans , United States , Rural Population/statistics & numerical data , Rural Health Services/statistics & numerical data , Healthcare Disparities/statistics & numerical data , Urban Population/statistics & numerical data , Home Care Services/statistics & numerical data , Home Care Services/standards , Home Care Agencies , Emergency Service, Hospital/statistics & numerical data , Hospitalization/statistics & numerical data
SELECTION OF CITATIONS
SEARCH DETAIL