RESUMEN
Bayesian empirical likelihood (BEL) models are becoming increasingly popular as an attractive alternative to fully parametric models. However, they have only recently been applied to spatial data analysis for small area estimation. This study considers the development of spatial BEL models using two popular conditional autoregressive (CAR) priors, namely BYM and Leroux priors. The performance of the proposed models is compared with their parametric counterparts and with existing spatial BEL models using independent Gaussian priors and generalised Moran basis priors. The models are applied to two benchmark spatial datasets, simulation study and COVID-19 data. The results indicate promising opportunities for these models to capture new insights into spatial data. Specifically, the spatial BEL models outperform the parametric spatial models when the underlying distributional assumptions of data appear to be violated.
Asunto(s)
COVID-19 , Teorema de Bayes , COVID-19/epidemiología , Humanos , Funciones de Verosimilitud , Distribución Normal , Análisis EspacialRESUMEN
Peer-grouping is used in many sectors for organisational learning, policy implementation, and benchmarking. Clustering provides a statistical, data-driven method for constructing meaningful peer groups, but peer groups must be compatible with business constraints such as size and stability considerations. Additionally, statistical peer groups are constructed from many different variables, and can be difficult to understand, especially for non-statistical audiences. We developed methodology to apply business constraints to clustering solutions and allow the decision-maker to choose the balance between statistical goodness-of-fit and conformity to business constraints. Several tools were utilised to identify complex distinguishing features in peer groups, and a number of visualisations are developed to explain high-dimensional clusters for non-statistical audiences. In a case study where peer group size was required to be small (≤ 100 members), we applied constrained clustering to a noisy high-dimensional data-set over two subsequent years, ensuring that the clusters were sufficiently stable between years. Our approach not only satisfied clustering constraints on the test data, but maintained an almost monotonic negative relationship between goodness-of-fit and stability between subsequent years. We demonstrated in the context of the case study how distinguishing features between clusters can be communicated clearly to different stakeholders with substantial and limited statistical knowledge.
Asunto(s)
Aprendizaje , Grupo Paritario , Benchmarking , Análisis por Conglomerados , HumanosRESUMEN
Conventional genome-wide association studies (GWASs) of complex traits, such as Multiple Sclerosis (MS), are reliant on per-SNP p-values and are therefore heavily burdened by multiple testing correction. Thus, in order to detect more subtle alterations, ever increasing sample sizes are required, while ignoring potentially valuable information that is readily available in existing datasets. To overcome this, we used penalised regression incorporating elastic net with a stability selection method by iterative subsampling to detect the potential interaction of loci with MS risk. Through re-analysis of the ANZgene dataset (1617 cases and 1988 controls) and an IMSGC dataset as a replication cohort (1313 cases and 1458 controls), we identified new association signals for MS predisposition, including SNPs above and below conventional significance thresholds while targeting two natural killer receptor loci and the well-established HLA loci. For example, rs2844482 (98.1% iterations), otherwise ignored by conventional statistics (p = 0.673) in the same dataset, was independently strongly associated with MS in another GWAS that required more than 40 times the number of cases (~45 K). Further comparison of our hits to those present in a large-scale meta-analysis, confirmed that the majority of SNPs identified by the elastic net model reached conventional statistical GWAS thresholds (p < 5 × 10-8) in this much larger dataset. Moreover, we found that gene variants involved in oxidative stress, in addition to innate immunity, were associated with MS. Overall, this study highlights the benefit of using more advanced statistical methods to (re-)analyse subtle genetic variation among loci that have a biological basis for their contribution to disease risk.
Asunto(s)
Antígenos HLA/genética , Esclerosis Múltiple/genética , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo , Receptores de Células Asesinas Naturales/genética , Estudios de Casos y Controles , Estudios de Cohortes , Femenino , Estudio de Asociación del Genoma Completo , Humanos , Masculino , Esclerosis Múltiple/patología , Análisis de RegresiónRESUMEN
PURPOSE: To retrospectively evaluate the performance of computed tomography (CT) angiography in the detection and localization of clinically active gastrointestinal (GI) hemorrhage of an unknown source. MATERIALS AND METHODS: Eighty-six CT angiograms were obtained in 74 patients with the clinical diagnosis of acute GI hemorrhage of an unknown source. Results of CT angiography were recorded, and the patients' electronic medical records were reviewed for documentation of subsequent interventional procedures performed within 24 hours of the reference CT angiogram to diagnose or control ongoing GI hemorrhage. Surgical, endoscopic, and final pathologic reports, if available, were reviewed. RESULTS: Twenty-two of the 86 CT angiograms (26%) were positive for active hemorrhage, with findings confirmed in 19 of the 22 cases (86%). Thirteen cases were confirmed with angiography, five cases were confirmed with surgery, and one case was confirmed with autopsy. Sixty-four of the 86 CT angiograms were negative, and 59 (92%) of the CT angiograms required no further intervention. These patients were discharged without incident. There were no cases in which CT angiography was negative and subsequent angiography within 24 hours was positive. The overall sensitivity, specificity, accuracy, and positive and negative predictive value of CT angiography in the detection of active GI hemorrhage within this study population were 79%, 95%, 91%, 86%, and 92%, respectively. CONCLUSIONS: CT angiography provides valuable information that can be used to determine the appropriateness of catheter angiography and guide mesenteric catheterization if a bleeding source is localized. The authors' experience with this study cohort supports its use before angiography in those patients with acute GI bleeding of an unknown source who are being considered for catheter-directed intervention.
Asunto(s)
Angiografía/métodos , Hemorragia Gastrointestinal/diagnóstico por imagen , Tomografía Computarizada por Rayos X/métodos , Adulto , Anciano , Anciano de 80 o más Años , Femenino , Humanos , Estudios Longitudinales , Masculino , Persona de Mediana Edad , Reproducibilidad de los Resultados , Sensibilidad y EspecificidadRESUMEN
Epigenome-wide association studies seek to identify DNA methylation sites associated with clinical outcomes. Difference in observed methylation between specific cell-subtypes is often of interest; however, available samples often comprise a mixture of cells. To date, cell-subtype estimates have been obtained from mixed-cell DNA data using linear regression models, but the accuracy of such estimates has not been critically assessed. We evaluated linear regression performance for cell-subtype specific methylation estimation using a 450K methylation array dataset of both mixed-cell and cell-subtype sorted samples from six healthy males. CpGs associated with each cell-subtype were first identified using t-tests between groups of cell-subtype sorted samples. Subsequent reduced panels of reliably accurate CpGs were identified from mixed-cell samples using an accuracy heuristic (D). Performance was assessed by comparing cell-subtype specific estimates from mixed-cells with corresponding cell-sorted mean using the mean absolute error (MAE) and the Coefficient of Determination (R2). At the cell-subtype level, methylation levels at 3272 CpGs could be estimated to within a MAE of 5% of the expected value. The cell-subtypes with the highest accuracy were CD56+ NK (R2 = 0.56) and CD8+T (R2 = 0.48), where 23% of sites were accurately estimated. Hierarchical clustering and pathways enrichment analysis confirmed the biological relevance of the panels. Our results suggest that linear regression for cell-subtype specific methylation estimation is accurate only for some cell-subtypes at a small fraction of cell-associated sites but may be applicable to EWASs of disease traits with a blood-based pathology. Although sample size was a limitation in this study, we suggest that alternative statistical methods will provide the greatest performance improvements.