Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 220
Filtrar
1.
Methods Mol Biol ; 2812: 155-168, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39068361

RESUMEN

This chapter shows applying the Asymmetric Within-Sample Transformation to single-cell RNA-Seq data matched with a previous dropout imputation. The asymmetric transformation is a special winsorization that flattens low-expressed intensities and preserves highly expressed gene levels. Before a standard hierarchical clustering algorithm, an intermediate step removes noninformative genes according to a threshold applied to a per-gene entropy estimate. Following the clustering, a time-intensive algorithm is shown to uncover the molecular features associated with each cluster. This step implements a resampling algorithm to generate a random baseline to measure up/downregulated significant genes. To this aim, we adopt a GLM model as implemented in DESeq2 package. We render the results in graphical mode. While the tools are standard heat maps, we introduce some data scaling to clarify the results' reliability.


Asunto(s)
Algoritmos , Análisis de la Célula Individual , Análisis de la Célula Individual/métodos , Análisis por Conglomerados , Humanos , Perfilación de la Expresión Génica/métodos , Programas Informáticos , Biología Computacional/métodos , RNA-Seq/métodos
2.
Biom J ; 66(5): e202300197, 2024 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-38953619

RESUMEN

In biomedical research, the simultaneous inference of multiple binary endpoints may be of interest. In such cases, an appropriate multiplicity adjustment is required that controls the family-wise error rate, which represents the probability of making incorrect test decisions. In this paper, we investigate two approaches that perform single-step p $p$ -value adjustments that also take into account the possible correlation between endpoints. A rather novel and flexible approach known as multiple marginal models is considered, which is based on stacking of the parameter estimates of the marginal models and deriving their joint asymptotic distribution. We also investigate a nonparametric vector-based resampling approach, and we compare both approaches with the Bonferroni method by examining the family-wise error rate and power for different parameter settings, including low proportions and small sample sizes. The results show that the resampling-based approach consistently outperforms the other methods in terms of power, while still controlling the family-wise error rate. The multiple marginal models approach, on the other hand, shows a more conservative behavior. However, it offers more versatility in application, allowing for more complex models or straightforward computation of simultaneous confidence intervals. The practical application of the methods is demonstrated using a toxicological dataset from the National Toxicology Program.


Asunto(s)
Investigación Biomédica , Biometría , Modelos Estadísticos , Biometría/métodos , Investigación Biomédica/métodos , Tamaño de la Muestra , Determinación de Punto Final , Humanos
3.
Int J Mol Sci ; 25(13)2024 Jul 03.
Artículo en Inglés | MEDLINE | ID: mdl-39000413

RESUMEN

Our study aims to address the methodological challenges frequently encountered in RNA-Seq data analysis within cancer studies. Specifically, it enhances the identification of key genes involved in axillary lymph node metastasis (ALNM) in breast cancer. We employ Generalized Linear Models with Quasi-Likelihood (GLMQLs) to manage the inherently discrete and overdispersed nature of RNA-Seq data, marking a significant improvement over conventional methods such as the t-test, which assumes a normal distribution and equal variances across samples. We utilize the Trimmed Mean of M-values (TMMs) method for normalization to address library-specific compositional differences effectively. Our study focuses on a distinct cohort of 104 untreated patients from the TCGA Breast Invasive Carcinoma (BRCA) dataset to maintain an untainted genetic profile, thereby providing more accurate insights into the genetic underpinnings of lymph node metastasis. This strategic selection paves the way for developing early intervention strategies and targeted therapies. Our analysis is exclusively dedicated to protein-coding genes, enriched by the Magnitude Altitude Scoring (MAS) system, which rigorously identifies key genes that could serve as predictors in developing an ALNM predictive model. Our novel approach has pinpointed several genes significantly linked to ALNM in breast cancer, offering vital insights into the molecular dynamics of cancer development and metastasis. These genes, including ERBB2, CCNA1, FOXC2, LEFTY2, VTN, ACKR3, and PTGS2, are involved in key processes like apoptosis, epithelial-mesenchymal transition, angiogenesis, response to hypoxia, and KRAS signaling pathways, which are crucial for tumor virulence and the spread of metastases. Moreover, the approach has also emphasized the importance of the small proline-rich protein family (SPRR), including SPRR2B, SPRR2E, and SPRR2D, recognized for their significant involvement in cancer-related pathways and their potential as therapeutic targets. Important transcripts such as H3C10, H1-2, PADI4, and others have been highlighted as critical in modulating the chromatin structure and gene expression, fundamental for the progression and spread of cancer.


Asunto(s)
Neoplasias de la Mama , Regulación Neoplásica de la Expresión Génica , Metástasis Linfática , Humanos , Neoplasias de la Mama/genética , Neoplasias de la Mama/patología , Metástasis Linfática/genética , Femenino , RNA-Seq/métodos , Perfilación de la Expresión Génica/métodos , Ganglios Linfáticos/patología , Axila , Biomarcadores de Tumor/genética , Análisis de Secuencia de ARN/métodos
4.
Ecol Evol ; 14(7): e11387, 2024 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-38994210

RESUMEN

Generalized linear models (GLMs) are an integral tool in ecology. Like general linear models, GLMs assume linearity, which entails a linear relationship between independent and dependent variables. However, because this assumption acts on the link rather than the natural scale in GLMs, it is more easily overlooked. We reviewed recent ecological literature to quantify the use of linearity. We then used two case studies to confront the linearity assumption via two GLMs fit to empirical data. In the first case study we compared GLMs to generalized additive models (GAMs) fit to mammal relative abundance data. In the second case study we tested for linearity in occupancy models using passerine point-count data. We reviewed 162 studies published in the last 5 years in five leading ecology journals and found less than 15% reported testing for linearity. These studies used transformations and GAMs more often than they reported a linearity test. In the first case study, GAMs strongly out-performed GLMs as measured by AIC in modeling relative abundance, and GAMs helped uncover nonlinear responses of carnivore species to landscape development. In the second case study, 14% of species-specific models failed a formal statistical test for linearity. We also found that differences between linear and nonlinear (i.e., those with a transformed independent variable) model predictions were similar for some species but not for others, with implications for inference and conservation decision-making. Our review suggests that reporting tests for linearity are rare in recent studies employing GLMs. Our case studies show how formally comparing models that allow for nonlinear relationships between the dependent and independent variables has the potential to impact inference, generate new hypotheses, and alter conservation implications. We conclude by suggesting that ecological studies report tests for linearity and use formal methods to address linearity assumption violations in GLMs.

5.
Sci Total Environ ; 946: 174324, 2024 Oct 10.
Artículo en Inglés | MEDLINE | ID: mdl-38960195

RESUMEN

Development of effective prevention and mitigation strategies for marine plastic pollution requires a better understanding of the pathways and transport mechanisms of plastic waste. Yet the role of estuaries as a key interface between riverine inputs of plastic pollution and delivery to receiving marine environments remains poorly understood. This study quantified the concentration and distribution of microplastics (MPs) (50-3200 µm) in surface waters of the St. Lawrence Estuary (SLE) in eastern Canada. Microplastics were identified and enumerated based on particle morphology, colour, and size class. Fourier Transform Infrared (FTIR) spectroscopy was used on a subset of particles to identify polymers. Generalized linear models (Gamma distribution with log-link) examined the relationship between MP concentrations and oceanographic variables and anthropogenic sources. Finally, a risk assessment model, using MP concentrations and chemical hazards based on polymer types, estimated the MP pollution risk to ecosystem health. Mean surface MP concentration in the SLE was 120 ± 42 SD particles m-3; MP concentrations were highest in the fluvial section and lowest in the Northwest Gulf of St. Lawrence. However, MP concentrations exhibited high heterogeneity along the length and width of the SLE. Microplastics were elevated at stations located closer to wastewater treatment plant outflows and downstream sites with more agricultural land. Black, blue, and transparent fibers and fragments ≤250 µm were most commonly encountered. Predominant polymer types included polyethylene terephthalate, regenerated cellulose, polyethylene, and alkyds. While the overall risk to ecosystem health in the entire estuary was considered low, several stations, particularly near urban centres were at high or very high risk. This study provides new insights into the quantification and distribution of MPs and first estimates of the risk of MP pollution to ecosystem health in one of the world's largest estuaries.

6.
IISE Trans Healthc Syst Eng ; 14(2): 130-140, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39055377

RESUMEN

Radiation therapy (RT) is a frontline approach to treating cancer. While the target of radiation dose delivery is the tumor, there is an inevitable spill of dose to nearby normal organs causing complications. This phenomenon is known as radiotherapy toxicity. To predict the outcome of the toxicity, statistical models can be built based on dosimetric variables received by the normal organ at risk (OAR), known as Normal Tissue Complication Probability (NTCP) models. To tackle the challenge of the high dimensionality of dosimetric variables and limited clinical sample sizes, statistical models with variable selection techniques are viable choices. However, existing variable selection techniques are data-driven and do not integrate medical domain knowledge into the model formulation. We propose a knowledge-constrained generalized linear model (KC-GLM). KC-GLM includes a new mathematical formulation to translate three pieces of domain knowledge into non-negativity, monotonicity, and adjacent similarity constraints on the model coefficients. We further propose an equivalent transformation of the KC-GLM formulation, which makes it possible to solve the model coefficients using existing optimization solvers. Furthermore, we compare KC-GLM and several well-known variable selection techniques via a simulation study and on two real datasets of prostate cancer and lung cancer, respectively. These experiments show that KC-GLM selects variables with better interpretability, avoids producing counter-intuitive and misleading results, and has better prediction accuracy.

7.
BMC Med Res Methodol ; 24(1): 81, 2024 Apr 01.
Artículo en Inglés | MEDLINE | ID: mdl-38561661

RESUMEN

BACKGROUND: Epidemiological studies in refugee settings are often challenged by the denominator problem, i.e. lack of population at risk data. We develop an empirical approach to address this problem by assessing relationships between occupancy data in refugee centres, number of refugee patients in walk-in clinics, and diseases of the digestive system. METHODS: Individual-level patient data from a primary care surveillance system (PriCarenet) was matched with occupancy data retrieved from immigration authorities. The three relationships were analysed using regression models, considering age, sex, and type of centre. Then predictions for the respective data category not available in each of the relationships were made. Twenty-one German on-site health care facilities in state-level registration and reception centres participated in the study, covering the time period from November 2017 to July 2021. RESULTS: 445 observations ("centre-months") for patient data from electronic health records (EHR, 230 mean walk-in clinics visiting refugee patients per month and centre; standard deviation sd: 202) of a total of 47.617 refugee patients were available, 215 for occupancy data (OCC, mean occupancy of 348 residents, sd: 287), 147 for both (matched), leaving 270 observations without occupancy (EHR-unmatched) and 40 without patient data (OCC-unmatched). The incidence of diseases of the digestive system, using patients as denominators in the different sub-data sets were 9.2% (sd: 5.9) in EHR, 8.8% (sd: 5.1) when matched, 9.6% (sd: 6.4) in EHR- and 12% (sd 2.9) in OCC-unmatched. Using the available or predicted occupancy as denominator yielded average incidence estimates (per centre and month) of 4.7% (sd: 3.2) in matched data, 4.8% (sd: 3.3) in EHR- and 7.4% (sd: 2.7) in OCC-unmatched. CONCLUSIONS: By modelling the ratio between patient and occupancy numbers in refugee centres depending on sex and age, as well as on the total number of patients or occupancy, the denominator problem in health monitoring systems could be mitigated. The approach helped to estimate the missing component of the denominator, and to compare disease frequency across time and refugee centres more accurately using an empirically grounded prediction of disease frequency based on demographic and centre typology. This avoided over-estimation of disease frequency as opposed to the use of patients as denominators.


Asunto(s)
Refugiados , Humanos , Registros Electrónicos de Salud , Emigración e Inmigración , Factores de Riesgo , Electrónica
8.
Front Microbiol ; 15: 1342328, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38655085

RESUMEN

Introduction: Our study undertakes a detailed exploration of gene expression dynamics within human lung organ tissue equivalents (OTEs) in response to Influenza A virus (IAV), Human metapneumovirus (MPV), and Parainfluenza virus type 3 (PIV3) infections. Through the analysis of RNA-Seq data from 19,671 genes, we aim to identify differentially expressed genes under various infection conditions, elucidating the complexities of virus-host interactions. Methods: We employ Generalized Linear Models (GLMs) with Quasi-Likelihood (QL) F-tests (GLMQL) and introduce the novel Magnitude-Altitude Score (MAS) and Relaxed Magnitude-Altitude Score (RMAS) algorithms to navigate the intricate landscape of RNA-Seq data. This approach facilitates the precise identification of potential biomarkers, highlighting the host's reliance on innate immune mechanisms. Our comprehensive methodological framework includes RNA extraction, library preparation, sequencing, and Gene Ontology (GO) enrichment analysis to interpret the biological significance of our findings. Results: The differential expression analysis unveils significant changes in gene expression triggered by IAV, MPV, and PIV3 infections. The MAS and RMAS algorithms enable focused identification of biomarkers, revealing a consistent activation of interferon-stimulated genes (e.g., IFIT1, IFIT2, IFIT3, OAS1) across all viruses. Our GO analysis provides deep insights into the host's defense mechanisms and viral strategies exploiting host cellular functions. Notably, changes in cellular structures, such as cilium assembly and mitochondrial ribosome assembly, indicate a strategic shift in cellular priorities. The precision of our methodology is validated by a 92% mean accuracy in classifying respiratory virus infections using multinomial logistic regression, demonstrating the superior efficacy of our approach over traditional methods. Discussion: This study highlights the intricate interplay between viral infections and host gene expression, underscoring the need for targeted therapeutic interventions. The stability and reliability of the MAS/RMAS ranking method, even under stringent statistical corrections, and the critical importance of adequate sample size for biomarker reliability are significant findings. Our comprehensive analysis not only advances our understanding of the host's response to viral infections but also sets a new benchmark for the identification of biomarkers, paving the way for the development of effective diagnostic and therapeutic strategies.

9.
Neuroimage ; 290: 120557, 2024 Apr 15.
Artículo en Inglés | MEDLINE | ID: mdl-38423264

RESUMEN

BACKGROUND: Time series analysis is critical for understanding brain signals and their relationship to behavior and cognition. Cluster-based permutation tests (CBPT) are commonly used to analyze a variety of electrophysiological signals including EEG, MEG, ECoG, and sEEG data without a priori assumptions about specific temporal effects. However, two major limitations of CBPT include the inability to directly analyze experiments with multiple fixed effects and the inability to account for random effects (e.g. variability across subjects). Here, we propose a flexible multi-step hypothesis testing strategy using CBPT with Linear Mixed Effects Models (LMEs) and Generalized Linear Mixed Effects Models (GLMEs) that can be applied to a wide range of experimental designs and data types. METHODS: We first evaluate the statistical robustness of LMEs and GLMEs using simulated data distributions. Second, we apply a multi-step hypothesis testing strategy to analyze ERPs and broadband power signals extracted from human ECoG recordings collected during a simple image viewing experiment with image category and novelty as fixed effects. Third, we assess the statistical power differences between analyzing signals with CBPT using LMEs compared to CBPT using separate t-tests run on each fixed effect through simulations that emulate broadband power signals. Finally, we apply CBPT using GLMEs to high-gamma burst data to demonstrate the extension of the proposed method to the analysis of nonlinear data. RESULTS: First, we found that LMEs and GLMEs are robust statistical models. In simple simulations LMEs produced highly congruent results with other appropriately applied linear statistical models, but LMEs outperformed many linear statistical models in the analysis of "suboptimal" data and maintained power better than analyzing individual fixed effects with separate t-tests. GLMEs also performed similarly to other nonlinear statistical models. Second, in real world human ECoG data, LMEs performed at least as well as separate t-tests when applied to predefined time windows or when used in conjunction with CBPT. Additionally, fixed effects time courses extracted with CBPT using LMEs from group-level models of pseudo-populations replicated latency effects found in individual category-selective channels. Third, analysis of simulated broadband power signals demonstrated that CBPT using LMEs was superior to CBPT using separate t-tests in identifying time windows with significant fixed effects especially for small effect sizes. Lastly, the analysis of high-gamma burst data using CBPT with GLMEs produced results consistent with CBPT using LMEs applied to broadband power data. CONCLUSIONS: We propose a general approach for statistical analysis of electrophysiological data using CBPT in conjunction with LMEs and GLMEs. We demonstrate that this method is robust for experiments with multiple fixed effects and applicable to the analysis of linear and nonlinear data. Our methodology maximizes the statistical power available in a dataset across multiple experimental variables while accounting for hierarchical random effects and controlling FWER across fixed effects. This approach substantially improves power leading to better reproducibility. Additionally, CBPT using LMEs and GLMEs can be used to analyze individual channels or pseudo-population data for the comparison of functional or anatomical groups of data.


Asunto(s)
Encéfalo , Proyectos de Investigación , Humanos , Reproducibilidad de los Resultados , Encéfalo/fisiología , Modelos Estadísticos , Modelos Lineales
10.
Genome Biol ; 25(1): 37, 2024 01 30.
Artículo en Inglés | MEDLINE | ID: mdl-38291503

RESUMEN

Sample multiplexing enables pooled analysis during single-cell RNA sequencing workflows, thereby increasing throughput and reducing batch effects. A challenge for all multiplexing techniques is to link sample-specific barcodes with cell-specific barcodes, then demultiplex sample identity post-sequencing. However, existing demultiplexing tools fail under many real-world conditions where barcode cross-contamination is an issue. We therefore developed deMULTIplex2, an algorithm inspired by a mechanistic model of barcode cross-contamination. deMULTIplex2 employs generalized linear models and expectation-maximization to probabilistically determine the sample identity of each cell. Benchmarking reveals superior performance across various experimental conditions, particularly on large or noisy datasets with unbalanced sample compositions.


Asunto(s)
Análisis de la Célula Individual , Análisis de Expresión Génica de una Sola Célula , Análisis de la Célula Individual/métodos , Algoritmos , Análisis de Secuencia de ARN/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos
11.
Trop Anim Health Prod ; 56(1): 42, 2024 Jan 12.
Artículo en Inglés | MEDLINE | ID: mdl-38214742

RESUMEN

Cattle weight development is highly correlated with some body measurements. Based on the relationship between morphometric measurements and body mass, our aim was to develop regression equations to estimate the body weight of Curraleiro Pé-Duro (CPD) cattle to be used in farms that lack access to weighting scales. Data from 1023 animals from four farms on withers height (WH), body length (BL), body score (BS), heart girth (HG), permanent teeth (PT), scrotal perimeter (SP), and live weight were used. The animals were classified into five categories depending on age and/or sex: newborns (NB), calves, weaned animals, cows, and bulls. The best models are GLM with Gamma, Gamma, inverse Gaussian, Gaussian, and Gamma distributions for NB, calves, weaned animals, cows, and bulls, respectively. Predictive modeling for bulls was the best performing overall, with a correlation of 0.97 between the estimated by the model and the obtained with a weighting scale. For NB, calves, weaned animals, and cows, the correlation (r) was 0.85, 0.90, 0.95, and 0.87, respectively. The evaluated models are adequate to be used as a technical solution to estimate weight in a cattle production system.


Asunto(s)
Peso al Nacer , Femenino , Animales , Bovinos , Masculino , Granjas , Destete , Peso Corporal
12.
Curr Probl Diagn Radiol ; 53(2): 192-200, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-37951726

RESUMEN

Magnetic Resonance Imaging (MRI) is an important diagnostic scanning tool for the detection and monitoring of specific diseases and conditions. However, the equipment cost, maintenance and specialty training of the technologists make the examination expensive. Consequently, unnecessary scanner time caused by poor scheduling, repeated sequences, aborted sequences, scanner idleness, or capture of non-diagnostic or low-value sequences is an opportunity to reduce costs and increase efficiency. This paper analyzes data collected from log files on 29 scanners over several years. 'Wasted' time is defined and key performance indicators (KPIs) are identified. A decrease in exam duration results when actively modifying and monitoring the number of sequences that comprise the exam card for a protocol.


Asunto(s)
Eficiencia , Imagen por Resonancia Magnética , Humanos , Flujo de Trabajo , Imagen por Resonancia Magnética/métodos
13.
Acta Trop ; 249: 107071, 2024 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-37956820

RESUMEN

Beak and feather disease virus (BFDV) is globally distributed in psittacine birds. BFDV is considered a key threat to biodiversity because it has the ability to transmit and shift between host species. Data from captive psittacine birds can help to identify potential risk factors for viral transmission management. Generalized Linear Models (GLM) were used to examine the association of sample type, species, and season on the prevalence of BFDV in captive exotic birds in Thailand. In this study, the overall prevalence of BFDV was 8.2 %, with 346 of 4243 birds being positive. The prevalence in feather samples (12.1 %) and pooled (dried blood and feather) samples (15.4 %) was higher than that in the dried blood samples (4.8 %). A GLM test revealed that the sample type, species, and season were significant factors influencing the prevalence of BFDV. Based on the model, two species (blue-eyed cockatoo; Cacatua ophthalmica, and ring-necked parakeet; Psittacula krameri) were associated with higher BFDV prevalence. By studying the seasonal BFDV prevalence, we can gather important insights into the environmental factors that contribute to its spread. The higher prevalence observed during the wet season suggest a possible affect between BFDV prevalence and environmental factors such as heavy rainfall and humidity. In conclusion, our analysis of the trends in BFDV prevalence offers valuable insights into the prevalence or distribution of BFDV in the studied population. By monitoring BFDV prevalence, identifying high-risk species, and understanding seasonal patterns, we can develop targeted management approaches to control the spread of the virus. This information is crucial for mitigating the impact of BFDV on aviculture.


Asunto(s)
Enfermedades de las Aves , Infecciones por Circoviridae , Circovirus , Loros , Animales , Circovirus/genética , Prevalencia , Infecciones por Circoviridae/epidemiología , Infecciones por Circoviridae/veterinaria , Enfermedades de las Aves/epidemiología , ADN Viral , Reacción en Cadena de la Polimerasa/veterinaria , Filogenia
14.
Stat Med ; 43(3): 534-547, 2024 02 10.
Artículo en Inglés | MEDLINE | ID: mdl-38096856

RESUMEN

There are now many options for doubly robust estimation; however, there is a concerning trend in the applied literature to believe that the combination of a propensity score and an adjusted outcome model automatically results in a doubly robust estimator and/or to misuse more complex established doubly robust estimators. A simple alternative, canonical link generalized linear models (GLM) fit via inverse probability of treatment (propensity score) weighted maximum likelihood estimation followed by standardization (the g $$ g $$ -formula) for the average causal effect, is a doubly robust estimation method. Our aim is for the reader not just to be able to use this method, which we refer to as IPTW GLM, for doubly robust estimation, but to fully understand why it has the doubly robust property. For this reason, we define clearly, and in multiple ways, all concepts needed to understand the method and why it is doubly robust. In addition, we want to make very clear that the mere combination of propensity score weighting and an adjusted outcome model does not generally result in a doubly robust estimator. Finally, we hope to dispel the misconception that one can adjust for residual confounding remaining after propensity score weighting by adjusting in the outcome model for what remains 'unbalanced' even when using doubly robust estimators. We provide R code for our simulations and real open-source data examples that can be followed step-by-step to use and hopefully understand the IPTW GLM method. We also compare to a much better-known but still simple doubly robust estimator.


Asunto(s)
Modelos Estadísticos , Humanos , Simulación por Computador , Interpretación Estadística de Datos , Probabilidad , Puntaje de Propensión , Modelos Lineales
15.
Rev. biol. trop ; 71(1)dic. 2023.
Artículo en Español | LILACS-Express | LILACS | ID: biblio-1449523

RESUMEN

Introducción: La enfermedad por coronavirus (COVID-19) se ha extendido entre la población de todo el país y ha tenido un gran impacto a nivel mundial. Sin embargo, existen diferencias geográficas importantes en la mortalidad de COVID-19 entre las diferentes regiones del mundo y en Costa Rica. Objetivo: Explorar el efecto de algunos de los factores sociodemográficos en la mortalidad de COVID-19 en pequeñas divisiones geográficas o cantones de Costa Rica. Métodos: Usamos registros oficiales y aplicamos un modelo de regresión clásica de Poisson y un modelo de regresión ponderada geográficamente. Resultados: Obtuvimos un criterio de información de Akaike (AIC) más bajo con la regresión ponderada (927.1 en la regresión de Poison versus 358.4 en la regresión ponderada). Los cantones con un mayor riesgo de mortalidad por COVID-19 tuvo una población más densa; bienestar material más alto; menor proporción de cobertura de salud y están ubicadas en el área del Pacífico de Costa Rica. Conclusiones: Una estrategia de intervención de COVID-19 específica debería concentrarse en áreas de la costa pacífica con poblaciones más densas, mayor bienestar material y menor población por unidad de salud.


Introduction: The coronavirus disease (COVID-19) has spread among the population of Costa Rica and has had a great global impact. However, there are important geographic differences in mortality from COVID-19 among world regions and within Costa Rica. Objective: To explore the effect of some sociodemographic factors on COVID-19 mortality in the small geographic divisions or cantons of Costa Rica. Methods: We used official records and applied a classical epidemiological Poisson regression model and a geographically weighted regression model. Results: We obtained a lower Akaike Information Criterion with the weighted regression (927.1 in Poisson regression versus 358.4 in weighted regression). The cantons with higher risk of mortality from COVID-19 had a denser population; higher material well-being; less population by health service units and are located near the Pacific coast. Conclusions: A specific COVID-19 intervention strategy should concentrate on Pacific coast areas with denser population, higher material well-being and less population by health service units.

16.
Sensors (Basel) ; 23(24)2023 Dec 07.
Artículo en Inglés | MEDLINE | ID: mdl-38139509

RESUMEN

The i-DREAMS project established a 'Safety Tolerance Zone (STZ)' to maintain operators within safe boundaries through real-time and post-trip interventions, based on the crucial role of the human element in driving behavior. This paper aims to model the inter-relationship among driving task complexity, operator and vehicle coping capacity, and crash risk. Towards that aim, data from 80 drivers, who participated in a naturalistic driving experiment carried out in three countries (i.e., Belgium, Germany, and Portugal), resulting in a dataset of approximately 19,000 trips were collected and analyzed. The exploratory analysis included the development of Generalized Linear Models (GLMs) and the choice of the most appropriate variables associated with the latent variables "task complexity" and "coping capacity" that are to be estimated from the various indicators. In addition, Structural Equation Models (SEMs) were used to explore how the model variables were interrelated, allowing for both direct and indirect relationships to be modeled. Comparisons on the performance of such models, as well as a discussion on behaviors and driving patterns across different countries and transport modes, were also provided. The findings revealed a positive relationship between task complexity and coping capacity, indicating that as the difficulty of the driving task increased, the driver's coping capacity increased accordingly, (i.e., higher ability to manage and adapt to the challenges posed by more complex tasks). The integrated treatment of task complexity, coping capacity, and risk can improve the behavior and safety of all travelers, through the unobtrusive and seamless monitoring of behavior. Thus, authorities should utilize a data system oriented towards collecting key driving insights on population level to plan mobility and safety interventions, develop incentives for road users, optimize enforcement, and enhance community building for safe traveling.


Asunto(s)
Conducción de Automóvil , Humanos , Accidentes de Tránsito/prevención & control , Habilidades de Afrontamiento , Viaje , Modelos Lineales
17.
BMC Med Res Methodol ; 23(1): 298, 2023 12 15.
Artículo en Inglés | MEDLINE | ID: mdl-38102539

RESUMEN

BACKGROUND: The Maximum Likelihood Estimator (MLE) for parameters of the gamma distribution is commonly used to estimate models of right-skewed variables such as costs, hospital length of stay, and appointment wait times in Economics and Healthcare research. The common specification for this estimator assumes the variance is proportional to the square of the mean, which underlies estimation and specification tests. We present a specification in which the variance is directly proportional to the mean. METHODS: We used simulation experiments to investigate finite sample results, and we used United States Department of Veterans Affairs (VA) healthcare cost data as an empirical example comparing the fit and predictive ability of the models. RESULTS: Simulation showed the MLE based on a correctly specified alternative has less parameter bias, lower standard errors, and less skewness in distribution than a misspecified standard model. The application to VA healthcare cost data showed the alternative specification can have better R square, smaller root mean squared error, and smaller mean residuals within deciles of predicted values. CONCLUSIONS: The alternative gamma specification can be a useful alternative to the standard specification for estimating models of right-skewed continuous variables.


Asunto(s)
Costos de la Atención en Salud , Investigación sobre Servicios de Salud , Humanos , Simulación por Computador
18.
J Appl Stat ; 50(16): 3199-3228, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37969896

RESUMEN

This article presents a novel stochastic removal mechanism under Type-II progressive random censoring in which removal probabilities are allowed to be dependent on the lifetime conditions through Generalized Linear Models (GLM). These conditions potentially include failure distances (the time required to observe the next failure) or other covariate information available in the experiment. The proposed GLM-based random removal mechanism includes a set of tuning parameters that are determined by the researcher according to the possible failure distance category. These parameters allow flexible determination of the removal probabilities leading to necessary experimental cost and time reductions. To establish the proposed mechanism, the Proportional Hazard Rate (PHR) family of distributions is considered. Also, the maximum likelihood estimators of parameters and their asymptotic variances are derived for the Weibull distributed lifetime data. A simple simulation algorithm for generating Type-II progressive censoring samples with GLM-based dependent removal probabilities is also presented. The expected experiment time required to complete the life test under this censoring scheme is also investigated using the Monte Carlo integration method. Several simulation studies are conducted to evaluate and compare the performance of the proposed mechanism. A sensitivity analysis is also considered to study the effect of misspecification of removal mechanism coefficients. Finally, two real data sets are analyzed for illustrative purposes.

19.
J Comput Graph Stat ; 32(3): 950-960, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-38013849

RESUMEN

Elastic net penalization is widely used in high-dimensional prediction and variable selection settings. Auxiliary information on the variables, for example, groups of variables, is often available. Group-adaptive elastic net penalization exploits this information to potentially improve performance by estimating group penalties, thereby penalizing important groups of variables less than other groups. Estimating these group penalties is, however, hard due to the high dimension of the data. Existing methods are computationally expensive or not generic in the type of response. Here we present a fast method for estimation of group-adaptive elastic net penalties for generalized linear models. We first derive a low-dimensional representation of the Taylor approximation of the marginal likelihood for group-adaptive ridge penalties, to efficiently estimate these penalties. Then we show by using asymptotic normality of the linear predictors that this marginal likelihood approximates that of elastic net models. The ridge group penalties are then transformed to elastic net group penalties by matching the ridge prior variance to the elastic net prior variance as function of the group penalties. The method allows for overlapping groups and unpenalized variables, and is easily extended to other penalties. For a model-based simulation study and two cancer genomics applications we demonstrate a substantially decreased computation time and improved or matching performance compared to other methods. Supplementary materials for this article are available online.

20.
J Appl Stat ; 50(13): 2701-2716, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37720247

RESUMEN

The American Community Survey (ACS) is an ongoing program conducted by the US Census Bureau that publishes estimates of important demographic statistics over pre-specified administrative areas. ACS provides spatially referenced count-valued outcomes that are paired with finite populations. For example, the number of people below the poverty line and the total population for each county are estimated by ACS. One common assumption is that the spatially referenced count-valued outcome given the finite population is binomial distributed. This conditionally specified (CS) model does not define the joint relationship between the count-valued outcome and the finite population. Thus, we consider a joint model for the count-valued outcome and the finite population. When cross-dependence in our joint model can be leveraged to 'improve spatial prediction' we say that the finite population is 'informative.' We model the count given the finite population as binomial and the finite population as negative binomial and use multivariate logit-beta prior distributions. This leads to closed-form expressions of the full-conditional distributions for an efficient Gibbs sampler. We illustrate our model through simulations and our motivating application of ACS poverty estimates. These empirical analyses show the benefits of using our proposed model over the more traditional CS binomial model.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA