Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 80
Filtrar
1.
Am J Epidemiol ; 2024 May 06.
Artigo em Inglês | MEDLINE | ID: mdl-38717330

RESUMO

Quantitative bias analysis (QBA) permits assessment of the expected impact of various imperfections of the available data on the results and conclusions of a particular real-world study. This article extends QBA methodology to multivariable time-to-event analyses with right-censored endpoints, possibly including time-varying exposures or covariates. The proposed approach employs data-driven simulations, which preserve important features of the data at hand while offering flexibility in controlling the parameters and assumptions that may affect the results. First, the steps required to perform data-driven simulations are described, and then two examples of real-world time-to-event analyses illustrate their implementation and the insights they may offer. The first example focuses on the omission of an important time-invariant predictor of the outcome in a prognostic study of cancer mortality, and permits separating the expected impact of confounding bias from non-collapsibility. The second example assesses how imprecise timing of an interval-censored event - ascertained only at sparse times of clinic visits - affects its estimated association with a time-varying drug exposure. The simulation results also provide a basis for comparing the performance of two alternative strategies for imputing the unknown event times in this setting. The R scripts that permit the reproduction of our examples are provided.

2.
Lancet Oncol ; 24(5): e197-e206, 2023 05.
Artigo em Inglês | MEDLINE | ID: mdl-37142381

RESUMO

Patient-reported outcomes (PROs) are increasingly used in single-arm cancer studies. We reviewed 60 papers published between 2018 and 2021 of single-arm studies of cancer treatment with PRO data for current practice on design, analysis, reporting, and interpretation. We further examined the studies' handling of potential bias and how they informed decision making. Most studies (58; 97%) analysed PROs without stating a predefined research hypothesis. 13 (22%) of the 60 studies used a PRO as a primary or co-primary endpoint. Definitions of PRO objectives, study population, endpoints, and missing data strategies varied widely. 23 studies (38%) compared the PRO data with external information, most often by using a clinically important difference value; one study used a historical control group. Appropriateness of methods to handle missing data and intercurrent events (including death) were seldom discussed. Most studies (51; 85%) concluded that PRO results supported treatment. Conducting and reporting of PROs in cancer single-arm studies need standards and a critical discussion of statistical methods and possible biases. These findings will guide the Setting International Standards in Analysing Patient-Reported Outcomes and Quality of Life Data in Cancer Clinical Trials-Innovative Medicines Initiative (SISAQOL-IMI) in developing recommendations for the use of PRO-measures in single-arm studies.


Assuntos
Neoplasias , Qualidade de Vida , Humanos , Medidas de Resultados Relatados pelo Paciente , Neoplasias/terapia , Oncologia , Projetos de Pesquisa
3.
Br J Cancer ; 128(3): 443-445, 2023 02.
Artigo em Inglês | MEDLINE | ID: mdl-36476656

RESUMO

In 2005, several experts in tumor biomarker research publishe the REporting recommendations for Tumor MARKer prognostic studies (REMARK) criteria. Coupled with the subsequent Biospecimen Reporting for Improved Study Quality (BRISQ) criteria, these initiatives provide a framework for transparently reporting of the methods of study conduct and analyses.


Assuntos
Biomarcadores Tumorais , Projetos de Pesquisa , Humanos , Prognóstico
4.
BMC Med ; 21(1): 182, 2023 05 15.
Artigo em Inglês | MEDLINE | ID: mdl-37189125

RESUMO

BACKGROUND: In high-dimensional data (HDD) settings, the number of variables associated with each observation is very large. Prominent examples of HDD in biomedical research include omics data with a large number of variables such as many measurements across the genome, proteome, or metabolome, as well as electronic health records data that have large numbers of variables recorded for each patient. The statistical analysis of such data requires knowledge and experience, sometimes of complex methods adapted to the respective research questions. METHODS: Advances in statistical methodology and machine learning methods offer new opportunities for innovative analyses of HDD, but at the same time require a deeper understanding of some fundamental statistical concepts. Topic group TG9 "High-dimensional data" of the STRATOS (STRengthening Analytical Thinking for Observational Studies) initiative provides guidance for the analysis of observational studies, addressing particular statistical challenges and opportunities for the analysis of studies involving HDD. In this overview, we discuss key aspects of HDD analysis to provide a gentle introduction for non-statisticians and for classically trained statisticians with little experience specific to HDD. RESULTS: The paper is organized with respect to subtopics that are most relevant for the analysis of HDD, in particular initial data analysis, exploratory data analysis, multiple testing, and prediction. For each subtopic, main analytical goals in HDD settings are outlined. For each of these goals, basic explanations for some commonly used analysis methods are provided. Situations are identified where traditional statistical methods cannot, or should not, be used in the HDD setting, or where adequate analytic tools are still lacking. Many key references are provided. CONCLUSIONS: This review aims to provide a solid statistical foundation for researchers, including statisticians and non-statisticians, who are new to research with HDD or simply want to better evaluate and understand the results of HDD analyses.


Assuntos
Pesquisa Biomédica , Objetivos , Humanos , Projetos de Pesquisa
5.
BMC Med ; 20(1): 184, 2022 05 12.
Artigo em Inglês | MEDLINE | ID: mdl-35546237

RESUMO

BACKGROUND: Factors contributing to the lack of understanding of research studies include poor reporting practices, such as selective reporting of statistically significant findings or insufficient methodological details. Systematic reviews have shown that prognostic factor studies continue to be poorly reported, even for important aspects, such as the effective sample size. The REMARK reporting guidelines support researchers in reporting key aspects of tumor marker prognostic studies. The REMARK profile was proposed to augment these guidelines to aid in structured reporting with an emphasis on including all aspects of analyses conducted. METHODS: A systematic search of prognostic factor studies was conducted, and fifteen studies published in 2015 were selected, three from each of five oncology journals. A paper was eligible for selection if it included survival outcomes and multivariable models were used in the statistical analyses. For each study, we summarized the key information in a REMARK profile consisting of details about the patient population with available variables and follow-up data, and a list of all analyses conducted. RESULTS: Structured profiles allow an easy assessment if reporting of a study only has weaknesses or if it is poor because many relevant details are missing. Studies had incomplete reporting of exclusion of patients, missing information about the number of events, or lacked details about statistical analyses, e.g., subgroup analyses in small populations without any information about the number of events. Profiles exhibit severe weaknesses in the reporting of more than 50% of the studies. The quality of analyses was not assessed, but some profiles exhibit several deficits at a glance. CONCLUSIONS: A substantial part of prognostic factor studies is poorly reported and analyzed, with severe consequences for related systematic reviews and meta-analyses. We consider inadequate reporting of single studies as one of the most important reasons that the clinical relevance of most markers is still unclear after years of research and dozens of publications. We conclude that structured reporting is an important step to improve the quality of prognostic marker research and discuss its role in the context of selective reporting, meta-analysis, study registration, predefined statistical analysis plans, and improvement of marker research.


Assuntos
Biomarcadores Tumorais , Projetos de Pesquisa , Biomarcadores Tumorais/análise , Humanos , Prognóstico
6.
Brief Bioinform ; 21(6): 1904-1919, 2020 12 01.
Artigo em Inglês | MEDLINE | ID: mdl-31750518

RESUMO

Data integration, i.e. the use of different sources of information for data analysis, is becoming one of the most important topics in modern statistics. Especially in, but not limited to, biomedical applications, a relevant issue is the combination of low-dimensional (e.g. clinical data) and high-dimensional (e.g. molecular data such as gene expressions) data sources in a prediction model. Not only the different characteristics of the data, but also the complex correlation structure within and between the two data sources, pose challenging issues. In this paper, we investigate these issues via simulations, providing some useful insight into strategies to combine low- and high-dimensional data in a regression prediction model. In particular, we focus on the effect of the correlation structure on the results, while accounting for the influence of our specific choices in the design of the simulation study.


Assuntos
Biologia Computacional , Simulação por Computador , Modelos Estatísticos
7.
BMC Med Res Methodol ; 22(1): 98, 2022 04 06.
Artigo em Inglês | MEDLINE | ID: mdl-35382744

RESUMO

BACKGROUND: In clinical trials, there is considerable interest in investigating whether a treatment effect is similar in all patients, or that one or more prognostic variables indicate a differential response to treatment. To examine this, a continuous predictor is usually categorised into groups according to one or more cutpoints. Several weaknesses of categorization are well known. To avoid the disadvantages of cutpoints and to retain full information, it is preferable to keep continuous variables continuous in the analysis. To handle this issue, the Subpopulation Treatment Effect Pattern Plot (STEPP) was proposed about two decades ago, followed by the multivariable fractional polynomial interaction (MFPI) approach. Provided individual patient data (IPD) from several studies are available, it is possible to investigate for treatment heterogeneity with meta-analysis techniques. Meta-STEPP was recently proposed and in patients with primary breast cancer an interaction of estrogen receptors with chemotherapy was investigated in eight randomized controlled trials (RCTs). METHODS: We use data from eight randomized controlled trials in breast cancer to illustrate issues from two main tasks. The first task is to derive a treatment effect function (TEF), that is, a measure of the treatment effect on the continuous scale of the covariate in the individual studies. The second is to conduct a meta-analysis of the continuous TEFs from the eight studies by applying pointwise averaging to obtain a mean function. We denote the method metaTEF. To improve reporting of available data and all steps of the analysis we introduce a three-part profile called MethProf-MA. RESULTS: Although there are considerable differences between the studies (populations with large differences in prognosis, sample size, effective sample size, length of follow up, proportion of patients with very low estrogen receptor values) our results provide clear evidence of an interaction, irrespective of the choice of the FP function and random or fixed effect models. CONCLUSIONS: In contrast to cutpoint-based analyses, metaTEF retains the full information from continuous covariates and avoids several critical issues when performing IPD meta-analyses of continuous effect modifiers in randomised trials. Early experience suggests it is a promising approach. TRIAL REGISTRATION: Not applicable.


Assuntos
Algoritmos , Neoplasias da Mama , Neoplasias da Mama/tratamento farmacológico , Feminino , Humanos , Tamanho da Amostra
8.
BMC Med Res Methodol ; 21(1): 63, 2021 04 02.
Artigo em Inglês | MEDLINE | ID: mdl-33810787

RESUMO

BACKGROUND: No standards exist for the handling and reporting of data quality in health research. This work introduces a data quality framework for observational health research data collections with supporting software implementations to facilitate harmonized data quality assessments. METHODS: Developments were guided by the evaluation of an existing data quality framework and literature reviews. Functions for the computation of data quality indicators were written in R. The concept and implementations are illustrated based on data from the population-based Study of Health in Pomerania (SHIP). RESULTS: The data quality framework comprises 34 data quality indicators. These target four aspects of data quality: compliance with pre-specified structural and technical requirements (integrity); presence of data values (completeness); inadmissible or uncertain data values and contradictions (consistency); unexpected distributions and associations (accuracy). R functions calculate data quality metrics based on the provided study data and metadata and R Markdown reports are generated. Guidance on the concept and tools is available through a dedicated website. CONCLUSIONS: The presented data quality framework is the first of its kind for observational health research data collections that links a formal concept to implementations in R. The framework and tools facilitate harmonized data quality assessments in pursue of transparent and reproducible research. Application scenarios comprise data quality monitoring while a study is carried out as well as performing an initial data analysis before starting substantive scientific analyses but the developments are also of relevance beyond research.


Assuntos
Confiabilidade dos Dados , Software , Humanos
9.
Biom J ; 63(2): 226-246, 2021 02.
Artigo em Inglês | MEDLINE | ID: mdl-32639065

RESUMO

Doug Altman was a visionary leader and one of the most influential medical statisticians of the last 40 years. Based on a presentation in the "Invited session in memory of Doug Altman" at the 40th Annual Conference of the International Society for Clinical Biostatistics (ISCB) in Leuven, Belgium and our long-standing collaborations with Doug, we discuss his contributions to regression modeling, reporting, prognosis research, as well as some more general issues while acknowledging that we cannot cover the whole spectrum of Doug's considerable methodological output. His statement "To maximize the benefit to society, you need to not just do research but do it well" should be a driver for all researchers. To improve current and future research, we aim to summarize Doug's messages for these three topics.


Assuntos
Pesquisa Biomédica , Bélgica , Bioestatística
10.
CMAJ ; 192(32): E901-E906, 2020 Aug 10.
Artigo em Inglês | MEDLINE | ID: mdl-32778601

RESUMO

BACKGROUND: Most randomized controlled trials (RCTs) and meta-analyses of RCTs examine effect modification (also called a subgroup effect or interaction), in which the effect of an intervention varies by another variable (e.g., age or disease severity). Assessing the credibility of an apparent effect modification presents challenges; therefore, we developed the Instrument for assessing the Credibility of Effect Modification Analyses (ICEMAN). METHODS: To develop ICEMAN, we established a detailed concept; identified candidate credibility considerations in a systematic survey of the literature; together with experts, performed a consensus study to identify key considerations and develop them into instrument items; and refined the instrument based on feedback from trial investigators, systematic review authors and journal editors, who applied drafts of ICEMAN to published claims of effect modification. RESULTS: The final instrument consists of a set of preliminary considerations, core questions (5 for RCTs, 8 for meta-analyses) with 4 response options, 1 optional item for additional considerations and a rating of credibility on a visual analogue scale ranging from very low to high. An accompanying manual provides rationales, detailed instructions and examples from the literature. Seventeen potential users tested ICEMAN; their suggestions improved the user-friendliness of the instrument. INTERPRETATION: The Instrument for assessing the Credibility of Effect Modification Analyses offers explicit guidance for investigators, systematic reviewers, journal editors and others considering making a claim of effect modification or interpreting a claim made by others.


Assuntos
Metanálise como Assunto , Ensaios Clínicos Controlados Aleatórios como Assunto , Projetos de Pesquisa/normas , Consenso , Humanos
11.
Stat Med ; 38(3): 326-338, 2019 02 10.
Artigo em Inglês | MEDLINE | ID: mdl-30284314

RESUMO

Non-linear exposure-outcome relationships such as between body mass index (BMI) and mortality are common. They are best explored as continuous functions using individual participant data from multiple studies. We explore two two-stage methods for meta-analysis of such relationships, where the confounder-adjusted relationship is first estimated in a non-linear regression model in each study, then combined across studies. The "metacurve" approach combines the estimated curves using multiple meta-analyses of the relative effect between a given exposure level and a reference level. The "mvmeta" approach combines the estimated model parameters in a single multivariate meta-analysis. Both methods allow the exposure-outcome relationship to differ across studies. Using theoretical arguments, we show that the methods differ most when covariate distributions differ across studies; using simulated data, we show that mvmeta gains precision but metacurve is more robust to model mis-specification. We then compare the two methods using data from the Emerging Risk Factors Collaboration on BMI, coronary heart disease events, and all-cause mortality (>80 cohorts, >18 000 events). For each outcome, we model BMI using fractional polynomials of degree 2 in each study, with adjustment for confounders. For metacurve, the powers defining the fractional polynomials may be study-specific or common across studies. For coronary heart disease, metacurve with common powers and mvmeta correctly identify a small increase in risk in the lowest levels of BMI, but metacurve with study-specific powers does not. For all-cause mortality, all methods identify a steep U-shape. The metacurve and mvmeta methods perform well in combining complex exposure-disease relationships across studies.


Assuntos
Metanálise como Assunto , Dinâmica não Linear , Índice de Massa Corporal , Doença das Coronárias/etiologia , Doença das Coronárias/mortalidade , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Modelos Estatísticos , Mortalidade , Fatores de Risco
12.
BMC Med Res Methodol ; 19(1): 46, 2019 03 06.
Artigo em Inglês | MEDLINE | ID: mdl-30841848

RESUMO

BACKGROUND: With progress on both the theoretical and the computational fronts the use of spline modelling has become an established tool in statistical regression analysis. An important issue in spline modelling is the availability of user friendly, well documented software packages. Following the idea of the STRengthening Analytical Thinking for Observational Studies initiative to provide users with guidance documents on the application of statistical methods in observational research, the aim of this article is to provide an overview of the most widely used spline-based techniques and their implementation in R. METHODS: In this work, we focus on the R Language for Statistical Computing which has become a hugely popular statistics software. We identified a set of packages that include functions for spline modelling within a regression framework. Using simulated and real data we provide an introduction to spline modelling and an overview of the most popular spline functions. RESULTS: We present a series of simple scenarios of univariate data, where different basis functions are used to identify the correct functional form of an independent variable. Even in simple data, using routines from different packages would lead to different results. CONCLUSIONS: This work illustrate challenges that an analyst faces when working with data. Most differences can be attributed to the choice of hyper-parameters rather than the basis used. In fact an experienced user will know how to obtain a reasonable outcome, regardless of the type of spline used. However, many analysts do not have sufficient knowledge to use these powerful tools adequately and will need more guidance.


Assuntos
Algoritmos , Bioestatística/métodos , Modelos Teóricos , Linguagens de Programação , Projetos de Pesquisa , Interpretação Estatística de Dados , Humanos , Computação Matemática
13.
BMC Med Res Methodol ; 19(1): 162, 2019 07 24.
Artigo em Inglês | MEDLINE | ID: mdl-31340753

RESUMO

BACKGROUND: Omics data can be very informative in survival analysis and may improve the prognostic ability of classical models based on clinical risk factors for various diseases, for example breast cancer. Recent research has focused on integrating omics and clinical data, yet has often ignored the need for appropriate model building for clinical variables. Medical literature on classical prognostic scores, as well as biostatistical literature on appropriate model selection strategies for low dimensional (clinical) data, are often ignored in the context of omics research. The goal of this paper is to fill this methodological gap by investigating the added predictive value of gene expression data for models using varying amounts of clinical information. METHODS: We analyze two data sets from the field of survival prognosis of breast cancer patients. First, we construct several proportional hazards prediction models using varying amounts of clinical information based on established medical knowledge. These models are then used as a starting point (i.e. included as a clinical offset) for identifying informative gene expression variables using resampling procedures and penalized regression approaches (model based boosting and the LASSO). In order to assess the added predictive value of the gene signatures, measures of prediction accuracy and separation are examined on a validation data set for the clinical models and the models that combine the two sources of information. RESULTS: For one data set, we do not find any substantial added predictive value of the omics data when compared to clinical models. On the second data set, we identify a noticeable added predictive value, however only for scenarios where little or no clinical information is included in the modeling process. We find that including more clinical information can lead to a smaller number of selected omics predictors. CONCLUSIONS: New research using omics data should include all available established medical knowledge in order to allow an adequate evaluation of the added predictive value of omics data. Including all relevant clinical information in the analysis might also lead to more parsimonious models. The developed procedure to assess the predictive value of the omics data can be readily applied to other scenarios.


Assuntos
Neoplasias da Mama/genética , Neoplasias da Mama/mortalidade , Genômica/estatística & dados numéricos , Modelos Estatísticos , Análise de Sobrevida , Conjuntos de Dados como Assunto , Feminino , Expressão Gênica , Humanos , Fatores de Risco
15.
Br J Cancer ; 119(10): 1288-1296, 2018 11.
Artigo em Inglês | MEDLINE | ID: mdl-30353050

RESUMO

BACKGROUND: Cancer prognostic biomarkers have shown disappointing clinical applicability. The objective of this study was to classify and estimate how study results are overinterpreted and misreported in prognostic factor studies in oncology. METHODS: This systematic review focused on 17 oncology journals with an impact factor above 7. PubMed was searched for primary clinical studies published in 2015, evaluating prognostic factors. We developed a classification system, focusing on three domains: misleading reporting (selective, incomplete reporting, misreporting), misleading interpretation (unreliable statistical analysis, spin) and misleading extrapolation of the results (claiming irrelevant clinical applicability, ignoring uncertainty). RESULTS: Our search identified 10,844 articles. The 98 studies included investigated a median of two prognostic factors (Q1-Q3, 1-7). The prognostic factors' effects were selectively and incompletely reported in 35/98 and 24/98 full texts, respectively. Twenty-nine articles used linguistic spin in the form of strong statements. Linguistic spin rejecting non-significant results was found in 34 full-text results and 15 abstract results sections. One in five articles had discussion and/or abstract conclusions that were inconsistent with the study findings. Sixteen reports had discrepancies between their full-text and abstract conclusions. CONCLUSIONS: Our study provides evidence of frequent overinterpretation of findings of prognostic factor assessment in high-impact medical oncology journals.


Assuntos
Biomarcadores Tumorais/metabolismo , Oncologia , Neoplasias/metabolismo , Humanos , Neoplasias/patologia , Prognóstico
16.
Am J Epidemiol ; 185(8): 650-660, 2017 04 15.
Artigo em Inglês | MEDLINE | ID: mdl-28369154

RESUMO

In most epidemiologic studies and in clinical research generally, there are variables with a spike at zero, namely variables for which a proportion of individuals have zero exposure (e.g., never smokers) and among those exposed the variable has a continuous distribution. Different options exist for modeling such variables, such as categorization where the nonexposed form the reference group, or ignoring the spike by including the variable in the regression model with or without some transformation or modeling procedures. It has been shown that such situations can be analyzed by adding a binary indicator (exposed/nonexposed) to the regression model, and a method based on fractional polynomials with which to estimate a suitable functional form for the positive portion of the spike-at-zero variable distribution has been developed. In this paper, we compare different approaches using data from 3 case-control studies carried out in Germany: the Mammary Carcinoma Risk Factor Investigation (MARIE), a breast cancer study conducted in 2002-2005 (Flesch-Janys et al., Int J Cancer. 2008;123(4):933-941); the Rhein-Neckar Larynx Study, a study of laryngeal cancer conducted in 1998-2000 (Dietz et al., Int J Cancer. 2004;108(6):907-911); and a lung cancer study conducted in 1988-1993 (Jöckel et al., Int J Epidemiol. 1998;27(4):549-560). Strengths and limitations of different procedures are demonstrated, and some recommendations for practical use are given.


Assuntos
Interpretação Estatística de Dados , Modelos Estatísticos , Idoso , Amianto/toxicidade , Neoplasias da Mama/etiologia , Estudos de Casos e Controles , Materiais de Construção/efeitos adversos , Relação Dose-Resposta a Droga , Poeira , Terapia de Reposição de Estrogênios/efeitos adversos , Feminino , Humanos , Neoplasias Laríngeas/induzido quimicamente , Neoplasias Pulmonares/induzido quimicamente , Masculino , Pessoa de Meia-Idade , Exposição Ocupacional/efeitos adversos , Análise de Regressão , Fatores de Risco
17.
Biometrics ; 72(1): 272-80, 2016 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-26288150

RESUMO

In recent years, increasing attention has been devoted to the problem of the stability of multivariable regression models, understood as the resistance of the model to small changes in the data on which it has been fitted. Resampling techniques, mainly based on the bootstrap, have been developed to address this issue. In particular, the approaches based on the idea of "inclusion frequency" consider the repeated implementation of a variable selection procedure, for example backward elimination, on several bootstrap samples. The analysis of the variables selected in each iteration provides useful information on the model stability and on the variables' importance. Recent findings, nevertheless, show possible pitfalls in the use of the bootstrap, and alternatives such as subsampling have begun to be taken into consideration in the literature. Using model selection frequencies and variable inclusion frequencies, we empirically compare these two different resampling techniques, investigating the effect of their use in selected classical model selection procedures for multivariable regression. We conduct our investigations by analyzing two real data examples and by performing a simulation study. Our results reveal some advantages in using a subsampling technique rather than the bootstrap in this context.


Assuntos
Algoritmos , Modelos Estatísticos , Análise Multivariada , Análise de Regressão , Tamanho da Amostra , Simulação por Computador , Interpretação Estatística de Dados , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
18.
Stata J ; 16(1): 72-87, 2016 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-29398977

RESUMO

In a recent article, Royston (2015, Stata Journal 15: 275-291) introduced the approximate cumulative distribution (acd) transformation of a continuous covariate x as a route toward modeling a sigmoid relationship between x and an outcome variable. In this article, we extend the approach to multivariable modeling by modifying the standard Stata program mfp. The result is a new program, mfpa, that has all the features of mfp plus the ability to fit a new model for user-selected covariates that we call fp1(p1, p2). The fp1(p1, p2) model comprises the best-fitting combination of a dimension-one fractional polynomial (fp1) function of x and an fp1 function of acd (x). We describe a new model-selection algorithm called function-selection procedure with acd transformation, which uses significance testing to attempt to simplify an fp1(p1, p2) model to a submodel, an fp1 or linear model in x or in acd (x). The function-selection procedure with acd transformation is related in concept to the fsp (fp function-selection procedure), which is an integral part of mfp and which is used to simplify a dimension-two (fp2) function. We describe the mfpa command and give univariable and multivariable examples with real data to demonstrate its use.

19.
Biom J ; 58(4): 783-96, 2016 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-27072783

RESUMO

In epidemiology and clinical research, predictors often take value zero for a large amount of observations while the distribution of the remaining observations is continuous. These predictors are called variables with a spike at zero. Examples include smoking or alcohol consumption. Recently, an extension of the fractional polynomial (FP) procedure, a technique for modeling nonlinear relationships, was proposed to deal with such situations. To indicate whether or not a value is zero, a binary variable is added to the model. In a two stage procedure, called FP-spike, the necessity of the binary variable and/or the continuous FP function for the positive part are assessed for a suitable fit. In univariate analyses, the FP-spike procedure usually leads to functional relationships that are easy to interpret. This paper introduces four approaches for dealing with two variables with a spike at zero (SAZ). The methods depend on the bivariate distribution of zero and nonzero values. Bi-Sep is the simplest of the four bivariate approaches. It uses the univariate FP-spike procedure separately for the two SAZ variables. In Bi-D3, Bi-D1, and Bi-Sub, proportions of zeros in both variables are considered simultaneously in the binary indicators. Therefore, these strategies can account for correlated variables. The methods can be used for arbitrary distributions of the covariates. For illustration and comparison of results, data from a case-control study on laryngeal cancer, with smoking and alcohol intake as two SAZ variables, is considered. In addition, a possible extension to three or more SAZ variables is outlined. A combination of log-linear models for the analysis of the correlation in combination with the bivariate approaches is proposed.


Assuntos
Interpretação Estatística de Dados , Modelos Estatísticos , Consumo de Bebidas Alcoólicas , Algoritmos , Estudos de Casos e Controles , Humanos , Dinâmica não Linear , Estatística como Assunto
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA