Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 95
Filtrar
1.
Am J Epidemiol ; 2024 Aug 30.
Artigo em Inglês | MEDLINE | ID: mdl-39214647

RESUMO

To optimize colorectal cancer (CRC) surveillance, accurate information on the risk of developing CRC from premalignant lesions is essential. However, directly observing this risk is challenging since precursor lesions, i.e., advanced adenomas (AAs), are removed upon detection. Statistical methods for multistate models can estimate risks, but estimation is challenging due to low CRC incidence. We propose an outcome-dependent sampling (ODS) design for this problem in which we oversample CRCs. More specifically, we propose a three-state model for jointly estimating the time distributions from baseline colonoscopy to AA and from AA onset to CRC accounting for the ODS design using a weighted likelihood approach. We applied the methodology to a sample from a Norwegian adenoma cohort (1993-2007), comprising 1, 495 individuals (median follow-up 6.8 years [IQR: 1.1 - 12.8 years]) of whom 648 did and 847 did not develop CRC. We observed a 5-year AA risk of 13% and 34% for individuals having non-advanced adenoma (NAA) and AA removed at baseline colonoscopy, respectively. Upon AA development, the subsequent risk to develop CRC in 5 years was 17% and age-dependent. These estimates provide a basis for optimizing surveillance intensity and determining the optimal trade-off between CRC prevention, costs, and use of colonoscopy resources.

2.
J Appl Stat ; 51(11): 2139-2156, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39157272

RESUMO

The transformation model with partly interval-censored data offers a highly flexible modeling framework that can simultaneously support multiple common survival models and a wide variety of censored data types. However, the real data may contain unexplained heterogeneity that cannot be entirely explained by covariates and may be brought on by a variety of unmeasured regional characteristics. Due to this, we introduce the conditionally autoregressive prior into the transformation model with partly interval-censored data and take the spatial frailty into account. An efficient Markov chain Monte Carlo method is proposed to handle the posterior sampling and model inference. The approach is simple to use and does not include any challenging Metropolis steps owing to four-stage data augmentation. Through several simulations, the suggested method's empirical performance is assessed and then the method is used in a leukemia study.

3.
Commun Stat Theory Methods ; 53(17): 6038-6054, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39100716

RESUMO

Phase IV clinical trials are designed to monitor long-term side effects of medical treatment. For instance, childhood cancer survivors treated with chest radiation and/or anthracycline are often at risk of developing cardiotoxicity during their adulthood. Often the primary focus of a study could be on estimating the cumulative incidence of a particular outcome of interest such as cardiotoxicity. However, it is challenging to evaluate patients continuously and usually, this information is collected through cross-sectional surveys by following patients longitudinally. This leads to interval-censored data since the exact time of the onset of the toxicity is unknown. Rai et al. computed the transition intensity rate using a parametric model and estimated parameters using maximum likelihood approach in an illness-death model. However, such approach may not be suitable if the underlying parametric assumptions do not hold. This manuscript proposes a semi-parametric model, with a logit relationship for the treatment intensities in two groups, to estimate the transition intensity rates within the context of an illness-death model. The estimation of the parameters is done using an EM algorithm with profile likelihood. Results from the simulation studies suggest that the proposed approach is easy to implement and yields comparable results to the parametric model.

4.
Stat Med ; 43(20): 3921-3942, 2024 Sep 10.
Artigo em Inglês | MEDLINE | ID: mdl-38951867

RESUMO

For survival analysis applications we propose a novel procedure for identifying subgroups with large treatment effects, with focus on subgroups where treatment is potentially detrimental. The approach, termed forest search, is relatively simple and flexible. All-possible subgroups are screened and selected based on hazard ratio thresholds indicative of harm with assessment according to the standard Cox model. By reversing the role of treatment one can seek to identify substantial benefit. We apply a splitting consistency criteria to identify a subgroup considered "maximally consistent with harm." The type-1 error and power for subgroup identification can be quickly approximated by numerical integration. To aid inference we describe a bootstrap bias-corrected Cox model estimator with variance estimated by a Jacknife approximation. We provide a detailed evaluation of operating characteristics in simulations and compare to virtual twins and generalized random forests where we find the proposal to have favorable performance. In particular, in our simulation setting, we find the proposed approach favorably controls the type-1 error for falsely identifying heterogeneity with higher power and classification accuracy for substantial heterogeneous effects. Two real data applications are provided for publicly available datasets from a clinical trial in oncology, and HIV.


Assuntos
Simulação por Computador , Infecções por HIV , Modelos de Riscos Proporcionais , Humanos , Análise de Sobrevida
5.
Comput Med Imaging Graph ; 115: 102395, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38729092

RESUMO

In this paper, we hypothesize that it is possible to localize image regions of preclinical tumors in a Chest X-ray (CXR) image by a weakly-supervised training of a survival prediction model using a dataset containing CXR images of healthy patients and their time-to-death label. These visual explanations can empower clinicians in early lung cancer detection and increase patient awareness of their susceptibility to the disease. To test this hypothesis, we train a censor-aware multi-class survival prediction deep learning classifier that is robust to imbalanced training, where classes represent quantized number of days for time-to-death prediction. Such multi-class model allows us to use post-hoc interpretability methods, such as Grad-CAM, to localize image regions of preclinical tumors. For the experiments, we propose a new benchmark based on the National Lung Cancer Screening Trial (NLST) dataset to test weakly-supervised preclinical tumor localization and survival prediction models, and results suggest that our proposed method shows state-of-the-art C-index survival prediction and weakly-supervised preclinical tumor localization results. To our knowledge, this constitutes a pioneer approach in the field that is able to produce visual explanations of preclinical events associated with survival prediction results.


Assuntos
Detecção Precoce de Câncer , Neoplasias Pulmonares , Humanos , Neoplasias Pulmonares/diagnóstico por imagem , Neoplasias Pulmonares/mortalidade , Detecção Precoce de Câncer/métodos , Radiografia Torácica , Aprendizado Profundo , Análise de Sobrevida
6.
Stat Methods Med Res ; 33(4): 681-701, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38444377

RESUMO

Relative survival represents the preferred framework for the analysis of population cancer survival data. The aim is to model the survival probability associated with cancer in the absence of information about the cause of death. Recent data linkage developments have allowed for incorporating the place of residence into the population cancer databases; however, modeling this spatial information has received little attention in the relative survival setting. We propose a flexible parametric class of spatial excess hazard models (along with inference tools), named "Relative Survival Spatial General Hazard," that allows for the inclusion of fixed and spatial effects in both time-level and hazard-level components. We illustrate the performance of the proposed model using an extensive simulation study, and provide guidelines about the interplay of sample size, censoring, and model misspecification. We present a case study using real data from colon cancer patients in England. This case study illustrates how a spatial model can be used to identify geographical areas with low cancer survival, as well as how to summarize such a model through marginal survival quantities and spatial effects.


Assuntos
Neoplasias do Colo , Humanos , Modelos de Riscos Proporcionais , Análise de Sobrevida , Simulação por Computador , Tamanho da Amostra , Modelos Estatísticos
7.
Stat Methods Med Res ; 33(5): 794-806, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38502008

RESUMO

Observational data (e.g. electronic health records) has become increasingly important in evidence-based research on dynamic treatment regimes, which tailor treatments over time to patients based on their characteristics and evolving clinical history. It is of great interest for clinicians and statisticians to identify an optimal dynamic treatment regime that can produce the best expected clinical outcome for each individual and thus maximize the treatment benefit over the population. Observational data impose various challenges for using statistical tools to estimate optimal dynamic treatment regimes. Notably, the task becomes more sophisticated when the clinical outcome of primary interest is time-to-event. Here, we propose a matching-based machine learning method to identify the optimal dynamic treatment regime with time-to-event outcomes subject to right-censoring using electronic health record data. In contrast to the established inverse probability weighting-based dynamic treatment regime methods, our proposed approach provides better protection against model misspecification and extreme weights in the context of treatment sequences, effectively addressing a prevalent challenge in the longitudinal analysis of electronic health record data. In simulations, the proposed method demonstrates robust performance across a range of scenarios. In addition, we illustrate the method with an application to estimate optimal dynamic treatment regimes for patients with advanced non-small cell lung cancer using a real-world, nationwide electronic health record database from Flatiron Health.


Assuntos
Registros Eletrônicos de Saúde , Aprendizado de Máquina , Humanos , Registros Eletrônicos de Saúde/estatística & dados numéricos , Modelos Estatísticos , Neoplasias Pulmonares/tratamento farmacológico , Resultado do Tratamento , Carcinoma Pulmonar de Células não Pequenas/tratamento farmacológico
8.
Sci Total Environ ; 921: 171155, 2024 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-38387591

RESUMO

The occurrence and distribution of 1,4-dioxane was investigated in 280 source and finished drinking water samples from 31 Chinese cities, based on which its ecological and health risks were systematically evaluated. The findings demonstrated that 1,4-dioxane was detected in about 80.0 % samples with values ranging from n.d. to 7757 ng/L in source water and n.d. to 2918 ng/L in drinking water. 1,4-Dioxane showed limited removal efficiency using conventional coagulation-sedimentation-filtration processes (14 % ± 48 %), and a removal efficiency of 35 % ± 44 % using ozonation-biological activated carbon advanced treatment processes. Relatively higher concentrations, detection frequency and environmental risk were observed in Taihu Lake, Yellow River, Yangtze River, Zhujiang River, and Huaihe River mainly in the eastern and southern regions, where there are considerable industrial activities and comparatively high population densities. The widespread presence as by-products during manufacturing consumer products e.g., ethoxylated surfactants, suggested municipal wastewater discharges were the dominant source for the ubiquitous occurrence of 1,4-dioxane, while industrial activities, e.g. resin manufacturing, also contribute considerably to the elevated concentrations of 1,4-dioxane. The estimated risk quotients were in the range of <1.5 × 10-4 for ecological risk, <5.0 × 10-3 by oral exposure and < 5.0 × 10-2 by inhalation exposure for health risk, illustrating limited ecological harm to water environment or chronic toxicity to human health. For carcinogenic risk, 1,4-Dioxane presented a mean risk of 1.8 × 10-6 by oral exposure, which slightly surpassed the recommended acceptable levels of U.S. EPA (<10-6), and risk from inhalation exposure could be negligible. The pervasiveness in drinking water, low removal efficiencies during water treatment processes, and suspected health impacts, highlighted the necessity to set related water quality standards of 1,4-dioxane in order to improve water environment in China.


Assuntos
Dioxanos , Água Potável , Poluentes Químicos da Água , Humanos , Poluentes Químicos da Água/análise , Qualidade da Água , China , Rios , Monitoramento Ambiental
9.
Biom J ; 66(2): e2200165, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38403463

RESUMO

Clinical trials involving novel immuno-oncology therapies frequently exhibit survival profiles which violate the proportional hazards assumption due to a delay in treatment effect, and, in such settings, the survival curves in the two treatment arms may have a crossing before the two curves eventually separate. To flexibly model such scenarios, we describe a nonparametric approach for estimating the treatment arm-specific survival functions which constrains these two survival functions to cross at most once without making any additional assumptions about how the survival curves are related. A main advantage of our approach is that it provides an estimate of a crossing time if such a crossing exists, and, moreover, our method generates interpretable measures of treatment benefit including crossing-conditional survival probabilities and crossing-conditional estimates of restricted residual mean life. Our estimates of these measures may be used together with efficacy measures from a primary analysis to provide further insight into differences in survival across treatment arms. We demonstrate the use and effectiveness of our approach with a large simulation study and an analysis of reconstructed outcomes from a recent combination therapy trial.


Assuntos
Atraso no Tratamento , Humanos , Análise de Sobrevida , Modelos de Riscos Proporcionais , Simulação por Computador
10.
Asian Pac J Cancer Prev ; 24(12): 4167-4177, 2023 Dec 01.
Artigo em Inglês | MEDLINE | ID: mdl-38156852

RESUMO

OBJECTIVE: Cure models are frequently used in survival analysis to account for a cured fraction in the data. When there is a cure rate present, researchers often prefer cure models over parametric models to analyse the survival data. These models enable the ability to define the probability distribution of survival durations for patients who are at risk. Various distributions can be considered for the survival times, such as Exponentiated Weibull Exponential (EWE), Exponential Exponential (EE), Weibull and lognormal distribution. The objective of this research is to choose the most appropriate distribution that accurately represents the survival times of patients who have not been cured. This will be accomplished by comparing various non-mixture cure models that are based on the EWE distribution with its sub-distributions, and distributions distinct from those belonging to the EWE distribution family. MATERIAL AND METHODS: A sample of 85 patients diagnosed with superficial bladder tumours was selected to be used in fitting the non-mixture cure model. In order to estimate the parameters of the suggested model, which takes into account the presence of a cure rate, censored data, and covariates, we utilized the maximum likelihood estimation technique using R software version 3.5.7. RESULT: Upon conducting a comparison of various parametric models fitted to the data, both with and without considering the cure fraction and without incorporating any predictors, the EE distribution yields the lowest AIC, BIC, and HQIC values among all the distributions considered in this study, (1191.921/1198.502, 1201.692/1203.387, 1195.851/1200.467). Furthermore, when considering a non-mixture cure model utilizing the EE distribution along with covariates, an estimated ratio was obtained between the probabilities of being cured for placebo and thiotepa groups (and its 95% confidence intervals) were 0.76130 (0.13914, 6.81863). CONCLUSION: The findings of this study indicate that EE distribution is the optimal selection for determining the duration of survival in individuals diagnosed with bladder cancer.


Assuntos
Modelos Estatísticos , Neoplasias da Bexiga Urinária , Humanos , Análise de Sobrevida , Neoplasias da Bexiga Urinária/terapia
11.
Front Pharmacol ; 14: 1255021, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37964874

RESUMO

Background: Although several strategies for modelling competing events in discrete event simulation (DES) exist, a methodological gap for the event-specific probabilities and distributions (ESPD) approach when dealing with censored data remains. This study defines and illustrates the ESPD strategy for censored data. Methods: The ESPD approach assumes that events are generated through a two-step process. First, the type of event is selected according to some (unknown) mixture proportions. Next, the times of occurrence of the events are sampled from a corresponding survival distribution. Both of these steps can be modelled based on covariates. Performance was evaluated through a simulation study, considering sample size and levels of censoring. Additionally, an oncology-related case study was conducted to assess the ability to produce realistic results, and to demonstrate its implementation using both frequentist and Bayesian frameworks in R. Results: The simulation study showed good performance of the ESPD approach, with accuracy decreasing as sample sizes decreased and censoring levels increased. The average relative absolute error of the event probability (95%-confidence interval) ranged from 0.04 (0.00; 0.10) to 0.23 (0.01; 0.66) for 60% censoring and sample size 50, showing that increased censoring and decreased sample size resulted in lower accuracy. The approach yielded realistic results in the case study. Discussion: The ESPD approach can be used to model competing events in DES based on censored data. Further research is warranted to compare the approach to other modelling approaches for DES, and to evaluate its usefulness in estimating cumulative event incidences in a broader context.

12.
Stat Methods Med Res ; 32(11): 2083-2095, 2023 11.
Artigo em Inglês | MEDLINE | ID: mdl-37559549

RESUMO

Contemporary works in change-point survival models mainly focus on an unknown universal change-point shared by the whole study population. However, in some situations, the change-point is plausibly individual-specific, such as when it corresponds to the telomere length or menopausal age. Also, maximum-likelihood-based inference for the fixed change-point parameter is notoriously complicated. The asymptotic distribution of the maximum-likelihood estimator is non-standard, and computationally intensive bootstrap techniques are commonly used to retrieve its sampling distribution. This article is motivated by a breast cancer study, where the disease-free survival time of the patients is postulated to be regulated by the menopausal age, which is unobserved. As menopausal age varies across patients, a fixed change-point survival model may be inadequate. Therefore, we propose a novel proportional hazards model with a random change-point. We develop a nonparametric maximum-likelihood estimation approach and devise a stable expectation-maximization algorithm to compute the estimators. Because the model is regular, we employ conventional likelihood theory for inference based on the asymptotic normality of the Euclidean parameter estimators, and the variance of the asymptotic distribution can be consistently estimated by a profile-likelihood approach. A simulation study demonstrates the satisfactory finite-sample performance of the proposed methods, which yield small bias and proper coverage probabilities. The methods are applied to the motivating breast cancer study.


Assuntos
Neoplasias da Mama , Humanos , Feminino , Funções Verossimilhança , Análise de Sobrevida , Modelos de Riscos Proporcionais , Simulação por Computador
13.
Stat Methods Med Res ; 32(8): 1527-1542, 2023 08.
Artigo em Inglês | MEDLINE | ID: mdl-37338958

RESUMO

Censored data frequently appeared in applications across a variety of different areas like epidemiology or medical research. Traditionally statistical inference on this data mechanism was based on some pre-assigned models that will suffer from the risk of model-misspecification. This article proposes a two-folded shrinkage procedure for simultaneous structure identification and variable selection of the semiparametric accelerated failure additive model with right-censored data, in which the nonparametric functions are addressed by spline approximation. Under some regularity conditions, the consistency of model structure identification is theoretically established in the sense that the proposed method can automatically separate the linear and zero components from the nonlinear ones with probability approaching to one. Detailed issues in computation and turning parameter selection are also discussed. Finally, we illustrate the proposed method by some simulation studies and two real data applications to the primary biliary cirrhosis data and skin cutaneous melanoma data.


Assuntos
Melanoma , Neoplasias Cutâneas , Humanos , Modelos Estatísticos , Simulação por Computador , Probabilidade
14.
Environ Sci Pollut Res Int ; 30(31): 77299-77317, 2023 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-37253915

RESUMO

This study details the occurrence and concentrations of organic micropollutants (OMPs) in stormwater collected from a highway bridge catchment in Sweden. The prioritized OMPs were bisphenol-A (BPA), eight alkylphenols, sixteen polycyclic aromatic hydrocarbons (PAHs), and four fractions of petroleum hydrocarbons (PHCs), along with other global parameters, namely, total organic carbon (TOC), total suspended solids (TSS), turbidity, and conductivity (EC). A Monte Carlo (MC) simulation was applied to estimate the event mean concentrations (EMC) of OMPs based on intra-event subsamples during eight rain events, and analyze the associated uncertainties. Assessing the occurrence of all OMPs in the catchment and comparing the EMC values with corresponding environmental quality standards (EQSs) revealed that BPA, octylphenol (OP), nonylphenol (NP), five carcinogenic and four non-carcinogenic PAHs, and C16-C40 fractions of PHCs can be problematic for freshwater. On the other hand, alkylphenol ethoxylates (OPnEO and NPnEO), six low molecule weight PAHs, and lighter fractions of PHCs (C10-C16) do not occur at levels that are expected to pose an environmental risk. Our data analysis revealed that turbidity has a strong correlation with PAHs, PHCs, and TSS; and TOC and EC highly associated with BPA concentrations. Furthermore, the EMC error analysis showed that high uncertainty in OMP data can influence the final interpretation of EMC values. As such, some of the challenges that were experienced in the presented research yielded suggestions for future monitoring programs to obtain more reliable data acquisition and analysis.


Assuntos
Hidrocarbonetos Policíclicos Aromáticos , Poluentes Químicos da Água , Poluentes Químicos da Água/análise , Monitoramento Ambiental , Suécia , Chuva , Hidrocarbonetos Policíclicos Aromáticos/análise
15.
Stat Med ; 42(12): 1981-1994, 2023 05 30.
Artigo em Inglês | MEDLINE | ID: mdl-37002623

RESUMO

Immunotherapy cancer clinical trials routinely feature an initial period during which the treatment is given without evident therapeutic benefit, which may be followed by a period during which an effective therapy reduces the hazard for event occurrence. The nature of this treatment effect is incompatible with the proportional hazards assumption, which has prompted much work on the development of alternative effect measures of frameworks for testing. We consider tests based on individual and combination of early- and late-emphasis infimum and supremum logrank statistics, describe how they can be implemented, and evaluate their performance in simulation studies. Through this work and illustrative applications we conclude that this class of test statistics offers a new and powerful framework for assessing treatment effects in cancer clinical trials involving immunotherapies.


Assuntos
Neoplasias , Humanos , Modelos de Riscos Proporcionais , Simulação por Computador , Neoplasias/tratamento farmacológico , Oncologia , Análise de Sobrevida
16.
Lifetime Data Anal ; 29(1): 34-65, 2023 01.
Artigo em Inglês | MEDLINE | ID: mdl-36125666

RESUMO

An important complexity in censored data is that only partial information on the variables of interest is observed. In recent years, a large family of asymmetric distributions and maximum likelihood estimation for the parameters in that family has been studied, in the complete data case. In this paper, we exploit the appealing family of quantile-based asymmetric distributions to obtain flexible distributions for modelling right censored survival data. The flexible distributions can be generated using a variety of symmetric distributions and monotonic link functions. The interesting feature of this family is that the location parameter coincides with an index-parameter quantile of the distribution. This family is also suitable to characterize different shapes of the hazard function (constant, increasing, decreasing, bathtub and upside-down bathtub or unimodal shapes). Statistical inference is done for the whole family of distributions. The parameter estimation is carried out by optimizing a non-differentiable likelihood function. The asymptotic properties of the estimators are established. The finite-sample performance of the proposed method and the impact of censorship are investigated via simulations. Finally, the methodology is illustrated on two real data examples (times to weaning in breast-fed data and German Breast Cancer data).


Assuntos
Modelos Estatísticos , Humanos , Funções Verossimilhança
17.
Lifetime Data Anal ; 29(1): 188-212, 2023 01.
Artigo em Inglês | MEDLINE | ID: mdl-36208362

RESUMO

The proportional hazards (PH) model is, arguably, the most popular model for the analysis of lifetime data arising from epidemiological studies, among many others. In such applications, analysts may be faced with censored outcomes and/or studies which institute enrollment criterion leading to left truncation. Censored outcomes arise when the event of interest is not observed but rather is known relevant to an observation time(s). Left truncated data occur in studies that exclude participants who have experienced the event prior to being enrolled in the study. If not accounted for, both of these features can lead to inaccurate inferences about the population under study. Thus, to overcome this challenge, herein we propose a novel unified PH model that can be used to accommodate both of these features. In particular, our approach can seamlessly analyze exactly observed failure times along with interval-censored observations, while aptly accounting for left truncation. To facilitate model fitting, an expectation-maximization algorithm is developed through the introduction of carefully structured latent random variables. To provide modeling flexibility, a monotone spline representation is used to approximate the cumulative baseline hazard function. The performance of our methodology is evaluated through a simulation study and is further illustrated through the analysis of two motivating data sets; one that involves child mortality in Nigeria and the other prostate cancer.


Assuntos
Algoritmos , Masculino , Criança , Humanos , Modelos de Riscos Proporcionais , Simulação por Computador
18.
Stat Med ; 42(3): 264-280, 2023 02 10.
Artigo em Inglês | MEDLINE | ID: mdl-36437483

RESUMO

The mean residual life (MRL) function is an important and attractive alternative to the hazard function for characterizing the distribution of a time-to-event variable. In this article, we study the modeling and inference of a family of generalized MRL models for right-censored survival data with censoring indicators missing at random. To estimate the model parameters, augmented inverse probability weighted estimating equation approaches are developed, in which the non-missingness probability and the conditional probability of an uncensored observation are estimated by parametric methods or nonparametric kernel smoothing techniques. Asymptotic properties of the proposed estimators are established and finite sample performance is evaluated by extensive simulation studies. An application to brain cancer data is presented to illustrate the proposed methods.


Assuntos
Neoplasias Encefálicas , Humanos , Simulação por Computador , Probabilidade , Modelos Estatísticos
19.
Entropy (Basel) ; 24(12)2022 Dec 15.
Artigo em Inglês | MEDLINE | ID: mdl-36554238

RESUMO

This study aims to propose modified semiparametric estimators based on six different penalty and shrinkage strategies for the estimation of a right-censored semiparametric regression model. In this context, the methods used to obtain the estimators are ridge, lasso, adaptive lasso, SCAD, MCP, and elasticnet penalty functions. The most important contribution that distinguishes this article from its peers is that it uses the local polynomial method as a smoothing method. The theoretical estimation procedures for the obtained estimators are explained. In addition, a simulation study is performed to see the behavior of the estimators and make a detailed comparison, and hepatocellular carcinoma data are estimated as a real data example. As a result of the study, the estimators based on adaptive lasso and SCAD were more resistant to censorship and outperformed the other four estimators.

20.
J Comput Aided Mol Des ; 36(12): 837-849, 2022 12.
Artigo em Inglês | MEDLINE | ID: mdl-36305984

RESUMO

In an earlier study (Didziapetris R & Lanevskij K (2016). J Comput Aided Mol Des. 30:1175-1188) we collected a database of publicly available hERG inhibition data for almost 6700 drug-like molecules and built a probabilistic Gradient Boosting classifier with a minimal set of physicochemical descriptors (log P, pKa, molecular size and topology parameters). This approach favored interpretability over statistical performance but still achieved an overall classification accuracy of 75%. In the current follow-up work we expanded the database (provided in Supplementary Information) to almost 9400 molecules and performed temporal validation of the model on a set of novel chemicals from recently published lead optimization projects. Validation results showed almost no performance degradation compared to the original study. Additionally, we rebuilt the model using AFT (Accelerated Failure Time) learning objective in XGBoost, which accepts both quantitative and censored data often reported in protein inhibition studies. The new model achieved a similar level of accuracy of discerning hERG blockers from non-blockers at 10 µM threshold, which can be conceived as close to the performance ceiling for methods aiming to describe only non-specific ligand interactions with hERG. Yet, this model outputs quantitative potency values (IC50) and is not tied to a particular classification cut-off. pIC50 from patch-clamp measurements can be predicted with R2 ≈ 0.4 and MAE < 0.5, which enables ligand ranking according to their expected potency levels. The employed approach can be valuable for quantitative modeling of various ADME and drug safety endpoints with a high prevalence of censored data.


Assuntos
Canais de Potássio Éter-A-Go-Go , Relação Quantitativa Estrutura-Atividade , Canais de Potássio Éter-A-Go-Go/química , Canais de Potássio Éter-A-Go-Go/metabolismo , Bloqueadores dos Canais de Potássio/farmacologia , Bloqueadores dos Canais de Potássio/química , Ligantes , Bases de Dados Factuais
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA