Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 152
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
J Hepatol ; 80(2): 268-281, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-37939855

RESUMEN

BACKGROUND & AIMS: Cholemic nephropathy (CN) is a severe complication of cholestatic liver diseases for which there is no specific treatment. We revisited its pathophysiology with the aim of identifying novel therapeutic strategies. METHODS: Cholestasis was induced by bile duct ligation (BDL) in mice. Bile flux in kidneys and livers was visualized by intravital imaging, supported by MALDI mass spectrometry imaging and liquid chromatography-tandem mass spectrometry. The effect of AS0369, a systemically bioavailable apical sodium-dependent bile acid transporter (ASBT) inhibitor, was evaluated by intravital imaging, RNA-sequencing, histological, blood, and urine analyses. Translational relevance was assessed in kidney biopsies from patients with CN, mice with a humanized bile acid (BA) spectrum, and via analysis of serum BAs and KIM-1 (kidney injury molecule 1) in patients with liver disease and hyperbilirubinemia. RESULTS: Proximal tubular epithelial cells (TECs) reabsorbed and enriched BAs, leading to oxidative stress and death of proximal TECs, casts in distal tubules and collecting ducts, peritubular capillary leakiness, and glomerular cysts. Renal ASBT inhibition by AS0369 blocked BA uptake into TECs and prevented kidney injury up to 6 weeks after BDL. Similar results were obtained in mice with humanized BA composition. In patients with advanced liver disease, serum BAs were the main determinant of KIM-1 levels. ASBT expression in TECs was preserved in biopsies from patients with CN, further highlighting the translational potential of targeting ASBT to treat CN. CONCLUSIONS: BA enrichment in proximal TECs followed by oxidative stress and cell death is a key early event in CN. Inhibiting renal ASBT and consequently BA enrichment in TECs prevents CN and systemically decreases BA concentrations. IMPACT AND IMPLICATIONS: Cholemic nephropathy (CN) is a severe complication of cholestasis and an unmet clinical need. We demonstrate that CN is triggered by the renal accumulation of bile acids (BAs) that are considerably increased in the systemic blood. Specifically, the proximal tubular epithelial cells of the kidney take up BAs via the apical sodium-dependent bile acid transporter (ASBT). We developed a therapeutic compound that blocks ASBT in the kidneys, prevents BA overload in tubular epithelial cells, and almost completely abolished all disease hallmarks in a CN mouse model. Renal ASBT inhibition represents a potential therapeutic strategy for patients with CN.


Asunto(s)
Proteínas Portadoras , Colestasis , Enfermedades Renales , Hepatopatías , Glicoproteínas de Membrana , Transportadores de Anión Orgánico Sodio-Dependiente , Simportadores , Humanos , Ratones , Animales , Colestasis/complicaciones , Colestasis/metabolismo , Riñón/metabolismo , Simportadores/metabolismo , Ácidos y Sales Biliares/metabolismo , Hígado/metabolismo , Conductos Biliares/metabolismo , Hepatopatías/metabolismo , Sodio
2.
Brief Bioinform ; 23(1)2022 01 17.
Artículo en Inglés | MEDLINE | ID: mdl-34498681

RESUMEN

Feature selection is crucial for the analysis of high-dimensional data, but benchmark studies for data with a survival outcome are rare. We compare 14 filter methods for feature selection based on 11 high-dimensional gene expression survival data sets. The aim is to provide guidance on the choice of filter methods for other researchers and practitioners. We analyze the accuracy of predictive models that employ the features selected by the filter methods. Also, we consider the run time, the number of selected features for fitting models with high predictive accuracy as well as the feature selection stability. We conclude that the simple variance filter outperforms all other considered filter methods. This filter selects the features with the largest variance and does not take into account the survival outcome. Also, we identify the correlation-adjusted regression scores filter as a more elaborate alternative that allows fitting models with similar predictive accuracy. Additionally, we investigate the filter methods based on feature rankings, finding groups of similar filters.


Asunto(s)
Algoritmos , Benchmarking , Expresión Génica
3.
Regul Toxicol Pharmacol ; 148: 105583, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38401761

RESUMEN

The alkaline comet assay is frequently used as in vivo follow-up test within different regulatory environments to characterize the DNA-damaging potential of different test items. The corresponding OECD Test guideline 489 highlights the importance of statistical analyses and historical control data (HCD) but does not provide detailed procedures. Therefore, the working group "Statistics" of the German-speaking Society for Environmental Mutation Research (GUM) collected HCD from five laboratories and >200 comet assay studies and performed several statistical analyses. Key results included that (I) observed large inter-laboratory effects argue against the use of absolute quality thresholds, (II) > 50% zero values on a slide are considered problematic, due to their influence on slide or animal summary statistics, (III) the type of summarizing measure for single-cell data (e.g., median, arithmetic and geometric mean) may lead to extreme differences in resulting animal tail intensities and study outcome in the HCD. These summarizing values increase the reliability of analysis results by better meeting statistical model assumptions, but at the cost of information loss. Furthermore, the relation between negative and positive control groups in the data set was always satisfactorily (or sufficiently) based on ratio, difference and quantile analyses.


Asunto(s)
Daño del ADN , Proyectos de Investigación , Animales , Ensayo Cometa/métodos , Reproducibilidad de los Resultados , Mutación
4.
BMC Bioinformatics ; 24(1): 393, 2023 Oct 19.
Artículo en Inglés | MEDLINE | ID: mdl-37858091

RESUMEN

BACKGROUND: An important problem in toxicology in the context of gene expression data is the simultaneous inference of a large number of concentration-response relationships. The quality of the inference substantially depends on the choice of design of the experiments, in particular, on the set of different concentrations, at which observations are taken for the different genes under consideration. As this set has to be the same for all genes, the efficient planning of such experiments is very challenging. We address this problem by determining efficient designs for the simultaneous inference of a large number of concentration-response models. For that purpose, we both construct a D-optimality criterion for simultaneous inference and a K-means procedure which clusters the support points of the locally D-optimal designs of the individual models. RESULTS: We show that a planning of experiments that addresses the simultaneous inference of a large number of concentration-response relationships yields a substantially more accurate statistical analysis. In particular, we compare the performance of the constructed designs to the ones of other commonly used designs in terms of D-efficiencies and in terms of the quality of the resulting model fits using a real data example dealing with valproic acid. For the quality comparison we perform an extensive simulation study. CONCLUSIONS: The design maximizing the D-optimality criterion for simultaneous inference improves the inference of the different concentration-response relationships substantially. The design based on the K-means procedure also performs well, whereas a log-equidistant design, which was also included in the analysis, performs poorly in terms of the quality of the simultaneous inference. Based on our findings, the D-optimal design for simultaneous inference should be used for upcoming analyses dealing with high-dimensional gene expression data.


Asunto(s)
Proyectos de Investigación , Simulación por Computador
5.
BMC Med ; 21(1): 182, 2023 05 15.
Artículo en Inglés | MEDLINE | ID: mdl-37189125

RESUMEN

BACKGROUND: In high-dimensional data (HDD) settings, the number of variables associated with each observation is very large. Prominent examples of HDD in biomedical research include omics data with a large number of variables such as many measurements across the genome, proteome, or metabolome, as well as electronic health records data that have large numbers of variables recorded for each patient. The statistical analysis of such data requires knowledge and experience, sometimes of complex methods adapted to the respective research questions. METHODS: Advances in statistical methodology and machine learning methods offer new opportunities for innovative analyses of HDD, but at the same time require a deeper understanding of some fundamental statistical concepts. Topic group TG9 "High-dimensional data" of the STRATOS (STRengthening Analytical Thinking for Observational Studies) initiative provides guidance for the analysis of observational studies, addressing particular statistical challenges and opportunities for the analysis of studies involving HDD. In this overview, we discuss key aspects of HDD analysis to provide a gentle introduction for non-statisticians and for classically trained statisticians with little experience specific to HDD. RESULTS: The paper is organized with respect to subtopics that are most relevant for the analysis of HDD, in particular initial data analysis, exploratory data analysis, multiple testing, and prediction. For each subtopic, main analytical goals in HDD settings are outlined. For each of these goals, basic explanations for some commonly used analysis methods are provided. Situations are identified where traditional statistical methods cannot, or should not, be used in the HDD setting, or where adequate analytic tools are still lacking. Many key references are provided. CONCLUSIONS: This review aims to provide a solid statistical foundation for researchers, including statisticians and non-statisticians, who are new to research with HDD or simply want to better evaluate and understand the results of HDD analyses.


Asunto(s)
Investigación Biomédica , Objetivos , Humanos , Proyectos de Investigación
6.
Liver Int ; 43(8): 1699-1713, 2023 08.
Artículo en Inglés | MEDLINE | ID: mdl-37073116

RESUMEN

BACKGROUND & AIMS: Nonalcoholic fatty liver disease (NAFLD) is a major health burden associated with the metabolic syndrome leading to liver fibrosis, cirrhosis and ultimately liver cancer. In humans, the PNPLA3 I148M polymorphism of the phospholipase patatin-like phospholipid domain containing protein 3 (PNPLA3) has a well-documented impact on metabolic liver disease. In this study, we used a mouse model mimicking the human PNPLA3 I148M polymorphism in a long-term high fat diet (HFD) experiment to better define its role for NAFLD progression. METHODS: Male mice bearing wild-type Pnpla3 (Pnpla3WT ), or the human polymorphism PNPLA3 I148M (Pnpla3148M/M ) were subjected to HFD feeding for 24 and 52 weeks. Further analysis concerning basic phenotype, inflammation, proliferation and cell death, fibrosis and microbiota were performed in each time point. RESULTS: After 52 weeks HFD Pnpla3148M/M animals had more liver fibrosis, enhanced numbers of inflammatory cells as well as increased Kupffer cell activity. Increased hepatocyte cell turnover and ductular proliferation were evident in HFD Pnpla3148M/M livers. Microbiome diversity was decreased after HFD feeding, changes were influenced by HFD feeding (36%) and the PNPLA3 I148M genotype (12%). Pnpla3148M/M mice had more faecal bile acids. RNA-sequencing of liver tissue defined an HFD-associated signature, and a Pnpla3148M/M specific pattern, which suggests Kupffer cell and monocytes-derived macrophages as significant drivers of liver disease progression in Pnpla3148M/M animals. CONCLUSION: With long-term HFD feeding, mice with the PNPLA3 I148M genotype show exacerbated NAFLD. This finding is linked to PNPLA3 I148M-specific changes in microbiota composition and liver gene expression showing a stronger inflammatory response leading to enhanced liver fibrosis progression.


Asunto(s)
Enfermedades Metabólicas , Enfermedad del Hígado Graso no Alcohólico , Animales , Masculino , Ratones , Aciltransferasas/genética , Dieta , Predisposición Genética a la Enfermedad , Genotipo , Hígado/patología , Cirrosis Hepática/genética , Cirrosis Hepática/metabolismo , Enfermedad del Hígado Graso no Alcohólico/genética , Enfermedad del Hígado Graso no Alcohólico/metabolismo , Fosfolipasas A2 Calcio-Independiente/genética , Fosfolipasas A2 Calcio-Independiente/metabolismo
7.
Arch Toxicol ; 97(10): 2741-2761, 2023 10.
Artículo en Inglés | MEDLINE | ID: mdl-37572131

RESUMEN

The analysis of dose-response, concentration-response, and time-response relationships is a central component of toxicological research. A major decision with respect to the statistical analysis is whether to consider only the actually measured concentrations or to assume an underlying (parametric) model that allows extrapolation. Recent research suggests the application of modelling approaches for various types of toxicological assays. However, there is a discrepancy between the state of the art in statistical methodological research and published analyses in the toxicological literature. The extent of this gap is quantified in this work using an extensive literature review that considered all dose-response analyses published in three major toxicological journals in 2021. The aspects of the review include biological considerations (type of assay and of exposure), statistical design considerations (number of measured conditions, design, and sample sizes), and statistical analysis considerations (display, analysis goal, statistical testing or modelling method, and alert concentration). Based on the results of this review and the critical assessment of three selected issues in the context of statistical research, concrete guidance for planning, execution, and analysis of dose-response studies from a statistical viewpoint is proposed.

8.
J Hepatol ; 77(5): 1386-1398, 2022 11.
Artículo en Inglés | MEDLINE | ID: mdl-35863491

RESUMEN

BACKGROUND & AIMS: Pluripotent stem cell (PSC)-derived hepatocyte-like cells (HLC) have enormous potential as a replacement for primary hepatocytes in drug screening, toxicology and cell replacement therapy, but their genome-wide expression patterns differ strongly from primary human hepatocytes (PHH). METHODS: We differentiated human induced pluripotent stem cells (hiPSC) via definitive endoderm to HLC and characterized the cells by single-cell and bulk RNA-seq, with complementary epigenetic analyses. We then compared HLC to PHH and publicly available data on human fetal hepatocytes (FH) ex vivo; we performed bioinformatics-guided interventions to improve HLC differentiation via lentiviral transduction of the nuclear receptor FXR and agonist exposure. RESULTS: Single-cell RNA-seq revealed that transcriptomes of individual HLC display a hybrid state, where hepatocyte-associated genes are expressed in concert with genes that are not expressed in PHH - mostly intestinal genes - within the same cell. Bulk-level overrepresentation analysis, as well as regulon analysis at the single-cell level, identified sets of regulatory factors discriminating HLC, FH, and PHH, hinting at a central role for the nuclear receptor FXR in the functional maturation of HLC. Combined FXR expression plus agonist exposure enhanced the expression of hepatocyte-associated genes and increased the ability of bile canalicular secretion as well as lipid droplet formation, thereby increasing HLCs' similarity to PHH. The undesired non-liver gene expression was reproducibly decreased, although only by a moderate degree. CONCLUSION: In contrast to physiological hepatocyte precursor cells and mature hepatocytes, HLC co-express liver and hybrid genes in the same cell. Targeted modification of the FXR gene regulatory network improves their differentiation by suppressing intestinal traits whilst inducing hepatocyte features. LAY SUMMARY: Generation of human hepatocytes from stem cells represents an active research field but its success is hampered by the fact that the stem cell-derived 'hepatocytes' still show major differences to hepatocytes obtained from a liver. Here, we identified an important reason for the difference, specifically that the stem cell-derived 'hepatocyte' represents a hybrid cell with features of hepatocytes and intestinal cells. We show that a specific protein (FXR) suppresses intestinal and induces liver features, thus bringing the stem cell-derived cells closer to hepatocytes derived from human livers.


Asunto(s)
Células Madre Pluripotentes Inducidas , Células Madre Pluripotentes , Diferenciación Celular , Hepatocitos/metabolismo , Humanos , Intestinos
9.
J Hepatol ; 77(1): 71-83, 2022 07.
Artículo en Inglés | MEDLINE | ID: mdl-35131407

RESUMEN

BACKGROUND & AIMS: Acetaminophen (APAP) overdose remains a frequent cause of acute liver failure, which is generally accompanied by increased levels of serum bile acids (BAs). However, the pathophysiological role of BAs remains elusive. Herein, we investigated the role of BAs in APAP-induced hepatotoxicity. METHODS: We performed intravital imaging to investigate BA transport in mice, quantified endogenous BA concentrations in the serum of mice and patients with APAP overdose, analyzed liver tissue and bile by mass spectrometry and MALDI-mass spectrometry imaging, assessed the integrity of the blood-bile barrier and the role of oxidative stress by immunostaining of tight junction proteins and intravital imaging of fluorescent markers, identified the intracellular cytotoxic concentrations of BAs, and performed interventions to block BA uptake from blood into hepatocytes. RESULTS: Prior to the onset of cell death, APAP overdose causes massive oxidative stress in the pericentral lobular zone, which coincided with a breach of the blood-bile barrier. Consequently, BAs leak from the bile canaliculi into the sinusoidal blood, which is then followed by their uptake into hepatocytes via the basolateral membrane, their secretion into canaliculi and repeated cycling. This, what we termed 'futile cycling' of BAs, led to increased intracellular BA concentrations that were high enough to cause hepatocyte death. Importantly, however, the interruption of BA re-uptake by pharmacological NTCP blockage using Myrcludex B and Oatp knockout strongly reduced APAP-induced hepatotoxicity. CONCLUSIONS: APAP overdose induces a breach of the blood-bile barrier which leads to futile BA cycling that causes hepatocyte death. Prevention of BA cycling may represent a therapeutic option after APAP intoxication. LAY SUMMARY: Only one drug, N-acetylcysteine, is approved for the treatment of acetaminophen overdose and it is only effective when given within ∼8 hours after ingestion. We identified a mechanism by which acetaminophen overdose causes an increase in bile acid concentrations (to above toxic thresholds) in hepatocytes. Blocking this mechanism prevented acetaminophen-induced hepatotoxicity in mice and evidence from patients suggests that this therapy may be effective for longer periods after ingestion compared to N-acetylcysteine.


Asunto(s)
Enfermedad Hepática Inducida por Sustancias y Drogas , Sobredosis de Droga , Acetaminofén/metabolismo , Acetilcisteína/farmacología , Animales , Ácidos y Sales Biliares/metabolismo , Enfermedad Hepática Inducida por Sustancias y Drogas/tratamiento farmacológico , Enfermedad Hepática Inducida por Sustancias y Drogas/metabolismo , Enfermedad Hepática Inducida por Sustancias y Drogas/prevención & control , Hepatocitos/metabolismo , Humanos , Hígado/metabolismo , Ratones , Ratones Endogámicos C57BL
10.
Bioinformatics ; 2021 Jan 30.
Artículo en Inglés | MEDLINE | ID: mdl-33515236

RESUMEN

MOTIVATION: An important goal of concentration-response studies in toxicology is to determine an 'alert' concentration where a critical level of the response variable is exceeded. In a classical observation-based approach, only measured concentrations are considered as potential alert concentrations. Alternatively, a parametric curve is fitted to the data that describes the relationship between concentration and response. For a prespecified effect level, both an absolute estimate of the alert concentration and an estimate of the lowest concentration where the effect level is exceeded significantly are of interest. RESULTS: In a simulation study for gene expression data, we compared the observation-based and the model-based approach for both absolute and significant exceedance of the prespecified effect level. Results show that, compared to the observation-based approach, the model-based approach overestimates the true alert concentration less often and more frequently leads to a valid estimate, especially for genes with large variance. AVAILABILITY AND IMPLEMENTATION: The code used for the simulation studies is available via the GitHub repository: https://github.com/FKappenberg/Paper-IdentificationAlertConcentrations. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

11.
Chem Res Toxicol ; 35(5): 760-773, 2022 05 16.
Artículo en Inglés | MEDLINE | ID: mdl-35416653

RESUMEN

Despite the progress made in developmental toxicology, there is a great need for in vitro tests that identify developmental toxicants in relation to human oral doses and blood concentrations. In the present study, we established the hiPSC-based UKK2 in vitro test and analyzed genome-wide expression profiles of 23 known teratogens and 16 non-teratogens. Compounds were analyzed at the maximal plasma concentration (Cmax) and at 20-fold Cmax for a 24 h incubation period in three independent experiments. Based on the 1000 probe sets with the highest variance and including information on cytotoxicity, penalized logistic regression with leave-one-out cross-validation was used to classify the compounds as test-positive or test-negative, reaching an area under the curve (AUC), accuracy, sensitivity, and specificity of 0.96, 0.92, 0.96, and 0.88, respectively. Omitting the cytotoxicity information reduced the test performance to an AUC of 0.94, an accuracy of 0.79, and a sensitivity of 0.74. A second method, which used the number of significantly deregulated probe sets to classify the compounds, resulted in a specificity of 1; however, the AUC (0.90), accuracy (0.90), and sensitivity (0.83) were inferior compared to those of the logistic regression-based procedure. Finally, no increased performance was achieved when the high test concentrations (20-fold Cmax) were used, in comparison to testing within the realistic clinical range (1-fold Cmax). In conclusion, although further optimization is required, for example, by including additional readouts and cell systems that model different developmental processes, the UKK2-test in its present form can support the early discovery-phase detection of human developmental toxicants.


Asunto(s)
Células Madre Pluripotentes Inducidas , Transcriptoma , Sustancias Peligrosas , Humanos , Técnicas In Vitro , Teratógenos
12.
Nucleic Acids Res ; 48(22): 12577-12592, 2020 12 16.
Artículo en Inglés | MEDLINE | ID: mdl-33245762

RESUMEN

Thousands of transcriptome data sets are available, but approaches for their use in dynamic cell response modelling are few, especially for processes affected simultaneously by two orthogonal influencing variables. We approached this problem for neuroepithelial development of human pluripotent stem cells (differentiation variable), in the presence or absence of valproic acid (signaling variable). Using few basic assumptions (sequential differentiation states of cells; discrete on/off states for individual genes in these states), and time-resolved transcriptome data, a comprehensive model of spontaneous and perturbed gene expression dynamics was developed. The model made reliable predictions (average correlation of 0.85 between predicted and subsequently tested expression values). Even regulations predicted to be non-monotonic were successfully validated by PCR in new sets of experiments. Transient patterns of gene regulation were identified from model predictions. They pointed towards activation of Wnt signaling as a candidate pathway leading to a redirection of differentiation away from neuroepithelial cells towards neural crest. Intervention experiments, using a Wnt/beta-catenin antagonist, led to a phenotypic rescue of this disturbed differentiation. Thus, our broadly applicable model allows the analysis of transcriptome changes in complex time/perturbation matrices.


Asunto(s)
Diferenciación Celular/genética , Células Madre Pluripotentes/citología , Transcriptoma/genética , Regulación del Desarrollo de la Expresión Génica/genética , Humanos , Vía de Señalización Wnt/genética
13.
Biom J ; 64(5): 948-963, 2022 06.
Artículo en Inglés | MEDLINE | ID: mdl-35212423

RESUMEN

We propose to use Bayesian optimization (BO) to improve the efficiency of the design selection process in clinical trials. BO is a method to optimize expensive black-box functions, by using a regression as a surrogate to guide the search. In clinical trials, planning test procedures and sample sizes is a crucial task. A common goal is to maximize the test power, given a set of treatments, corresponding effect sizes, and a total number of samples. From a wide range of possible designs, we aim to select the best one in a short time to allow quick decisions. The standard approach to simulate the power for each single design can become too time consuming. When the number of possible designs becomes very large, either large computational resources are required or an exhaustive exploration of all possible designs takes too long. Here, we propose to use BO to quickly find a clinical trial design with high power from a large number of candidate designs. We demonstrate the effectiveness of our approach by optimizing the power of adaptive seamless designs for different sets of treatment effect sizes. Comparing BO with an exhaustive evaluation of all candidate designs shows that BO finds competitive designs in a fraction of the time.


Asunto(s)
Proyectos de Investigación , Teorema de Bayes , Tamaño de la Muestra
14.
Biom J ; 64(5): 883-897, 2022 06.
Artículo en Inglés | MEDLINE | ID: mdl-35187701

RESUMEN

We extend the scope of application for MCP-Mod (Multiple Comparison Procedure and Modeling) to in vitro gene expression data and assess its characteristics regarding model selection for concentration gene expression curves. Precisely, we apply MCP-Mod on single genes of a high-dimensional gene expression data set, where human embryonic stem cells were exposed to eight concentration levels of the compound valproic acid (VPA). As candidate models we consider the sigmoid Emax$E_{\max }$ (four-parameter log-logistic), linear, quadratic, Emax$E_{\max }$ , exponential, and beta model. Through simulations we investigate the impact of omitting one or more models from the candidate model set to uncover possibly superfluous models and to evaluate the precision and recall rates of selected models. Each model is selected according to Akaike information criterion (AIC) for a considerable number of genes. For less noisy cases the popular sigmoid Emax$E_{\max }$ model is frequently selected. For more noisy data, often simpler models like the linear model are selected, but mostly without relevant performance advantage compared to the second best model. Also, the commonly used standard Emax$E_{\max }$ model has an unexpected low performance.


Asunto(s)
Modelos Lineales , Expresión Génica , Humanos
15.
BMC Bioinformatics ; 22(1): 586, 2021 Dec 11.
Artículo en Inglés | MEDLINE | ID: mdl-34895139

RESUMEN

BACKGROUND: Important objectives in cancer research are the prediction of a patient's risk based on molecular measurements such as gene expression data and the identification of new prognostic biomarkers (e.g. genes). In clinical practice, this is often challenging because patient cohorts are typically small and can be heterogeneous. In classical subgroup analysis, a separate prediction model is fitted using only the data of one specific cohort. However, this can lead to a loss of power when the sample size is small. Simple pooling of all cohorts, on the other hand, can lead to biased results, especially when the cohorts are heterogeneous. RESULTS: We propose a new Bayesian approach suitable for continuous molecular measurements and survival outcome that identifies the important predictors and provides a separate risk prediction model for each cohort. It allows sharing information between cohorts to increase power by assuming a graph linking predictors within and across different cohorts. The graph helps to identify pathways of functionally related genes and genes that are simultaneously prognostic in different cohorts. CONCLUSIONS: Results demonstrate that our proposed approach is superior to the standard approaches in terms of prediction performance and increased power in variable selection when the sample size is small.


Asunto(s)
Teorema de Bayes , Estudios de Cohortes , Expresión Génica , Humanos , Tamaño de la Muestra
16.
BMC Med Inform Decis Mak ; 21(1): 342, 2021 12 07.
Artículo en Inglés | MEDLINE | ID: mdl-34876106

RESUMEN

BACKGROUND: An important task in clinical medicine is the construction of risk prediction models for specific subgroups of patients based on high-dimensional molecular measurements such as gene expression data. Major objectives in modeling high-dimensional data are good prediction performance and feature selection to find a subset of predictors that are truly associated with a clinical outcome such as a time-to-event endpoint. In clinical practice, this task is challenging since patient cohorts are typically small and can be heterogeneous with regard to their relationship between predictors and outcome. When data of several subgroups of patients with the same or similar disease are available, it is tempting to combine them to increase sample size, such as in multicenter studies. However, heterogeneity between subgroups can lead to biased results and subgroup-specific effects may remain undetected. METHODS: For this situation, we propose a penalized Cox regression model with a weighted version of the Cox partial likelihood that includes patients of all subgroups but assigns them individual weights based on their subgroup affiliation. The weights are estimated from the data such that patients who are likely to belong to the subgroup of interest obtain higher weights in the subgroup-specific model. RESULTS: Our proposed approach is evaluated through simulations and application to real lung cancer cohorts, and compared to existing approaches. Simulation results demonstrate that our proposed model is superior to standard approaches in terms of prediction performance and variable selection accuracy when the sample size is small. CONCLUSIONS: The results suggest that sharing information between subgroups by incorporating appropriate weights into the likelihood can increase power to identify the prognostic covariates and improve risk prediction.


Asunto(s)
Neoplasias Pulmonares , Simulación por Computador , Expresión Génica , Humanos , Neoplasias Pulmonares/diagnóstico , Modelos de Riesgos Proporcionales , Tamaño de la Muestra
17.
BMC Bioinformatics ; 21(1): 26, 2020 Jan 28.
Artículo en Inglés | MEDLINE | ID: mdl-31992203

RESUMEN

BACKGROUND: With modern methods in biotechnology, the search for biomarkers has advanced to a challenging statistical task exploring high dimensional data sets. Feature selection is a widely researched preprocessing step to handle huge numbers of biomarker candidates and has special importance for the analysis of biomedical data. Such data sets often include many input features not related to the diagnostic or therapeutic target variable. A less researched, but also relevant aspect for medical applications are costs of different biomarker candidates. These costs are often financial costs, but can also refer to other aspects, for example the decision between a painful biopsy marker and a simple urine test. In this paper, we propose extensions to two feature selection methods to control the total amount of such costs: greedy forward selection and genetic algorithms. In comprehensive simulation studies of binary classification tasks, we compare the predictive performance, the run-time and the detection rate of relevant features for the new proposed methods and five baseline alternatives to handle budget constraints. RESULTS: In simulations with a predefined budget constraint, our proposed methods outperform the baseline alternatives, with just minor differences between them. Only in the scenario without an actual budget constraint, our adapted greedy forward selection approach showed a clear drop in performance compared to the other methods. However, introducing a hyperparameter to adapt the benefit-cost trade-off in this method could overcome this weakness. CONCLUSIONS: In feature cost scenarios, where a total budget has to be met, common feature selection algorithms are often not suitable to identify well performing subsets for a modelling task. Adaptations of these algorithms such as the ones proposed in this paper can help to tackle this problem.


Asunto(s)
Algoritmos , Biomarcadores , Biología Computacional
18.
Bioinformatics ; 35(14): i484-i491, 2019 07 15.
Artículo en Inglés | MEDLINE | ID: mdl-31510644

RESUMEN

MOTIVATION: To obtain a reliable prediction model for a specific cancer subgroup or cohort is often difficult due to limited sample size and, in survival analysis, due to potentially high censoring rates. Sometimes similar data from other patient subgroups are available, e.g. from other clinical centers. Simple pooling of all subgroups can decrease the variance of the predicted parameters of the prediction models, but also increase the bias due to heterogeneity between the cohorts. A promising compromise is to identify those subgroups with a similar relationship between covariates and target variable and then include only these for model building. RESULTS: We propose a subgroup-based weighted likelihood approach for survival prediction with high-dimensional genetic covariates. When predicting survival for a specific subgroup, for every other subgroup an individual weight determines the strength with which its observations enter into model building. MBO (model-based optimization) can be used to quickly find a good prediction model in the presence of a large number of hyperparameters. We use MBO to identify the best model for survival prediction of a specific subgroup by optimizing the weights for additional subgroups for a Cox model. The approach is evaluated on a set of lung cancer cohorts with gene expression measurements. The resulting models have competitive prediction quality, and they reflect the similarity of the corresponding cancer subgroups, with both weights close to 0 and close to 1 and medium weights. AVAILABILITY AND IMPLEMENTATION: mlrMBO is implemented as an R-package and is freely available at http://github.com/mlr-org/mlrMBO.


Asunto(s)
Expresión Génica , Neoplasias Pulmonares , Análisis de Supervivencia , Femenino , Humanos , Funciones de Verosimilitud , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/mortalidad , Masculino , Tamaño de la Muestra
19.
Arch Toxicol ; 94(11): 3787-3798, 2020 11.
Artículo en Inglés | MEDLINE | ID: mdl-32965549

RESUMEN

In cell biology, pharmacology and toxicology dose-response and concentration-response curves are frequently fitted to data with statistical methods. Such fits are used to derive quantitative measures (e.g. EC[Formula: see text] values) describing the relationship between the concentration of a compound or the strength of an intervention applied to cells and its effect on viability or function of these cells. Often, a reference, called negative control (or solvent control), is used to normalize the data. The negative control data sometimes deviate from the values measured for low (ineffective) test compound concentrations. In such cases, normalization of the data with respect to control values leads to biased estimates of the parameters of the concentration-response curve. Low quality estimates of effective concentrations can be the consequence. In a literature study, we found that this problem occurs in a large percentage of toxicological publications. We propose different strategies to tackle the problem, including complete omission of the controls. Data from a controlled simulation study indicate the best-suited problem solution for different data structure scenarios. This was further exemplified by a real concentration-response study. We provide the following recommendations how to handle deviating controls: (1) The log-logistic 4pLL model is a good default option. (2) When there are at least two concentrations in the no-effect range, low variances of the replicate measurements, and deviating controls, control values should be omitted before fitting the model. (3) When data are missing in the no-effect range, the Brain-Cousens model sometimes leads to better results than the default model.


Asunto(s)
Algoritmos , Técnicas In Vitro , Modelos Estadísticos , Línea Celular , Simulación por Computador , Células Hep G2 , Humanos , Modelos Biológicos , Distribución Normal , Proyectos de Investigación , Ácido Valproico/análisis , Ácido Valproico/toxicidad
20.
Arch Toxicol ; 94(1): 151-171, 2020 01.
Artículo en Inglés | MEDLINE | ID: mdl-31712839

RESUMEN

The first in vitro tests for developmental toxicity made use of rodent cells. Newer teratology tests, e.g. developed during the ESNATS project, use human cells and measure mechanistic endpoints (such as transcriptome changes). However, the toxicological implications of mechanistic parameters are hard to judge, without functional/morphological endpoints. To address this issue, we developed a new version of the human stem cell-based test STOP-tox(UKN). For this purpose, the capacity of the cells to self-organize to neural rosettes was assessed as functional endpoint: pluripotent stem cells were allowed to differentiate into neuroepithelial cells for 6 days in the presence or absence of toxicants. Then, both transcriptome changes were measured (standard STOP-tox(UKN)) and cells were allowed to form rosettes. After optimization of staining methods, an imaging algorithm for rosette quantification was implemented and used for an automated rosette formation assay (RoFA). Neural tube toxicants (like valproic acid), which are known to disturb human development at stages when rosette-forming cells are present, were used as positive controls. Established toxicants led to distinctly different tissue organization and differentiation stages. RoFA outcome and transcript changes largely correlated concerning (1) the concentration-dependence, (2) the time dependence, and (3) the set of positive hits identified amongst 24 potential toxicants. Using such comparative data, a prediction model for the RoFA was developed. The comparative analysis was also used to identify gene dysregulations that are particularly predictive for disturbed rosette formation. This 'RoFA predictor gene set' may be used for a simplified and less costly setup of the STOP-tox(UKN) assay.


Asunto(s)
Células-Madre Neurales/efectos de los fármacos , Trastornos del Neurodesarrollo/inducido químicamente , Neurotoxinas/farmacología , Formación de Roseta/métodos , Pruebas de Toxicidad/métodos , Diferenciación Celular/efectos de los fármacos , Regulación de la Expresión Génica/efectos de los fármacos , Humanos , Células-Madre Neurales/citología , Células-Madre Neurales/fisiología , Análisis de Secuencia por Matrices de Oligonucleótidos , Factores de Tiempo
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA