Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 84
Filtrar
Mais filtros

Base de dados
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
Nat Methods ; 20(8): 1159-1169, 2023 08.
Artigo em Inglês | MEDLINE | ID: mdl-37443337

RESUMO

The detection of circular RNA molecules (circRNAs) is typically based on short-read RNA sequencing data processed using computational tools. Numerous such tools have been developed, but a systematic comparison with orthogonal validation is missing. Here, we set up a circRNA detection tool benchmarking study, in which 16 tools detected more than 315,000 unique circRNAs in three deeply sequenced human cell types. Next, 1,516 predicted circRNAs were validated using three orthogonal methods. Generally, tool-specific precision is high and similar (median of 98.8%, 96.3% and 95.5% for qPCR, RNase R and amplicon sequencing, respectively) whereas the sensitivity and number of predicted circRNAs (ranging from 1,372 to 58,032) are the most significant differentiators. Of note, precision values are lower when evaluating low-abundance circRNAs. We also show that the tools can be used complementarily to increase detection sensitivity. Finally, we offer recommendations for future circRNA detection and validation.


Assuntos
Benchmarking , RNA Circular , Humanos , RNA Circular/genética , RNA/genética , RNA/metabolismo , Análise de Sequência de RNA/métodos
2.
Brief Bioinform ; 25(3)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38555473

RESUMO

Digital PCR (dPCR) is a highly accurate technique for the quantification of target nucleic acid(s). It has shown great potential in clinical applications, like tumor liquid biopsy and validation of biomarkers. Accurate classification of partitions based on end-point fluorescence intensities is crucial to avoid biased estimators of the concentration of the target molecules. We have evaluated many clustering methods, from general-purpose methods to specific methods for dPCR and flowcytometry, on both simulated and real-life data. Clustering method performance was evaluated by simulating various scenarios. Based on our extensive comparison of clustering methods, we describe the limits of these methods, and formulate guidelines for choosing an appropriate method. In addition, we have developed a novel method for simulating realistic dPCR data. The method is based on a mixture distribution of a Poisson point process and a skew-$t$ distribution, which enables the generation of irregularities of cluster shapes and randomness of partitions between clusters ('rain') as commonly observed in dPCR data. Users can fine-tune the model parameters and generate labeled datasets, using their own data as a template. Besides, the database of experimental dPCR data augmented with the labeled simulated data can serve as training and testing data for new clustering methods. The simulation method is available as an R Shiny app.


Assuntos
Neoplasias , Ácidos Nucleicos , Humanos , Reação em Cadeia da Polimerase/métodos , Benchmarking , Biópsia Líquida
3.
Stat Med ; 2024 Jul 04.
Artigo em Inglês | MEDLINE | ID: mdl-38963080

RESUMO

Semiparametric probabilistic index models allow for the comparison of two groups of observations, whilst adjusting for covariates, thereby fitting nicely within the framework of generalized pairwise comparisons (GPC). As with most regression approaches in this setting, the limited amount of data results in invalid inference as the asymptotic normality assumption is not met. In addition, separation issues might arise when considering small samples. In this article, we show that the parameters of the probabilistic index model can be estimated using generalized estimating equations, for which adjustments exist that lead to estimators of the sandwich variance-covariance matrix with improved finite sample properties and that can deal with bias due to separation. In this way, appropriate inference can be performed as is shown through extensive simulation studies. The known relationships between the probabilistic index and other GPC statistics allow to also provide valid inference for example, the net treatment benefit or the success odds.

4.
Pharm Stat ; 2024 Apr 02.
Artigo em Inglês | MEDLINE | ID: mdl-38562060

RESUMO

Combination treatments have been of increasing importance in drug development across therapeutic areas to improve treatment response, minimize the development of resistance, and/or minimize adverse events. Pre-clinical in-vitro combination experiments aim to explore the potential of such drug combinations during drug discovery by comparing the observed effect of the combination with the expected treatment effect under the assumption of no interaction (i.e., null model). This tutorial will address important design aspects of such experiments to allow proper statistical evaluation. Additionally, it will highlight the Biochemically Intuitive Generalized Loewe methodology (BIGL R package available on CRAN) to statistically detect deviations from the expectation under different null models. A clear advantage of the methodology is the quantification of the effect sizes, together with confidence interval while controlling the directional false coverage rate. Finally, a case study will showcase the workflow in analyzing combination experiments.

5.
Biom J ; 66(1): e2200237, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38285404

RESUMO

The two-sample problem is one of the earliest problems in statistics: given two samples, the question is whether or not the observations were sampled from the same distribution. Many statistical tests have been developed for this problem, and many tests have been evaluated in simulation studies, but hardly any study has tried to set up a neutral comparison study. In this paper, we introduce an open science initiative that potentially allows for neutral comparisons of two-sample tests. It is designed as an open-source R package, a repository, and an online R Shiny app. This paper describes the principles, the design of the system and illustrates the use of the system.


Assuntos
Simulação por Computador
6.
Clin Chem ; 69(9): 976-990, 2023 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-37401391

RESUMO

BACKGROUND: Partition classification is a critical step in the digital PCR data analysis pipeline. A range of partition classification methods have been developed, many motivated by specific experimental setups. An overview of these partition classification methods is lacking and their comparative properties are often unclear, likely impacting the proper application of these methods. CONTENT: This review provides a summary of all available digital PCR partition classification approaches and the challenges they aim to overcome, serving as a guide for the digital PCR practitioner wishing to apply them. We additionally discuss strengths and weaknesses of these methods, which can further guide practitioners in vigilant application of these existing methods. This review provides method developers with ideas for improving methods or designing new ones. The latter is further stimulated by our identification and discussion of application gaps in the literature, for which there are currently no or few methods available. SUMMARY: This review provides an overview of digital PCR partition classification methods, their properties, and potential applications. Ideas for further advances are presented and may bolster method development.


Assuntos
Reação em Cadeia da Polimerase , Reação em Cadeia da Polimerase/métodos
7.
Biom J ; 65(2): e2100354, 2023 02.
Artigo em Inglês | MEDLINE | ID: mdl-36127290

RESUMO

The method of generalized pairwise comparisons (GPC) is an extension of the well-known nonparametric Wilcoxon-Mann-Whitney test for comparing two groups of observations. Multiple generalizations of Wilcoxon-Mann-Whitney test and other GPC methods have been proposed over the years to handle censored data. These methods apply different approaches to handling loss of information due to censoring: ignoring noninformative pairwise comparisons due to censoring (Gehan, Harrell, and Buyse); imputation using estimates of the survival distribution (Efron, Péron, and Latta); or inverse probability of censoring weighting (IPCW, Datta and Dong). Based on the GPC statistic, a measure of treatment effect, the "net benefit," can be defined. It quantifies the difference between the probabilities that a randomly selected individual from one group is doing better than an individual from the other group. This paper aims at evaluating GPC methods for censored data, both in the context of hypothesis testing and estimation, and providing recommendations related to their choice in various situations. The methods that ignore uninformative pairs have comparable power to more complex and computationally demanding methods in situations of low censoring, and are slightly superior for high proportions (>40%) of censoring. If one is interested in estimation of the net benefit, Harrell's c index is an unbiased estimator if the proportional hazards assumption holds. Otherwise, the imputation (Efron or Peron) or IPCW (Datta, Dong) methods provide unbiased estimators in case of proportions of drop-out censoring up to 60%.


Assuntos
Projetos de Pesquisa , Probabilidade , Simulação por Computador , Análise de Sobrevida
8.
J Biomed Inform ; 131: 104111, 2022 07.
Artigo em Inglês | MEDLINE | ID: mdl-35671939

RESUMO

The Population Reference Interval (PRI) refers to the range of outcomes that are expected in a healthy population for a clinical or a diagnostic measurement. It is widely used in daily clinical practice and is essential for assisting clinical decision-making in diagnostics and treatment. In this manuscript, we start from the observation that each healthy individual has its own range for a given variable, depending on personal biological traits. This Individual Reference Interval (IRI) can be calculated and be utilised in clinical practice, in combination with the PRI for improved decision making. Nonparametric estimation of IRIs would require quite long time series. To circumvent this problem, we propose methods based on quantile models in combination with penalised parameter estimation methods that allow for information-sharing among the subjects. Our approach considers the calculation of an IRI as a prediction problem rather than an estimation problem. We perform a simulation study designed to benchmark the methods under different assumptions. From the simulation study we conclude that the new methods are robust and provide empirical coverages close to the nominal level. Finally, we evaluate the methods on real-life data consisting of eleven clinical tests and metabolomics measurements from the VITO IAM Frontier study.


Assuntos
Tomada de Decisão Clínica , Metabolômica , Simulação por Computador , Humanos , Valores de Referência
9.
Pharm Stat ; 21(2): 345-360, 2022 03.
Artigo em Inglês | MEDLINE | ID: mdl-34608741

RESUMO

Combination therapies are increasingly adopted as the standard of care for various diseases to improve treatment response, minimise the development of resistance and/or minimise adverse events. Therefore, synergistic combinations are screened early in the drug discovery process, in which their potential is evaluated by comparing the observed combination effect to that expected under a null model. Such methodology is implemented in the BIGL R-package which allows for a quick screening of drug combinations. We extend the meanR and maxR tests from this package by allowing non-constant variance of the responses and by extending the list of null models (Loewe, Loewe2, HSA, Bliss). These new tests are evaluated in a comprehensive simulation study under various models for additivity and synergy, various monotherapeutic dose-response models (complete, partial and incomplete responders) and various types of deviation from the constant variance assumption. In addition, the BIGL package is extended with bootstrap confidence intervals for the individual off-axis points and for the overall synergy strength, which were demonstrated to have reliable coverage and can complement the existing tests. We conclude that the differences in performance between the different null models are small and depend on the simulation scenario. As a result, the choice of null model should be driven by expert knowledge on the particular problem. Finally, we demonstrate the new features of the BIGL package and the difference between the synergy models on a real dataset from drug discovery. The BIGL package is available at CRAN (https://CRAN.R-project.org/package=BIGL) and as a Shiny app (https://synergy.openanalytics.eu/app).


Assuntos
Descoberta de Drogas , Simulação por Computador , Combinação de Medicamentos , Descoberta de Drogas/métodos , Sinergismo Farmacológico , Humanos
10.
Brief Bioinform ; 20(1): 210-221, 2019 01 18.
Artigo em Inglês | MEDLINE | ID: mdl-28968702

RESUMO

High-throughput sequencing technologies allow easy characterization of the human microbiome, but the statistical methods to analyze microbiome data are still in their infancy. Differential abundance methods aim at detecting associations between the abundances of bacterial species and subject grouping factors. The results of such methods are important to identify the microbiome as a prognostic or diagnostic biomarker or to demonstrate efficacy of prodrug or antibiotic drugs. Because of a lack of benchmarking studies in the microbiome field, no consensus exists on the performance of the statistical methods. We have compared a large number of popular methods through extensive parametric and nonparametric simulation as well as real data shuffling algorithms. The results are consistent over the different approaches and all point to an alarming excess of false discoveries. This raises great doubts about the reliability of discoveries in past studies and imperils reproducibility of microbiome experiments. To further improve method benchmarking, we introduce a new simulation tool that allows to generate correlated count data following any univariate count distribution; the correlation structure may be inferred from real data. Most simulation studies discard the correlation between species, but our results indicate that this correlation can negatively affect the performance of statistical methods.


Assuntos
Microbiota , Algoritmos , Biodiversidade , Biologia Computacional/métodos , Simulação por Computador , Bases de Dados Genéticas/estatística & dados numéricos , Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Humanos , Microbiota/genética , Estatísticas não Paramétricas
11.
Bioinformatics ; 36(10): 3276-3278, 2020 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-32065619

RESUMO

SUMMARY: SPsimSeq is a semi-parametric simulation method to generate bulk and single-cell RNA-sequencing data. It is designed to simulate gene expression data with maximal retention of the characteristics of real data. It is reasonably flexible to accommodate a wide range of experimental scenarios, including different sample sizes, biological signals (differential expression) and confounding batch effects. AVAILABILITY AND IMPLEMENTATION: The R package and associated documentation is available from https://github.com/CenterForStatistics-UGent/SPsimSeq. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
RNA , Software , Sequência de Bases , RNA/genética , Análise de Sequência de RNA , Sequenciamento do Exoma
12.
Anal Biochem ; 626: 114217, 2021 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-33939972

RESUMO

Accurate tools to measure RNA integrity are essential to obtain reliable gene expression data. The reverse transcription quantitative PCR (RT-qPCR) based 3':5' assay permits a direct determination of messenger RNA (mRNA) integrity. However, the use of standard curves and the possible effect of PCR inhibitors make this method cumbersome and prone to variation, especially in small samples. Here we developed a triplex digital PCR (dPCR) 3':5' assay for assessing RNA integrity in equine samples as rapid and simple alternative to RT-qPCR. This dPCR assay not only provides a straight forward analysis of the mRNA integrity, but also of its quantity.


Assuntos
Estabilidade de RNA , RNA Mensageiro/química , RNA/análise , Animais , Cavalos , RNA/genética , RNA Mensageiro/análise , RNA Mensageiro/genética , Reação em Cadeia da Polimerase em Tempo Real
13.
BMC Bioinformatics ; 21(1): 58, 2020 Feb 17.
Artigo em Inglês | MEDLINE | ID: mdl-32066370

RESUMO

BACKGROUND: To understand biology and differences among various tissues or cell types, one typically searches for molecular features that display characteristic abundance patterns. Several specificity metrics have been introduced to identify tissue-specific molecular features, but these either require an equal number of replicates per tissue or they can't handle replicates at all. RESULTS: We describe a non-parametric specificity score that is compatible with unequal sample group sizes. To demonstrate its usefulness, the specificity score was calculated on all GTEx samples, detecting known and novel tissue-specific genes. A webtool was developed to browse these results for genes or tissues of interest. An example python implementation of SPECS is available at https://github.com/celineeveraert/SPECS. The precalculated SPECS results on the GTEx data are available through a user-friendly browser at specs.cmgg.be. CONCLUSIONS: SPECS is a non-parametric method that identifies known and novel specific-expressed genes. In addition, SPECS could be adopted for other features and applications.


Assuntos
Software , Perfilação da Expressão Gênica , Tamanho da Amostra , Análise de Sequência de RNA , Estatísticas não Paramétricas
14.
BMC Genomics ; 21(1): 384, 2020 06 03.
Artigo em Inglês | MEDLINE | ID: mdl-32493350

RESUMO

An amendment to this paper has been published and can be accessed via the original article.

15.
BMC Genomics ; 21(1): 312, 2020 Apr 19.
Artigo em Inglês | MEDLINE | ID: mdl-32306892

RESUMO

BACKGROUND: In gene expression studies, RNA sample pooling is sometimes considered because of budget constraints or lack of sufficient input material. Using microarray technology, RNA sample pooling strategies have been reported to optimize both the cost of data generation as well as the statistical power for differential gene expression (DGE) analysis. For RNA sequencing, with its different quantitative output in terms of counts and tunable dynamic range, the adequacy and empirical validation of RNA sample pooling strategies have not yet been evaluated. In this study, we comprehensively assessed the utility of pooling strategies in RNA-seq experiments using empirical and simulated RNA-seq datasets. RESULT: The data generating model in pooled experiments is defined mathematically to evaluate the mean and variability of gene expression estimates. The model is further used to examine the trade-off between the statistical power of testing for DGE and the data generating costs. Empirical assessment of pooling strategies is done through analysis of RNA-seq datasets under various pooling and non-pooling experimental settings. Simulation study is also used to rank experimental scenarios with respect to the rate of false and true discoveries in DGE analysis. The results demonstrate that pooling strategies in RNA-seq studies can be both cost-effective and powerful when the number of pools, pool size and sequencing depth are optimally defined. CONCLUSION: For high within-group gene expression variability, small RNA sample pools are effective to reduce the variability and compensate for the loss of the number of replicates. Unlike the typical cost-saving strategies, such as reducing sequencing depth or number of RNA samples (replicates), an adequate pooling strategy is effective in maintaining the power of testing DGE for genes with low to medium abundance levels, along with a substantial reduction of the total cost of the experiment. In general, pooling RNA samples or pooling RNA samples in conjunction with moderate reduction of the sequencing depth can be good options to optimize the cost and maintain the power.


Assuntos
RNA-Seq/economia , RNA-Seq/estatística & dados numéricos , Sequência de Bases , Simulação por Computador , Custos e Análise de Custo , Perfilação da Expressão Gênica/métodos , Projetos de Pesquisa , Tamanho da Amostra , Sequenciamento do Exoma
16.
Stat Med ; 38(8): 1484-1501, 2019 04 15.
Artigo em Inglês | MEDLINE | ID: mdl-30609115

RESUMO

Semiparametric linear transformation models form a versatile class of regression models with the Cox proportional hazards model being the most well-known member. These models are well studied for right censored outcomes and are typically used in survival analysis. We consider transformation models as a tool for situations with uncensored continuous outcomes where linear regression is not appropriate. We introduce the probabilistic index as a uniform effect measure for the class of transformation models. We discuss and compare three estimators using a working Cox regression model: the partial likelihood estimator, an estimator based on binary generalized linear models and one based on probabilistic index model estimating equations. The latter has a superior performance in terms of bias and variance when the working model is misspecified. For the purpose of illustration, we analyze data that were collected at an urban alcohol and drug detoxification unit.


Assuntos
Modelos Lineares , Análise de Sobrevida , Algoritmos , Viés , Modelos de Riscos Proporcionais
17.
Anal Chem ; 90(11): 6540-6547, 2018 06 05.
Artigo em Inglês | MEDLINE | ID: mdl-29739189

RESUMO

Multiple simulation studies have shown that volume variability of partition sizes in digital PCR (dPCR) causes bias in the resulting concentration estimates and their associated standard errors. These biases are especially apparent when the volume variability is large, and the targeted nucleic acid concentration is high. Currently, only a single method for the elimination or reduction of these biases is available, and it assumes a fixed class of models for the volume variability. We show that the form in which volumetric variability occurs in empirical data is variable and cannot be modeled by a single distribution. We propose a new volume-modeling method, NPVolMod, which takes volume variability of an arbitrary form into account and is applicable to both absolute and relative quantification. The method is nonparametric in the sense that no distributional assumption is needed. Moreover, the volumes of each of the individual partitions are not needed. We empirically demonstrate by simulation that NPVolMod nearly eliminates the biases caused by volumetric variability and that it often outperforms the existing method. The possibility of the proper modeling of volume variability may have implications for platform design and may increase the performance of existing dPCR platforms in terms of, for example, their trueness and linear dynamic range.


Assuntos
Modelos Genéticos , Ácidos Nucleicos/genética , Reação em Cadeia da Polimerase
18.
Acta Oncol ; 57(5): 604-612, 2018 May.
Artigo em Inglês | MEDLINE | ID: mdl-29299946

RESUMO

INTRODUCTION: Evaluation of patient characteristics inducing toxicity in breast radiotherapy, using simultaneous modeling of multiple endpoints. METHODS AND MATERIALS: In 269 early-stage breast cancer patients treated with whole-breast irradiation (WBI) after breast-conserving surgery, toxicity was scored, based on five dichotomized endpoints. Five logistic regression models were fitted, one for each endpoint and the effect sizes of all variables were estimated using maximum likelihood (MLE). The MLEs are improved with James-Stein estimates (JSEs). The method combines all the MLEs, obtained for the same variable but from different endpoints. Misclassification errors were computed using MLE- and JSE-based prediction models. For associations, p-values from the sum of squares of MLEs were compared with p-values from the Standardized Total Average Toxicity (STAT) Score. RESULTS: With JSEs, 19 highest ranked variables were predictive of the five different endpoints. Important variables increasing radiation-induced toxicity were chemotherapy, age, SATB2 rs2881208 SNP and nodal irradiation. Treatment position (prone position) was most protective and ranked eighth. Overall, the misclassification errors were 45% and 34% for the MLE- and JSE-based models, respectively. p-Values from the sum of squares of MLEs and p-values from STAT score led to very similar conclusions, except for the variables nodal irradiation and treatment position, for which STAT p-values suggested an association with radiosensitivity, whereas p-values from the sum of squares indicated no association. Breast volume was ranked as the most significant variable in both strategies. DISCUSSION: The James-Stein estimator was used for selecting variables that are predictive for multiple toxicity endpoints. With this estimator, 19 variables were predictive for all toxicities of which four were significantly associated with overall radiosensitivity. JSEs led to almost 25% reduction in the misclassification error rate compared to conventional MLEs. Finally, patient characteristics that are associated with radiosensitivity were identified without explicitly quantifying radiosensitivity.


Assuntos
Neoplasias da Mama/radioterapia , Modelos Estatísticos , Tolerância a Radiação , Radioterapia/efeitos adversos , Feminino , Humanos , Radioterapia/métodos
19.
Anal Bioanal Chem ; 410(23): 5731-5739, 2018 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-29961092

RESUMO

The experimental design that will be carried out to evaluate a nucleic acid quantification hypothesis determines the cost and feasibility of digital polymerase chain reaction (digital PCR) studies. Experiment design involves the calculation of the number of technical measurement replicates and the determination of the characteristics of those replicates, and this in accordance with the capabilities of the available digital PCR platform. Available digital PCR power analyses suffer from one or more of the following limitations: narrow scope, unrealistic assumptions, no sufficient detail for replication, lack of source code and user-friendly software. Here, we discuss the nature of six parameters that affect the statistical power, i.e., desired effect size, total number of partitions, fraction of positive partitions, number of replicate measurements, between-replicate variance, and significance level. We also show to what extent these parameters affect power, and argue that careful design of experiments is needed to achieve the desired power. A web tool, dPowerCalcR, that allows interactive calculation of statistical power and optimization of the experimental design is available.


Assuntos
Reação em Cadeia da Polimerase/métodos , Algoritmos , Interpretação Estatística de Dados , Modelos Estatísticos , Projetos de Pesquisa , Software
20.
Eur Respir J ; 50(6)2017 12.
Artigo em Inglês | MEDLINE | ID: mdl-29269578

RESUMO

Malignant pleural mesothelioma (MPM) is predominantly caused by asbestos exposure and has a poor prognosis. Breath contains volatile organic compounds (VOCs) and can be explored as an early detection tool. Previously, we used multicapillary column/ion mobility spectrometry (MCC/IMS) to discriminate between patients with MPM and asymptomatic high-risk persons with a high rate of accuracy. Here, we aim to validate these findings in different control groups.Breath and background samples were obtained from 52 patients with MPM, 52 healthy controls without asbestos exposure (HC), 59 asymptomatic former asbestos workers (AEx), 41 patients with benign asbestos-related diseases (ARD), 70 patients with benign non-asbestos-related lung diseases (BLD) and 56 patients with lung cancer (LC).After background correction, logistic lasso regression and receiver operating characteristic (ROC) analysis, the MPM group was discriminated from the HC, AEx, ARD, BLD and LC groups with 65%, 88%, 82%, 80% and 72% accuracy, respectively. Combining AEx and ARD patients resulted in 94% sensitivity and 96% negative predictive value (NPV). The most important VOCs selected were P1, P3, P7, P9, P21 and P26.We discriminated MPM patients from at-risk subjects with great accuracy. The high sensitivity and NPV allow breath analysis to be used as a screening tool for ruling out MPM.


Assuntos
Testes Respiratórios , Neoplasias Pulmonares/diagnóstico , Mesotelioma/diagnóstico , Neoplasias Pleurais/diagnóstico , Adulto , Idoso , Amianto/efeitos adversos , Bélgica , Estudos de Casos e Controles , Estudos Transversais , Expiração , Feminino , Humanos , Modelos Logísticos , Masculino , Pessoa de Meia-Idade , Curva ROC , Compostos Orgânicos Voláteis/análise
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA