Probability binning comparison: a metric for quantitating univariate distribution differences.

Roederer, M; Treister, A; Moore, W; Herzenberg, L A

Roederer, M; Treister, A; Moore, W; Herzenberg, L A.

Afiliação

Roederer M; Vaccine Research Center, NIH, Bethesda, Maryland 20892-3015, USA. Roederer@drmr.com

Cytometry ; 45(1): 37-46, 2001 Sep 01.

Article em En | MEDLINE | ID: mdl-11598945

ABSTRACT

ABSTRACT

BACKGROUND:

Comparing distributions of data is an important goal in many applications. For example, determining whether two samples (e.g., a control and test sample) are statistically significantly different is useful to detect a response, or to provide feedback regarding instrument stability by detecting when collected data varies significantly over time.

METHODS:

We apply a variant of the chi-squared statistic to comparing univariate distributions. In this variant, a control distribution is divided such that an equal number of events fall into each of the divisions, or bins. This approach is thereby a mini-max algorithm, in that it minimizes the maximum expected variance for the control distribution. The control-derived bins are then applied to test sample distributions, and a normalized chi-squared value is computed. We term this algorithm Probability Binning.

RESULTS:

Using a Monte-Carlo simulation, we determined the distribution of chi-squared values obtained by comparing sets of events derived from the same distribution. Based on this distribution, we derive a conversion of any given chi-squared value into a metric that is analogous to a t-score, i.e., it can be used to estimate the probability that a test distribution is different from a control distribution. We demonstrate that this metric scales with the difference between two distributions, and can be used to rank samples according to similarity to a control. Finally, we demonstrate the applicability of this metric to ranking immunophenotyping distributions to suggest that it indeed can be used to objectively determine the relative distance of distributions compared to a single control.

CONCLUSION:

Probability Binning, as shown here, provides a useful metric for determining the probability that two or more flow cytometric data distributions are different. This metric can also be used to rank distributions to identify which are most similar or dissimilar. In addition, the algorithm can be used to quantitate contamination of even highly-overlapping populations. Finally, as demonstrated in an accompanying paper, Probability Binning can be used to gate on events that represent significantly different subsets from a control sample. Published 2001 Wiley-Liss, Inc.

Assuntos

Algoritmos; Distribuição de Qui-Quadrado; Citometria de Fluxo/métodos; Infecções por HIV/sangue; Humanos; Imunofenotipagem; Linfócitos/imunologia; Monócitos/imunologia; Método de Monte Carlo; Probabilidade

Buscar no Google

Adicionar na Minha BVS

Imprimir

XML

PubMed Links

Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Algoritmos / Distribuição de Qui-Quadrado / Citometria de Fluxo Tipo de estudo: Health_economic_evaluation / Prognostic_studies Limite: Humans Idioma: En Revista: Cytometry Ano de publicação: 2001 Tipo de documento: Article País de afiliação: Estados Unidos País de publicação: EEUU / ESTADOS UNIDOS / ESTADOS UNIDOS DA AMERICA / EUA / UNITED STATES / UNITED STATES OF AMERICA / US / USA

Buscar no Google

Adicionar na Minha BVS

Imprimir

XML

PubMed Links