RESUMEN
The analysis of large amounts of data is important for the development of machine learning (ML) models. flowSim is the first algorithm designed to visualize, detect and remove highly redundant information in flow cytometry (FCM) training sets to decrease the computational time for training and increase the performance of ML algorithms by reducing overfitting. flowSim performs near duplicate image detection by combining community detection algorithms with the density analysis of the marker expression values. flowSim clustering compared to consensus manual clustering on a dataset composed of 160 images of bivariate FCM data had a mean Adjusted Rand Index of 0.90, demonstrating its efficiency in identifying similar patterns. flowSim selectively discarded near duplicate files in datasets constructed with known redundancy, and removed 92.6% of FCM images in a dataset of over 500,000 drawn from public repositories.
Asunto(s)
Algoritmos , Aprendizaje Automático , Citometría de Flujo/métodos , Análisis por ConglomeradosRESUMEN
We introduce a new cell population score called SpecEnr (specific enrichment) and describe a method that discovers robust and accurate candidate biomarkers from flow cytometry data. Our approach identifies a new class of candidate biomarkers we define as driver cell populations, whose abundance is associated with a sample class (e.g., disease), but not as a result of a change in a related population. We show that the driver cell populations we find are also easily interpretable using a lattice-based visualization tool. Our method is implemented in the R package flowGraph, freely available on GitHub (github.com/aya49/flowGraph) and on BioConductor.
Asunto(s)
Programas Informáticos , Biomarcadores , Citometría de Flujo/métodosRESUMEN
These guidelines are a consensus work of a considerable number of members of the immunology and flow cytometry community. They provide the theory and key practical aspects of flow cytometry enabling immunologists to avoid the common errors that often undermine immunological data. Notably, there are comprehensive sections of all major immune cell types with helpful Tables detailing phenotypes in murine and human cells. The latest flow cytometry techniques and applications are also described, featuring examples of the data that can be generated and, importantly, how the data can be analysed. Furthermore, there are sections detailing tips, tricks and pitfalls to avoid, all written and peer-reviewed by leading experts in the field, making this an essential research companion.
Asunto(s)
Alergia e Inmunología/normas , Separación Celular/métodos , Separación Celular/normas , Citometría de Flujo/métodos , Citometría de Flujo/normas , Consenso , Humanos , FenotipoRESUMEN
Diffuse large B-cell lymphoma (DLBCL) is the most common histologic subtype of non-Hodgkin lymphoma and is notorious for its clinical heterogeneity. Patient outcomes can be predicted by cell-of-origin (COO) classification, demonstrating that the underlying transcriptional signature of malignant B-cells informs biological behavior in the context of standard combination chemotherapy regimens. In the current study, we used mass cytometry (CyTOF) to examine tumor phenotypes at the protein level with single cell resolution in a collection of 27 diagnostic DLBCL biopsy specimens from treatment naïve patients. We found that malignant B-cells from each patient occupied unique regions in 37-dimensional phenotypic space with no apparent clustering of samples into discrete subtypes. Interestingly, variable MHC class II expression was found to be the greatest contributor to phenotypic diversity. Within individual tumors, a subset of cases showed multiple phenotypic subpopulations, and in one case, we were able to demonstrate direct correspondence between protein-level phenotypic subsets and DNA mutation-defined subclones. In summary, CyTOF analysis can resolve both intertumoral and intratumoral heterogeneity among primary samples and reveals that each case of DLBCL is unique and may be comprised of multiple, genetically distinct subclones. © 2019 International Society for Advancement of Cytometry.
Asunto(s)
Linfoma de Células B Grandes Difuso , Humanos , Linfoma de Células B Grandes Difuso/genética , MutaciónRESUMEN
Defining responses of the structural and immune cells in biologic systems is critically important to understanding disease states and responses to injury. This requires accurate and sensitive methods to define cell types in organ systems. The principal method to delineate the cell populations involved in these processes is flow cytometry. Although researchers increasingly use flow cytometry, technical challenges can affect its accuracy and reproducibility, thus significantly limiting scientific advancements. This challenge is particularly critical to lung immunology, as the lung is readily accessible and therefore used in preclinical and clinical studies to define potential therapeutics. Given the importance of flow cytometry in pulmonary research, the American Thoracic Society convened a working group to highlight issues and technical challenges to the performance of high-quality pulmonary flow cytometry, with a goal of improving its quality and reproducibility.
Asunto(s)
Citometría de Flujo/métodos , Citometría de Flujo/normas , Enfermedades Pulmonares/diagnóstico , Enfermedades Pulmonares/genética , Pulmón/citología , Animales , Apoptosis , Separación Celular , Congresos como Asunto , Humanos , Pulmón/inmunología , Pulmón/patología , Células Mieloides/citología , Fenotipo , Guías de Práctica Clínica como Asunto , Reproducibilidad de los Resultados , Sociedades Médicas , Estados UnidosRESUMEN
Motivation: Droplet digital PCR (ddPCR) is an emerging technology for quantifying DNA. By partitioning the target DNA into â¼20 000 droplets, each serving as its own PCR reaction compartment, a very high sensitivity of DNA quantification can be achieved. However, manual analysis of the data is time consuming and algorithms for automated analysis of non-orthogonal, multiplexed ddPCR data are unavailable, presenting a major bottleneck for the advancement of ddPCR transitioning from low-throughput to high-throughput. Results: ddPCRclust is an R package for automated analysis of data from Bio-Rad's droplet digital PCR systems (QX100 and QX200). It can automatically analyze and visualize multiplexed ddPCR experiments with up to four targets per reaction. Results are on par with manual analysis, but only take minutes to compute instead of hours. The accompanying Shiny app ddPCRvis provides easy access to the functionalities of ddPCRclust through a web-browser based GUI. Availability and implementation: R package: https://github.com/bgbrink/ddPCRclust; Interface: https://github.com/bgbrink/ddPCRvis/; Web: https://bibiserv.cebitec.uni-bielefeld.de/ddPCRvis/. Supplementary information: Supplementary data are available at Bioinformatics online.
Asunto(s)
Biología Computacional/métodos , ADN/análisis , Reacción en Cadena de la Polimerasa/métodos , Programas Informáticos , AlgoritmosRESUMEN
Automated reagent preparation, sample processing, and data acquisition have increased the rate at which flow cytometry data can be generated. Furthermore, advances in technology and flow cytometry instrumentation continually increase the complexity and dimensionality of this data. Together, this leads to increased pressure on manual data analysis, which has inherent limitations including subjectivity of the analyst and the length of time needed for data processing. These issues can create bottlenecks in the data processing workflow and potentially compromise data quality. To address these issues, as well as the challenges associated with manual gating in a high-volume human immune profiling laboratory, we sought to implement an automated analysis pipeline. In this report, we discuss considerations for selecting an automated analysis method, the process of implementing an automated pipeline, and detail our successful incorporation of an automated gating strategy with flowDensity into our analysis workflow. This validated pipeline augments our laboratory's ability to provide rapid high-throughput immune profiling for patients participating in cancer immunotherapy clinical trials. © International Society for Advancement of Cytometry.
Asunto(s)
Automatización de Laboratorios/métodos , Citometría de Flujo/métodos , Interpretación Estadística de Datos , HumanosAsunto(s)
Biomarcadores/análisis , Biología Computacional/métodos , Citometría de Flujo/estadística & datos numéricos , Análisis de la Célula Individual/métodos , Programas Informáticos , Biología Computacional/instrumentación , Citometría de Flujo/instrumentación , Humanos , Inmunidad , Linfocitos/inmunología , Análisis de la Célula Individual/instrumentaciónRESUMEN
The rapid expansion of flow cytometry applications has outpaced the functionality of traditional manual analysis tools used to interpret flow cytometry data. Scientists are faced with the daunting prospect of manually identifying interesting cell populations in 50-dimensional datasets, equalling the complexity previously only reached in mass cytometry. Data can no longer be analyzed or interpreted fully by manual approaches. While automated gating has been the focus of intense efforts, there are many significant additional steps to the analytical pipeline (e.g., cleaning the raw files, event outlier detection, extracting immunophenotypes). We review the components of a customized automated analysis pipeline that can be generally applied to large scale flow cytometry data. We demonstrate these methodologies on data collected by the International Mouse Phenotyping Consortium (IMPC).
Asunto(s)
Biología Computacional , Citometría de Flujo/métodos , Inmunofenotipificación/métodos , Algoritmos , Animales , Citometría de Flujo/estadística & datos numéricos , Humanos , Inmunofenotipificación/estadística & datos numéricos , Ratones , Programas InformáticosRESUMEN
We demonstrate improved methods for making valid and accurate comparisons of fluorescence measurement capabilities among instruments tested at different sites and times. We designed a suite of measurements and automated data processing methods to obtain consistent objective results and applied them to a selection of 23 instruments at nine sites to provide a range of instruments as well as multiple instances of similar instruments. As far as we know, this study represents the most accurate methods and results so far demonstrated for this purpose. The first component of the study reporting improved methods for photoelectron scale (Spe) evaluations, which was published previously (Parks, El Khettabi, Chase, Hoffman, Perfetto, Spidlen, Wood, Moore, and Brinkman: Cytometry A 91 (2017) 232-249). Those results which were within themselves are not sufficient for instrument comparisons, so here, we use the Spe scale results for the 23 cytometers and combine them with additional information from the analysis suite to obtain the metrics actually needed for instrument evaluations and comparisons. We adopted what we call the 2+2SD limit of resolution as a maximally informative metric, for evaluating and comparing dye measurement sensitivity among different instruments and measurement channels. Our results demonstrate substantial differences among different classes of instruments in both dye response and detection sensitivity and some surprisingly large differences among similar instruments, even among instruments with nominally identical configurations. On some instruments, we detected defective measurement channels needing service. The system can be applied in shared resource laboratories and other facilities as an aspect of quality assurance, and accurate instrument comparisons can be valuable for selecting instruments for particular purposes and for making informed instrument acquisition decisions. An institutionally supported program could serve the cytometry community by facilitating access to materials, and analysis and maintaining an archive of results. © 2018 International Society for Advancement of Cytometry.
Asunto(s)
Citometría de Flujo/instrumentación , Citometría de Flujo/métodos , Calibración , HumanosRESUMEN
We developed a fully automated procedure for analyzing data from LED pulses and multilevel bead sets to evaluate backgrounds and photoelectron scales of cytometer fluorescence channels. The method improves on previous formulations by fitting a full quadratic model with appropriate weighting and by providing standard errors and peak residuals as well as the fitted parameters themselves. Here we describe the details of the methods and procedures involved and present a set of illustrations and test cases that demonstrate the consistency and reliability of the results. The automated analysis and fitting procedure is generally quite successful in providing good estimates of the Spe (statistical photoelectron) scales and backgrounds for all the fluorescence channels on instruments with good linearity. The precision of the results obtained from LED data is almost always better than that from multilevel bead data, but the bead procedure is easy to carry out and provides results good enough for most purposes. Including standard errors on the fitted parameters is important for understanding the uncertainty in the values of interest. The weighted residuals give information about how well the data fits the model, and particularly high residuals indicate bad data points. Known photoelectron scales and measurement channel backgrounds make it possible to estimate the precision of measurements at different signal levels and the effects of compensated spectral overlap on measurement quality. Combining this information with measurements of standard samples carrying dyes of biological interest, we can make accurate comparisons of dye sensitivity among different instruments. Our method is freely available through the R/Bioconductor package flowQB. © 2017 International Society for Advancement of Cytometry.
Asunto(s)
Citometría de Flujo/métodos , Modelos Teóricos , Imagen Óptica/métodos , Calibración , Citometría de Flujo/estadística & datos numéricos , Análisis de los Mínimos CuadradosRESUMEN
SUMMARY: flowDensity facilitates reproducible, high-throughput analysis of flow cytometry data by automating a predefined manual gating approach. The algorithm is based on a sequential bivariate gating approach that generates a set of predefined cell populations. It chooses the best cut-off for individual markers using characteristics of the density distribution. The Supplementary Material is linked to the online version of the manuscript. AVAILABILITY AND IMPLEMENTATION: R source code freely available through BioConductor (http://master.bioconductor.org/packages/devel/bioc/html/flowDensity.html.). Data available from FlowRepository.org (dataset FR-FCM-ZZBW). CONTACT: rbrinkman@bccrc.ca SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Asunto(s)
Algoritmos , Fenómenos Fisiológicos Celulares , Biología Computacional/métodos , Citometría de Flujo/métodos , Programas Informáticos , Biomarcadores , Análisis por Conglomerados , Bases de Datos Factuales , HumanosRESUMEN
MOTIVATION: Deep profiling the phenotypic landscape of tissues using high-throughput flow cytometry (FCM) can provide important new insights into the interplay of cells in both healthy and diseased tissue. But often, especially in clinical settings, the cytometer cannot measure all the desired markers in a single aliquot. In these cases, tissue is separated into independently analysed samples, leaving a need to electronically recombine these to increase dimensionality. Nearest-neighbour (NN) based imputation fulfils this need but can produce artificial subpopulations. Clustering-based NNs can reduce these, but requires prior domain knowledge to be able to parameterize the clustering, so is unsuited to discovery settings. RESULTS: We present flowBin, a parameterization-free method for combining multitube FCM data into a higher-dimensional form suitable for deep profiling and discovery. FlowBin allocates cells to bins defined by the common markers across tubes in a multitube experiment, then computes aggregate expression for each bin within each tube, to create a matrix of expression of all markers assayed in each tube. We show, using simulated multitube data, that flowType analysis of flowBin output reproduces the results of that same analysis on the original data for cell types of >10% abundance. We used flowBin in conjunction with classifiers to distinguish normal from cancerous cells. We used flowBin together with flowType and RchyOptimyx to profile the immunophenotypic landscape of NPM1-mutated acute myeloid leukemia, and present a series of novel cell types associated with that mutation.
Asunto(s)
Biomarcadores de Tumor/genética , Citometría de Flujo/métodos , Leucemia Mieloide Aguda/genética , Leucocitos Mononucleares/metabolismo , Mutación/genética , Programas Informáticos , Estudios de Casos y Controles , Linaje de la Célula , Separación Celular , Humanos , Inmunofenotipificación , Leucemia Mieloide Aguda/patología , Leucocitos Mononucleares/citología , Proteínas Nucleares/genética , NucleofosminaRESUMEN
MOTIVATION: Finding one or more cell populations of interest, such as those correlating to a specific disease, is critical when analysing flow cytometry data. However, labelling of cell populations is not well defined, making it difficult to integrate the output of algorithms to external knowledge sources. RESULTS: We developed flowCL, a software package that performs semantic labelling of cell populations based on their surface markers and applied it to labelling of the Federation of Clinical Immunology Societies Human Immunology Project Consortium lyoplate populations as a use case. CONCLUSION: By providing automated labelling of cell populations based on their immunophenotype, flowCL allows for unambiguous and reproducible identification of standardized cell types. AVAILABILITY AND IMPLEMENTATION: Code, R script and documentation are available under the Artistic 2.0 license through Bioconductor (http://www.bioconductor.org/packages/devel/bioc/html/flowCL.html). CONTACT: rbrinkman@bccrc.ca SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Asunto(s)
Algoritmos , Fenómenos Fisiológicos Celulares , Citometría de Flujo/métodos , Ontología de Genes , Inmunofenotipificación/métodos , Programas Informáticos , Humanos , Antígenos Comunes de Leucocito/análisis , Receptores CCR7/análisisRESUMEN
Modern flow cytometry systems can be coupled to plate readers for high-throughput acquisition. These systems allow hundreds of samples to be analyzed in a single day. Quality control of the data remains challenging, however, and is further complicated when a large number of parameters is measured in an experiment. Our examination of 29,228 publicly available FCS files from laboratories worldwide indicates 13.7% have a fluorescence anomaly. In particular, fluorescence measurements for a sample over the collection time may not remain stable due to fluctuations in fluid dynamics; the impact of instabilities may differ between samples and among parameters. Therefore, we hypothesized that tracking cell populations (which represent a summary of all parameters) in centered log ratio space would provide a sensitive and consistent method of quality control. Here, we present flowClean, an algorithm to track subset frequency changes within a sample during acquisition, and flag time periods with fluorescence perturbations leading to the emergence of false populations. Aberrant time periods are reported as a new parameter and added to a revised data file, allowing users to easily review and exclude those events from further analysis. We apply this method to proof-of-concept datasets and also to a subset of data from a recent vaccine trial. The algorithm flags events that are suspicious by visual inspection, as well as those showing more subtle effects that might not be consistently flagged by investigators reviewing the data manually, and out-performs the current state-of-the-art. flowClean is available as an R package on Bioconductor, as a module on the free-to-use GenePattern web server, and as a plugin for FlowJo X. © 2016 International Society for Advancement of Cytometry.
Asunto(s)
Algoritmos , Citometría de Flujo/normas , Rastreo Celular/instrumentación , Rastreo Celular/métodos , Conjuntos de Datos como Asunto , Fluorescencia , Humanos , Control de CalidadRESUMEN
The Flow Cytometry: Critical Assessment of Population Identification Methods (FlowCAP) challenges were established to compare the performance of computational methods for identifying cell populations in multidimensional flow cytometry data. Here we report the results of FlowCAP-IV where algorithms from seven different research groups predicted the time to progression to AIDS among a cohort of 384 HIV+ subjects, using antigen-stimulated peripheral blood mononuclear cell (PBMC) samples analyzed with a 14-color staining panel. Two approaches (FlowReMi.1 and flowDensity-flowType-RchyOptimyx) provided statistically significant predictive value in the blinded test set. Manual validation of submitted results indicated that unbiased analysis of single cell phenotypes could reveal unexpected cell types that correlated with outcomes of interest in high dimensional flow cytometry datasets.
Asunto(s)
Síndrome de Inmunodeficiencia Adquirida/patología , Benchmarking , Biología Computacional/métodos , Progresión de la Enfermedad , Citometría de Flujo/métodos , Linfocitos T/citología , Síndrome de Inmunodeficiencia Adquirida/diagnóstico , Algoritmos , Interpretación Estadística de Datos , Seropositividad para VIH , Humanos , Coloración y EtiquetadoRESUMEN
Previous studies demonstrated that imatinib mesylate (IM) induces autophagy in chronic myeloid leukemia (CML) and that this process is critical to cell survival upon therapy. However, it is not known if the autophagic process differs at basal levels between CML patients and healthy individuals and if pretreatment CML cells harbor unique autophagy characteristics that could predict patients' clinical outcomes. We now demonstrate that several key autophagy genes are differentially expressed in CD34(+) hematopoietic stem/progenitor cells, with the highest transcript levels detected for ATG4B, and that the transcript and protein expression levels of ATG4 family members, ATG5 and BECLIN-1 are significantly increased in CD34(+) cells from chronic-phase CML patients (P < .05). Importantly, ATG4B is differentially expressed in pretreatment CML stem/progenitor cells from subsequent IM responders vs IM nonresponders (P < .05). Knockdown of ATG4B suppresses autophagy, impairs the survival of CML stem/progenitor cells and sensitizes them to IM treatment. Moreover, deregulated expression of ATG4B in CD34(+) CML cells inversely correlates with transcript levels of miR-34a, and ATG4B is shown to be a direct target of miR-34a. This study identifies ATG4B as a potential biomarker for predicting therapeutic response in treatment-naïve CML stem/progenitor cells and uncovers ATG4B as a possible drug target in these cells.
Asunto(s)
Biomarcadores Farmacológicos , Biomarcadores de Tumor/metabolismo , Cisteína Endopeptidasas/metabolismo , Leucemia Mielógena Crónica BCR-ABL Positiva/diagnóstico , Células Madre Neoplásicas/metabolismo , Adulto , Antígenos CD34/metabolismo , Autofagia/genética , Proteínas Relacionadas con la Autofagia , Biomarcadores Farmacológicos/metabolismo , Células Cultivadas , Humanos , Células K562 , Leucemia Mielógena Crónica BCR-ABL Positiva/patología , Leucemia Mielógena Crónica BCR-ABL Positiva/terapia , Terapia Molecular Dirigida , Células Madre Neoplásicas/patología , Pronóstico , Resultado del TratamientoRESUMEN
We present a significantly improved version of the flowType and RchyOptimyx BioConductor-based pipeline that is both 14 times faster and can accommodate multiple levels of biomarker expression for up to 96 markers. With these improvements, the pipeline is positioned to be an integral part of data analysis for high-throughput experiments on high-dimensional single-cell assay platforms, including flow cytometry, mass cytometry and single-cell RT-qPCR.
Asunto(s)
Citometría de Flujo/métodos , Antígenos CD/análisis , Biomarcadores/análisis , Programas InformáticosRESUMEN
Identifying homogenous sets of cell populations in flow cytometry is an important process for sorting and selecting populations of interests for further data acquisition and analysis. Many computational methods are now available to automate this process, with several algorithms partitioning cells based on high-dimensional separation versus the traditional pairwise two-dimensional visualization approach of manual gating. ISAC's classification results file format was developed to exchange the results of both manual gating and algorithmic classification approaches in a standardized way based on per event based classifications, including the potential for soft classifications expressed as the probability of an event being a member of a class. © 2014 International Society for Advancement of Cytometry.