RESUMEN
A key challenge in understanding subcellular organization is quantifying interpretable measurements of intracellular structures with complex multi-piece morphologies in an objective, robust and generalizable manner. Here we introduce a morphology-appropriate representation learning framework that uses 3D rotation invariant autoencoders and point clouds. This framework is used to learn representations of complex multi-piece morphologies that are independent of orientation, compact, and easy to interpret. We apply our framework to intracellular structures with punctate morphologies (e.g. DNA replication foci) and polymorphic morphologies (e.g. nucleoli). We systematically compare our framework to image-based autoencoders across several intracellular structure datasets, including a synthetic dataset with pre-defined rules of organization. We explore the trade-offs in the performance of different models by performing multi-metric benchmarking across efficiency, generative capability, and representation expressivity metrics. We find that our framework, which embraces the underlying morphology of multi-piece structures, facilitates the unsupervised discovery of sub-clusters for each structure. We show how our approach can also be applied to phenotypic profiling using a dataset of nucleolar images following drug perturbations. We implement and provide all representation learning models using CytoDL, a python package for flexible and configurable deep learning experiments.
RESUMEN
Understanding how a subset of expressed genes dictates cellular phenotype is a considerable challenge owing to the large numbers of molecules involved, their combinatorics and the plethora of cellular behaviours that they determine1,2. Here we reduced this complexity by focusing on cellular organization-a key readout and driver of cell behaviour3,4-at the level of major cellular structures that represent distinct organelles and functional machines, and generated the WTC-11 hiPSC Single-Cell Image Dataset v1, which contains more than 200,000 live cells in 3D, spanning 25 key cellular structures. The scale and quality of this dataset permitted the creation of a generalizable analysis framework to convert raw image data of cells and their structures into dimensionally reduced, quantitative measurements that can be interpreted by humans, and to facilitate data exploration. This framework embraces the vast cell-to-cell variability that is observed within a normal population, facilitates the integration of cell-by-cell structural data and allows quantitative analyses of distinct, separable aspects of organization within and across different cell populations. We found that the integrated intracellular organization of interphase cells was robust to the wide range of variation in cell shape in the population; that the average locations of some structures became polarized in cells at the edges of colonies while maintaining the 'wiring' of their interactions with other structures; and that, by contrast, changes in the location of structures during early mitotic reorganization were accompanied by changes in their wiring.