RESUMO
Deep learning excels at cryo-tomographic image restoration and segmentation tasks but is hindered by a lack of training data. Here we introduce cryo-TomoSim (CTS), a MATLAB-based software package that builds coarse-grained models of macromolecular complexes embedded in vitreous ice and then simulates transmitted electron tilt series for tomographic reconstruction. We then demonstrate the effectiveness of these simulated datasets in training different deep learning models for use on real cryotomographic reconstructions. Computer-generated ground truth datasets provide the means for training models with voxel-level precision, allowing for unprecedented denoising and precise molecular segmentation of datasets. By modeling phenomena such as a three-dimensional contrast transfer function, probabilistic detection events, and radiation-induced damage, the simulated cryo-electron tomograms can cover a large range of imaging content and conditions to optimize training sets. When paired with small amounts of training data from real tomograms, networks become incredibly accurate at segmenting in situ macromolecular assemblies across a wide range of biological contexts.