RESUMO
Hi-C is a widely applied chromosome conformation capture (3C)-based technique, which has produced a large number of genomic contact maps with high sequencing depths for a wide range of cell types, enabling comprehensive analyses of the relationships between biological functionalities (e.g. gene regulation and expression) and the three-dimensional genome structure. Comparative analyses play significant roles in Hi-C data studies, which are designed to make comparisons between Hi-C contact maps, thus evaluating the consistency of replicate Hi-C experiments (i.e. reproducibility measurement) and detecting statistically differential interacting regions with biological significance (i.e. differential chromatin interaction detection). However, due to the complex and hierarchical nature of Hi-C contact maps, it remains challenging to conduct systematic and reliable comparative analyses of Hi-C data. Here, we proposed sslHiC, a contrastive self-supervised representation learning framework, for precisely modeling the multi-level features of chromosome conformation and automatically producing informative feature embeddings for genomic loci and their interactions to facilitate comparative analyses of Hi-C contact maps. Comprehensive computational experiments on both simulated and real datasets demonstrated that our method consistently outperformed the state-of-the-art baseline methods in providing reliable measurements of reproducibility and detecting differential interactions with biological meanings.
Assuntos
Cromatina , Cromossomos , Reprodutibilidade dos Testes , Cromatina/genética , Cromossomos/genética , Genômica/métodos , Aprendizado de Máquina SupervisionadoRESUMO
Analytical performance specifications (APS) are usually compared to the intermediate reproducibility uncertainty of measuring a particular measurand using a single in vitro diagnostic medical device (IVD MD). Healthcare systems assembling multiple laboratories that include several IVD MDs and cater to patients suffering from long-term disease conditions mean that samples from a patient are analyzed using a few IVD MDs, sometimes from different manufacturers, but rarely all IVD MDs in the healthcare system. The reproducibility uncertainty for results of a measurand measured within a healthcare system and the components of this measurement uncertainty is useful in strategies to minimize bias and overall measurement uncertainty within the healthcare system. The root mean squares deviation (RMSD) calculated as the sample standard deviation (SD) and relative SD includes both imprecision and bias and is appropriate for expressing such uncertainties. Results from commutable stabilized internal and external control samples, from measuring split natural patient samples or using big-data techniques, are essential in monitoring bias and measurement uncertainties in healthcare systems. Variance component analysis (VCA) can be employed to quantify the relative contributions of the most influential factors causing measurement uncertainty. Such results represent invaluable information for minimizing measurement uncertainty in the interest of the healthcare system's patients.