RÉSUMÉ
The deoxyribonucleic acid (DNA) molecule damage simulations with an atom level geometric model use the traversal algorithm that has the disadvantages of quite time-consuming, slow convergence and high-performance computer requirement. Therefore, this work presents a density-based spatial clustering of applications with noise (DBSCAN) clustering algorithm based on the spatial distributions of energy depositions and hydroxyl radicals (·OH). The algorithm with probability and statistics can quickly get the DNA strand break yields and help to study the variation pattern of the clustered DNA damage. Firstly, we simulated the transportation of protons and secondary particles through the nucleus, as well as the ionization and excitation of water molecules by using Geant4-DNA that is the Monte Carlo simulation toolkit for radiobiology, and got the distributions of energy depositions and hydroxyl radicals. Then we used the damage probability functions to get the spatial distribution dataset of DNA damage points in a simplified geometric model. The DBSCAN clustering algorithm based on damage points density was used to determine the single-strand break (SSB) yield and double-strand break (DSB) yield. Finally, we analyzed the DNA strand break yield variation trend with particle linear energy transfer (LET) and summarized the variation pattern of damage clusters. The simulation results show that the new algorithm has a faster simulation speed than the traversal algorithm and a good precision result. The simulation results have consistency when compared to other experiments and simulations. This work achieves more precise information on clustered DNA damage induced by proton radiation at the molecular level with high speed, so that it provides an essential and powerful research method for the study of radiation biological damage mechanism.