RESUMO
In most countries, a government agency or collaborating organization gathers information on occupational accidents. Comparisons based on a single factor such as autonomous community, activity sector or others, often leads to contradictory conclusions. The use of this information for comparison is not immediate because the different characteristics considered give place to different possible comparisons. The elaboration of a single baseline for each set of characteristics is addressed. The method proposed comes from the data available in Spain but could be applied to other cases. The method consists of: (1) selecting factors-those selected are age, sex, autonomous community and activity; (2) the generation of a synthetic population based on data from a survey and general proportions by applying the Optimal Representative Sample Weighting (rsw); and (3) the prediction of the accidents ratio for each set of characteristic by using a XGBoost decision trees ensemble. The results confirm the appropriateness of the method.