RESUMO
Read-across is a well-established data-gap filling technique used within analogue or category approaches. Acceptance remains an issue, mainly due to the difficulties of addressing residual uncertainties associated with a read-across prediction and because assessments are expert-driven. Frameworks to develop, assess and document read-across may help reduce variability in read-across results. Data-driven read-across approaches such as Generalised Read-Across (GenRA) include quantification of uncertainties and performance. GenRA also affords opportunities on how New Approach Method (NAM) data can be systematically incorporated to support the read-across hypothesis. Herein, a systematic investigation of differences in expert-driven read-across with data-driven approaches was pursued in terms of establishing scientific confidence in the use of read-across. A dataset of expert-driven read-across assessments that made use of registration data as disseminated in the public International Uniform Chemical Information Database (IUCLID) (version 6) of Registration, Evaluation, Authorisation and Restriction of Chemicals (REACH) Study Results were compiled. A dataset of ~5000 read-across cases pertaining to repeated dose and developmental toxicity was extracted and mapped to content within EPA's Distributed Structure Searchable Toxicity database (DSSTox) to retrieve chemical name and structural identification information. Content could be mapped to ~3600 cases which when filtered for unique cases with curated quantitative structure-activity relationship-ready SMILES resulted in 389 target-source analogue pairs. The similarity between target and the source analogues on the basis of different contexts - from structural similarity using chemical fingerprints to metabolic similarity using predicted metabolic information was evaluated. An attempt was also made to quantify the relative contribution each similarity context played relative to the target-source analogue pairs by deriving a model which predicted known analogue pairs. Finally, point of departure values (PODs) were predicted using the GenRA approach underpinned by data extracted from the EPA's Toxicity Values Database (ToxValDB). The GenRA predicted PODs were compared with those reported within the REACH dossiers themselves. This study offers generalisable insights on how read-across is already applied for regulatory submissions and expectations on the levels of similarity necessary to make decisions.