RESUMEN
By conducting hierarchical clustering along a sliding window, we generated haplotypes across hundreds of re-sequenced genomes in a few hours. We leveraged our method to define cryptic introgressions underlying disease resistance in tomato (Solanum lycopersicum L.) and to discover resistant germplasm in the tomato seed bank. The genomes of 9 accessions with early blight (Alternaria linariae) disease resistance were newly sequenced and analyzed together with published sequences for 770 tomato and wild species accessions, most of which are available in germplasm collections. Identification of common ancestral haplotypes among resistant germplasm enabled rapid fine mapping of recently discovered quantitative trait loci (QTL) conferring resistance and the identification of possible causal variants. The source of the early blight QTL EB-9 was traced to a vintage tomato named 'Devon Surprise'. Another QTL, EB-5, as well as resistance to bacterial spot disease (Xanthomonas spp.), was traced to Hawaii 7998. A genomic survey of all accessions forecasted EB-9-derived resistance in several heirloom tomatoes, accessions of S. lycopersicum var. cerasiforme, and S. pimpinellifolium PI 37009. Our haplotype-based predictions were validated by screening the accessions against the causal pathogen. There was little evidence of EB-5 prevalence in surveyed contemporary germplasm, presenting an opportunity to bolster tomato disease resistance by adding this rare locus. Our work demonstrates practical insights that can be derived from the efficient processing of large genome-scale datasets, including rapid functional prediction of disease resistance QTL in diverse genetic backgrounds. Finally, our work finds more efficient ways to leverage public genetic resources for crop improvement.