RESUMEN
Around a third of stroke survivors suffer from acquired language disorders (aphasia), but current medicine cannot predict whether or when they might recover. Prognostic research in this area increasingly draws on datasets associating structural brain imaging data with outcome scores for ever-larger samples of stroke patients. The aim is to learn brain-behaviour trends from these data, and generalize those trends to predict outcomes for new patients. The practical significance of this work depends on the expected breadth of that generalization. Here, we show that these models can generalize across countries and native languages (from British patients tested in English to Chilean patients tested in Spanish), across neuroimaging technology (from MRI to CT), and from scans collected months or years after stroke for research purposes, to scans collected days or weeks after stroke for clinical purposes. Our results suggest one important confound, in attempting to generalize from research data to clinical data, is the delay between scan acquisition and language assessment. This delay is typically small for research data, where scans and assessments are often acquired contemporaneously. But the most natural, clinical application of these predictions will employ acute prognostic factors to predict much longer-term outcomes. We mitigated this confound by projecting the clinical patients' lesions from the time when their scans were acquired, to the time when their language abilities were assessed; with this projection in place, there was strong evidence that prognoses derived from research data generalized equally well to research and clinical data. These results encourage attention to the confounding role that lesion growth may play in other types of lesion-symptom analysis.