RESUMEN
OBJECTIVE: To systematically evaluate which lesion-based imaging features and methods allow for the best statistical prediction of poststroke deficits across independent datasets. METHODS: We utilized imaging and clinical data from three independent datasets of patients experiencing acute stroke (N1 = 109, N2 = 638, N3 = 794) to statistically predict acute stroke severity (NIHSS) based on lesion volume, lesion location, and structural and functional disconnection with the lesion location using normative connectomes. RESULTS: We found that prediction models trained on small single-center datasets could perform well using within-dataset cross-validation, but results did not generalize to independent datasets (median R2 N1 = 0.2%). Performance across independent datasets improved using large single-center training data (R2 N2 = 15.8%) and improved further using multicenter training data (R2 N3 = 24.4%). These results were consistent across lesion attributes and prediction models. Including either structural or functional disconnection in the models outperformed prediction based on volume or location alone (P < 0.001, FDR-corrected). INTERPRETATION: We conclude that (1) prediction performance in independent datasets of patients with acute stroke cannot be inferred from cross-validated results within a dataset, as performance results obtained via these two methods differed consistently, (2) prediction performance can be improved by training on large and, importantly, multicenter datasets, and (3) structural and functional disconnection allow for improved prediction of acute stroke severity.
RESUMEN
Deep learning has allowed for remarkable progress in many medical scenarios. Deep learning prediction models often require 105-107 examples. It is currently unknown whether deep learning can also enhance predictions of symptoms post-stroke in real-world samples of stroke patients that are often several magnitudes smaller. Such stroke outcome predictions however could be particularly instrumental in guiding acute clinical and rehabilitation care decisions. We here compared the capacities of classically used linear and novel deep learning algorithms in their prediction of stroke severity. Our analyses relied on a total of 1430 patients assembled from the MRI-Genetics Interface Exploration collaboration and a Massachusetts General Hospital-based study. The outcome of interest was National Institutes of Health Stroke Scale-based stroke severity in the acute phase after ischaemic stroke onset, which we predict by means of MRI-derived lesion location. We automatically derived lesion segmentations from diffusion-weighted clinical MRI scans, performed spatial normalization and included a principal component analysis step, retaining 95% of the variance of the original data. We then repeatedly separated a train, validation and test set to investigate the effects of sample size; we subsampled the train set to 100, 300 and 900 and trained the algorithms to predict the stroke severity score for each sample size with regularized linear regression and an eight-layered neural network. We selected hyperparameters on the validation set. We evaluated model performance based on the explained variance (R2) in the test set. While linear regression performed significantly better for a sample size of 100 patients, deep learning started to significantly outperform linear regression when trained on 900 patients. Average prediction performance improved by â¼20% when increasing the sample size 9× [maximum for 100 patients: 0.279 ± 0.005 (R2, 95% confidence interval), 900 patients: 0.337 ± 0.006]. In summary, for sample sizes of 900 patients, deep learning showed a higher prediction performance than typically employed linear methods. These findings suggest the existence of non-linear relationships between lesion location and stroke severity that can be utilized for an improved prediction performance for larger sample sizes.
RESUMEN
Photoswitchable molecules display two or more isomeric forms that may be accessed using light. Separating the electronic absorption bands of these isomers is key to selectively addressing a specific isomer and achieving high photostationary states whilst overall red-shifting the absorption bands serves to limit material damage due to UV-exposure and increases penetration depth in photopharmacological applications. Engineering these properties into a system through synthetic design however, remains a challenge. Here, we present a data-driven discovery pipeline for molecular photoswitches underpinned by dataset curation and multitask learning with Gaussian processes. In the prediction of electronic transition wavelengths, we demonstrate that a multioutput Gaussian process (MOGP) trained using labels from four photoswitch transition wavelengths yields the strongest predictive performance relative to single-task models as well as operationally outperforming time-dependent density functional theory (TD-DFT) in terms of the wall-clock time for prediction. We validate our proposed approach experimentally by screening a library of commercially available photoswitchable molecules. Through this screen, we identified several motifs that displayed separated electronic absorption bands of their isomers, exhibited red-shifted absorptions, and are suited for information transfer and photopharmacological applications. Our curated dataset, code, as well as all models are made available at https://github.com/Ryan-Rhys/The-Photoswitch-Dataset.
RESUMEN
The task of predicting human motion is complicated by the natural heterogeneity and compositionality of actions, necessitating robustness to distributional shifts as far as out-of-distribution (OoD). Here, we formulate a new OoD benchmark based on the Human3.6M and Carnegie Mellon University (CMU) motion capture datasets, and introduce a hybrid framework for hardening discriminative architectures to OoD failure by augmenting them with a generative model. When applied to current state-of-the-art discriminative models, we show that the proposed approach improves OoD robustness without sacrificing in-distribution performance, and can theoretically facilitate model interpretability. We suggest human motion predictors ought to be constructed with OoD challenges in mind, and provide an extensible general framework for hardening diverse discriminative architectures to extreme distributional shift. The code is available at: https://github.com/bouracha/OoDMotion.