RESUMO
In recent years, machine learning approaches have been successfully applied to analysis of patient symptom data in the context of disease diagnosis, at least where such data is well codified. However, much of the data present in Electronic Health Records (EHR) is unlikely to prove suitable for classic machine learning approaches. In particular, the use of free (or unstructured) text for clinical notes presents significant analytical opportunities, but also unique difficulties. Furthermore, the wide dispersal of health data relating to individuals necessitates the development of decentralized solutions. We provide, in this paper, an overview of our approach to develop a neural network framework for patient classification in the environment of EHRs where data may be heterogeneous, incomplete (containing missing values), and noisy. In this paper we describe our system which provides prediction of outlier cases which are likely to relate to frequent attender patients, which acheives an Area-Under-the-Curve score of up to 0.92.