Relating Biological and Clinical Features of Alzheimers Patients With Predictive Clustering Trees Martin Breskvar1,2 martin.breskvar@ijs.si Bernard enko1 bernard.zenko@ijs.si Sao Deroski1,2 saso.dzeroski@ijs.si 1Joef Stefan Institute, Jamova cesta 39, 1000 Ljubljana, Slovenia 2Joef Stefan Postgraduate School, Jamova cesta 39, 1000 Ljubljana, Slovenia ABSTRACT This paper presents experiments with Predictive Clustering Trees that uncover several subpopulations of the Alzheimers disease patients. Our experiments are based on previous research that identi(cid:12)ed the everyday cognition as one of the most important testing domains in the clinical diagnostic process for the Alzheimers disease. We are investigating which biological features have a role in the progression of the disease by observing behavioral response of the patients and their study partners. Our dataset includes 342 male and 317 female patients from the ADNI database that are described with 243 clinical and biological attributes. The resulting clusters, described in terms of biological features, show behavioral and gender speci(cid:12)c dierences between clusters of patients with progressed disease. These (cid:12)ndings suggest a possibility that the Alzheimers disease is manifested through dierent biological pathways. INTRODUCTION 1. Alzheimers disease (AD) is a form of dementia, which represents a large portion of all dementias. It is a neurodegenerative disease aecting many aspects of the patients life, including physical, psychological and social wellbeing. This inevitably leads to severe decrease of life quality. Currently about 47.5 million people worldwide suer from dementia,1 and its incidence is expected to triple by the year 2050. In order to diagnose AD with certainty, a histopathologic examination has to be conducted, which is the main reason why in practice AD diagnosis is mainly based on clinical criteria that can be subjective. Finding links between the clinical and biological characteristics of the disease is therefore an important research topic: its advancement could potentially improve the understanding of the disease pathophysiology and enable its detection at earlier stages. In this work, we address the problem of (cid:12)nding possible 1Source: World Health Organization (march 2015). connections between biological and clinical features of AD patients with the use of Predictive Clustering Trees (PCTs). Our goal is not to provide a model for diagnosing the disease, but rather to cluster patients into homogeneous groups that share biological features. This way we should be able to investigate the traits of the grouped patients in more detail. One of the most distinctive properties of PCTs is their ability to learn models for predicting structured or complex variables, e.g., vectors, time-series or hierarchies. By using PCTs, we were able to construct clusters homogeneous in respect of several clinical variables simultaneously and not just a single one as with standard decision trees. We use a dataset of Alzheimers patients obtained from the ADNI database2. The remainder of the paper is structured as follows. Section 2 presents the dataset, methodology and the experimental design. Section 3 describes the results. Finally, in Section 4 we analyze the results and present our conclusions. 2. DATA AND METHODOLOGY 2.1 The Data All data used comes from Alzheimers Disease Neuroimaging Initiative (ADNI) database2. ADNI is an international observational study of healthy, cognitively normal elders, people with mild cognitive impairment (MCI) and people with Alzheimers disease. It collects a wide range of clinical and biological data for each patient at multiple time points. We used the ADNIMERGE table, which is a joined dataset from multiple ADNI data collection domains. The dataset includes information on 659 patients (342 male, 317 female). Each patient is described with 56 biological and 187 clinical attributes. Some numerical values have been transformed in order to make them more linear. Out of 243 attributes, 74 contain missing values. Biological attributes include ABETA peptides, APOE4 genetic variations, intracerebral volume (ICV), results from many laboratory measurements like glucose and protein levels, red and white blood cell counts, MRI volumetric data, (Ventricles, Hippocampus, WholeBrain, Entorhinal gyrus, 2The ADNI was launched in 2003 by the National Institute on Aging (NIA), the National Institute of Biomedical Imaging and Bioengineering (NIBIB), the Food and Drug Administration (FDA), private pharmaceutical companies and non-pro(cid:12)t organizations. More information can be found at http://www.adni-info.org and http://adni.loni.usc.edu. Figure 1: Predictive Clustering Tree, showing 10 clusters (cluster IDs and numbers of patients in each cluster). Fusiform gyrus, Middle temporal gyrus), TAU and PTAU proteins, and PET imaging results (FDG-PET and AV45). Clinical attributes include Alzheimers Disease Assessment Scale (ADAS13), Mini Mental State Examination (MMSE), Rey Auditory Verbal Learning Test (RAVLT), which is divided into several dierent stages (immediate, learning, forgetting and percantage of forgetting), Functional Assessment Questionnaire (FAQ), Montreal Cognitive Assessment (MOCA) and Everyday Cognition, which consists of questions that are answered by patients themselves (ECogPt) and their study partners3 (ECogSP). Again, this cognitive evaluation consists of several domains (Memory, Language, Organization, Planning, Visuospatial abilities, Divided attention and Total score). Also Neuropsychiatric Inventory Examination, Neurological Exam, Modi(cid:12)ed Hachinski Ischemia Scale, Geriatric Depression Scale, Baseline symptoms (nausea, vomiting, diarrhea, sweating, etc.), Clinical Dementia Rating (CDR), Medical History, patient gender and handedness have been included. The diagnosis (DX) that has been given by the physician at the (cid:12)rst examination is included in the data. The possible values for the DX attribute are Cognitively Normal (CN), Signi(cid:12)cant Memory Concern (SMC), Early Mild Cognitive Impairment (EMCI), Late Mild Cognitive Impairment (LMCI) and Alzheimers Disease (AD). The diagnosis distribution is the following: CN=173, SMC=94, EMCI=148, LMCI=134, AD=110. We are using only the baseline data (i.e., data gathered when patients enrolled in the ADNI study and have been examined and tested for the (cid:12)rst time). 2.2 Experimental design In our study we are especially interested in the everyday cognition of patients, therefore we will give a brief overview of the everyday cognition, as it is understood and evaluated within the ADNI database. Everyday cognition (ECog) is a questionnaire, that requires cooperation of both patient and 3Each patient must have a study partner, a person who is in frequent contact with the patient, provides information about the patient and is able to independently evaluate the patients functioning. his or her study partner. It assesses the patients capability to perform normal, everyday tasks. Patients and their corresponding study partners must individually compare the patients current activity levels and capabilities with levels from 10 years prior the examination. The domains of memory, language and executive functioning are assessed. Answers are evaluated on a 5 point scale: (1) no change or performing better, (2) occasionally performs worse, (3) consistently performs worse, (4) performs much worse, (5) does not know. According to Farias et. al.[4], everyday cognition shows promise as a tool for measuring general and domainspeci(cid:12)c everyday functions in the elderly. We have decided to design our experiment on that assumption and we aim to connect existing biological and clinical features in order to observe dierences of predicted values between clusters. We have used Predictive Clustering Trees for the task of multi-target prediction. Our targets were all the ECog components and the diagnosis itself. The descriptive space was de(cid:12)ned by all the laboratory measurements, neuropathology, medical history and gender. We have included medical history in the descriptive space because we wanted to observe whether pre-existing conditions such as alergies play a role in the disease progression. Additionally we included gender, because according to Barnes et. al.[1] gender speci(cid:12)c dierences do exist. We have pre-pruned our clustering tree with the constraint of minimum 50 examples per leaf. 2.3 Predictive Clustering Trees The concept of predictive clustering was introduced in 1998 by H. Blockeel [2] and can be seen as a generalization of supervised and unsupervised learning. Even though predictive modeling and clustering are usually viewed as two separate tasks, they are connected by the methods that partition the instance space into subsets. We can also consider these methods to be clustering methods. An example of such methods are decision trees. If we consider a decision tree in the predictive clustering paradigm, the tree is a hierarchy of clusters. We refer to those trees as predictive clustering trees (PCTs). An obvious bene(cid:12)t of PCTs is that they, in addition to predictions, also provide symbolic descriptions of the clusters. Each node in the clustering tree represents a cluster and has a FDG > 5.999ABETA_upennbiomk5> 167.7AV45 > 1.15Fusiform > 17166Hippocampus > 6711C10: 60RCT19 > 158NOYESNONOYESYESC9: 87C8: 64YESNOFDG > 6.616BAT126 > 626C7: 67NOYESC6:54NOC5:64YESMH13ALLE > 0NONOC4: 56NOC3: 50YESC2: 70YESC1: 87YES (a) Patients evaluate their cognition as worse than 10 years ago. Study partners evaluate the same behavior as approximately half as bad. (b) Patients evaluate their cognition as much worse than patients in cluster 7. Study partners also evaluate it worse. (c) Patients evaluate their behavior milder as their study partners, which consistently give the worst scores. Figure 3: Normalized Everyday cognition (ECog) predictions for clusters 7 (3a), 8 (3b) and 9 (3c). Figure 2: Normalized distribution of original diagnoses with respect to the clusters modeled by the PCT in Figure 1. symbolic description (except for the root node) in the form of a conjunction of conditions on the path from the root node to the selected cluster node. In case of the PCT in Figure 1, the examples in the root node are split according to condition FDG > 5:999. Examples, whose value of the FDG attribute is greater than the value 5:999 will go to the left branch, the others to the right branch. On the next level of the clustering tree, nodes AV 45 > 1:15 and ABETA upennbiomk5 > 167:7 are now split again iteratively until we reach leaf nodes C1. Examples in cluster C1, for example, are those that correspond to the condition: FDG > 5:999 & ABETA upennbiomk5 > 167:7 & RCT19 > 158. PCTs support multi-target predictions which means we can learn a model with respect to not only one target variable but many variables simultaneously. This gives us the tool needed to predict complex structures that can also be interconnected. Several dierent predictive clustering methods [3, 5, 6, 7, 8, 9] are implemented in the software package CLUS (available at http://sourceforge.net/projects/clus/). 3. RESULTS The result of our analysis is the PCT presented in Figure 1. We have investigated all ten clusters in the leaf nodes and Figure 2 shows relative distribution of original diagnoses (DX) in all the clusters. Clusters 1 to 6 are relatively diverse and we can state that the presence of Alzheimers patients in these clusters is unlikely. With the exception of cluster 6, cognitively normal patients are dominant. Cluster 6 also contains patients in the early stage of the disease (EMCI) as well as some LMCI patients. We have focused our attention on clusters 7, 8 and 9 because they mainly consist of patients in the stages of late MCI or already developed AD. important features that show potential for discovering specialized clusters. Our results show that AV45, FDG, hippocampal and fusiform volumes and ABETA upennbiomk5 play an important role in the description of our clusters. As we already mentioned in Section 2.2, we have pre-pruned our clustering tree. The unpruned tree reveals additional important features such as the volume of entorhinal cortex, several laboratory measurements, including glucose level, PTAU upennbiomk5, and white blood cell count. 4. CONCLUSIONS This work presents an application of predictive clustering trees to the problem of discovering connections between biological and clinical features of patients with Alzheimers disease. The result is a PCT with ten clusters, three of which are interesting. We have analyzed all three and discovered interesting indications that biological features have an impact on the observed clinical behavior of the patients. We have also discovered gender speci(cid:12)c dierences, as we have initially expected in the design of the experiment. We have identi(cid:12)ed several biological features that might be connected with the Alzheimers disease progression. The results are promising and in line with other studies, but additional research will need to be conducted in order to further validate the current results presented here. 5. ACKNOWLEDGMENTS We would like to acknowledge the support of the Slovenian Research Agency (through a young researcher grant to MB and the programme grant Knowledge Technologies) and the European Commission (through the projects MAESTRA { Learning from Massive, Incompletely annotated, and Structured Data grant FP7-ICT-612944) and HBP { The Human Brain Project grant FP7-ICT-604102). 6. REFERENCES [1] L. L. Barnes, R. S. Wilson, J. L. Bienias, J. A. Schneider, D. A. Evans, and D. A. Bennett. Sex dierences in the clinical manifestations of alzheimer disease pathology. Archives of General Psychiatry, 62(6):685{691, 2005. [2] H. Blockeel. Top-Down Induction of Clustering Trees. PhD thesis, Katholieke Universiteit Leuven, Department of Computer Science, 1998. [3] H. Blockeel and J. Struyf. Ecient algorithms for decision tree cross-validation. The Journal of Machine Learning Research, 3:621{650, 2003. [4] S. T. Farias, D. Mungas, B. R. Reed, D. Cahn-Weiner, W. Jagust, K. Baynes, and C. DeCarli. The measurement of everyday cognition (ecog): scale development and psychometric properties. Neuropsychology, 22(4):531, 2008. [5] D. Kocev, C. Vens, J. Struyf, and S. D(cid:20)zeroski. Ensembles of multi-objective decision trees. Machine Learning: ECML 2007, pages 624{631, 2007. [6] I. Slavkov, V. Gjorgjioski, J. Struyf, and S. D(cid:20)zeroski. Finding explained groups of time-course gene expression pro(cid:12)les with predictive clustering trees. Molecular BioSystems, 6(4):729{740, 2010. [7] J. Struyf and S. D(cid:20)zeroski. Constraint based induction of multi-objective regression trees. Springer, 2006. [8] C. Vens, J. Struyf, L. Schietgat, S. D(cid:20)zeroski, and H. Blockeel. Decision trees for hierarchical multi-label classi(cid:12)cation. Machine Learning, 73(2):185{214, 2008. Figure 4: Gender dierence in cluster 8. We have examined the pro(cid:12)les of predicted ECog features. The normalized predictions are shown in Figure 3. Cluster 10 is interesting in the sense that it includes two extremes, healthy patients and heavily aected patients. We assume that this cluster should be further split into two more homogeneous clusters. The exploration of this cluster is planned for further work. Patients in cluster 7 (Fig.3a) evaluate their cognition as worse than 10 years ago. Their study partners evaluate the same behavior as approximately half as bad. The majority of patients have early and late MCI and the predictions for this cluster correspond to the distribution in Figure 2 quite well. In cluster 8 (Fig.3b), where the majority classes are AD and LMCI, patients evaluate their behavior worse than those in cluster 7. Study partners in this cluster see the situation worse than study partners in cluster 7. In both clusters 7 and 8 the patients always evaluate their behavior worse than their study partners. In cluster 9 (Fig.3c) we observe a change in this perception. Study partners evaluate the patients behavior worse than the patients themselves. We see that the study partners consistently give worst scores for every testing sub-domain, which diers from evaluations in clusters 7 and 8. We can speculate that this observation is a direct result of the disease progression and medication. Given the fact, that clinical depression is very common with AD patients, it is possible, that the switch in perception is simply caused by medication for easing depression. On the other hand it could indicate a new disease signature, where patients have a different view on the world that manifests in dierent cognition. A more pessimistic explanation could be, that the ECog test is not suitable for this kind of analysis, since it only uses subjective scores, where evaluations from study partners cannot be considered as ground truth but a general direction of the functional direction of the patients. We have also analyzed the gender distribution within clusters 7, 8 and 9. We discovered that cluster 7 is gender balanced. Cluster 8 contains more male patients and this dominance is exhibited for all diagnoses as shown in Figure 4. Cluster 9 on the other side contains more women. Speci(cid:12)cally, dierences occur in classes EMCI and LMCI. In addition to identifying a cluster of severely aected males and establishing a dierence of perception between the patients and their study partners, we have also identi(cid:12)ed some [9] B. (cid:20)Zenko. Learning Predictive Clustering Rules. PhD thesis, University of Ljubljana, Faculty of Computer and Information Science, 2007.