1 Introduction
One of the more accentuated issues in late adulthood is social isolation due to such factors as retirement, children living in different places, or the spouse loss. Social isolation is defined as the lack of contact and interaction with others 1,2. An early diagnosis of this condition significantly reduces the risk of depression, cognitive impairment, decreased food intake, reduced physical exercise, or impoverishment of the social network 3. This risk underscores the importance of knowing at all times when an older adult stops socializing in order to carry out interventions that allow him/her to overcome this condition and be kept in a socially active state. Currently, several psychological scales are used to assess the level of social isolation in older adults 3,4. Unfortunately, the application of these instruments is tedious because older adults need to go to assistance centers or specialized professionals in order to be assessed. This motivates the development of novel approaches that enable automatic monitoring of significant changes in their social interactions. In this context, Ambient Intelligence (AmI) provides widely accepted computational mechanisms that would help older people in their daily lives in a manner that is simple, unintrusive, ubiquitous, and proactive at the same time 6,7. On the other hand, the increase in the participation of older adults in Social Networking Sites (SNSs) opens a range of opportunities to monitor social interactions through these virtual communication channels 8. Therefore, this paper describes a predictive model developed to serve as a baseline for determining social isolation levels. This model receives as input quantitative values from indoor/outdoor social interactions performed by older adults. The proposed model will benefit institutes interested in developing systems to improve the quality of life of older adults.
The paper is structured as follows: Section 2 describes related work on the detection of social isolation. In Section 3 the proposed predictive model of social isolation is presented. Section 4 describes the design of the experimental test to evaluate the performance of the predictive model. In Section 5 a discussion of our experimental test is given. Finally, Section 6 presents conclusions and future works.
2 Related Work
Recent research has studied the impact of Ambient Intelligence and Social Networking Sites on socialization. This body of research has addressed how these technologies help i) to reduce the level of social isolation and increase independent life at home 7),(8, ii) to keep seniors in touch with friends through natural ways of interaction 9, 10, iii) to encourage physical exercise 11,12, and iv) to monitor the state of health and to keep caregivers and relatives informed 15. However, monitoring social isolation through computational mechanisms has not been addressed.
3 Predictive Model
A predictive model examines an attribute set and produces an outcome class. Our research work focuses on identifying attributes that have a correlation with social isolation. These attributes correspond to social activities performed by older adults that can be monitored by AmI and SNSs. Table 1 shows a summary of social interaction activities grouped by technological resource and the location where the activity is performed. Previous studies have demonstrated that these activities are correlated with subjective social isolation (loneliness), for instance, time spent inside home 16, time spent out of home 17, and communication through mobile phones 18. Also, previous research suggests that the use of SNSs could help prevent social isolation in older adults 8. For this reason, such activities were considered as the attribute set for the predictive model development.
3.1 Data Collection
Data collection consisted in carrying out a non-probability sampling through a questionnaire applied to 144 older adults, including both men and women between 60 and 89 years of age (68.2 ±8.9) with full physical and cognitive abilities, without mobility impairment, who own a mobile phone and have the ability to use it to make calls or send text messages. In addition, these subjects have a profile in Facebook, had no difficulty understanding the questions, and signed an informed consent indicating that they were willing to take part in the research. The sample was collected in the city of Cuernavaca, Mexico. The interviews took place in public parks and malls. The questionnaire comprises two parts. The first one is the LSNS-6 in its Spanish version 19 to determine the level of social isolation of older adults.
The second part of the questionnaire collected data concerning demographic information as well as social interaction activities described in Table 1. The questions formulated by the LSNS-6 request information about the frequency of social interactions during one month previous to the interview, which is often difficult to remember accurately. Figure 1 shows a summary of the sample's general information. In this chart we can observe 48 severe cases of social isolation, 93 moderate cases of isolation, and 3 cases where no social isolation was detected.
3.2 Attribute Selection
Attribute selection is the process of identifying and removing irrelevant and redundant information. Most machine learning algorithms were designed to identify the most appropriate attributes for classification. Decision tree methods choose the most promising attribute to split at each point and should in theory never select irrelevant or unhelpful attributes 19. In order to obtain the first subset of relevant attributes, the J48 classification algorithm was applied to the full dataset. Then, the subset obtained was assessed using Chi-Squared and InfoGain methods 20 with the Ranker method for evaluation of attributes, Correlation-based Feature Selection method with BestFirst and Greedy Stepwise 21 for evaluation of the sets of attributes. All the tests were performed with ten times 10-fold cross validation as the standard evaluation technique 19.
The resulting relevant attributes were gender, the number of different places visited, the number of times when a person initiates conversation with the family by chat, the number of incoming calls from the family, the number of incoming calls from friends, the duration of incoming calls from the family in minutes, the duration of outgoing calls from family in minutes, the number of incoming messages from the family, the number of outgoing messages to friends, time spent in the bedroom, time spent in the living room, time spent in the dining room, time spent in the garden, and time spent in other area inside home.
3.3 Classification
In order to develop the most suitable model for predicting social isolation, a range of classifier algorithms were assessed 22. This process was carried out using WEKA 23. ZeroR (ZR) algorithm was used as a baseline. The other classifier algorithms used were NaiveBayes (NB), Simple Logistic (SL), Support Vector Machine (SVM), k-Nearest-Neighbor (kNN), AdaBoost (AB), OneR (OR), J48, and SimpleCart (SC). The stratified ten times ten-fold cross-validation technique was used because it is the standard evaluation technique in situations where only limited data is available 19.
3.4 Balancing the Dataset
Table 2A dataset is imbalanced if the classification categories are not equally represented. The imbalance between such class data could have an impact on some classification algorithms, typically with a bias toward the majority class prediction. Therefore, applying a dataset balancing technique is required. In order to handle the imbalance, the dataset was resampled by applying the synthetic minority oversampling technique (SMOTE) 24.
Each derived model is denoted by the name of the classifier algorithm plus "_S" when SMOTE is applied. For example, a model derived using kNN classification and SMOTE for data resampling is denoted as "kNN_S" and the one without data resampling is denoted as "kNN". Table 2 shows the dataset before and after applying SMOTE.
3.5 Model Evaluation
Predictive models' performance was evaluated in terms of accuracy 25, sensitivity, specificity, positive and negative predictive values 26, and error types I and II. In order to corroborate the results of the predictive models, a reference standard was necessary to define as an alternative and real diagnosis. The reference standard used was the LSNS-6.
3.6 Suitable Model Selection
In selecting a suitable model we focus on reducing type II errors (FN rate), that is, we are more concerned with not detecting actual cases of isolation than predicting isolation when not actually present (type I error or FP rate), particularly, if the older adult might have unknowingly been under a serious risk of depression and suicide 1. The first model performance evaluation was carried out with ZR as the baseline. This algorithm is the simplest classification method which relies on the target and ignores all predictors. The ZR classifier simply predicts the majority class. Although there is no predictability power in ZR, it is useful for determining the baseline performance as a benchmark for other classification methods 19. All the models obtained from the dataset with and without using SMOTE produced a significantly higher accuracy than the baseline.
The second model performance evaluation was carried out in terms of accuracy. The classification algorithms were applied to the data using the relevant attribute subsets with and without using SMOTE. All the models obtained from the dataset using SMOTE produced higher accuracy scores. The accuracy of the baseline and the classifications algorithms applied to the dataset with and without using SMOTE are shown in Figure 2.
Once the performance of the models was compared based on their accuracy, the best models in terms of sensitivity, specificity, positive and negative predictive values were examined. Nevertheless, type II error was weighted heavier than the other criteria since this type of error could lead to most adverse effects in older adults. The summary of all criteria is presented in Table 3.
The AB_S model obtained the best accuracy score of 85% and also the best type II error rate of 15%. It had the best performance over the rest of models. This model was selected as the most suitable one.
4 Experiment
In order to evaluate the AB_S model, an experiment was conducted. A comparison of the model's results with the real condition of the older adults was carried out.
4.1 Materials
4.2 Procedure
First, the participants were asked to sign an informed consent form where they agreed to participate in the experiment. Then, each participant was monitored for one month, since this period is required from LSNS-6 in order to obtain the social isolation level. The monitoring of all participants lasted four months.
Such attributes as the number of incoming calls from the family, the number of incoming calls from friends, the duration of incoming calls from the family, the duration of outgoing calls from the family, the number of incoming messages from the family, and the number of outgoing messages to friends were obtained by retrieving the call log of each participant's mobile phone at the end of the month. Such attributes as time spent in the living room, time spent in the garden, time spent in the dining room, time spent in the bedroom, and time spent in other area inside home were obtained from two IP cameras strategically installed in homes. Each camera recorded 12 hours per day. About 5760 hours of video were recorded. The transcript 27 of the videos was done every day. Such attribute as the number of times that the older person initiates a conversation chat with the family was obtained by the Facebook message history in each participant's personal account at the end of the month. Such attributes as gender and the number of places visited were obtained by participants' self-report using the printed form where they informed of the number of places visited every day. At the end of each participant's monitoring period, the LSNS-6 was administered in order to obtain their real social isolation level. From the data collected during the monitoring phase, each participant's social isolation level was obtained through AB_S model. Finally, the comparison between the social isolation level results obtained with the LSNS-6 and AB_S model was carried out.
4.3 Results
Table 4 shows the summary of the 8 older adults' data who were in the experimental group. Column A corresponds to gender, Column B corresponds to the number of different places visited, Column C corresponds to the number of times that the older person initiates a conversation chat with the family, Column D corresponds to the number of incoming calls from the family, Column E corresponds to the number of incoming calls from friends, Column F corresponds to the duration of incoming calls from the family (in minutes), Column G corresponds to the duration of outgoing calls from the family (in minutes), Column H corresponds to the number of incoming messages from the family, Column I corresponds to the number of outgoing messages to friends, Column J corresponds to time spent in the bedroom (in minutes, excluding sleep time), Column K corresponds to time spent in the living room (in minutes), Column L corresponds to time spent in the dining room (in minutes), Column M corresponds to time spent in the garden (in minutes), and Column N corresponds to time spent in other area inside home (in minutes).
The comparison between the social isolation level results obtained by LSNS-6 and AB_S model is shown in Figure 3.
The AB_S model correctly classified 7 of 8 participants producing an accuracy of 87.5% and a type II error rate of 12.5%.
5 Discussion
This research focused on inferring the older adults' social isolation level through activities that can be monitored by AmI and SNSs. From the collected sample, which gave rise to the predictive model, the attributes that have a correlation with social isolation were identified. Also, we made some findings during the development of the predictive model.
5.1 Relevant Attributes
From all the demographic attributes, only gender resulted to be a relevant attribute. Thirty four percent of the sample had a severe level of social isolation. Of this 34%, 69.8% were male and 30.2% were female. As we can observe, men run a greater risk of social isolation than women. Of the 30.2% of women with a severe level of social isolation, 60% live alone.
One possible interpretation of this finding is that in Mexico men's life expectancy is lower than that of women 28, so women become widows and live alone. Another relevant attribute was the number of different places visited. One possible interpretation of this finding is that older adults need to perform activities outside their homes in order to encourage social interactions. Concerning such attributes as posts and messages by Facebook, they did not result to be relevant but what turned out to be relevant is the number of times that an older person initiates a conversation chat with the family. One possible interpretation of this finding is that older adults use only private messages and they avoid posting on the Facebook wall due to security. Another possible interpretation is that older adults have begun to use Facebook recently, so they have not yet developed enough abilities. Within the attributes concerning the use of mobile phone, the relevant attributes were the number of incoming calls from the family, the number of incoming calls from friends, the duration of incoming calls from the family, the duration of outgoing calls from the family, the number of incoming messages from the family, and the number of outgoing messages to friends.
As we can observe, most attributes refer to communication with the family. One possible interpretation of this finding is that currently older adults use their mobile phone more frequently to communicate with the family than with others. Another finding is that the use of SMSs by older adults is increasing. This increment could be explained by the fact that new technologies are considering older adults' limitations which results in more appropriate interfaces.
Finally, relevant attributes referring to indoor location were time spent in the bedroom, time spent in the living room, time spent in the dining room, time spent in the garden, and time spent in other area inside home. One possible interpretation of this finding is that it is important to older adults to move inside their homes since it encourages social interactions with the people they live with, thus avoiding being isolated in one area within the home.
5.2 Predictive Models
In order to handle the imbalanced data, an oversampling technique (SMOTE) [24] was applied. The predictive model performance was better using SMOTE. The AB model obtained an accuracy of 65% and a type II error rate of 32%. The AB_S model obtained an accuracy of 85% and a type error rate of 15%. In this case, the synthetic instances created with SMOTE enhanced the learning of the AdaBoost classifier algorithm. In the experiment, the AB_S model produced a higher accuracy than that produced by the cross-validation test (85%, 87.5%). The type II error rate was lower than the one for the cross-validation test (15%, 12.5%).
Even though the accuracy performance was worse, the type II error rate improved. It means a lower rate of older adults might have unknowingly been under a serious risk of other diseases [1]. Nevertheless, new experiments with a larger sample are required.
6 Conclusions and Future Work
Social isolation is considered to be one of the possible factors that cause such disorders as depression, cognitive impairment, or impoverishment of the social network 3,1. Therefore, an early diagnosis and suitable interventions from relatives and caregivers would allow older adults to cope with this health condition. In order to infer social isolation in older adults, an evaluation of a number of activities that can be monitored through AmI and SNSs was carried out. From these activities, relevant attributes were identified through attributes' evaluation methods. Using such attributes, a number of predictive models were developed by implementing a range of classifier algorithms. Each model went through performance evaluation and a technique to handle imbalanced data and bias. The AB_S model was the selected model due to its performance (accuracy: 85%, sensitivity: 85%, specificity: 92%, PPV: 91%, PPN: 85%) and its lower type II error rate (15%). In order to evaluate the selected model, an experiment with 8 older adults was carried out. The experiment compared the social isolation level of each participant obtained by the AB_S model versus the reference standard, the LSNS-6. The experiment results showed that the AB_S model correctly classified 7 of 8 participants, producing an accuracy of 87.5% and a type II error rate of 12.5%.
A limitation for our work is the amount of available data. It was both expensive and time-consuming to collect such data from older adults. Nevertheless, a collaborative project with geriatric institutions is currently underway which will allow our current approach to be extended to a larger sample size. As future work, an implementation of the AB_S model in a computer system is considered. This system will be capable to monitor the older adults' activities in a ubiquitous manner and to infer their social isolation levels. Also, older adults will be capable to share their social isolation level with previously authorized caregivers and relatives with the intention to alert them of a risk situation. Finally, some older adults employ and enjoy their isolated time which does not imply a risk situation. In order to adapt individual requirements to the model, an implementation of statistical learning algorithms is planned.