Home Visualizations About Publications Contact Us Download

Reality Mining: Complex Social Systems

User Behavior Modeling and Prediction

It is commonly known that although humans have the potential for an extremely high degree of randomness, there exist easily identifiable patterns in every person's life. These patterns can be found on a range of timescales: from the daily routines of getting out bed, eating lunch, driving home from work, to weekly patterns such as the Saturday afternoon softball games, and even to yearly patterns like seeing family during the holidays in late December. While our ultimate goal is to create a predictive classifier that can learn aspects of a user's life better than a human observer (including the actual user), we begin by building simple mechanisms that can recognize many of the common structures in the user's routine. Learning the structure of individual's routine has already been demonstrated using other modalities; our contribution will be to demonstrate learning of social structures.

We begin with a simple model of behavior in three states: home, work, and elsewhere. The data are obtained from Bluetooth, cell tower, and temporal information collected from the phones. We then incorporate information from static Bluetooth devices (class 1, such as desktop computers), using them as 'cell towers' to identify significant locations and localize the user to a ten meter radius. We show that most users spend a significant amount of time in the presence of static Bluetooth devices, particularly when they don't have cell tower reception (e.g., inside the office building). This makes them an ideal supplement to cell towers for location classification.

The Entropy of Life

Human life is inherently imbued with routine across all temporal scales, from minute-to-minute actions to monthly or yearly patterns. Many of these patterns in behavior are easy to recognize, however some are more subtle. We attempt to quantify the amount of predictable structure in an individual's life using an entropy metric. People who live high-entropy lives tend to be more variable and harder to predict, while low-entropy lives are characterized by strong patterns across all time scales. Below depicts the patterns in cell tower transitions and the total number of Bluetooth devices encountered each hour during the month of January for Subject 9, a 'low entropy' subject.

It is clear that the subject is typically at home during the evening and night until 8:00, when he commutes in to work, and then stays at work until 17:00 when he returns home. We can see that almost all of the Bluetooth devices are detected during these regular office hours, Monday through Friday. This is certainly not the case for many of the subjects. The figure below displays a different set of behaviors for Subject 8. The subject has much less regular patterns of location and in the evenings has other mobile devices in close proximity. We will use contextualized information about proximity with other mobile devices to infer relationships.

One similarity between the two different behaviors above is the clear role time plays in determining user behavior. To account for this, we have developed a simple Hidden Markov Model conditioned on both the hour of day as well as weekday or weekend. A straightforward Expectation-Maximization inference engine was used to learn the parameters in the model, and performed clustering in which we defined the dimensionality of the state space. After training our model with one month of data from several subjects we were able to provide a good separation of ({office}, {home}, {elsewhere}) clusters, typically with greater than 95% accuracy. Examination of the data shows that non-linear techniques will be required to obtain significantly higher accuracy. However, for the purposes of the next two sections, this accuracy has proven sufficient. In future work we hope to leverage the information within LifeNet [Singh and Williams, (2003)] to create more specific interferences about activity.

LifeLog

In collaboration with Push Singh and Bo Morgan, we have created an interactive, automatically generated diary application which will allow users not only to query their own life (ie: "When was the last time I had lunch with Mike? Where were we? Who else was there? What did I do next?") but also (after a few months of training data) visualize the model's predictions about upcoming behavior in the immediate future.




© 2009 Nathan Eagle / Massachusetts Institute of Technology