Skip to content

Latest commit

 

History

History

web_routineness

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 

Web Routineness and Limits of Predictability: Investigating Demographic and Behavioral Differences Using Web Tracking Data

Juhi Kulshrestha, Marcos Oliveira, Orkut Karacalik, Denis Bonnay, Claudia Wagner

Abstract Understanding human activities and movements on the Web is not only important for computational social scientists but can also offer valuable guidance for the design of online systems for recommendations, caching, advertising, and personalization. In this work, we demonstrate that people tend to follow routines on the Web, and these repetitive patterns of web visits increase their browsing behavior's achievable predictability. We present an information-theoretic framework for measuring the uncertainty and theoretical limits of predictability of human mobility on the Web. We systematically assess the impact of different design decisions on the measurement. We apply the framework to a web tracking dataset of German internet users. Our empirical results highlight that individual's routines on the Web make their browsing behavior predictable to 85% on average, though the value varies across individuals. We observe that these differences in the users' predictabilities can be explained to some extent by their demographic and behavioral attributes.

Check out the paper here.

Data

The dataset used to regenerate the results for the aforementioned paper using the above notebooks:

A web tracking data set of online browsing behavior of 2,148 users. https://doi.org/10.5281/zenodo.4383164.

Jupyter Notebooks:

The Zenodo repository contains both pre-processed data and raw data.

Pre-processing:

The pre-processing consist of adding categories to the websites, creating users' trajectories, and filtering out users:

Analyses

With the processed data, the analyses follow:

  • [Analysis 1] Basic statistics about the data set.
  • [Analysis 2] The predictability measurement framework.
  • [Analysis 3] Comparing predictability of different types trajectories and confidence intervals.
  • [Analysis 4] Examining the relationship between predictability and demographics.
  • [Analysis 5] Examining the relationship between predictability and browsing behavior.