Webserver logs access time by region/language

As anyone with a popular website knows, there’s a big difference in the resources required for peak vs. off-peak hours and you typically have to pay for peak usage even if you don’t always use it (e.g. 95th percentile bandwidth billing)

Frugal as I am, I was curious to see if I could increase traffic during what are off-peak hours. Seemed sensible that people in different regions of the world might be accessing during off-hours.

So I aggregated data by country code/language and 10-minute time segment. Applied a Daniell smoothing kernel (a sliding window) of 6 segments (1 hour) and plotted a a row-scaled heatmap in R. Rows are clustered so similar access patterns are next to one another, with the left-hand-side dendrogram indicating dissimilarity between rows. Yellow-white is a traffic burst. I’ll post the code and data later for how I made this.

access times by country/language

As it turns out, the main off-peak trough corresponds to the middle of the Pacific ocean. Kinda watery for people to live there. Oh well, I tried.