Kinja Data Team

it's A/B test time, my dudes

One of my jobs here at Gawker is understanding and communicating how users are interacting with our site. A stat I get asked about a lot is time-on-site or engaged-time-on-site: though the ability to approximate time-on-site has been around for years, Upworthy and Chartbeat deserve a lot credit for re-imagining the concept and raising its profile through better tracking of engagement.

Here at Gawker, we love Google Analytics and we love Chartbeat, and we use them across all our properties. Both of them track time-on-site (well, time-on-page or engaged-time-on-page, which we can convert to time-on-site), though in different ways (Chartbeat does a good job laying out the difference here). I believe that Chartbeat's data better accounts for engagement, but I know Google's API is much more robust, so I'd rather grab the data from Google if I can.


I wanted to understand how different the two sources were, so I did a deep dive into it. I took the last 30 days of engaged time (Chartbeat) and session duration (Google) on each of our nine sites, and divided by the number of sessions that day (according to Google: Chartbeat doesn't make that available).

Here is Average Session Duration, by day. Can you guess the outlier?

As you can see, for most of our blogs, there's very little difference between using Google and Chartbeat. However, for a couple of blogs, Chartbeat is reporting massively higher time-on-site. Since Chartbeat is only tracking 'engaged time', I expected it to be lower.

The biggest outlier, in the middle, is Lifehacker. Chartbeat is showing us engaged times about 50% above Google.


So which one is right? I started investigating this discrepancy: I came up with a few theses for why this could be happening:

- Lifehacker publishes about how to block tracking: maybe their blockers work on Google's tags but not on Chartbeat's, so I'm using the wrong denominator (sessions) for the number of sessions Chartbeat is actually tracking.


- We might be missing tags on parts of Lifehacker.

- Lifehacker users might be keeping certain pages up a really long time (like when they're on a DIY project) which Chartbeat is capturing but Google isn't.


- Lifehacker might have more single-page visits than other sites, and Google is treating those visits as bounces (no time on page).

After looking for evidence for the top 3 ideas, I wasn't coming up with anything that seemed likely:

- If we were missing users due to blocking Google, real-time views as reported by Google and Chartbeat would be different. That wasn't the case, so I figured that was enough contradictory evidence to throw that out.


- I could not find any highly trafficked pages without tags, so throw that out: if it's the case with a few pages, it's not meaningful.

- I know I shouldn't fall in love with hypotheses, but I was really hoping that there would be some section of Lifehacker (like DIY) where people had a single page open for really long periods of time. This would explain the discrepancy, since Chartbeat would be tracking users' activity over that whole long while. Sadly, while that's true for some pages, it wasn't a large part of traffic and wasn't causing the discrepancy.


Instead, the real issue was in single-page visits. This is a hard stat to identify, since I can't isolate these users with Chartbeat's API. So I decided to use data from our other sites to see if there was a statistical relationship between the Google-reported 'bounce-rate' and the difference between Google and Chartbeat time-on-site.

There was. Bounce-rates and Google under-reporting time-on-site (compared to Chartbeat) are strongly correlated (pearson's r of over .75). While we do use Google's event tracking API, we don't have an event specifically to track time-on-page.


This makes sense to me: lots of users come to Lifehacker with a specific question, spend time on the page finding the answer, then leave. Chartbeat captures the engaged time these users spend on the page: Google doesn't.


Sadly for me, this means that I'm going to have to either implement additional events through the Google API, or grab attention time from Chartbeat and mash it up with the Google traffic numbers, with all the munging that it entails. But at least I feel confident about the numbers now.

Share This Story

Get our newsletter