Kinja Data Team

it's A/B test time, my dudes

Occasionally, I need to find how closely two things are correlated. This generally isn't too difficult. I just plug a few vectors into my favorite statistical software and boom: correlations.

However, once the math is done, I need to communicate this information my intelligent, non-quantitatively oriented co-workers. They all "get" the concept of correlation, but if I say two variables have an r-squared of .57 (like here), they don't have any idea what that means.


Usually I say something like 'r-squared is a measure of how well variation in X explains variation in Y'. Which is true, but doesn't tell anyone if .57 is high or low.

So to give some context to these numbers, I put together this table of experimental r-squared values. Now I'll have real-world relationships between variables to compare to my findings. So in the example I linked to above (bounce rate vs difference in reporting), the r-squared of .57 tells me those variables are almost as closely related as an athlete's height and weight. So a pretty strong relationship.

Very Weak Relationships

.07 - Job tenure and earnings (US 79-94) (source)

.11 - Height and earnings (US, meta-study) (source)

.14 - Fathers' & sons' IQs (Norway 60s-80s) (source)

Weak Relationships

.22 - Brothers' IQs (Sweden) (Bjorklund et al 2010)

.24 - Net worth and BMI (in young baby boomers) (Zagorsky 2004)

.30 - Parents' years of schooling and child's years of schooling (source)

.38 - A country's life expectancy and GDP per capita (source)

Strong Relationships

.42 - Twin brothers' IQs (Sweden, includes both fraternal & identical twins) (Bjorklund et al 2010)


.49 - Parent and child wealth (US 68-99) (source)*

.51 - Death rate and % population without a diploma (US states) (source)

Very Strong Relationships

.61 - Athletes' height and weight (Olympic athletes) (source)

The normal caveats of r-squared apply: this doesn't give you any information about a model's bias, patterns, multicollinearity, non-significance, or data quality issues that might be leading you to an incorrect conclusion. But if you're comfortable with your model, this will hopefully help you explain how well it's working.


If you have any other examples of r-squareds (with citations), please send to me or share below. I hope to add to this over time.

[photo credit]

*These are sources where I'm a little confused what is going on: if anyone has corrections, please put them in the comments

Share This Story

Get our newsletter