A quick bug report: today we discovered that we have been preventing search engines from indexing our pages for a few days.
I wish I could say that we (on the tech & product team) caught this, but editorial was all over it after some major drops in traffic yesterday. By 11AM today I had 4 emails reporting traffic issues.
Late last week, we made some changes to how our Content Delivery Network (Fastly) is set up. While the new architecture is performing really well, a bug was introduced that set our HTTP headers to
X-Robots-Tag:noindex . Effectively, we were telling search engines not to show our pages for any search results.
How We Found & Fixed It
Google, to their eternal credit, flagged that the issue was in our HTTP header and sent us an alert through email and webmaster tools. Once we knew the issue was in the HTTP header (not the first or second place I would have looked) Chris and Ben found the offending line of code (a misconfiguration in our Varnish Configuration) in 4 minutes, according to Slack.
Once the change was made, I used Google Webmaster's Fetch As Google and resubmitted our homepages to try to speed up re-indexing. As of now (about 90 minutes after bug fix) all sites are showing up in Google Search with the exception of Gawker.com, which I'm expecting to come up shortly.
How Will We Prevent This in the Future
Bugs will always happen, but there's more we can do to prevent them. This bug got missed in part because it's in code that isn't always reviewed like our primary code repos. There's already been discussion about how we can do a better job of versioning this code, which will make this easier to identify in the future.
In terms of detection and response, we have been building out a number of dashboards and alerts to track things like signups on the Kinja platform and high-traffic posts. However, we haven't done anything with regards to monitoring our traffic sources. This is a wake-up call to the importance of that type of monitoring.