/ Technology

What caused the Facebook outage?

On Monday 4 October 2021 Facebook, Instagram and WhatsApp suddenly went down. Kate Bevan explains what happened to the tech giant and what we can learn.

What were you doing during the Great Facebook Outage of 2021? While it wasn’t quite up there with some of history’s defining talking points, the absence of Facebook and its sister services Instagram and WhatsApp was a stark reminder not only of how connected we all are, but how much we rely on those connections. 

And it’s also a wake-up call about the dangers of putting too many eggs into a single basket – both for us as individuals, but also for businesses that rely on these platforms, and for Facebook itself.

What happened to Facebook?

It seems that something went wrong during a configuration tweak at Facebook. It’s important to note that this almost certainly wasn’t a cyber-attack on Facebook. We don’t know – and may never know – if it was a genuine error, or something internal, but I am going with the principle of Hanlon’s Razor, which states ‘never attribute to malice that which is adequately explained by stupidity’.

The effect of this error meant that Facebook stopped telling the rest of the internet that it exists. It turns out that Facebook doesn’t outsource any of its networking to third-party cloud providers like Amazon: it does everything itself.

Because Facebook handles all its networking this way, once it had vanished from the internet, none of its internal systems worked. That meant engineers couldn’t remote in to fix the error, and when employees began showing up at Facebook offices, their entry cards didn’t work and they couldn’t get in to the buildings. Facebook’s internal email and messaging systems weren’t working, either.

You can read Facebook’s own explanation of the incident in its blog post here.

The nerdy bit

You can skip this bit if you’re not inclined to be geeky, but for those who are, this was a BGP routing failure. BGP stands for ‘border gateway protocol’ – it’s the system by which big standalone networks like Facebook connect to the rest of the internet via big routers. Those big routers hold maps of the routes that packets of information can use to make their way around the internet.

Without those maps – which are constantly updated – none of the standalone, or ‘autonomous networks’, can communicate with each other. 

For some reason, what should have been a routine update from Facebook to those maps went wrong. Because it couldn’t announce its presence to the internet, Facebook in effect vanished from the internet. 

My personal favourite explanation of what happened came via Twitter last night:

The knock-on effect

Of course, it wasn’t just Facebook that went down: Instagram and WhatsApp, both of which are owned by Facebook, also went dark. I first noticed this when I was chatting with Harry Rose on WhatsApp.

There were two big fallouts: first, an awful lot of people use their Facebook profile to log in to other websites and services, and so couldn’t connect to those.

Using a social platform to sign in to other sites is considered a bad idea by security folk: first, it means you’re sharing all that data of your shopping and browsing with Facebook (or Google, or Twitter if you use those logins). Second, it means that if your Facebook password is stolen or compromised, cyber-criminals then can get in to all those other sites you’ve connected to Facebook.

And third, as we discovered yesterday, if Facebook goes down, you’re locked out of everything else, too.

The second big fallout was that other services started creaking under the strain. Twitter was the most obvious victim of this: as people couldn’t connect on Facebook, they turned to Twitter to find out what was going on. Additionally, people started hitting the refresh button on Facebook/Instagram/WhatsApp, which meant networks were swamped with traffic and other parts of the internet slowed down.

What does it all mean?

The Facebook outage reveals that it’s not the best idea to put all your eggs into one basket. Facebook engineers will be grappling with that lesson today, while the rest of us should think about having other systems in place for our personal comms, and for our work comms too. 

I spent a fair bit of the evening on Twitter and Signal discussing the outage, and today I’m thinking about replicating some of my WhatsApp groups on Signal as a back-up option.

What apps do you use to communicate with your friends, family and colleagues? Did you even notice that all these services were down last night? And if so, did you go elsewhere?

Let us know your thoughts on what happened in the comments below. 

Comments

Call me cynical, but the previous day’s whistleblower revelations played a huge part in it, I believe. You said “for some reason”, well I reckon that could well be it; they possibly engaged in a damage limitation exercise either to remove something or prevent further access to something, by completely locking down until everything was secure.* I don’t imagine for a moment that everybody was locked out. Of course they would claim that to be the case. I quite like the idea of Mark’s response to critics being simply stating “let’s see how they manage without my platforms for a while then..” 😂

*totally not a conspiracy nut, though, don’t worry 😂

I’m assuming Which? read The Register’s explanation? But missed the bit where it thought it was facebook’s automatic networking program that came up with a faulty routing table. The worst bit is that that was developed to save 8 hours work a week by a human!

To answer the three questions at the end of the Intro —
:: I use a telephone to communicate with our family and friends if and when necessary.
:: I had no idea Facebook et al were down until I saw it in a news report.
:: I was not affected in any way by the outage.

I don’t know why people are so keen to discover the cause of the fault; it’s one of those “if it can happen it will happen” incidents, and now it has, . . . and it will probably do so again one day.

I admit to finding it funny that Facebook had to take to it’s great rival Twitter to tell the world about its problem.

I’m surprised you didn’t mention DNS. Facebook’s servers were inaccessible because there was no DNS mapping of Facebook’s hostnames (e.g. facebook.com) to the IP addresses of Facebook’s servers (e.g. 31.13.83.36 or 2a03:2880:f130:83:face:b00c:0:25de).

He’s a good guy 🙂

I only use Facebook to help promote a small society, but that has helped increase our membership in the past couple of years despite the pandemic. I don’t look at FB often and the outage had passed me by.

Kate has made this. important point in her introduction:
“Using a social platform to sign in to other sites is considered a bad idea by security folk: first, it means you’re sharing all that data of your shopping and browsing with Facebook (or Google, or Twitter if you use those logins). Second, it means that if your Facebook password is stolen or compromised, cyber-criminals then can get in to all those other sites you’ve connected to Facebook.”

Knowing that it is not a good idea to use the same password for different websites in case one is compromised I have not used my FB password to access other online services, despite being invited to. Perhaps it’s time to stop FB and other companies doing this.

Starcloud says:
8 October 2021

Recently I have found that I could only log onto few websites via either FB or Google with no option to log in direct. Needless to say I have only done this once when I really needed to.

I am on FB but don’t like it or use it, not my bag. However I do use Whatsapp although I am refusing to accept the new FB terms, like many others I know. Will decide what to do if and when they close my account. And not being a constant user I didn’t notice the outage.

Kevin says:
7 October 2021

I’d like to thank the Facebook network team for giving the world a break, if only for a few hours, from this cancerous Internet parasite.

I’m sure their data security and ethics teams are as well versed in the important aspects of their work as the network team, who couldn’t possibly have predicted what effect messing up BGP would have. After all, it was, in the political meaning of the word, unprecedented.

I would like to thank Kate for a relatively easy explanation of what occurred and for advising caution on “Using a social platform to sign in to other sites” Although I was one of those who for a few minutes attempted to reload Facebook a few times, the outage resulted in some peace and quiet.

The family all use group WhatsApp but then I also have their phone numbers to call if needed – so that wasn’t missed.

What I did notice was how many people I am in contact with via messenger, for which I don’t have any other contact details.

Just signed up to this conversation and find Which? is inviting me to sign through Facebook or Twitter.
Kate, isn’t this exactly what you say in your piece “is considered a bad idea by security folk”?
Explain please!

I am a geek but I enjoyed your explanation.
However you made an excellent point about using, or rather not using social network logins to log into other sites – then I was immediately offered a “log in with Facebook” link

WHY?

You asked “What were you doing during the Great Facebook Outage of 2021?”

Answer: I was blissfully unaware of any ‘problem’ until it came up on Radio 4’s news at 6pm.
Get a life!

BrianL says:
8 October 2021

Never used FB, so didn’t know that there was a problem until, like stevegs, I heard about it on the Radio 4 News. And I don’t live in a cave. I actually mentor & coach people in IT.

Like a few of your respondents, I do not use Facebook, Twtter or Whatsap – cannot see any need. I only heard abut the outage on BBC news. I do use email though, normally to pass on amusing comments or stories from my friends. My mobile phone is about as ‘unsmart’ as you can get – probably generation -1. And I do not suffer as a result.

There is discussion of DNS being down. Well, I went in – or tried to, purely for diagnostic purposes – to the IP address that was in my last cached DNS – from memory 157. somethingorother – and it got a pretty bland, but nonetheless very clearly Facebook-liveried page explaining error!

I am a regular user of WhatsApp which with its facility to confirm messages arriving and been viewed is a extremely useful. My ultimate back up is sms text, this requires minimal mobile, ie not the Internet, connection and can hence often get through when there is no data/Internet available and even phoning is unreliable.

John H says:
8 October 2021

The first I knew about the facebook crash was reading this article tonight (8/10/21). I have an elderly phone and an elderly PC, I carry the phone and can usually get a signal when smart phones fail and text messages seem to go from anywhere. Why do I need facebook? I can never find anything on it that is useful.

” Why do I need facebook? I can never find anything on it that is useful.”

It’s the other way round: FB needs you – or more precisely – Facebook needs the billions it has become so adept at monetising.

This is one reason why.

Richard says:
8 October 2021

DNS uses BGP and as it was that which was down, DNS couldn’t obtain IP address routes.