Facebook May Be Abusing Your Website With It’s facebookexternalhit Bot

It's pretty normal for various online services to show up in your hosting account's bandwidth logs, thanks to various "bots" (automated software) that are looking for information from your site. Every website online will have a normal background noise of this kind of traffic, but in this case, the facebookexternalhit bot is sucking up gigabytes of bandwidth that it shouldn't be. 

There are two possible cases of how this is happening: 1) third parties are using the Facebook infrastructure to attack a site, or 2) Facebook is directly pulling in content related to a shared link every time it's viewed instead of caching or storing that preview locally (this would save them on server costs).

In either case, Facebook is not responding very quickly to mitigate the issue, from what I've read online so far, as it doesn't directly affect them. But there are many people talking about it.

Why is this a problem exactly?

The two main costs of web hosting are storage space, and monthly bandwidth. In this case, the problem is the bandwidth. There are essentially a limited number of times files can be accessed on your site, within any particular hosting plan. Think of it like a cell phone minutes plan. This bandwidth limit can be used up with a single file getting accessed an unusually high number of times, or a larger number of files being accessed a smaller number of times (it adds up.) 

With the Facebook bot, what happens is that when a link is shared on Facebook, there is a "preview" that is displayed, typically consisting of an image and some text. So, their bot scrapes this content from your website (downloading the files), and typically many more than it needs, in order to decide what it wants to use. Every time that shared link is viewed, the process triggers again. Facebook COULD mitigate the issue by caching that data on their servers, but it would cost them a lot of money to do this for all the links that are shared, so it's in their interests to pass that cost on. 

What can be done?

Some possible ways to mitigate the problem include caching plugins, firewalls, rate limiting plugins, etc.  Though no one solution is bullet proof. If your website is on our maintenance plan, that does include some of these options that can be implemented. 

If your website is a WordPress website, you can also try manually installing this plugin: https://github.com/nadimtuhin/Facebook-Request-Throttle-WordPress-Plugin

Additional reading:

Share, email, or print this post...


Posted on by Nathan Lyle in Security, Social Media.

About Nathan Lyle

Nathan is a father of four, an amateur musician, and an aspiring photographer. He started programming in 4th grade on an Apple II+ and many years later spent much of his college years freelancing website design for college departments. Nathan is a veteran of the Browser Wars, and will gladly talk at length about the changes he has seen in Web technology if you accidentally ask him.

Leave a comment

Your email address will not be published. Required fields are marked *