Nitter is an alternate UI for Twitter that is simpler, faster, and free of
tracking. I mainly set it up so that I could share Twitter links with my
friends, without them having to visit Twitter proper. It's obvious in hindsight
but a Nitter instance is basically a proxy for the entirety of Twitter so any
bots that start crawling it are basically trying to suck all of Twitter through
my little server.
Nitter doesn't have a `robots.txt` by default so the first thing I did was fork
it and [add one that blocked all robots][robots.txt]. Unsurprisingly this
didn't have an immediate impact and I was concerned that if the traffic kept up
I'd hit my bandwidth limit and start having to pay per Gb thereafter.
I concluded the best option for now was to block most traffic to the instance
until I could work out what to do. Since [Varnish] fronts most of my web
traffic I used it to filter requests and return a very basic 404 page for all
but a handful of routes. A day later the impact of this change is obvious in
the usage graphs.
{{ figure(image="posts/2021/nitter-bandwidth/daily-bandwidth.png", link="posts/2021/nitter-bandwidth/daily-bandwidth.png", alt="Chart showing daily bytes sent and received for the last 30 days. The last day shows a significant drop", caption="Daily data sent and received for the last 30 days") }}
{{ figure(image="posts/2021/nitter-bandwidth/network-usage.png", link="posts/2021/nitter-bandwidth/network-usage.png", alt="Chart showing network activity for the last week with a significant drop in the last two days", caption="Network activity, last 7 days") }}
{{ figure(image="posts/2021/nitter-bandwidth/cpu-usage.png", link="posts/2021/nitter-bandwidth/cpu-usage.png", alt="Chart showing CPU usage for the last week with a significant drop in the last two days", caption="CPU usage, last 7 days") }}