Format nitter post

This commit is contained in:
Wesley Moore 2021-08-26 19:23:35 +10:00
parent 0f85f2165f
commit b86b79c000
No known key found for this signature in database
GPG key ID: BF67766C0BC2D0EE

View file

@ -6,9 +6,10 @@ date = 2021-08-26T09:10:54+10:00
#updated = 2021-05-15T10:15:08+10:00 #updated = 2021-05-15T10:15:08+10:00
+++ +++
On 24 August I received an email from Vultr saying that my server had used 78% of its 3Tb On 24 August I received an email from Vultr saying that my server had used 78%
bandwidth allocation for the month. This was surprising as last time I looked I only used of its 3Tb bandwidth allocation for the month. This was surprising as last time
a small fraction of this allocation across the various [things I host][alpine-docker]. I looked I only used a small fraction of this allocation across the various
[things I host][alpine-docker].
After some investigation I noticed that the [Nitter] instance I [set up six After some investigation I noticed that the [Nitter] instance I [set up six
months ago][nitter-instance] at `nitter.decentralised.social` seemed to be months ago][nitter-instance] at `nitter.decentralised.social` seemed to be
@ -16,6 +17,8 @@ getting a lot of traffic. In particular it seemed that there were several
crawlers including Googlebot and bingbot attempting to index the whole site and crawlers including Googlebot and bingbot attempting to index the whole site and
all its media. all its media.
<!-- more -->
Nitter is an alternate UI for Twitter that is simpler, faster, and free of Nitter is an alternate UI for Twitter that is simpler, faster, and free of
tracking. I mainly set it up so that I could share Twitter links with my tracking. I mainly set it up so that I could share Twitter links with my
friends, without them having to visit Twitter proper. It's obvious in hindsight friends, without them having to visit Twitter proper. It's obvious in hindsight
@ -40,13 +43,14 @@ the usage graphs.
{{ figure(image="posts/2021/nitter-bandwidth/cpu-usage.png", link="posts/2021/nitter-bandwidth/cpu-usage.png", alt="Chart showing CPU usage for the last week with a significant drop in the last two days", caption="CPU usage, last 7 days") }} {{ figure(image="posts/2021/nitter-bandwidth/cpu-usage.png", link="posts/2021/nitter-bandwidth/cpu-usage.png", alt="Chart showing CPU usage for the last week with a significant drop in the last two days", caption="CPU usage, last 7 days") }}
After letting the changes sit overnight I was still seeing a lot of requests from user-agents After letting the changes sit overnight I was still seeing a lot of requests
that appear to be Chinese bots of some sort. They almost exactly matched the user-agents from user-agents that appear to be Chinese bots of some sort. They almost
in this blog post: exactly matched the user-agents in this blog post: [Blocking aggressive Chinese
[Blocking aggressive Chinese crawlers/scrapers/bots](https://www.johnlarge.co.uk/blocking-aggressive-chinese-crawlers-scrapers-bots/). crawlers/scrapers/bots](https://www.johnlarge.co.uk/blocking-aggressive-chinese-crawlers-scrapers-bots/).
As a result I added some additional configuration to Varnish to block requests from these As a result I added some additional configuration to Varnish to block requests
user-agents, as they were clearly not honouring the `robots.txt` I added: from these user-agents, as they were clearly not honouring the `robots.txt` I
added:
```c ```c
sub vcl_recv { sub vcl_recv {
@ -60,10 +64,11 @@ sub vcl_recv {
### What Now? ### What Now?
I liked having the Nitter instance for sharing links but now I'm not sure how to run it in I liked having the Nitter instance for sharing links but now I'm not sure how
a way that only proxies the things I'm sharing. I don't really want to be responsible for to run it in a way that only proxies the things I'm sharing. I don't really
all of the content posted to Twitter flowing through my server. Perhaps there's a project want to be responsible for all of the content posted to Twitter flowing through
idea lurking there, or perhaps I just make my peace with linking to Twitter. my server. Perhaps there's a project idea lurking there, or perhaps I just make
my peace with linking to Twitter.
[alpine-docker]: https://www.wezm.net/technical/2019/02/alpine-linux-docker-infrastructure/ [alpine-docker]: https://www.wezm.net/technical/2019/02/alpine-linux-docker-infrastructure/
[Nitter]: https://github.com/zedeus/nitter [Nitter]: https://github.com/zedeus/nitter