wezm.net/v2/content/posts/2024/pleroma-archive.md

2.6 KiB

+++ title = "Generating a Static Website From a Pleroma Archive" date = 2024-11-25T20:21:44+10:00

#[extra] #updated = 2024-06-06T08:24:45+10:00 +++

Almost two years ago, in Jan 2023 I migrated from my Fediverse presence from my self-hosted Pleroma instance to a single user Mastodon instance hosted by masto.host. Since then I've wanted to retire the Pleroma instance, but I didn't want to just take it offline. I wanted to preserve my posts and links to them. That became a priority over the weekend so I built a tool, pleroma-archive to do it.

A few months after the switch to Mastodon I tried pulling my posts via RSS hit a runtime error. I reported it to the project but nothing came of it.

I ignored it for another 18 months until this weekend when I tried to upgrade my PostgreSQL server from version 12 to 16 and migrate it to a new host. I chose the dump and load method of doing this but when restoring the Pleroma database it appeared to get stuck building one of the indexes. I eventually concluded that it was going to take something like 30 hours to complete. I'm not the first one to hit this problem either.

Retiring Pleroma had now become a priority. I discovered that there was now an account backup option in the import/export section of the settings. I downloaded my archive and set about building a tool that could generate a website from it.

As usual I built the tool in Rust, my scripting language of choice. It's imaginatively called pleroma-archive. It generates an index page of all posts as well as a page for each individual post. The public URLs that Pleroma uses are not part of the archive, so for each post the tool does a HEAD request with the post id to determine the public URL of the post. The results of this are cached so it only needs to do it once for each post.

With a little bit of help from Nginx try_files a page like notice/ARQGKLTJNiP8Lu2gT2.html can be served at /notice/ARQGKLTJNiP8Lu2gT2, matching the URL it had when served by Pleroma:

location / {
    try_files $uri $uri/index.html $uri.html =404;
}

The end result is at https://decentralised.social/.

Source code and instructions for using pleroma-archive is available at https://forge.wezm.net/wezm/pleroma-archive.