@bclindner I’m seeing instance slowdown. iotop showed a Postgres DELETE query using a lot of disk IO and PGHero indicates there are *nine* long-running queries currently being executed. Some have been active for almost 40 minutes (whaaaat).

My theory is that a user with many status has deleted their account and Postgres is chewing through the work to make that happen. I’m monitoring, and if need be we can kill the queries in PGHero.

@bclindner ah, forgot to say that one of the queries I found was this:

DELETE FROM "accounts" WHERE "accounts"."id" = $1

It only took a few minutes to complete though 😅

Show thread

@bclindner haha wait, I’m see mdszy in the logs when I grep for delete... maybe
@skelly ‘s auto-delete script is running?

Show thread

@bclindner okay so, all the queries I killed were identical, a complex SELECT (I took a screenshot). So far there’s one that’s come back. I’m going to set a two-minute timeout on our Postgres install to prevent this from happening again. Two minutes is still very generous, and longer than our nginx timeout, so the API should be unaffected. Let me know if you (or, indeed, if any curious watchers) have questions!

Show thread

@ashfurrow @bclindner oh yeah

it's still running

had that been causing problems??? oh dear

@skelly @bclindner I have no idea! I mean, I saw the old username in the logs. When a post is deleted, we also send deletion requests to every server we federate with, and those sometimes fail and get retried with exponential back off.

To be abundantly clear: there’s nothing wrong with your script or with you running it. The database should be able to handle this, but it isn’t. I think I know why but I’m not looking forward to fixing it...

Sign in to participate in the conversation
Mastodon for Tech Folks

This Mastodon instance is for people interested in technology. Discussions aren't limited to technology, because tech folks shouldn't be limited to technology either!