> What I had missed is that we deployed a new internal service last week that sent less than three GetPostRecord requests per second, but it did sometimes send batches of 15-20 thousand URIs at a time. Typically, we'd probably be doing between 1-50 post lookups per request.
That’ll do it.
show comments
tapoxi
I don't really understand this architecture, but I thought Bluesky was distributed like Mastodon? How can it have an outage?
show comments
goekjclo
> The timing of these log spikes lined up with drops in user-facing traffic, which makes sense. Our data plane heavily uses memcached to keep load off our main Scylla database, and if we're exhausting ports, that's a huge problem.
I expect this is common.
mwkaufma
Tell us more about this buggy "new internal service" that's scraping batch data :P
drewg123
Golang's use of a potentially unbounded number of threads is just insane. I used to be fairly bullish on golang, but this, combined with the fact that its garbage collected, makes me feel its just unsuitable for production use.
show comments
pembrook
Distributed social media goes down? hrmmm.
Email and the internet don't have "downtime." Certain key infra providers do of course. ISPs can go down. DNS providers can go down. But the internet and email itself can't go down absent a global electricity outage.
You haven't built a decentralized network until you reach that standard imo. Otherwise its just "distributed protocol" cosplay. Nice costume. Kind of like how everybody has been amnesia'd into thinking Obsidian is open source when it really isn't.
show comments
gsibble
Did all 3 users notice?
show comments
electrondood
Great write up... curious about the RCA. Thanks!
rvz
Thank you for the post mortem on this outage.
jonstaab
nostr never goes down
show comments
jmclnx
Lite Blue on a dark Blue background. That is a new one, I have seen grey text on lite grey, but blue on blue ?
The article does work in lynx, at least I can read it.
> What I had missed is that we deployed a new internal service last week that sent less than three GetPostRecord requests per second, but it did sometimes send batches of 15-20 thousand URIs at a time. Typically, we'd probably be doing between 1-50 post lookups per request.
That’ll do it.
I don't really understand this architecture, but I thought Bluesky was distributed like Mastodon? How can it have an outage?
> The timing of these log spikes lined up with drops in user-facing traffic, which makes sense. Our data plane heavily uses memcached to keep load off our main Scylla database, and if we're exhausting ports, that's a huge problem.
I expect this is common.
Tell us more about this buggy "new internal service" that's scraping batch data :P
Golang's use of a potentially unbounded number of threads is just insane. I used to be fairly bullish on golang, but this, combined with the fact that its garbage collected, makes me feel its just unsuitable for production use.
Distributed social media goes down? hrmmm.
Email and the internet don't have "downtime." Certain key infra providers do of course. ISPs can go down. DNS providers can go down. But the internet and email itself can't go down absent a global electricity outage.
You haven't built a decentralized network until you reach that standard imo. Otherwise its just "distributed protocol" cosplay. Nice costume. Kind of like how everybody has been amnesia'd into thinking Obsidian is open source when it really isn't.
Did all 3 users notice?
Great write up... curious about the RCA. Thanks!
Thank you for the post mortem on this outage.
nostr never goes down
Lite Blue on a dark Blue background. That is a new one, I have seen grey text on lite grey, but blue on blue ?
The article does work in lynx, at least I can read it.