League of Legends players worldwide couldn’t login for hours because Riot forgot to renew the client’s SSL certificate—just like it did 10 years ago

AmbiguousProps

I work in DevOps, this is one of the easier things to automate. It’s common for certs to be issued on a 90 day basis these days, no way that would be maintainable without automating.

Limerance

The problem sometimes is the automation failing for some reason.

@[email protected]

Future generations using Ai to automate this kind of thing will make it even worse probably.

@[email protected]

Have you had Certbot or LE fail on prod for you before?

I’m sure stuff happens, but I usually view them as one of the most robust moving parts on a server.

E: I don’t mean to express disbelief at all; just curious to learn about possible footguns.

@[email protected]

Certbot / LE has to be running on some machine and that machine can be accidentally turned off, payments not fulfilled, was supposed to be moved but the new instance doesn’t work, gateway configuration changed, etc.

Automation requires maintenance and that introduces human error

@[email protected]

Certbot/LE should typically be running on the box that’s terminating TLS for you, right? If the box handling your traffic is down, shouldn’t that be a self-evident problem?

I’ve been running Caddy and certbot for nearly a decade and never found a way for them to break without it being 100% my fault. They’re more or less self-healing too. I’m with AmbiguousProps; cert renewals have been pretty damn reliable to automate compared to any other piece of tech, IME.

AmbiguousProps

Like dgdft said, if you’re using certbot, it should typically be running on the machine that your endpoints are hosted on. Enterprise solutions don’t require this, but they have other means of deploying certificates automatically and alarming if they are unable to, before they expire. My organization has dashboards showing which certs expire and when, and it triggers alarms at least a month before anything goes wrong.

High stakes automation should always have alarms on error, and since certs have set expiration dates baked into them, you can alarm far before anything goes wrong. Apparently, Riot didn’t have that.

Also, more frequent renewals make it so that people are less likely to forget it exists. Because of that, along with the possible security ramifications, 2 to 10 year certs should never be used, in my opinion. A 10 year cert will always get kicked on to the next team and it’s very possible for things to fall through the cracks.

@[email protected]

Yeah I’ve had certbot mess up a few times, though more often it was the scripts that actually shuttle the updated certs to their proper locations and restart services after updating

The issue here is this is a client certificate, issued within the League client, for seemingly local<->local traffic. This ain’t no typical HTTPS ceritifcate, it’s bundled into the client build. See from the source “League client’s hard-coded certificate meant someone at Riot would’ve needed to remember it required updating before its expiration date.” So, not quite as easy as configuring an ACME CRON, but something that’d need to be remembered or have some kind of internal reminder for.

AmbiguousProps

I’m aware, but it should have been part of their build system and they should have, at the very least, had alarms for this.

@[email protected]

Even the simplest things fail sometimes

AmbiguousProps

That’s what alarming is for.

@[email protected]

Cool story if everything you have has an API or code based. Try doing it on hundreds of switches and other embedded devices. The whole 42 day thing they’re floating is gonna be a massive nightmare because they don’t realize all the other things out there that use certificates.

AmbiguousProps

What makes you think I don’t do this on embedded devices? I’m not about to dox my self with specifics, but I do this exclusively for embedded hardware as my job. We even do it for devices not directly attached to our network. It’s really not difficult so long as you have control of your enterprise hardware (which, you should, unless your management is terrible at their jobs). Hell, even the routers we use have this functionality built in, failure alarms and all.

If this is a problem for you, it’s probably at an organizational level, and not a technical issue.

League of Legends players worldwide couldn’t login for hours because Riot forgot to renew the client’s SSL certificate—just like it did 10 years agoplus-square

League of Legends players worldwide couldn’t login for hours because Riot forgot to renew the client’s SSL certificate—just like it did 10 years agoplus-square

PC Gaming

League of Legends players worldwide couldn’t login for hours because Riot forgot to renew the client’s SSL certificate—just like it did 10 years ago

League of Legends players worldwide couldn’t login for hours because Riot forgot to renew the client’s SSL certificate—just like it did 10 years ago