The collective wisdom goes that 99% site reliability might be the standard but https://www.alex-hidalgo.com/, a principal reliability advocate at https://www.nobl9.com/, says such high standards can be often unnecessary. When is it worth pulling out of the race for site reliability of even 99.9%?
That https://thenewstack.io/automate-user-satisfaction-with-this-gitops-friendly-spec-for-service-level-objectives/ (SLOs), which are a vital number for https://thenewstack.io/usenix-the-3-measures-of-successful-site-reliability-engineering/ (SRE), require 99+% site reliability is a myth.
There are many instances where 99% isn’t necessary and offering services with such high reliability will quickly burn through the budget.
Add in the SLO target and the request morphs into something more like “99% of requests to be good every 30 days.”