DevOps Articles

Curated articles, resources, tips and trends from the DevOps World.

Declare, Respond, Mitigate, Learn: How Kintaba Tackles Incident Response

5 years ago thenewstack.io

Summary: This is a summary of an article originally published by The New Stack. Read the full original article here →

There is established tooling for alert routing: Something breaks and it gets directed to an on-call dev or ops person to fix it. Ninety-nine percent of the time, that’s sufficient.

However, when you are dealing with unforeseeable circumstances, you don’t often have the wits about you to quickly make these decisions.

Egan said that “In major outage world, you are really responding to a symptom.

You need systems to go in and to where those problems are instead of making it an HR problem.”