Tuesday, November 12, 2019

Site Reliability Engineering


It is assumed that DevOps philosophy has been adopted by every project at their own way. True implementation of DevOps is hidden in SRE - Site Reliability Engineering.

It seems every organization has its own SRE team in a fragmented form. Whenever there is an issue, we all jump into that and bring the business on track as per SLA. SRE talks about another two layers - SLI and SLO, which can be used as a filter of SLA. At any point of time, a particular matrix says Yes or No about system Health. These are all Service Level Indicators. Bindings targets of SLI is SLO. It never promises 100% availability of the site. Based on all these SLOs, Service Level Agreements are prepared transparently.

Transparently, because it accepts expectable risk – amount of failure we can have within our SLO. It is near to impossible to assure 100% availability, even if we provide service through our own fiber network, backbone and customized secure software. Due to least reliable component in the system we can grantee 100% availability all the time. Error Budget clearly shows minimum permissible loss beforehand. SRE expects failure is normal and determine how much failure we can tolerate. Error Budget helps to decide whether delivering new product quickly is important or Releasing reliable product/feature is our prime goal.

It has perfectly defined perhaps intended how to avoid Toil or operational overhead by discarding manual task so far possible. Manual, repetitive, automatable, tactical and devoid of long-term value are the characteristics of Overhead. Working manually by sitting in front of computer is not an intelligent decision. At the same time, investing 20Hrs to automate a single task which supposed to be done manually once in a month within 20 min, is not a wise idea, either.

Altogether, it seems, latter the service organization adopt SRE, sooner it will disappear from the market. Therefore, every organization should have a defined framework/model of SRE, if nothing as such is ready!! Experts says SRE is the class that implements the interface of DevOps. Case study on existing DevOps projects and implementing SRE on that can be represented as a POC.

No comments:

Post a Comment