Site Reliability Engineering

Back: 07 - Reliability and Operations MOC

Site Reliability Engineering (SRE) applies software engineering principles to operations problems. Pioneered at Google, SRE provides a prescriptive framework for managing service reliability through SLOs, error budgets, toil reduction, and automation, balancing the tension between feature velocity and system stability.

Categories


sre reliability operations