Site Reliability Engineering
Back: 07 - Reliability and Operations MOC
Site Reliability Engineering (SRE) applies software engineering principles to operations problems. Pioneered at Google, SRE provides a prescriptive framework for managing service reliability through SLOs, error budgets, toil reduction, and automation, balancing the tension between feature velocity and system stability.