What is Site Reliability Engineering? SRE Versus DevOps

Solutions & Products
- Solutions & Products
- Cloud Services
  Cloud Services
  
  World-class data management and storage solutions in the biggest public clouds.
  Visit Cloud Services
  
  Solutions
  
  Microsoft Azure
  
  Google Cloud
  
  AWS
  
  IBM Cloud
  
  Products
  
  Azure NetApp Files
  
  Amazon FSx for NetApp ONTAP
  
  Cloud Volumes Service for Google Cloud
  
  Cloud Volumes ONTAP
  
  Compute Optimization
  
  Cloud Sync
  
  Cloud Data Sense
  
  Cloud Tiering
  
  Cloud Backup Service
  
  Cloud Volumes Edge Cache
  
  Global File Cache
  
  Cloud Manager
  
  Astra
  
  Cloud Insights
  
  File Services / File Sharing
  
  MySQL
  
  PostgreSQL
  
  Kubernetes
  
  Quick Links
  
  Cloud Central
  
  Data Fabric
  
  Why NetApp for Cloud Services
  
  Spot by NetApp
  
  Customer Stories
  
  Test Drive
  
  Free Trials
  
  How to Buy
- Hybrid Cloud
  Hybrid Cloud
  
  Build your business on the best of cloud and on premises together with Hybrid Cloud Infrastructure solutions.
  Visit Hybrid Cloud
  
  Solutions
  
  Virtualization
  
  Service Provider Infrastructure
  
  IT Automation
  
  Private Clouds
  
  VMware
  
  Red Hat
  
  Quick Links
  
  Data Fabric
  
  Why NetApp for Hybrid Cloud
  
  What is Hybrid Cloud
  
  Customer Stories
  
  Test Drive
  
  Free Trials
  
  How to Buy
- Data Storage
  Data Storage
  
  NetApp is the proven leader when it comes to modernizing and simplifying your storage environment.
  Visit Data Storage
  
  Solutions
  
  SAN
  
  Scale-Out NAS
  
  Unstructured Data Solutions
  
  Products
  
  AFF A-Series
  
  AFF C190
  
  E-Series
  
  EF-Series
  
  FAS
  
  FlexPod
  
  SolidFire
  
  StorageGRID
  
  Disk Shelves & Storage Media
  
  Quick Links
  
  Data Fabric
  
  Why NetApp for Data Storage
  
  Customer Stories
  
  Test Drive
  
  Free Trials
  
  How to Buy
- Cyber Resilience
  Cyber Resilience
  
  Our industry-leading solutions are built so you can protect and secure your sensitive company data.
  Visit Cyber Resilience
  
  Solutions
  
  Data Protection
  
  Ransomware Protection
  
  Business Continuity / Disaster Recovery
  
  Data Backup and Recovery
  
  Data Compliance
  
  ONTAP Data Security
  
  Products
  
  SnapCenter
  
  Cloud Backup
  
  Quick Links
  
  Data Fabric
  
  Customer Stories
  
  Test Drive
  
  Free Trials
  
  How to Buy
- Data Management
  Data Management
  
  Get complete control over your data with simplicity, efficiency, and flexibility.
  Visit Data Management
  
  Solutions
  
  Simplicity365
  
  Products
  
  Active IQ
  
  Element Software
  
  OnCommand Insight
  
  ONTAP Data Management
  
  SANtricity Software
  
  Virtual Infrastructure Management
  
  Quick Links
  
  Data Fabric
  
  Data Management Specialists
  
  Customer Stories
  
  Test Drive
  
  Free Trials
  
  How to Buy
- Enterprise Applications
  Enterprise Applications
  
  Speed application development, improve software quality, reduce business risk, and shrink costs.
  Visit Enterprise Applications
  
  Solutions
  
  SAP
  
  Oracle
  
  MS SQL
  
  Quick Links
  
  Data Fabric
  
  Why NetApp for Enterprise Applications
  
  Customer Stories
  
  Test Drive
  
  Free Trials
  
  How to Buy
- DevOps
  Devops
  
  Our solutions remove friction to help maximize developer productivity, reduce time to market, and improve customer satisfaction.
  Visit Devops
  
  Solutions
  
  Configuration Management
  
  Containers
  
  Google Clouds Anthos
  
  Continuous Integration Continuous Delivery
  
  Quick Links
  
  Data Fabric
  
  Why NetApp for DevOps
  
  What is DevOps
  
  Customer Stories
  
  Test Drive
  
  Free Trials
  
  How to Buy
- AI
  AI
  
  NetApp AI solutions remove bottlenecks at the edge, core, and the cloud to enable more efficient data collection.
  Visit AI
  
  Solutions
  
  Big Data Analytics
  
  High Performance Computing
  
  Products
  
  ONTAP AI
  
  Quick Links
  
  Data Fabric
  
  Why NetApp for AI
  
  What is AI
  
  Customer Stories
  
  Test Drive
  
  Free Trials
  
  How to Buy
- VDI
  VDI
  
  Provide a powerful, consistent end-user computer (EUC) experience—regardless of team size, location, complexity.
  Visit VDI
  
  Products
  
  Spot PC
  
  Virtual Desktop Service
  
  Quick Links
  
  Data Fabric
  
  What is VDI
  
  Customer Stories
  
  Test Drive
  
  Free Trials
  
  How to Buy
- Services
  Services
  
  We have a service for your every need, plus the ones you’re about to discover.
  Visit Services
  
  Services
  
  Professional Services
  
  Support Services
  
  Quick Links
  
  Data Fabric
  
  Customer Stories
  
  Test Drive
  
  Free Trials
  
  How to Buy
Support & Training
How to Buy
Community

Site reliability engineering versus DevOps

Site reliability engineers (SREs) work between development and operations, but not necessarily within DevOps proper. The concept of SRE has been around since 2003, which means that it’s older than DevOps. The term was made popular by Ben Treynor, who created Google’s Site Reliability Team. According to Treynor, SRE is “what happens when a software engineer is tasked with what used to be called operations.”

Both disciplines, DevOps and SRE, aim to enhance the release cycle by helping dev and ops see each other’s side of the process throughout the application lifecycle. They also advocate automation and monitoring, reducing the time from when a developer commits a change to when it’s deployed to production. SREs and DevOps aim for this result without compromising the quality of the code or the product itself.

SRE and DevOps ask two different but equally valuable questions:

DevOps asks what needs to be done.

SRE asks how that can be done.

Site reliability engineers day to day

Site reliability engineers measure service level indicators (SLIs) and service level objectives (SLOs), while DevOps teams measure the failure rate plus the success rate over time. SREs share responsibilities related to the following DevOps pillars of infrastructural improvement:

Reduce organizational silos

SREs don't discuss how many silos exist in company, but they encourage everyone else to discuss the issue. This discussion is accomplished by using the tools and techniques across the company, helping to spread ownership across all employees.

Accept failure as normal

SREs need to make sure that there aren’t too many errors or failures. To do so, they use a formula composed of SLI and SLO scores. SLIs count failures per request, by calculating request latency, throughput of requests per second, or failures per request per time. SLOs are derived from threshold and percentage, and represent the success of SLIs over a certain amount of time.

Implement gradual change

SREs are all in for change, but in a slow, methodical way. Because companies want to move faster, they demand frequent releases, continually updating the product. So DevOps and SREs must respond quickly but maintain a steady, controlled pace.

Leverage tooling and automation

Automate as long as it provides value to developers and operations by removing manual tasks.

Measure everything

SRE teams need to know that everything is moving in the right direction. This can be accomplished by setting up alerts for various scenarios, embracing peer code review, and/or using unit tests.

Future of site reliability engineering (SRE)

Site reliability engineering (SRE) and DevOps are two trending disciplines with quite a bit of overlap; their essential goals are understanding how to measure success or failure and how to gain continuous reliability across every application. Reliability is not just about the infrastructure—it’s relevant every step of the way, from application quality through performance and on up to security. SREs care about every process from source code to deployment; that’s how they earn the reputation of being a true bridge from development to operations.

Site reliability engineering versus DevOps

Site reliability engineers day to day

Reduce organizational silos

Accept failure as normal

Implement gradual change

Leverage tooling and automation

Measure everything

Future of site reliability engineering (SRE)

Continue reading

How NetApp cloud services makes any SRE’s job easier

DevOps vs. SRE: Why not both?

Top 5 things you should not miss at Google Cloud Next ’19

Gartner’s tips for navigating your DevOps Journey