SRE Jobs – Remote & On-Site Site Reliability Engineering Roles
Find your next site reliability engineer role. Browse remote SRE jobs and on-site positions focused on system reliability, observability, and incident response.
Site Reliability Engineers ensure that production systems remain available, performant, and resilient at scale. Our curated SRE job listings feature opportunities working with observability tools like Prometheus, Grafana, and Datadog, along with incident management platforms and chaos engineering practices. Whether you're looking for remote SRE jobs, senior site reliability engineer roles, or cloud SRE careers with AWS, Azure, or GCP, CloudOpsJobs connects you with companies committed to reliability excellence. Explore positions where you'll define SLOs, lead incident response, and build the automation that keeps systems running smoothly.
All Jobs
11 positions
About SRE Careers
Site Reliability Engineering roles focus on building and maintaining highly reliable production systems. Pioneered by Google, SRE combines software engineering with operations expertise to ensure system availability and performance.
Common Skills & Tools
- • Observability: Prometheus, Grafana, Datadog, New Relic
- • Incident Management: PagerDuty, Opsgenie, incident.io
- • SLOs/SLIs: Error budgets, availability targets
- • Chaos Engineering: Gremlin, Chaos Monkey, Litmus
- • Cloud: AWS, Azure, GCP, Kubernetes
- • Languages: Python, Go, Bash, SQL