BT

Senior Site Reliability Engineer - Cloud Systems @ Bolt On Technology

Bolt On Technology
PlatformRemote • Global$110k-110kPosted about 3 hours ago

Job Description

Salary: $110,000 - 110,000 per year

Requirements:
  • Over 6 years of experience in Site Reliability Engineering, DevOps, Platform Engineering, or similar roles
  • Proficient with cloud service providers (AWS, Azure, or GCP)
  • Strong skills in infrastructure as code (Terraform, CloudFormation, Pulumi, etc.)
  • Experience with container technologies and orchestration (Docker, Kubernetes)
  • Solid foundations in Linux systems administration and networking
  • Background in building and managing CI/CD pipelines
  • Practical knowledge of monitoring and observability platforms (Datadog, Prometheus, Grafana, New Relic, etc.)
  • Strong problem-solving abilities and incident management expertise
  • Experience in automation and scripting (Python, Bash, Go, or related languages)
Responsibilities:
  • Design, develop, and maintain highly available and fault-tolerant systems
  • Lead efforts to enhance reliability across both production and non-production settings
  • Manage and advance monitoring, alerting, and observability systems
  • Facilitate incident response, conduct root cause analyses, and oversee post-incident assessments
  • Implement automation strategies to minimize manual operational efforts
  • Collaborate with Engineering, Security, and Product teams to fulfill platform requirements
  • Define and monitor service-level indicators (SLIs), service-level objectives (SLOs), and error budgets
  • Lead initiatives for capacity planning and performance optimization
  • Refine deployment, CI/CD, and infrastructure-as-code methodologies
  • Identify and mitigate risks to reliability and scalability before they affect users
  • Mentor junior engineers and contribute to technical standards within the team
  • Participate in on-call rotations and enhance on-call processes
Technologies:
  • AWS
  • Azure
  • Bash
  • CI/CD
  • Cloud
  • Datadog
  • DevOps
  • Docker
  • GCP
  • Grafana
  • Support
  • Kubernetes
  • Linux
  • Prometheus
  • Python
  • Security
  • Terraform

More:

We are a dynamic company seeking a Senior Site Reliability Engineer who will focus on the reliability, scalability, performance, and security of our production systems. This position is a perfect blend of software engineering and systems engineering, aimed at constructing resilient infrastructure and automating processes to lower operational risks. We offer competitive salaries, comprehensive medical, dental, and vision benefits, flexible work schedules, unlimited PTO, and support for professional development, among other perks. As a part of our team, you will play a key role in driving technical excellence within our organization.

last updated 10 week of 2026

More Platform Engineering Jobs

PlatformSource: DevITJobsOnsite • San Antonio, Texas, College Park$80k-90k
about 3 hours ago
PlatformSource: DevITJobsOnsite • Alexandria, Virginia, Walker Lane 6361$85k-125k
about 3 hours ago
PlatformSource: DevITJobsOnsite • Annapolis Junction, Maryland, Dorsey Run Road 8210$100k-140k
about 3 hours ago