MG
Cloud Infrastructure Engineer
Metis Consulting GroupPlatformRemote • Remote$95k-105kPosted about 14 hours ago
Job Description
We are Metis Consulting Group, a fully remote team building and operating custom business applications, including intranets, data integrations, APIs, and workflow automation, for clients in travel, marketing, and related industries. Established in 1997, we are a certified Woman-Owned and Disability-Owned New York State Public Benefit Corporation and a top-scoring Best for the World B Corp. We offer flexible work arrangements, a generous benefits package, personalized mentoring, paid training, conference opportunities, profit-sharing bonuses, wellness and leisure perks, and mobile device and data plan subsidies. We value an equitable, inclusive workplace and strongly encourage candidates from diverse backgrounds to apply. This remote role expects general availability Monday through Friday, 9am to 5pm ET, and the permanent remote work location must be in New York, California, Michigan, or Maryland.
- We need at least 5 years of practical experience in infrastructure, cloud operations, or systems engineering.
- We expect proven ownership of production environments, including capacity planning, resource allocation, and server or service configuration.
- We need experience creating monitoring, alerting, and runbook processes that enable consistent incident resolution.
- You should be comfortable handling ticket-based support workflows for client-reported issues.
- We require hands-on knowledge of DNS, domain, and email settings, including TXT, DKIM, SPF, and DMARC, plus SSL and development certificate administration.
- You should have experience managing software and platform versions, including planning for end-of-life and end-of-support timelines.
- We need experience with user access control and security administration.
- Docker and Swarm experience are required; Kubernetes and broader orchestration experience are strongly preferred.
- You must have strong written communication skills for remote collaboration, documentation, tickets, and runbooks.
- You should be comfortable working with AI-assisted operations tools.
- We need experience evaluating legacy environments and planning and executing cloud migrations.
- Experience managing event buses and messaging systems such as Kafka is required.
- You should have working knowledge of firewall, WAF, and application-level security management.
- We expect comfort working in cloud platforms such as AWS, Azure, or GCP.
- You should have experience with CI/CD pipelines and deployment automation.
- We need capability in cost management and rightsizing across hosting environments.
- Infrastructure automation and scripting experience is required.
- You should bring a security-conscious approach to operations.
- This role may require occasional after-hours support.
- Nice-to-have experience includes Infrastructure-as-Code tools such as Terraform, Pulumi, or Ansible.
- Kubernetes administration experience or certification, such as CKA, is preferred.
- Cloud certifications such as AWS Solutions Architect or Azure Administrator are a plus.
- Consulting or multi-client environment experience would be beneficial.
- Strong technical documentation habits and independent research skills are advantageous.
- Familiarity with Agile ways of working, including sprint planning and iterative delivery, is a plus.
- We will have you manage client hosting operations end to end, including environments, costs, monitoring, incident response, and cloud migration roadmaps.
- You will work closely with technical leadership to assess hosting requirements.
- You will design, configure, monitor, and operate the infrastructure our applications and clients rely on.
- You will plan resources from the initial sizing stage through incident resolution and migration delivery.
- You will bring structure to hosted environments by improving observability, process, and operational consistency.
- You will help keep client-facing systems reliable and cost-effective.
- You will oversee infrastructure-related support and ensure incidents are handled in a repeatable, organized way.
- You may occasionally provide support outside standard working hours.
- We need at least 5 years of practical experience in infrastructure, cloud operations, or systems engineering.
- We expect proven ownership of production environments, including capacity planning, resource allocation, and server or service configuration.
- We need experience creating monitoring, alerting, and runbook processes that enable consistent incident resolution.
- You should be comfortable handling ticket-based support workflows for client-reported issues.
- We require hands-on knowledge of DNS, domain, and email settings, including TXT, DKIM, SPF, and DMARC, plus SSL and development certificate administration.
- You should have experience managing software and platform versions, including planning for end-of-life and end-of-support timelines.
- We need experience with user access control and security administration.
- Docker and Swarm experience are required; Kubernetes and broader orchestration experience are strongly preferred.
- You must have strong written communication skills for remote collaboration, documentation, tickets, and runbooks.
- You should be comfortable working with AI-assisted operations tools.
- We need experience evaluating legacy environments and planning and executing cloud migrations.
- Experience managing event buses and messaging systems such as Kafka is required.
- You should have working knowledge of firewall, WAF, and application-level security management.
- We expect comfort working in cloud platforms such as AWS, Azure, or GCP.
- You should have experience with CI/CD pipelines and deployment automation.
- We need capability in cost management and rightsizing across hosting environments.
- Infrastructure automation and scripting experience is required.
- You should bring a security-conscious approach to operations.
- This role may require occasional after-hours support.
- Nice-to-have experience includes Infrastructure-as-Code tools such as Terraform, Pulumi, or Ansible.
- Kubernetes administration experience or certification, such as CKA, is preferred.
- Cloud certifications such as AWS Solutions Architect or Azure Administrator are a plus.
- Consulting or multi-client environment experience would be beneficial.
- Strong technical documentation habits and independent research skills are advantageous.
- Familiarity with Agile ways of working, including sprint planning and iterative delivery, is a plus.
- We will have you manage client hosting operations end to end, including environments, costs, monitoring, incident response, and cloud migration roadmaps.
- You will work closely with technical leadership to assess hosting requirements.
- You will design, configure, monitor, and operate the infrastructure our applications and clients rely on.
- You will plan resources from the initial sizing stage through incident resolution and migration delivery.
- You will bring structure to hosted environments by improving observability, process, and operational consistency.
- You will help keep client-facing systems reliable and cost-effective.
- You will oversee infrastructure-related support and ensure incidents are handled in a repeatable, organized way.
- You may occasionally provide support outside standard working hours.
More Platform Engineering Jobs
PlatformSource: DevITJobsOnsite • Lakehurst, New-Jersey, John Davison Rockefeller Memorial Highway$110k-130k
about 14 hours ago
about 14 hours ago