Cloud Resiliency Architect - Infrastructure & Security Automation

  • Posted 1 day ago | Updated 4 hours ago

Overview

Remote
Hybrid
Contract - W2
Contract - 3 Month(s)

Skills

Python
Operations
DEV OPS
Terraform
Best Practices
Metrics
APM
Database
Scripting
Kubernetes
Kafka
Continuous Integration/Delivery
Prometheus
Grafana
Shell Scripting
disaster recovery
mentor
Cloud Security
Welding
Provisioning
Regulatory Compliance
Problem-Solving
Incident Response
Identity and Access Management
AWS CloudWatch
AWS Certified
Reliability Engineering
Business Continuity
Infrastructure Engineering

Job Details

Job Title: Cloud Resiliency Architect Infrastructure & Security Automation

Location: Remote (US)

Job Summary:

We are seeking an experienced Cloud Resiliency Architect to lead the design and implementation of highly available, secure, and automated cloud infrastructure with a focus on resiliency and operational excellence. You will leverage expertise in SRE principles, Terraform, observability, and security automation to build resilient systems that ensure business continuity and compliance.

This role requires a deep understanding of cloud security controls including IAM, KMS, Vaulting, and messaging platforms such as Kafka, combined with hands-on experience driving infrastructure automation and monitoring.

Key Responsibilities:
  • Architect and implement resilient cloud infrastructure using Terraform and automation best practices to support high availability and disaster recovery goals.

  • Design and integrate security automation controls across IAM, KMS, and secret management tools like Vault to enforce least privilege and compliance.

  • Lead initiatives around resiliency operations, including failure detection, automated recovery, and incident response.

  • Develop and maintain observability frameworks leveraging tools and metrics to proactively monitor infrastructure and applications for availability and performance.

  • Collaborate with database and messaging platform teams to ensure fault tolerance and secure configurations for DBs and Kafka clusters.

  • Drive the adoption of Site Reliability Engineering (SRE) practices and principles to improve operational reliability and automation.

  • Build and maintain CI/CD pipelines to automate infrastructure provisioning, security policy enforcement, and monitoring deployments.

  • Mentor engineering teams on resiliency patterns, security automation, and infrastructure best practices.

  • Stay current with cloud security, resilience trends, and emerging technologies to continuously enhance the cloud environment.

Required Qualifications:
  • 7+ years of experience in cloud infrastructure engineering, site reliability, or resiliency operations.

  • Strong hands-on expertise with Terraform for cloud infrastructure automation.

  • Deep knowledge of cloud security concepts, including IAM, KMS, and secret management solutions like HashiCorp Vault.

  • Experience with observability tools (e.g., Prometheus, Grafana, ELK, CloudWatch) and implementing monitoring/alerting frameworks.

  • Familiarity with messaging platforms such as Kafka and resilient database architectures.

  • Proficiency in scripting or programming languages (e.g., Python, Go, Bash) for automation and tooling.

  • Strong understanding of Site Reliability Engineering (SRE) principles and practices.

  • Excellent problem-solving skills and ability to work collaboratively across cross-functional teams.

Preferred Qualifications:
  • Cloud certifications such as AWS Certified Security Specialty, AWS Certified DevOps Engineer, or equivalent.

  • Experience with Kubernetes and container orchestration resilience strategies.

  • Knowledge of regulatory compliance frameworks and their impact on cloud security and operations.

  • Experience designing multi-region disaster recovery and failover solutions.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.