Role: Senior Devops Engineer
Location: New York, NY
Mode of Hire: Full Time
Job Description:
We are seeking a Senior DevOps Engineer with deep expertise in Terraform and Cloud Infrastructure to design, automate, and manage highly available, scalable, and secure cloud environments.
The ideal candidate will play a key role in infrastructure as code (IaC), CI/CD automation, cloud security, and production reliability, while mentoring junior engineers and collaborating with development and architecture teams.
Key Responsibilities Infrastructure as Code (Terraform)
· Design, develop, and maintain Terraform modules and reusable patterns for cloud infrastructure.
· Implement remote state management, state locking, and workspaces.
· Enforce IaC best practices including versioning, code reviews, tagging, and modular design.
· Integrate Terraform with CI/CD pipelines for automated deployments.
· Cloud Engineering
· Architect and manage cloud infrastructure on AWS / Azure / Google Cloud Platform:
· Compute, networking, storage, and managed services
· Design high availability, disaster recovery, and cost optimized architectures.
· Implement multi environment setups (dev, QA, staging, prod). CI/CD & Automation.
· Build and maintain CI/CD pipelines using tools like Jenkins, GitHub Actions, GitLab CI, Azure DevOps, or Bitbucket Pipelines.
· Automate infrastructure provisioning, application deployment, and rollback strategies.
· Improve deployment reliability using blue green, canary, or rolling deployments. Containerization & Orchestration
· Strong experience with Docker and Kubernetes.
· Manage and automate EKS / AKS / GKE clusters.
· Implement scaling, monitoring, and security best practices for containerized workloads.
· Use Helm or Kustomize for Kubernetes configurations. Cloud Security & Compliance
· Implement IAM best practices (least privilege, role based access).
· Secure infrastructure using Secrets Manager / Key Vault / Secret Manager.
· Enforce security policies using tools like OPA, Sentinel, Checkov, tfsec.
· Support compliance requirements (ISO, SOC2, PCI DSS – if applicable). Monitoring & Reliability
· Implement monitoring and alerting using Prometheus, Grafana, CloudWatch, Azure Monitor, Stackdriver.
· Handle production incidents, root cause analysis (RCA), and post mortems.
· Improve system reliability and performance through automation. Collaboration & Leadership
· Work closely with developers, architects, and QA teams.
· Mentor junior DevOps engineers and review IaC and pipeline code.
· Contribute to DevOps standards