Overview
On Site
Depends on Experience
Accepts corp to corp applications
Contract - W2
Contract - Independent
Contract - 18 Month(s)
No Travel Required
Able to Provide Sponsorship
Skills
Cloud Operations Subject Matter Expert (SME)
AWS Control Tower
EKS (Kubernetes)
EC2
Terraform
Python
Windows
Linux
ServiceNow(incident
change
problem
SLA tracking)
AWS Control Tower & EKS Operations Engineer
Cloud Systems Engineer (AWS)
Job Details
We are seeking a highly skilled Cloud Operations Subject Matter Expert (SME) to manage and operate a 100% AWS-based environment. This is a hands-on operational role (minimal to no DevOps) focused on daily platform management, incident remediation, cloud governance, and maintaining SLAs.
The ideal candidate is an experienced AWS operations professional with strong troubleshooting skills, deep understanding of cloud infrastructure, and the ability to manage enterprise-scale environments.
Key Responsibilities
- Manage and operate AWS environments leveraging Control Tower, EKS, EC2, S3, and related services
- Triage and resolve ServiceNow tickets, including OS-level and cloud-level troubleshooting, vulnerability remediation, and SLA adherence
- Commission and decommission cloud resources; manage lifecycle activities across environments
- Plan and execute disaster recovery (DR) activities and support RCA follow-ups
- Perform backups, patching, and routine maintenance of cloud resources and compute instances
- Support cloud migration efforts and site externalization initiatives
- Conduct cost cleanup, perform cost optimization analysis, and implement cost-saving strategies
- Build lightweight automations and PoCs to streamline operations
- Create, review, and implement change controls; maintain documentation, operational standards, and runbooks
- Collaborate with Windows and Linux engineering teams for OS-level diagnostic and remediation work
Required Technical Skills
- AWS Control Tower operational governance and landing zone management
- EKS (Kubernetes) cluster administration and troubleshooting
- EC2 lifecycle management, patching, monitoring, and performance tuning
- Terraform infrastructure-as-code for provisioning and change management
- Python scripting and automation for operational workflows
- S3 lifecycle policies, bucket security, access controls
- Strong OS-level troubleshooting skills across Windows and Linux
- Experience with ServiceNow (incident, change, problem, SLA tracking)
- Familiarity with vulnerability remediation and security patching processes
Nice-to-Have
- Experience with AWS cost-optimization tools and billing insights
- Familiarity with AWS services such as RDS, CloudWatch, IAM, GuardDuty
- Prior experience leading DR exercises and RCA activities
- Experience developing operational runbooks, SOPs, and playbooks
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.