Overview
Skills
Job Details
Application Cloud Engineer
100% Remote
Hours: Eastern, Central and Mountain time zones
Security Clearance: Must be able to obtain Public Trust Clearance
W2 ONLY, NO C2C
ALTA IT Services is seeking a skilled and motivated Application Cloud Engineer to join our dynamic team. The ideal candidate will be responsible for maintaining cloud-based applications and infrastructure on AWS. You will work closely with development, operations and security teams to ensure the scalability, performance and security of cloud applications.
Responsibilities:
Provision and manage AWS infrastructure using infrastructure as code (IaC) using tools such as Terraform and CloudFormation
Monitor and troubleshoot production systems using AWS CloudWatch and other observability tools
Collaborate with developers to containerize and deploy applications using ECS and Lambda
Deploy applications across multiple environments (dev, staging, prod) and ensure consistency and stability
Monitor deployments and system health using CloudWatch and other tools
Implement rollback strategies and manage version control during deployments
Troubleshoot and resolve deployment issues and improve pipeline performance and reliability
Proficient with Python, Bash, YAML/JSON, Node.js, Lambda functions
Perform daily health checks using AWS CLI or scheduled Lambda scripts to check health and log/report results
Document deployment processes and infrastructure architecture
Familiarity with image registries like Amazon ECR and CI/CD pipelines for container deployment
Collaborate with development team and DevOps teams to ensure applications are stateless and fault-tolerant
Implement enhancements to containerized environments on ECS, focusing on scalability, performance and observability
Enhance container orchestration strategies, including auto-scaling, rolling deployments and upgrades
Support feature branch testing, merge request validation and artifact promotion workflows
Ensure pipeline security and compliance through automated code scanning and approval gates
Responsible for remediation of OS-level, container and dependent vulnerabilities
Orchestrate failover and restoration of ECS/ EKS services, Lambda functions, databases and other infrastructure components
Test and document regional failover playbooks and recovery runbooks
Ensure compliance with RTO (Recovery Time Objective) and RPO (Recovery Point Objective) requirements
Participate in on-call rotations to support 24/7 production systems and respond to incidents as they arise
Diagnose and resolve production issues related to cloud services, container orchestration, databases and CI/CD pipelines
Follow and improve incident response playbooks, escalation procedures and communication workflows
Automate common operational tasks and improve alert accuracy to reduce on-call fatigue
Log incidents, changes, and operational metrics in tracking systems
Required Qualifications:
BA/BS in IT, Computer Science or related field (or equivalent work experience may be accepted in lieu of the degree)
2+ years of hands-on experience with AWS and cloud-based deployment strategies
Proficient in scripting languages like Python, Bash and Node.js.
Hands-on experience with CI/CD tools (GibHub, GitLab, Kubernettes, DevOps, CI)
Knowledge of disaster recovery planning and implementation
AWS or relevant Cloud certifications (AWS DevOps Engineer, Solutions Architect Associate)
Solid understanding of cloud architecture principles, autoscaling strategies and load balancing
Proficient with monitoring, alerting and logging tools
Strong written and verbal communication skills for technical and non-technical stakeholders
Excellent analytical and problem-solving skills
Must be able to obtain and maintain a Public Trust clearance
Preferred Qualifications:
Familiarity with container orchestration (Docker, ECS, Kubernetes)
Knowledge of ITIL practice or incident management frameworks