Overview
Remote
Depends on Experience
Contract - W2
Contract - Independent
Contract - 5 Year(s)
10% Travel
Skills
CloudWatch
Grafana
Datadog
Prometheus
AWS
Windows
Oracle
Red Hat
SQL Server
C2
RDS
Lambda
S3
ECS
EKS
CI/CD
GibHub
GitLab
Kubernettes
DevOps)
Job Details
Sr. Cloud Engineer
Location: Remote
Eastern, Central and Mountain time zones.
ship is Required - Ability to pass enhanced background screen (criminal, financial, drug) for Public Trust clearance.
Description:
Client is seeking a detail-oriented and proactive Sr. Cloud Engineer to design, implement, and manage observability solutions across our cloud infrastructure. In this role you’ll be responsible for ensuring system reliability and visibility through best-in-class monitoring, logging and alerting practices across AWS. You’ll work across operations and compliance teams to ensure our AWS workloads meet performance expectations while managing security, regulatory and cost-efficiency standards. This role is key to driving visibility, governance and financial accountability in our cloud environment.
Responsibilities:
- Design and implement health checks and probes for cloud infrastructure and applications across AWS
- Define and deploy readiness and liveness probes for containers running in EKS/ECS
- Write custom scripts for CloudWatch custom metrics and alarms based on application specific probes
- Implement alerting and remediation automation based on probe outputs
- Document monitoring strategies, probe configurations and operational playbooks
- Define monitoring strategies for cloud resources, microservices and containerized workloads
- Implement automated health checks and uptime monitoring
- Continuously optimize and evolve the observability stack to improve reliability and reduce noise
- Configure and manage monitoring tools (CloudWatch, Grafana, Datadog, Prometheus)
- Set up monitoring thresholds, dashboards, and metrics for application and infrastructure
- Perform root cause analysis and incident correlation using monitoring and performance analysis tools
- Maintain a central inventory of all licensed software deployed in AWS environments (Windows, Oracle, Red Hat, SQL Server)
- Ensure compliance with vendor-specific licensing terms
- Monitor usage patterns and perform license audits and reconciliation
- Identify and remediate latency issues, throughput bottlenecks and underutilized resources
- Recommend and implement right-sizing of compute, memory and storage resources
- Analyze and optimize the performance of AWS resources, including EC2, RDS, Lambda, S3, ECS and EKS
- Conduct performance profiling and benchmarking for applications hosted on AWS
- Contribute to capacity planning, disaster recovery strategies and performance testing initiatives
- Create reports on system performance trends and opportunities, capacity planning and cost-performance trade-offs
Required Qualifications:
- BA/BS in IT, Computer Science or related field (or equivalent work experience may be accepted in lieu of the degree)
- 3+ years of experience in cloud infrastructure with emphasis on AWS
- Strong experience with CloudWatch (metrics, logs, alarms) CloudWatch Synthetics (canary scripting), Route 53 health checks and failover strategies
- Proficient in scripting languages like Python, Bash or Node.js.
- Hands-on experience with CI/CD tools (GibHub, GitLab, Kubernettes, DevOps)
- Cloud certifications (AWS DevOps Engineer, Solutions Architect Associate)
- Proficient with license management tools and cost optimization platforms
- Solid understanding of cloud architecture principles, autoscaling strategies and load balancing
- Strong written and verbal communication skills for technical and non-technical stakeholders
- Excellent analytical and problem-solving skills
- Must be able to obtain and maintain a Public Trust clearance
Preferred Qualifications:
- Hands-on experience with observability stacks like Grafana, OpenSearch, Datadog
- Familiarity with FinOps practices and cost-performance trade-offs
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.