Senior Cloud Engineer

  • Herndon, VA
  • Posted 15 hours ago | Updated 3 hours ago

Overview

On Site
Contract - Independent
Contract - W2

Skills

Government Contracts
IT Service Management
Attention To Detail
Reliability Engineering
Collaboration
Finance
Accountability
Microservices
Management
Dashboard
Root Cause Analysis
Performance Analysis
Inventory
Microsoft Windows
Oracle
Red Hat Linux
Microsoft SQL Server
Regulatory Compliance
Licensing
Auditing
Storage
Amazon EC2
Remote Desktop Services
Amazon RDS
Amazon S3
Benchmarking
Disaster Recovery
Performance Testing
Capacity Management
Business Analytics
Business Analysis
Computer Science
IaaS
Amazon Route 53
Failover
Scripting
Python
Bash
Node.js
Continuous Integration
Continuous Delivery
GitLab
Cloud Computing
Amazon Web Services
DevOps
Licensing Management
Optimization
Cloud Architecture
Load Balancing
Communication
Analytical Skill
Conflict Resolution
Problem Solving
Security Clearance
Stacks Blockchain
Grafana

Job Details

Senior Cloud Engineer
100% Remote
Hours: Eastern, Central and Mountain time zones
Security Clearance: Must be able to obtain Public Trust Clearance
ship required per government contract
W2 ONLY, NO C2C


ALTA IT Services is seeking a detail-oriented and proactive Sr. Cloud Engineer to design, implement, and manage observability solutions across our cloud infrastructure. In this role you'll be responsible for ensuring system reliability and visibility through best-in-class monitoring, logging and alerting practices across AWS. You'll work across operations and compliance teams to ensure our AWS workloads meet performance expectations while managing security, regulatory and cost-efficiency standards. This role is key to driving visibility, governance and financial accountability in our cloud environment.

Responsibilities:
Design and implement health checks and probes for cloud infrastructure and applications across AWS
Define and deploy readiness and liveness probes for containers running in EKS/ECS
Write custom scripts for CloudWatch custom metrics and alarms based on application specific probes
Implement alerting and remediation automation based on probe outputs
Document monitoring strategies, probe configurations and operational playbooks
Define monitoring strategies for cloud resources, microservices and containerized workloads
Implement automated health checks and uptime monitoring
Continuously optimize and evolve the observability stack to improve reliability and reduce noise
Configure and manage monitoring tools (CloudWatch, Grafana, Datadog, Prometheus)
Set up monitoring thresholds, dashboards, and metrics for application and infrastructure
Perform root cause analysis and incident correlation using monitoring and performance analysis tools
Maintain a central inventory of all licensed software deployed in AWS environments (Windows, Oracle, Red Hat, SQL Server)
Ensure compliance with vendor-specific licensing terms
Monitor usage patterns and perform license audits and reconciliation
Identify and remediate latency issues, throughput bottlenecks and underutilized resources
Recommend and implement right-sizing of compute, memory and storage resources
Analyze and optimize the performance of AWS resources, including EC2, RDS, Lambda, S3, ECS and EKS
Conduct performance profiling and benchmarking for applications hosted on AWS
Contribute to capacity planning, disaster recovery strategies and performance testing initiatives
Create reports on system performance trends and opportunities, capacity planning and cost-performance trade-offs

Required Qualifications:
BA/BS in IT, Computer Science or related field (or equivalent work experience may be accepted in lieu of the degree)
3+ years of experience in cloud infrastructure with emphasis on AWS
Strong experience with CloudWatch (metrics, logs, alarms) CloudWatch Synthetics (canary scripting), Route 53 health checks and failover strategies
Proficient in scripting languages like Python, Bash or Node.js.
Hands-on experience with CI/CD tools (GibHub, GitLab, Kubernettes, DevOps)
Cloud certifications (AWS DevOps Engineer, Solutions Architect Associate)
Proficient with license management tools and cost optimization platforms
Solid understanding of cloud architecture principles, autoscaling strategies and load balancing
Strong written and verbal communication skills for technical and non-technical stakeholders
Excellent analytical and problem-solving skills

Must be able to obtain and maintain a Public Trust clearance

Preferred Qualifications:
Hands-on experience with observability stacks like Grafana, OpenSearch, Datadog
Familiarity with FinOps practices and cost-performance trade-offs

System One, and its subsidiaries including Joul, ALTA IT Services, and Mountain Ltd., are leaders in delivering outsourced services and workforce solutions across North America. We help clients get work done more efficiently and economically, without compromising quality. System One not only serves as a valued partner for our clients, but we offer eligible employees health and welfare benefits coverage options including medical, dental, vision, spending accounts, life insurance, voluntary plans, as well as participation in a 401(k) plan.

System One is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender identity, age, national origin, disability, family care or medical leave status, genetic information, veteran status, marital status, or any other characteristic protected by applicable federal, state, or local law.

#M2

Ref: #850-Rockville (ALTA IT)
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.