JOB TITLE: Reliability Engineer (Mid-Level) JOB LOCATION: Travel to Hanscom Air Force Base, MA or Huntsville AL occasionally WAGE RANGE*: $120k to $125k
JOB NUMBER: 26-00598 REQUIRED EXPERIENCE: Active Secret security clearance required
DoD 8570 / 8140 compliant certification (IAT Level II required)
One or more cloud certifications (AWS, Azure, Google Cloud Platform, or OCI) U.S. Citizenship required
Bachelor's degree in Computer Science, Engineering, Information Technology, or related field
Minimum 8 years of experience in cloud engineering, systems engineering, or reliability engineering
Experience supporting cloud-based systems and distributed environments
Strong understanding of system monitoring, performance tuning, and incident response
JOB DESCRIPTION We are seeking a Reliability Engineer to support cloud platform stability, performance, and operational resilience within a federal program.
This role focuses on ensuring the availability and reliability of cloud-based systems through proactive monitoring, incident response, and performance optimization. The Reliability Engineer will support production environments, improve system resilience, and help drive operational excellence across distributed cloud services.
Key Responsibilities - Ensure availability, performance, and reliability of cloud platforms and services
- Monitor systems and respond to incidents, outages, and performance degradation
- Develop and maintain monitoring, logging, and alerting strategies across cloud environments
- Support implementation of high availability, backup, and disaster recovery solutions
- Analyze system performance and identify areas for optimization and improvement
- Troubleshoot issues including hardware degradation, network latency, and resource constraints
- Support production readiness by validating system requirements including dependencies, diagrams, and monitoring plans
- Utilize operational metrics such as MTTR (Mean Time to Recovery) and MTTF (Mean Time to Failure) to improve system performance
- Collaborate with engineering and DevOps teams to support system integration and deployment activities
- Develop technical solutions to complex system reliability challenges
Required Qualifications - Active Secret security clearance required
- U.S. Citizenship required
- Bachelor's degree in Computer Science, Engineering, Information Technology, or related field
- Minimum 8 years of experience in cloud engineering, systems engineering, or reliability engineering
- Experience supporting cloud-based systems and distributed environments
- Strong understanding of system monitoring, performance tuning, and incident response
Certifications - DoD 8570 / 8140 compliant certification (IAT Level II required)
- One or more cloud certifications (AWS, Azure, Google Cloud Platform, or OCI)
Equal opportunity employer as to all protected groups, including protected veterans and individuals with disabilities * While an hourly range is posted for this position, an eventual hourly rate is determined by a comprehensive salary analysis which considers multiple factors including but not limited to: job-related knowledge, skills and qualifications, education and experience as compared to others in the organization doing substantially similar work, if applicable, and market and business considerations. Benefits offered include medical, dental and vision benefits; dependent care flexible spending account; 401(k) plan; voluntary life/short term disability/whole life/term life/accident and critical illness coverage; employee assistance program; sick leave in accordance with regulation. Benefits may be subject to generally applicable eligibility, waiting period, contribution, and other requirements and conditions. Benefits offered are in accordance with applicable federal, state, and local laws and subject to change at TCM's discretion. #Dice