Overview
The Systems Operations and O&M Lead is responsible for overseeing the day-to-day operations, maintenance, and reliability of systems and infrastructure. This role ensures that all systems are operating efficiently, securely, and with minimal downtime, while leading a team of engineers and technicians to support ongoing operations and continuous improvement initiatives. The lead also oversees the release cycles of the software that address bug fixes, performance enhancements and security remediations and enhancements.
Key Responsibilities:
- Operations Management
- Lead and manage daily system operations to ensure high availability and performance.
- Monitor system health, performance metrics, and service levels (SLAs).
- Coordinate incident response, troubleshooting, and root cause analysis.
- Maintenance, Reliability & Systems monitoring
- Develop and implement preventive and corrective maintenance plans.
- Ensure timely patching, upgrades, and lifecycle management of systems.
- Drive reliability engineering practices to reduce system failures.
- Monitor the system performance using enterprise monitoring tools such as SPLUNK, New Relic
- Team Leadership
- Supervise and mentor operations and maintenance staff.
- Assign tasks, manage workload, and ensure team productivity.
- Foster a culture of accountability, safety, and continuous improvement.
- Process Improvement
- Establish and refine operational procedures, runbooks, and documentation.
- Identify opportunities for automation and efficiency gains.
- Implement best practices such as ITIL, DevOps, or SRE methodologies.
- Stakeholder Coordination
- Act as the primary point of contact for system operations.
- Collaborate with engineering, security, and business teams.
- Provide regular status reports and communicate risks/issues.
- Compliance & Security
- Ensure systems comply with organizational policies and regulatory requirements.
- Support audits, security assessments, and risk mitigation efforts.
Required Qualifications:
- Bachelor’s degree in information technology, Engineering, or related field (or equivalent experience).
- 7–10 years of experience in systems operations, infrastructure, or O&M roles.
- Proven experience leading technical teams.
- Strong knowledge of system administration, networking, and cloud/on-prem environments.
- Experience with SPLUNK,New Relic.
- Experience with SNOW for managing tickets
- Experience with AWS Infrastructure management and AWS Managed Services configuration
- Experience in FISMA
Desired qualifications:
· Experience in O&M activities for CMS systems
· Experience in CMS AWS Cloud
· Experience in CMS ATO activities
· Experience in Agile (Scrum/Kanban) methodology
Residency Requirement:
Candidate must be able to obtain Public Trust clearance and must have lived in the United States for at least three (3) out of the last five (5) years.
Salary & Benefits Information:
The actual salary offer will carefully consider a wide range of factors, including your skills, qualifications, experience, and location.
C-HIT offers Healthcare Benefits, Remote Working Options, Paid Time Off, PTO cash-out, Training/Certification opportunities, Healthcare Savings Account & Flexible Savings Account, Paid Life Insurance, Short-term & Long-term Disability, 401K Match, Employee Assistance Program, Paid Holidays, and much more perks and Voluntary benefits!
Employees of C-HIT shall, as an enduring obligation throughout their term of employment, adhere to all information security requirements as documented in company policies and procedures.
C-HIT, a CMMI Maturity Level 5 company, focuses on delivering information technology and professional services to Federal and State agencies.
C-HIT is an EOE, including disability and veterans”