NO 3RD PARITIES
NO SUB VENDORS
100% REMOTE EAST COAST HOURS
We are seeking a Lead Cloud Operations Engineer to ensure the stability, reliability, and operational performance of Azure-based platform services. This role serves as the primary L3/L4 escalation point, driving rapid incident resolution, root cause analysis, and continuous improvement of production environments.
Key Responsibilities
- Own availability, uptime, and performance of Azure production environments
- Act as L3/L4 escalation lead for major incidents
- Lead incident triage, coordination, and resolution
- Drive root cause analysis and eliminate repeat issues
- Improve MTTR and operational processes
- Own monitoring and alerting strategy (Azure Monitor, Log Analytics, DataDog, Application Insights)
- Review and approve production changes and ensure readiness
- Support security incident response and compliance
- Partner with engineering and application teams
Required Qualifications
- 7–10+ years IT infrastructure experience (operations-focused)
- 3–5+ years supporting Azure production environments
- Strong L3/L4 incident escalation experience
- Deep troubleshooting and root cause analysis skills
- Experience with monitoring tools and ITSM processes
- Experience supporting high-availability environments
Preferred Qualifications
- Azure certifications (AZ-104 or similar)
- Experience with AVD, hybrid cloud, migrations
- Experience with Ivanti or ITSM tools
- Familiarity with ITIL