Role Summary:
The Cloud Administrator SME (Onshore) is a critical, customer-facing technical expert responsible for the 24x7 operational health, effective L2 incident management, and onsite support of our hybrid, multi-cloud infrastructure supporting mission-critical Healthcare systems like EPIC Electronic Health Record (EHR) workloads across Azure, Google Cloud Platform, and OCI. This role requires hands-on expertise in laC, container orchestration (GKE), and ensuring absolute compliance with HIPAA and other healthcare regulations, with a strong focus on high-touch coordination during major hospital events.
Key Responsibilities:
Serve as the primary L2/L3 Onshore technical expert for all cloud infrastructure and EPIC workload incidents, ensuring rapid diagnosis and resolution to minimize impact on clinical operations.
Lead communications during active incidents, providing clear, concise, and professional updates to hospital I! teams, dlinical stakeholders, and internal leadership
Perform onsite, hands-on support during critical periods, including EPIC go-lives, major version upgrades, patching windows, and disaster recovery drills.
Coordinate directly with EPIC technical staff, hospital IT teams (network, security, application), and vendors to resolve complex cross-functional dependencies
Daily administration, monitoring, and proactive remediation of laas/Paas services across Azure, Google Cloud Platform, and Oracle Cloud infrastructure (OCi) supporting EPIC environments.
Operational expertise with Google Kubernetes Engine (GKE): Manage the deployment, scaling, healt, and operational troubleshooting of containerized EPIC components or supporting microservices.
Perform resource provisioning, capacity monitoring, and infrastructure tagging for cost management and chargebacks across all multi-cloud tenants.
Maintain and validate IAM, backup, and geo-redundant Disaster Recovery (DRj/Business Continuity Planning (BCP) mechanisms for EPIC and integrated hospital systems
Implement and enforce security baselines and compliance controls (e.g., HIPAA, SOCZ, CIS, NIST) at the infrastructure layer across Azure, Google Cloud Platform, and OCI.
Drive automation efforts for routine operational tasks and provisioning using Terraform for laC and scripting te (PowerShell, Python) to ensure operational consistency and auditability.
Manage log monitoring, correlation, and initial incident response for security and performance events within EPIC cloud environments.Qualifications & Technical Skills:Qualifications & Technical Skills:
Tools: Terraform, Github, Ansible, Tanium, PowerShell, YAML, Bash, Python.
Qualifications & Technical Skills:
Experience & Environment: /-10 years in infrastructure operations, with at least 5 years dedicated to Cloud Administration and 5+ years supporting EPIC CHR or similar mission-critical clinical systems in a hybrid environment.
Multi-Cloud Hands-on Expertise: Proven expertise in the administration, configuration, and operational suppe of Microsoft Azure, Google Cloud Platform (GP), and Oracle Cloud Infrastructure (OCI)
Containerization: Strong practical experience in the operational management, troubleshooting, and administration of Google Kubernetes Engine (GKE) clusters.
EPIC Systems Knowledge: Solid understanding of EPIC infrastructure requirements, deployment topologies (eg
Clarity, Chronicles), and operational dependencies.
Automation & Scripting: Highly proficient with:
laC: Terraform for multi-cloud deployments.
Scripting: PowerShell, Python, YAML, and Bash
Compliance: Deep knowledge of HIPAA requirements and operational procedures necessary to maintain compliance in a cloud environment