Overview
Skills
Job Details
Key Responsibilities:
Support the design, implementation, and sustainment of CI/CD pipelines with embedded with auditable deployment processes.
Promote infrastructure-as-code using Terraform, Helm, and Ansible, incorporating HITRUST and GxP controls into reusable modules.
Architect and maintain highly available, scalable, and compliant systems leveraging Kubernetes and cloud platforms (AWS, Google Cloud Platform, Azure).
Apply SRE principles defining, measuring, and improving reliability metrics (SLIs/SLOs/SLAs) in regulated healthcare environments.
Lead capacity planning, performance tuning, and infrastructure optimization initiatives focused on regulatory and privacy requirements.
Manage the full incident lifecycle (detection, triage, resolution, postmortem), documenting as required for FDA compliance and audit readiness.
Develop and maintain incident response playbooks, including IT and regulatory escalation protocols.
Implement and manage monitoring solutions (Datadog, Prometheus, Grafana, Elastic Search) to support rapid issue identification in compliance with healthcare mandates.
Integrate and manage SIEM tools (Splunk, Datadog Security, Elastic Security) for log aggregation, threat detection, and support of regulatory audits (HITRUST, GxP).
Collaborate with security, quality assurance, and regulatory teams to monitor and respond to production security incidents.
Ensure logging, auditing, and reporting meet FDA, HITRUST, ISO 27001 and healthcare industry standards including data retention, traceability, and privacy safeguards.
Document and communicate infrastructure processes clearly to facilitate internal knowledge transfer and external audit readiness.
Plan and manage resource utilization to meet both performance goals and regulatory efficiency standards.
Troubleshoot and support cloud/network issues, ensuring secure handling of protected health information (PHI) and device data.
Qualifications:
Bachelor s or Master s degree in Computer Science, Engineering, or related field.
7+ years in Production Engineering, DevOps, or SRE roles within healthcare, medical device, or life sciences industries.
Expertise in containerization (Kubernetes, Docker), cloud platforms, and infrastructure-as-code.
Direct experience supporting systems subject to FDA GxP and HITRUST compliance; familiarity with HIPAA, SOC2, ISO 27001 frameworks.
Strong skills in scripting/automation (Python, Bash, Go).
Proven track record managing SIEM and monitoring platforms in regulated environments.
In-depth knowledge of incident response and reliability engineering in healthcare/medical device settings.
Certifications in cloud security, DevOps, and/or healthcare compliance (e.g., HITRUST, AWS Security, etc.) strongly preferred.
Preferred Skills:
Experience deploying and supporting medical device software under FDA regulations.
Familiarity with quality management systems, validation procedures, and documentation for regulatory audits and FDA submissions.
Strong communication and leadership skills for cross-functional collaboration in a regulated setting.
Ability to innovate while maintaining strict compliance constraints.