Role : OpenShift Platform Engineer
Location : Atlanta , GA (Locals)
POSITION GENERAL DUTIES AND TASKS :
Cluster & Infrastructure Management
• Managed OpenShift clusters across bare metal, VMs, and containerized environments to meet diverse application needs.
• Administration of OpenShift environments using GitOps tools , ArgoCD and Azure DevOps (ADO).
• Maintain cluster-wide RBAC, access policies, and operational readiness for production workloads.
• Perform performance tuning and troubleshooting of CPU, memory, and I/O issues.
• Manage and maintain ISTIO, OSSM3 operators, Kiali, Grafana and familiarization with the upgrade.
Infrastructure as Code & Automation
• Design, develop and maintain automation using Red Hat Ansible Automation Platform (AAP) for infrastructure and application deployment.
• Knowledge of Terraform and IAC leveraging terraform.
• Integrate AAP with ServiceNow and Git for automated change management and version control.
• Automate OpenShift lifecycle operations—including cluster upgrades—via GitLab CI/CD pipelines.
• Built Infrastructure-as-Code solutions enabling cluster autoscaling, certificate rotation, service mesh deployment, and end-to-end OpenShift provisioning.
• Provisioning of clusters for the Service mesh Migration.
Security, Networking & Service Mesh
• Implement security solutions including Aqua Security, WIZ, and cluster-wide log monitoring tools.
• Configure Horizontal Pod Autoscaler (HPA) to support autoscaling of Kubernetes workloads.
• Configure Vertical Pod Autoscaler (KEDA) to support autoscaling.
• Able to remediate security vulnerability and present the remediation plan on Kubernetes to the management and external security auditors.
• Very high level understanding of the security posture and rule of least privilege to maintaining the platform wide firewall rules.
Monitoring, High Availability & Operations
• Implement monitor and alert for VMs and nodes using Prometheus, Grafana, and OpenShift Monitoring Stack.
• Improve uptime through enhanced observability and proactive remediation, supporting critical production systems.
• Reduce incident response time through improved on-call strategies and incident playbooks.
• Migrate OpenShift 4.14 clusters to service mesh-enabled architectures for enhanced reliability and traffic management.
• Implementation of Dynatrace for observability.
• Maintain of the support schedule and oncall rotation with weekly call to review the issues.
Collaboration & Governance
• Partner with senior engineers, architects, and cross-functional teams to deliver secure and scalable cloud solutions.
• Develop automated workflows for ServiceNow-driven namespace management, user onboarding, cluster patching, and Red Hat case creation.
• Maintain cloud documentation, architecture diagrams, and best practices in internal knowledge repositories.
Skillsets :
Operating System/Platforms:
Google Kubernetes Engine, OpenShift, Kubernetes, Azure, AWS, IAC, Apache, Active MQ, VMware ESXi , V-Center, RHEL.
Application/Software:
RedHat OpenShift, Kubernetes, AAP, Docker , Quay, Ansible, AWX Tower, Terraform, Jenkins, Artifact Registry, Prometheus, Grafana, APM, GitHub/Gitlab, IAM & Admin, Billing , VPC, Subnets, Firewall, Load Balancer, Hashicorp Vault, Certificate Manager, Secret Manager, App Engine, Cloud Run Functions, Cronjobs, Filestore, NFS, S3, AMQ, MySQL, PostgreSQL, Microsoft SQL Server, IIS, Tufin, VMware vSphere, ESXi , vCenter, vMotion, DRS, HA, vSAN, Splunk, Kafka, Apache, Weblogic, WebSphere, Dynatrace, WIZ, devOps tool Putty.