- Strong expertise in Microservices Architecture, including:designing distributed systems, deploying microservices, production support, and maintaining reliability in distributed environments.
- Deep hands-on experience with Kubernetes, including:deployment management, scaling, upgrades, troubleshooting, cluster operations.
- Working proficiency with API Gateway platforms, including:Azure API Management (APIM), Kong, IBM API Connect (APIC), traffic management, routing, rate limiting, API observability.
- Experience with Observability and Monitoring tools, including:Splunk, AppDynamics, Instana, log analytics, metrics, distributed tracing, dashboards, alerting, monitoring.
- Strong expertise in Production Issue Resolution, including:complex incident troubleshooting, root cause analysis (RCA), problem resolution, preventative measures.
- Familiarity with Site Reliability Engineering (SRE) practices, including:SLIs, SLOs, error budgets, incident response, post-mortems, automation, continuous improvement.
- Experience in Production / Operations Support environments, including:system reliability management, platform performance monitoring, operational support, and service stability.
Production Support Engineer


United Technology
Dice Job Match Score™
🛠️ Calibrating flux capacitors...
Job Details
Skills
- Kubernetes Production Support
- Microservices & API Gateway Support
- Observability / Incident Management / RCA
Summary
Job Summary
We are looking for Production Support Engineers with strong experience in supporting modern distributed platforms in production. The role requires hands-on expertise in microservices, Kubernetes, API Gateway platforms, and observability tools. The candidate must be able to manage critical production incidents, perform root cause analysis, improve system reliability, and drive operational excellence using SRE practices.
Key Responsibilities
-
Provide production support for microservices-based applications running in distributed environments.
-
Monitor, troubleshoot, and resolve production issues across applications, infrastructure, APIs, and platform services.
-
Manage and support Kubernetes environments including deployments, scaling, upgrades, cluster health, and troubleshooting.
-
Work with API Gateway platforms such as Azure API Management (APIM), Kong, and IBM API Connect (APIC) for routing, rate limiting, traffic control, and API monitoring.
-
Use observability and monitoring tools such as Splunk, AppDynamics, Instana, or similar tools for logs, metrics, traces, dashboards, and alerting.
-
Perform root cause analysis (RCA) for incidents and implement corrective and preventive actions.
-
Support incident management, problem management, and post-mortem activities.
-
Apply SRE best practices including SLIs, SLOs, error budgets, automation, and continuous improvement.
-
Improve platform reliability, resilience, and performance through proactive monitoring and operational enhancements.
-
Collaborate with development, infrastructure, cloud, and support teams to ensure stable production operations.
Required Skills
Preferred Background
-
8–10 years of relevant experience in production support / operations support roles
-
Administrative support experience can also be considered if the candidate has strong platform support and troubleshooting exposure
-
Experience supporting high-availability, enterprise-scale distributed systems is preferred
- Dice Id: 91165420
- Position Id: 171-38949-
- Posted 2 days ago
Company Info
About United Technology
At United Technology, we specialize in connecting businesses with exceptional tech talent that drives innovation and accelerates growth. With an extensive network of highly skilled professionals in fields such as software development, cybersecurity, data science, cloud computing, and IT infrastructure, we match the right individuals to the right opportunity.
Similar Jobs
It looks like there aren't any Similar Jobs for this job yet.
Search all similar jobs