Site Reliability Engineer

Overview

Hybrid
Depends on Experience
Contract - W2
Contract - 12 Month(s)
No Travel Required

Skills

API
GCP
PYTHON
LINUX
SRE
DOCKER
KUBERNETES

Job Details

Experience: 5+ years in SRE/DevOps with proven JVM, APIGEE, Google Cloud Platform observability, Grafana stack, GKE, OpenTelemetry, and UI instrumentation implementation experience

Clear Skills Needed:
Technical: Python, Linux, Prometheus, Grafana, Kubernetes, Docker, Loki, Tempo
JVM Metrics: Java application monitoring, JVM performance tuning, heap analysis, garbage collection optimization for portal applications
Logging & Tracing: Splunk, distributed tracing, log aggregation standards, correlation IDs across portal systems
API Management: APIGEE experience, API monitoring, rate limiting, security, performance tracking for portal APIs
Infrastructure: CI/CD pipelines , AI tools like GIT copilot , Cursor etc.
Observability Tools & Query Languages: PromQL, InfluxQL for querying metrics(Grafana)
Strong experience with Kubernetes (GKE), including namespace management, RBAC, and deploying/maintaining SRE tools via code (Python, Bash, YAML, Helm).

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

About Technocrafts, Inc.