Overview
Skills
Job Details
Hi,
I hope this message finds you well. We have an urgent full-time role with our client for the role of Observability Engineer. Please share your updated resume if you are interested.
Role: Observability Engineer (20728-1)
Location: Remote
Duration: 12+ Months
Must have:
Dynatrace
Prometheus
Data Modeling
Responsibilities:
Design and implement observability frameworks across cloud-native and on-premises environments.
Integrate and manage monitoring, logging, and distributed tracing systems.
Build scalable data ingestion pipelines to collect data from diverse sources, including APIs.
Manage large-scale time-series data with efficient modeling, storage, and retrieval strategies.
Develop custom dashboards, alerts, and reporting views for actionable performance insights.
Build and maintain user-friendly status pages to communicate system health to stakeholders.
Utilize Prometheus and Grafana for time-series data collection, visualization, and alerting.
Use Dynatrace for deep-dive application and infrastructure monitoring and analysis.
Collaborate with development, operations, and SRE teams to define and refine key performance metrics (latency, errors, saturation, throughput).
Promote observability best practices and evangelize data-driven engineering across teams.
Use performance insights to improve system reliability, reduce MTTR, and optimize resource usage.
Implement dashboards-as-code and automate observability components using IaC tools (e.g., Terraform).
Requirements:
Proficiency with Prometheus, Grafana, and Dynatrace.
Strong experience with time-series data and PromQL.
Strong understanding of API design, data modeling, and data pipeline architecture.
Expertise in scripting and backend development using Python, Go, or Java.
Proven experience designing and scaling observability stacks for production workloads.
Hands-on experience with cloud platforms, especially AWS and Azure.
Experience with containerization and Kubernetes environments.
Familiarity with dashboards as code, Git workflows, and Infrastructure as Code tools like Terraform.