Overview
On Site
Depends on Experience
Contract - Independent
Contract - 12 Month(s)
No Travel Required
Unable to Provide Sponsorship
Skills
Integrate OpenTelemetry data pipelines with Grafana and other observability platforms. Develop and maintain OpenTelemetry collectors and exporters for various environments. Develop and implement monitoring solutions for applications and infrastructure to ensure high availability and performance.
Job Details
Grafana Dashboard Specialist & Automation SME (DevOps)
Irving, Dallas, Plano – Texas
12+ Months
Hybrid
Job Summary:
We are seeking a highly skilled and motivated Grafana Dashboard Specialist with deep expertise in DevOps automation to join our team. The ideal candidate will be responsible for designing, developing, and maintaining advanced Grafana dashboards that provide actionable insights into system performance, application metrics, and business KPIs. Additionally, the candidate will be a Subject Matter Expert (SME) in automation, developing and contributing to CI/CD pipelines, infrastructure as code (IaC), and cloud-native operations for Grafana.
Key Responsibilities:
Grafana & Observability:
- Design and implement visually compelling and data-rich Grafana dashboards for Observability.
- Integrate Grafana Cloud with data sources such as Prometheus, Loki, ServiceNow, PagerDuty, Snowflake, AWS
- Integrate telemetry data sources such as Tomcat, Liberty, Ping, Linux, Windows, and databases (Oracle, PostGres) and REST API.
- Create alerting mechanisms for SLA breaches, latency spikes and transaction anomalies.
- Develop custom panels and alerts to monitor infrastructure, applications, and business metrics.
- Collaborate with stakeholders to understand monitoring needs and translate them to define KPIs and visualization needs.
- Optimize dashboard performance and usability across teams.
- Implement and manage OpenTelemetry instrumentation across services to collect distributed traces, metrics, and logs.
- Integrate OpenTelemetry data pipelines with Grafana and other observability platforms.
- Develop and maintain OpenTelemetry collectors and exporters for various environments.
- Develop and implement monitoring solutions for applications and infrastructure to ensure high availability and performance.
- Collaborate with development, operations, and other IT teams to ensure monitoring solutions are integrated and aligned with business needs.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.