Observability Engineer - Remote / Telecommute

  • Alpharetta, GA
  • Posted 2 days ago | Updated 2 days ago

Overview

On Site
Depends on Experience
Accepts corp to corp applications
Contract - W2
Contract - 1 Month(s)

Skills

API
Amazon Web Services
Backend Development
Cloud Computing
Collaboration
Communication
Conflict Resolution
Dashboard
Data Collection

Job Details

We are looking for Observability Engineer - Remote / Telecommute for our client in Alpharetta, GA
Job Title: Observability Engineer - Remote / Telecommute
Job Location: Alpharetta, GA
Job Type: Contract
Job Description:
Responsibilities:
  • Design and implement observability frameworks: Integrate and manage monitoring, logging, and tracing for cloud-native and on-premises systems.
  • Data integration and ingestion: Build systems to collect and ingest data from various sources, often through APIs, and manage time-series data at scale.
  • Develop custom solutions: Create custom dashboards, alerts, and reporting views to provide clear, actionable insights into system performance.
  • Build status pages: Develop and maintain status pages that provide a user-friendly web experience to communicate system health to stakeholders.
  • Tooling expertise: Leverage expertise in tools like Prometheus and Grafana for time-series data collection and visualization, and Dynatrace for deep-dive analysis and monitoring.
  • Collaboration and evangelism: Work with development, operations, and SRE teams to identify performance metrics (e.g., latency, errors)
  • Performance and reliability: Use data-driven insights to improve system reliability, reduce Mean Time To Resolution (MTTR), and optimize resource usage.
Required Skills and Qualifications:
  • Technical skills: o Proficiency in Prometheus, Grafana, and Dynatrace. o Experience with time-series data, including PromQL.
  • Strong knowledge of API design, data modeling, and data pipelines.
  • Expertise in scripting and backend development (e.g., Python, Go, Java).
Experience:
  • Proven experience designing and scaling observability stacks for production systems.
  • Hands-on experience with cloud platforms (AWS, Azure).
  • Experience with containerization (e.g., Kubernetes) is often required.
  • Familiarity with dashboards as code and Terraform.
Soft Skills:
  • Excellent problem-solving, communication, and stakeholder management skills.
  • Ability to translate complex data into actionable recommendations for technical and non-technical audiences.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.