Datadog SME

Overview

Remote
Depends on Experience
Accepts corp to corp applications
Contract - W2
Contract - Independent
Contract - 6 Month(s)

Skills

Datadog
Observability
Infrastructure
APM
Security
SIEM
Kubernetes
Docker
AWS
Azure
GCP
ITIL
SRE
DevOps
Python
Bash
Terraform
Ansible
monitoring

Job Details

Overview:

We are seeking a highly skilled Datadog Subject Matter Expert (SME) to lead the design, implementation, optimization, and ongoing management of monitoring, observability, and alerting solutions across our technology landscape. The SME will act as a trusted advisor and technical authority, enabling teams to maximize the value of Datadog for performance monitoring, application observability, infrastructure health, security insights, and operational excellence.

Key Responsibilities:

  • Design & Architecture
    • Develop scalable observability strategies using Datadog for applications, infrastructure, cloud services, and security monitoring.
    • Architect dashboards, monitors, and alerting frameworks tailored to business and operational requirements.
  • Implementation & Integration
    • Lead deployment and configuration of Datadog agents, integrations, and APIs across hybrid/multi-cloud environments (AWS, Azure, Google Cloud Platform, on-prem).
    • Integrate Datadog with CI/CD pipelines, logging systems, and collaboration tools (e.g., Slack, ServiceNow, Jira).
  • Optimization & Governance
    • Establish best practices for metric collection, log ingestion, tracing, and anomaly detection.
    • Optimize cost management and usage efficiency of Datadog licenses and features.
    • Ensure alerting policies reduce noise while providing actionable insights.
  • Collaboration & Enablement
    • Partner with DevOps, SRE, Cloud, Security, and Application teams to embed observability into daily operations.
    • Conduct training, documentation, and workshops to upskill engineering teams in Datadog usage.
  • Troubleshooting & Support
    • Serve as escalation point for Datadog-related performance or monitoring issues.
    • Perform root-cause analysis using Datadog dashboards, traces, and logs to identify and resolve system issues quickly.

Qualifications:

  • Proven experience as a Datadog SME, Consultant, or Senior Engineer with hands-on deployment and scaling expertise.
  • Strong background in observability, monitoring, and APM practices across distributed systems and microservices.
  • Proficiency with Datadog products (APM, Infrastructure Monitoring, Log Management, RUM, Synthetic Monitoring, Cloud Security Monitoring).
  • Experience integrating Datadog with cloud platforms (AWS, Azure, Google Cloud Platform) and container orchestration (Kubernetes, Docker).
  • Solid scripting and automation skills (Python, Bash, Terraform, Ansible, or similar).
  • Familiarity with ITIL/DevOps/SRE practices and incident response workflows.
  • Excellent communication skills with the ability to influence and train cross-functional teams.
  • Must have at least 6 years experience with Datadog.

Preferred:

  • Datadog Certifications (Datadog Observability, Infrastructure, APM, or Security).
  • Experience with security monitoring, compliance dashboards, or SIEM integrations.
  • Prior consulting or advisory experience implementing monitoring solutions at enterprise scale.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.