Data Dog Systems Engineer- Onsite/Remote

Overview

Remote
Depends on Experience
Contract - Independent
Contract - W2
Contract - 6 Month(s)

Skills

Data Dog

Job Details

Title: Data Dog Systems Engineer

Location: Telecommuter - DC/MD/VA is preferred, but EST/CST is ok (MST would be ok for A++ candidate). If in DC/MD/VA, will be required to come onsite for random meetings in Reston, VA.

Contract Length: 12 months+

Rate: $60 - 70 C2C

Summary of Must haves:

Backend administration of the tool DataDog (not frontend Monitoring users, backend administrators only) - All functionalities of DataDog is required.

AWS Cloud environment

Big Plus: BigPanda, ServiceNow (CMBD), Healthcare Insurance, Linux OS

Data Dog Systems Engineer

We are seeking a seasoned Lead Systems Engineer with deep expertise in Datadog, as a backend Engineering professional, within an AWS Environment.

In this role, you will oversee the design, implementation, and maintenance of comprehensive monitoring, observability, and incident management solutions for cloud-based infrastructure and applications. You will play a key role in guiding the team to ensure operational excellence, system reliability, and seamless collaboration across IT and engineering teams.

Responsibilities

Lead the architecture, design, and implementation of end-to-end monitoring solutions using Datadog, ensuring high availability and performance of cloud-based services. Backend Administration/Engineering support of the DataDog tool is a primary responsibility

Oversee the deployment and management of AWS resources (EC2, RDS, Lambda, ECS/EKS, S3, etc.), ensuring adherence to best practices for scalability, security, and cost optimization.

Define monitoring strategies and best practices, including Datadog dashboards, monitors, alerts, and custom metrics for comprehensive observability.

Architect and manage the integration of Datadog with ServiceNow to automate incident management workflows, event correlation, and CMDB synchronization.

Provide technical leadership and mentorship to junior engineers on best practices for monitoring, logging, and observability.

Collaborate with cross-functional teams to integrate monitoring and logging into CI/CD pipelines and cloud infrastructure.

Drive continuous improvement in system reliability, including SLO/SLI definitions, synthetic monitoring, and anomaly detection.

Contribute to and enforce Infrastructure as Code (IaC) standards using Terraform, CloudFormation, or similar tools.

Participate in high-severity incident management, root cause analysis, and the implementation of corrective actions to prevent future occurrences.

Requirements

Bachelor s degree in Computer Science, Information Technology, or a related field (or equivalent experience).

5+ years of experience with AWS cloud services, including deployment, management, and optimization of cloud infrastructure.

3+ years of hands-on experience supporting the monitoring tool, Datadog. Seeking candidates with backend administration/engineering support of DataDog (not frontend Monitoring users). Must know all functionalities of the DataDog tool, and complex dashboards, integrations, and custom metrics.

Demonstrated experience leading teams or projects in a cloud operations or DevOps environment.

Strong proficiency in scripting and automation (Python, Bash, or similar).

Solid understanding of networking, security best practices, distributed systems, and troubleshooting complex cloud architectures.

Preferred Skills (Nice to Have)

Experience with Infrastructure as Code (Terraform, CloudFormation).

AWS certifications (e.g., AWS Certified Solutions Architect, DevOps Engineer).

Experience with Kubernetes monitoring and log aggregation solutions (Fluentd, ELK stack).

Familiarity with other observability tools like Prometheus or Grafana.

ServiceNow certifications or experience with ServiceNow ITOM modules (Discovery, Event Management, CMDB).

Experience integrating Datadog with ServiceNow, including incident management workflows, event management, and CMDB integration.

Additional Notes from the Intake Call:

Backfill positions.

Location: 99% remote - office meetings occur when needed for local candidates. Not often at all. DMV is a BIG PLUS. As close to DMV as a plan B. EST/CST only. Will consider MST if A+++ candidate

This group support DataDog for the monitoring teams, NOT using the actual tool for monitoring. This is a backend Administration support position for the DataDog tool.

DataDog is their primary monitoring tool

BigPanda is another tool they use. Within DataDog, there are a lot of functionalities (infrastructure monitoring, log mgt, network performance monitoring, software monitoring)

Must know all of them due to the diverse nature of the environment.

They are moving away from GovCloud to Commercial/AWS. Migrating all FEPOC claims and enrollment to AWS Cloud. DataDog will be the primary monitoring tool for this cloud environment as well.

DataDog in AWS Cloud Environment is required. = Must have

ServiceNow Implementation for Incident / Problem MGT projecting is coming. Migrating ServiceNow into DataDog, BIG PLUS if someone has done this in the past. Any CMBD ServiceNow exp is a BIG PLUS

BigPanda - Nice to have

DevOps candidates are not what we are looking for here. We are looking for administrators who support the tool on-going. DevOps only does the installation and command codes then doesn t work on anything backend after the installation.

They run DataDog on Linux OS

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.