Overview
Skills
Job Details
Job Title : Reliability Engineer - Datadog
Type: Contract
Location: Minneapolis, MN
About US
At Vilwaa (), we are focused towards delivering IT services and solutions in Data Engineering, Cloud Infrastructure solutions , Digital Application Development, (IoT) - Internet of Things.
We are seeking an experienced Reliability Engineer with deep expertise in Datadog experience to lead and deliver end-to-end enterprise transformation. You ll be working directly with client stakeholders to guide them through requirements, configuration, integration, and post-go-live support.
Job Description:
Implement full-stack monitoring using Datadog (APM, Logs, Dashboards, Alerts).
Work with development teams to define key application health metrics.
Analyze incident trends and noise levels to fine-tune alerting rules.
Use AppDynamics and Splunk for supplemental monitoring and log analysis.
Review incidents and problem records in ServiceNow; contribute to incident trend reports.
Create and maintain technical documentation related to monitoring setup and standards.
Apply analytical skills to research and reduce incident volume.
Required Skills:
Strong experience with Datadog AIOps (including setting up APM, logs, dashboards, and alerts).
Working knowledge of AppDynamics and Splunk Logging.
Familiarity with ServiceNow (incident/problem management and reporting).
Ability to collaborate with developers and reliability teams.
Strong troubleshooting and analytical skills.
Good documentation skills.