Overview
Remote
Depends on Experience
Contract - W2
Skills
DataDog
AWS
Linux
RedHat
SAFE
ITIL
F5
SSL
Shell Scripting
Unix
Training
User Guides
ServiceNow
Job Details
Position Title: Lead Systems Engineer
Location: Washington, DC / Remote
Duration: 9 Months
W2 Only
Job Overview:
We are seeking a Lead Systems Engineer to support the Client s Systems Monitoring initiatives for several Statements of Work (SOWs) in 2025 and beyond. The ideal candidate will bring deep expertise in monitoring tools particularly DataDog and possess strong experience on the Linux platform. This role involves the full lifecycle of monitoring tools administration including implementation, scripting, dashboard creation, and cross-functional collaboration.
Key Responsibilities:
- Administer and maintain monitoring tools, primarily DataDog, on Linux platforms.
- Configure infrastructure, network, and application monitoring, including centralized logging and SNMP-based monitoring.
- Instrument Java-based applications (e.g., running on Tomcat) with DataDog for Application Performance Monitoring (APM).
- Create and manage dashboards and visualizations in DataDog.
- Administer related monitoring platforms such as ELK Stack (Elasticsearch, Logstash, Kibana) and CloudBeat for synthetic monitoring.
- Write automation scripts using Shell, Python, or Ansible.
- Support logging configurations from various platforms including WebSphere, Tomcat, and AIX.
- Set up Browser Real User Monitoring (RUM) and Synthetic Monitoring using Selenium and CloudBeat.
- Troubleshoot production performance issues, correlate cross-platform data, and provide root cause analysis.
- Collaborate with architecture and development teams to integrate monitoring early in the SDLC.
- Document tool usage, configurations, procedures, and provide internal training as needed
Required Skills and Experience:
- 5 8 years of IT experience in distributed environments (Windows, Linux/Unix, VMware, SQL Server, network infrastructure).
- Minimum 3 years of hands-on experience with DataDog administration or equivalent experience with ELK Stack.
- Proficient in Shell scripting, Python, and Selenium; VuGen is a plus.
- Experience configuring SSL certs and encryption on Linux systems.
- Understanding of F5 Load Balancers, WebSeal, SNMP, Palo Alto, Gigamon, and network monitoring tools.
- Comfortable with setting up monitoring in cloud and hybrid environments.
- Experience with alerting, dashboarding, and reporting in DataDog or similar platforms.
- Strong documentation skills, including SOPs, training material, and user guides.
- Familiarity with service level management (SLAs, SLRs, etc.).
- Bachelor s degree in computer science, Engineering, or related technical field (or equivalent experience).
- Experience with Agile/SAFe methodologies.
- Exposure to both Waterfall and Agile SDLC environments.
Preferred Certifications:
- ITIL Foundations v3 (must be obtained within 180 days if not currently held)
- SAFe Certification
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.