Overview
Full Time
Contract - W2
Contract - LONG TERM
Skills
Splunk
Datadog
AppDynamics
Elastic
PagerDuty
OpenTelemetry
Job Details
Job Title: Observability Specialist(Application Performance Monitoring)
Location: Remote in Canada
Job Type: Contract
Job Summary:
We are seeking Senior Software Developer Contractors with expertise in observability tools and practices to join our team. The contractors will work alongside senior developer employees (with 10+ years of software development experience) to design and implement a world-class observability suite. This initiative aims to enhance system reliability, performance, and reduce MTTA/MTTR by adopting standard SRE patterns and integrating industry-leading observability tools.
Key Responsibilities:
- Collaborate with internal senior developers to build and implement an integrated observability suite.
- Design and implement monitoring, logging, and alerting solutions using tools such as AppDynamics, Datadog, Splunk, Elastic, and PagerDuty.
- Adopt and promote standard SRE patterns for monitoring, on-call management, and alerting.
- Integrate OpenTelemetry for distributed tracing and observability across services.
- Consolidate logging into a single aggregation platform for applications and SIEM.
- Provide expertise in observability best practices and mentor team members without prior observability experience.
- Ensure observability solutions are automated, intuitive, and democratized for use across teams.
- Collaborate with stakeholders to define KPIs and SLIs for system reliability and performance.
Required Skills and Experience:
- 8+ years of software development experience, with at least 3+ years focused on observability and SRE practices.
- Hands-on experience with Datadog for monitoring and observability.
- Hands-on experience with AppDynamics for application performance monitoring, business transaction insights, and end-user experience.
- Expertise in log aggregation tools such as Splunk or Elastic.
- Experience with PagerDuty for alerting and incident management.
- Proficiency in OpenTelemetry for distributed tracing and observability.
- Strong understanding of SRE principles, including monitoring, on-call management, and alerting.
- Experience with automation and integration of observability tools into CI/CD pipelines.
- Excellent communication and collaboration skills to work effectively with cross-functional teams.
- Ability to mentor and upskill team members in observability practices.
Preferred Qualifications:
- Experience in building observability solutions for large-scale distributed systems.
- Familiarity with SIEM tools and security logging practices.
- Knowledge of cloud platforms (AWS, Azure, or Google Cloud Platform) and their observability services.
- Certification in observability tools (e.g., Datadog, Splunk) or SRE practices.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.