Overview
On Site
$45 - $50
Contract - Independent
Contract - W2
Contract - 6 Month(s)
Skills
Budget
Collaboration
DevOps
Reliability Engineering
Management
Operational Risk
Knowledge Base
Workflow
Splunk
Scripting
Regulatory Compliance
SQL
Onboarding
Continuous Integration
Continuous Improvement
Quality Assurance
Continuous Delivery
Root Cause Analysis
Documentation
Communication
Dynatrace
Incident Management
Knowledge Sharing
Operational Efficiency
Job Details
Job Description - Platform Engineer/DevOps will take technical call
PV W2
Location: Onsite (can be any of these locations: Berkeley Heights, NJ; Frisco, TX; Alpharetta, GA; or Wilmington, DE)
Job Description:
- Own and manage deployment pipelines tailored to individual applications. This includes coordinating release schedules, validating deployment artifacts, and ensuring smooth rollout processes across environments. Collaborate with development and QA teams to ensure deployments meet quality and compliance standards.
- Establish and operate anomaly detection mechanisms to identify and respond to incidents quickly. Develop triage workflows that streamline root cause analysis and resolution. Work closely with support and engineering teams to minimize downtime and improve incident response times.
- Create and maintain automation scripts and tools that enhance operational efficiency and reduce manual effort. Integrate automation into CI/CD pipelines, monitoring systems, and incident response processes. Continuously evaluate and improve tooling to support evolving application needs.
- Develop comprehensive runbooks, operational guides, and knowledge base articles for supported applications. Ensure documentation is up-to-date, accessible, and aligned with best practices. Promote knowledge sharing across teams to improve onboarding and reduce operational risk.
- Manage the lifecycle of application certificates, including issuance, renewal, and monitoring. Ensure certificates are properly configured to support secure communication and meet compliance requirements. Automate certificate processes where possible to reduce risk and overhead.
Qualifications:
- Minimum of 5+ years of broad and relevant engineering experience in application operations, DevOps, or site reliability engineering is essential.
- Hands-on experience with operations monitoring and observability.
- Knowledge of tools such as Splunk, Dynatrace, or similar platforms. Ability to design and implement telemetry solutions that provide actionable insights into application performance and reliability.
- Triage and investigate Alerts
- A strong passion for automation and continuous improvement. Proven ability to develop scripts, tools, and integrations that reduce manual effort and improve operational efficiency.
- Excellent written and verbal communication skills. Capable of producing clear documentation, runbooks, and reports for both technical and non-technical audiences.
- Experience applying SRE principles such as SLOs, error budgets, and resilience engineering to improve system reliability and performance.
- Basic SQL
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.