Apply Now

SRE Platform Engineer

Remote • Posted 3 hours ago • Updated 3 hours ago

Contract W2

No Travel Required

Remote

$85 - $90/hr

Fitment

Dice Job Match Score™

🫥 Flibbertigibetting...

Job Details

Skills

.NET
BMC
Cloud Computing
Communication
C#
DevSecOps
DevOps
Dynatrace
Microsoft SCOM
Agile
Microsoft Azure
Data Collection
Analytics

Summary

Eclaro''s client is searching for a Lead SRE Platform Engineer to drive reliability engineering strategy and execution across critical IT Business Solutions platforms.

**This is a for EST remote hours- NO 3RD PARTIES UNABLE TO SUBCONTRACT**

What You’ll Do

Reliability & Observability Leadership

Define and mature SRE best practices across cloud and on-prem environments.
Design and implement comprehensive monitoring strategies using tools such as:

Dynatrace
Datadog
Microsoft SCOM

Develop dashboards, alerts, synthetic testing, and proactive monitoring capabilities.
Establish and evolve a MELT data strategy to improve service reliability.
Provide data-driven RCA investigations and implement preventative solutions.

Platform & Application Reliability

Support and enhance reliability across:

Cloud & Infrastructure

Microsoft Azure (software, storage, Azure local)
Hyper-V and legacy VMware environments
NetApp and Pure storage platforms
Azure log analytics
Infrastructure as Code using Terraform
Migration from Azure DevOps to GitHub (strong GitHub experience required)

Order Management Systems

Azure-based, internally developed .NET/C# applications
Internal message queuing systems
Logging, analytics, and synthetic testing post-patching
API-based integrations

Workforce & Payroll Platforms

Workday (Payroll)
ADP Vantage (Timekeeping)

Warehouse & Distribution Systems

Blue Yonder Warehouse Management System (WMS)
Vocollect handheld voice picking devices
Network analytics for identifying dead zones and connectivity issues
Barcode scanners and device connectivity troubleshooting

DevSecOps & Automation

Lead CI/CD reliability improvements (Azure DevOps → GitHub transition critical).
Enhance pipeline automation with embedded security controls.
Advance Infrastructure-as-Code standards (Terraform).
Improve configuration management and change governance.
Drive automation to reduce manual intervention and operational risk.

ITSM & Incident Management

Work within BMC ecosystem including:

BMC Helix
BMC Remedy
BMC Server Automation

Optimize automated incident generation (SCOM → BMC workflows).
Improve triage, escalation, and impact modeling across services.
Monitor vendor performance and escalate appropriately.
Participate in off-hour escalation support when required.

Strategic Impact

Develop predictive reliability models using statistical techniques.
Identify systemic risk across production systems.
Guide tooling decisions (e.g., Dynatrace vs. Datadog or other observability platforms).
Ensure regulatory and operational compliance standards are met.
Facilitate cross-functional collaboration and document SRE procedures and planning artifacts.

Required Qualifications

5–7+ years of Software Engineering and Infrastructure/Database Engineering experience.
Deep expertise in:

DevSecOps practices
Observability platforms
API integrations
Performance management tools
ITIL principles
ITSM data analytics
MELT data collection and analysis

Experience in Azure cloud environments.
Strong analytical and problem-solving skills.
Demonstrated ability to influence technical direction.
Excellent communication and cross-team collaboration skills.
Continuous improvement mindset focused on reliability engineering.

Preferred Qualifications

Strong programming experience in:

.NET / C#
Python
SQL

Experience with MSSQL (primary) and Oracle (limited).
Experience with GitHub (critical for upcoming transition).
Agile/Scrum experience.
Knowledge of Reliability-Centered Engineering and maintenance strategies.
Experience with synthetic testing and proactive validation post-deployment.
Bachelor’s degree in a related technical field.

What Success Looks Like

Improved uptime and measurable reduction in production incidents.
Faster, data-driven RCA resolution and prevention of repeat issues.
Clearly defined observability strategy with actionable dashboards.
Automated, secure, and reliable deployment pipelines.
Enhanced visibility into warehouse, order management, and payroll systems.
A scalable SRE foundation built for long-term growth.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Dice Id: ndi
Position Id: 8949091
Posted 3 hours ago

Create job alert

Never miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

SRE Lead Platform Engineer- Remote

Remote

•

Today

Role Summary As a Lead SRE Platform Engineer, you will drive reliability engineering strategy and execution across critical IT Business Solutions platforms. This role focuses on improving uptime, performance, and operational efficiency through software enhancements, observability, automation, and data-driven root cause analysis (RCA). You will serve as the technical lead for SRE practices establishing monitoring standards, improving MELT (Metrics, Events, Logs, Traces) strategy, influencing tool

Contract

75-95/hr

Requirement for SRE Manager

Remote

•

15d ago

Location : 100 % Remote Duration : 3 months Contract to Hire Need only on 1099 / W2 Site Reliability Engineering Manager SRE Manager to lead a team of reliability engineers responsible for the uptime, performance, and efficiency of the customer-facing platforms. You ll set SLOs and error budgets, build great incident and change practices, and coach engineers to automate everything that can be automated. Responsibilities Lead & grow the team: Hire, coach, and develop SREs; set goals and establis

Easy Apply

Contract

$160,000 - $180,000

Site Reliability Engineer III

Remote

•

7d ago

Job Details: Job Title: Site Reliability Engineer Duration: Long-Term Contract Location: Chicago, IL || Remote (Candidate from CST Zone only) Job Description: Job Responsibilities: Applies software engineering practices to IT operations tasks to maintain a scalable and reliable production environment for running software services create a bridge between development and operations by applying a software engineering mindset to system administration topics.Writing and developing code to automate

Easy Apply

Contract

70 - 80

SRE Engineer

Remote

•

11d ago

Note:- We are looking bank domain Consultant.Job Description:-Summary We are seeking three SRE / DevOps Engineers to improve the reliability, observability, and operational Role readiness of business-critical platforms and services supporting the Helix program.While titled as SRE / DevOps, these roles are heavily operations-oriented and require strong production support, incident response, and Splunk-based monitoring experience.Key Responsibilities Lead complex initiatives to improve the reliabi

Easy Apply

Contract

Depends on Experience

Search all similar jobs