DevOps & Site Reliability Lead

Deerfield, IL, US • Posted 1 day ago • Updated 1 day ago
Full Time
On-site
Depends on Experience
Fitment

Dice Job Match Score™

📊 Calculating match score...

Job Details

Skills

  • Apache Hadoop
  • Big Data
  • Capacity Management
  • Apache Spark
  • Data Processing
  • Change Management
  • Continuous Improvement
  • ITIL
  • Offshoring
  • Management
  • Root Cause Analysis
  • DevOps
  • Extract
  • Transform
  • Load
  • Talend

Summary

Job Title : DevOps & Site Reliability Lead
Location : Deerfield, IL

Type: Fulltime Position

Job description :

Must Have Technical/Functional Skills
  • We are seeking a Site Reliability Engineer (SRE) with strong expertise in Talend and Big Data platforms to support and operate large-scale data processing environments.
  • The role requires close collaboration with customers, application teams, and offshore delivery teams to ensure platform reliability, incident management, and operational excellence. Experience with Databricks is a strong plus.
Key Responsibilities
  • Act as an SRE for Big Data and ETL platforms, ensuring high availability, performance, and reliability of data pipelines and applications.
  • Provide operational support and incident management (MIM), including triage, root cause analysis, and resolution of production issues.
  • Serve as a primary point of contact for customers, providing timely updates, issue resolution, and operational insights.
  • Collaborate closely with application teams to support ETL jobs, data processing workflows, and platform enhancements.
  • Coordinate with offshore teams for day-to-day operations, incident resolution, and continuous improvement initiatives.
  • Monitor, troubleshoot, and optimize Talend, Hadoop, Spark, and Big Data ecosystems.
  • Implement and support monitoring, alerting, runbooks, and automation to improve platform stability and reduce manual effort.
  • Participate in problem management, change management, and post-incident reviews to drive preventive measures.
  • Support capacity planning, performance tuning, and reliability improvements across the data landscape.
Required Skills & Qualifications
  • Strong hands-on experience with Talend (development, support, and troubleshooting).
  • Solid understanding of Big Data technologies, including:
o Hadoop ecosystem
o Apache Spark
  • Proven experience handling Major Incident Management (MIM) and production support in a 24x7 or on-call environment.
  • Experience working directly with customers, business stakeholders, and cross-functional teams.
  • Strong coordination skills to manage and guide offshore teams.
  • Knowledge of ITIL processes, especially Incident, Problem, and Change Management.
  • Excellent communication, documentation, and stakeholder management skills.
Roles & Responsibilities
  • Act as an SRE for Big Data and ETL platforms, ensuring high availability, performance, and reliability of data pipelines and applications.
  • Provide operational support and incident management (MIM), including triage, root cause analysis, and resolution of production issues.
  • Serve as a primary point of contact for customers, providing timely updates, issue resolution, and operational insights.
  • Collaborate closely with application teams to support ETL jobs, data processing workflows, and platform enhancements.
  • Coordinate with offshore teams for day-to-day operations, incident resolution, and continuous improvement initiatives.
  • Monitor, troubleshoot, and optimize Talend, Hadoop, Spark, and Big Data ecosystems.
  • < li>Implement and support monitoring, alerting, runbooks, and automation to improve platform stability and reduce manual effort.
  • Participate in problem management, change management, and post-incident reviews to drive preventive measures.
  • Support capacity planning, performance tuning, and reliability improvements across the data landscape.


Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 91137892
  • Position Id: 8978620
  • Posted 1 day ago
Contact the job poster
Prathamesh Mane

Prathamesh Mane

Recruiter @ iPeople Infosystems LLC
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Deerfield, Illinois

19d ago

Easy Apply

Full-time

Depends on Experience

Remote or Buffalo Grove, Illinois

Today

Full-time

USD 101,970.00 - 203,940.00 per year

Lake Forest, Illinois

Today

Easy Apply

Full-time

$160000 - $185000

Evanston, Illinois

Today

Full-time

USD 82.00 per hour

Search all similar jobs