Overview
On Site
Full Time
Skills
Pivotal
Collaboration
MIT
Computer Science
Computer Networking
Innovation
Systems Architecture
Analytical Skill
Problem Solving
Conflict Resolution
Communication
LDAP
Virtualization
Configuration Management
Linux
Reliability Engineering
HPC
Research
Network
Job Details
Posting Description
LEAD SITE RELIABILITY ENGINEER, Office of Research Computing and Data (ORCD), to build and advance SRE functions in collaboration with a diverse team of systems engineers; play a pivotal part in the strategic transformation of infrastructure planning, design, delivery, and operations in support of ORCD's continued growth; and build and foster cross-functional collaboration between engineering and operations teams across MIT, ensuring alignment with institutional objectives and long-term strategic initiatives.
Find the full job description here:
Job Requirements
REQUIRED : Bachelor's degree in engineering, computer science, related field or equivalent industry experience; a minimum of seven years of experience in site reliability engineering or a related field; possess a deep and broad expertise across multiple technical domains, including Linux, networking, and virtualization; ability to drive innovation in system architecture and lead transformative design initiatives from the ground up; robust analytical and structured problem-solving skills, coupled with excellent communication and inter-personal abilities; deep understanding of Linux, LDAP, virtualization & config management in a large Linux-based engineering environment. PREFERRED : 10+ years of experience in site reliability engineering; experience working within an HPC/research computing environment; ability to analyze network traffic to identify technical issues and suspicious activities. Job #24909-11
4/8/2025
LEAD SITE RELIABILITY ENGINEER, Office of Research Computing and Data (ORCD), to build and advance SRE functions in collaboration with a diverse team of systems engineers; play a pivotal part in the strategic transformation of infrastructure planning, design, delivery, and operations in support of ORCD's continued growth; and build and foster cross-functional collaboration between engineering and operations teams across MIT, ensuring alignment with institutional objectives and long-term strategic initiatives.
Find the full job description here:
Job Requirements
REQUIRED : Bachelor's degree in engineering, computer science, related field or equivalent industry experience; a minimum of seven years of experience in site reliability engineering or a related field; possess a deep and broad expertise across multiple technical domains, including Linux, networking, and virtualization; ability to drive innovation in system architecture and lead transformative design initiatives from the ground up; robust analytical and structured problem-solving skills, coupled with excellent communication and inter-personal abilities; deep understanding of Linux, LDAP, virtualization & config management in a large Linux-based engineering environment. PREFERRED : 10+ years of experience in site reliability engineering; experience working within an HPC/research computing environment; ability to analyze network traffic to identify technical issues and suspicious activities. Job #24909-11
4/8/2025
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.