Site Reliability Engineer Architect

Overview

Remote
$70+
Accepts corp to corp applications
Contract - W2
Contract - Independent
Contract - 12 Month(s)

Skills

SRE
Site Reliability Engineer Architect

Job Details

Hello,
Greetings,
Hope you are safe & healthy.
My name is Kundan Mishra. I am a Technical Recruiter with Altitude Technology Solutions (ATS) INC.
The purpose of this email is to serve as an invitation to discuss the opportunity below with our customer that I think you or your candidate would be a great fit for.
ROLE- Site Reliability Engineer Architect
LOCATION- BOSTON,MA(Remote)
Experience - 15+ Years
  • Lokking for architect or having program manager experience.
  • Looking for technical hands-on knowledge. He must know how to drive the program.
  • They are not looking for devops side candidate.
  • Must know implementing SRE framework
  • Integrating multiple applications
  • Must know data flowing from one to the other
  • Must have- Azure Framework
Generic JD: (go with the above highlighted details)
  • Design, Implement and/or refine Service Management processes. (Monitoring, Incident, Problem, Capacity, Change & Releases and Service Level Management)
  • Track system health, performance and reliability via monitoring, observability platforms, implement proactive alerting mechanisms to detect anomalies and respond swiftly to incidents.
  • Act as a point of escalation for complex incidents, collaborating with senior engineers and management to ensure effective resolution.
  • Establish and enforce change control and release management processes to ensure smooth and controlled deployment of system changes.
  • Conduct post-incident analyses to identify root causes and implement actions to prevent recurrence and improve system resilience.
  • Perform regular system testing to identify vulnerabilities and validate disaster recovery plans.
  • Partner with development teams to improve services through rigorous testing and release procedures.
  • Participate in system design consulting, platform management, and capacity planning.
  • Integrate reliability practices into CI/CD pipelines to automate testing, quality assurance, and deployment processes.
  • Foster a culture of collaboration between development and operations teams, promoting shared ownership and accountability for system reliability.
  • Create sustainable systems and services through automation and uplifts.
  • Balance feature development speed and reliability with well-defined service-level objectives
  • Continuously evaluate and enhance system reliability, scalability and performance. Identify areas for improvement and implement solutions to optimize processes and reduce manual toil.
  • Define, track, and monitor SLAs/ SLOs to measure and improve system reliability.
  • Collaborate with cross-functional teams to ensure scalable and adequate resource allocations and optimize cost efficiency.
Required skills and qualifications
  • Bachelor s degree (or equivalent) in computer science or related discipline
  • Proven Process definition and Implementation experience, leveraging ITIL best practices
  • Minimum ITIL V3 Intermediate / Expert certified - Mandatory
  • Implementation experience of ITSM / ESM tools (e.g., SNOW, Remedy, JIRA)
  • Strong DevSecOps skills with implementation experience Foundation / Practitioner certification will be an advantage.
  • Coding experience beyond simple scripts Python, Java, C/C++ and JavaScript
  • Knowledge of Linux/ Unix systems administration and troubleshooting skills
  • Knowledge of relational and NoSQL databases and distributed storage systems Proficiency in database administration, query optimization, and data replication.
  • Familiarity with Incident management and collaboration tools such as JIRA, PagerDuty, Slack, or ServiceNow.
  • Expertise in performance monitoring and analysis tools such as New Relic, AppDynamics, or Datadog.
  • Knowledge of Observability (e.g, Dynatrace, SolarWinds) and monitoring systems (e.g., Prometheus, Nagios) and log management tools (e.g., ELK stack, Splunk).
  • Strong analytical thinking and problem-solving abilities to identify patterns, troubleshoot issues, and propose effective solutions.
  • Proactive approach to identifying problems, performance bottlenecks, and areas for improvement.
  • Previous success in technical engineering

Thanks & Regards

Kundan Mishra
Sr. Technical Recruiter

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.