Devops Engineer (Site Reliability Engineer)

company banner
The Judge Group, Inc.
Engineer, Release, Windows, Development, Applications, Engineering, Quality, Systems, Computer, Python, Perl, PHP, Java, TCP, IP, Linux, Security, Micro, Analysis
Full Time
Work from home not available Travel not required

Job Description

Location: El Segundo, CA
Description:

Position: Devops Engineer (Site Reliability Engineer)

Location: El Segundo, CA

Direct hire/Full-time

NOTE: For a quicker response, please apply direct to LWadhwa@judge.com and put "Devops"" on the subject line. If you have the required experience, I assure you an opportunity to interview with our client.

Thank you!

Key Skills:

DevOps Engineer is a Build/Release Engineer b. Skillset is Windows, .NET c. Trouble-shooting issues in build/deployment as well as production issues d. TFS is good to have, plus experience migrating from TFS to Git/Jenkins e. Knowledge of AWS/Cloud very useful

Development (DevOps) are responsible for ensuring services are highly available, reliable, secure and scalable.

This is a very senior level position. The ideal candidates are fluent in systems programming and/or automation, and can leverage their experience to solve complex problems associated with running production environments at massive scale in multi-tenant environments.

Responsibilities

Employ deep troubleshooting and scripting skills to improve the availability, performance, and security

Coding and Automation of Applications on Cloud Platform Implement automated tests, automated deployments, and operational tools Collaborate with Product and Support teams to plan and deploy product releases Set Strategic and Operational goals for team, and work with team to deliver on goals.

Work with Cloud Platform and Operations leaders to develop narratives, backlog grooming, epic planning and overall sprint planning processes Work with Engineering leadership to build shared services that meet the requirements and need of the platform and application teams Ensure services are designed with 24/7 availability and operational readiness and rigor Mentor engineers and work with them on career planning and goals Implementation of proactive monitoring, alerting, trend analysis and self-healing systems Participate in on-call rotations, driving restoration and repair of service-impacting issues Define non-functional requirements as part of the product lifecycle to influence the new designs, standards, and methods for scalable, highly available distributed systems Contribute to product development / engineering as needed to ensure Quality of Service of Highly Available services

Requirements:

10+ years of Systems/Applications automation in 24x7 Production Services environments BS in Computer Science, Computer Engineering, Math, or equivalent professional experience Fluency with at least one current generation scripting language used by DevOps professionals (Python, Perl, PHP, Ruby) + Java Development and/or .NET Excellent troubleshooter, utilizing a systematic problem-solving approach spanning code, systems, and network theory & protocols (TCP/IP, UDP, ICMP) ability to read a packet capture/tcpdump, etc.

Demonstrated experience in designing, analyzing, and diagnosing large-scale distributed systems + Windows Server and/or Linux systems internals (system libraries, file systems, clientserver

protocols)

Experience with elastically scalable, fault tolerance and other cloud architecture patterns Experience operating on AWS (both PaaS and IaaS offerings) Experience in both Windows (2k8R2+) and Linux (centos) + Security triage & forensic analysis Experience with Continuous Integration and Continuous Delivery concepts, including Infrastructure as code utilizing tools like Terraform, Cloudformation and SaltStack Familiarity with Containerization concepts like Docker, and PaaS services on AWS.

Experience with elastically scalable, fault tolerance and other cloud architecture patterns NoSQL/Docker/Micro-services/Forensic-Analysis experience is a big plus Responsibilities Employ deep troubleshooting and scripting skills to improve the availability, performance, and security of Services.

Coding and Automation of Applications on Cloud Platform Implement automated tests, automated deployments, and operational tools Collaborate with Product and Support teams to plan and deploy product releases Set Strategic and Operational goals for team, and work with team to deliver on goals.

Work with Cloud Platform and Operations leaders to develop narratives, backlog grooming, epic planning and overall sprint planning processes Work with Engineering leadership to build shared services that meet the requirements and need of the platform and application teams Ensure services are designed with 24/7 availability and operational readiness and rigor Mentor engineers and work with them on career planning and goals Implementation of proactive monitoring, alerting, trend analysis and self-healing systems Participate in on-call rotations, driving restoration and repair of service-impacting issues Define non-functional requirements as part of the product lifecycle to influence the new designs, standards, and methods for scalable, highly available distributed systems Contribute to product development / engineering as needed to ensure Quality of Service of Highly Available services

Requirements:

10+ years of Systems/Applications automation in 24x7 Production Services environments BS in Computer Science, Computer Engineering, Math, or equivalent professional experience Fluency with at least one current generation scripting language used by DevOps professionals (Python, Perl, PHP, Ruby) + Java Development and/or .NET Excellent troubleshooter, utilizing a systematic problem-solving approach spanning code, systems, and network theory & protocols (TCP/IP, UDP, ICMP) ability to read a packet capture/tcpdump, etc.

Demonstrated experience in designing, analyzing, and diagnosing large-scale distributed systems + Windows Server and/or Linux systems internals (system libraries, file systems, client/server

protocols)

Experience with elastically scalable, fault tolerance and other cloud architecture patterns Experience operating on AWS (both PaaS and IaaS offerings) Experience in both Windows (2k8R2+) and Linux (centos) + Security triage & forensic analysis Experience with Continuous Integration and Continuous Delivery concepts, including Infrastructure as code utilizing tools like Terraform, Cloudformation and SaltStack Familiarity with Containerization concepts like Docker, and PaaS services on AWS.

Experience with elastically scalable, fault tolerance and other cloud architecture patterns NoSQL/Docker/Micro-services/Forensic-Analysis experience is a big plus

Thanks

SHARMISHTHA RAWAT

Staffing Specialist, The Judge Group

Srawat@judge.com

Contact: srawat@judge.com

This job and many more are available through The Judge Group. Find us on the web at www.judge.com


Company Information

The Judge Group is a privately-owned, leading professional services firm. What does that mean? It means we provide technology, talent and learning solutions to businesses around the globe, and we're great at it. Our expertise is positioned at the crossroads of people and technology—two of the most important aspects of successful business today.
Dice Id : cxjudgpa
Position Id : 634280
Originally Posted : 3 months ago

Similar Positions at The Judge Group, Inc.

Site Reliability Manager (SRE/Devop)
  • El Segundo, CA
  • 1 day ago
Sr. DevOps Engineer/ SRE
  • El Segundo, CA
  • 1 day ago
Site Reliability Engineer
  • Aurora, CO
  • 1 day ago
Senior Site Reliability Engineer
  • Falcon Heights, MN
  • 1 day ago
Senior Site Reliability Engineer
  • Boston, MA
  • 1 week ago
Reliability Engineer
  • Princeton, NJ
  • 1 day ago
Reliability Engineer Lead
  • Princeton, NJ
  • 1 day ago
Senior DevOps Engineer
  • St. Paul, MN
  • 1 day ago