Site Reliability Engineer

company banner
The Judge Group, Inc.
QA, Linux, Development, Java, Python, Apache, IIS, SQL, Oracle, MySQL, SDLC, Computer
Full Time
Work from home not available Travel not required

Job Description

Location: Columbus, OH


6 month CTH

  • Hands-on design, analysis, development and troubleshooting of highly-distributed large-scale production systems and event-driven services spanning on-prem and AWS based hosting
  • Ownership of reliability, uptime, system security, cost, operations, capacity and performance-analysis
  • Share a 24x7 on-call rotation with your team and respond to incidents; lead triage bridges during incidents and provide needed status updates
  • Create and maintain monitoring, alerting and dashboarding solutions that improve the visibility into our applications' performance and business metrics and keep operational workload in-check.
  • Use automation technologies to ensure repeatability, eliminate toil, reduce time to action and repair services
  • Participate in technical training events and game day scenarios
  • Partner with engineering, security, performance, QA and product management teams to improve the availability and quality of service of our products

Required Skills:

  • Strong Linux administration/build/management skills
  • Development experience in at least one of these languages: Java, Go, C# and/or Python; Strong skills in reading, understanding and writing code in the same
  • Demonstrated expertise building and managing highly scaled production infrastructure in on-prem and AWS based environments
  • Extensive experience troubleshooting n-tier architectures with diverse sets of technologies strongly desired. (e.g. load balancers, web/app/caching/database servers, queues, threading, memory, cpu, heap, storage, network, os)
  • Strong experience using application and infrastructure monitoring systems (like Splunk, Cloudwatch, Datadog, New Relic, Sumologic, ELK)
  • Excellent presentation and communication skills
  • Mastery of infrastructure automation technologies (like Terraform, Puppet, Ansible, Chef)
  • Expertise with continuous deployment-based software development lifecycles (e.g. CI/CD)
  • Experience with common middleware (e.g., Apache, NGINX, IIS, Tomcat, JBoss)
  • Experience with SQL databases (e.g., PostgreSQL, Oracle, MySQL)
  • Expertise with SDLC branching, SCM, and code deployment systems (git/gitflow, Jenkins, CircleCI, TravisCI, etc.)
  • Expertise in container/container-fleet-orchestration technologies (like Docker, Vagrant, Mesosphere)
  • BS Degree in Computer Science (or related technical field and/or equivalent industry experience)


This job and many more are available through The Judge Group. Find us on the web at

Company Information

The Judge Group is a privately-owned, leading professional services firm. What does that mean? It means we provide technology, talent and learning solutions to businesses around the globe, and we're great at it. Our expertise is positioned at the crossroads of people and technology—two of the most important aspects of successful business today.
Dice Id : cxjudgpa
Position Id : 632523
Originally Posted : 1 month ago

Similar Positions at The Judge Group, Inc.

Senior Machine Learning Engineer, Python
  • Columbus, OH
  • 24 hours ago
Site Reliability Engineer
  • Aurora, CO
  • 24 hours ago
Senior Site Reliability Engineer
  • Falcon Heights, MN
  • 24 hours ago
Devops Engineer (Site Reliability Engineer)
  • El Segundo, CA
  • 24 hours ago
Site Reliability Manager (SRE/Devop)
  • El Segundo, CA
  • 24 hours ago
Reliability Engineer
  • Princeton, NJ
  • 24 hours ago
Reliability Engineer Lead
  • Princeton, NJ
  • 24 hours ago
Software Test Engineer (Cleared role)
  • Tysons Corner, VA
  • 24 hours ago
Python Data Engineer
  • Philadelphia, PA
  • 24 hours ago