NOC Engineer / Service Reliability

  • Odesus,
  • West Los Angeles, CA
  • 3 weeks ago
NOC, network technician, ITIL, Linux, data center, Monitoring, ServiceNow, JIRA, NewRelic, Nagios, Elastic Search, Grafana, Systems Admin, Python, Automation, Incident management
Contract W2, 12 Months
Depends on Experience
Work from home available Travel not required

Job Description

Large eSports gaming company in West Los Angeles is looking to hire a NOC Technician/Service Reliability Admin for their Network Operations Center (NOC) with the following:  

    1. Monitoring tools: NewRelic is a huge plus, but any other monitoring tools is fine 
    2. Automation tools: python, bash, ruby, or JavaScript
    3. ServiceNow for Incident management
    4. Linux Systems Administration  
    5. Kubernetes / cloud is a huge plus

The Network Operations Center (NOC) manages the 24x7 monitoring and response components of company player-facing services. We are the first line of defense when things go wrong with any of our live services and many of our internal services as well. We leverage technical familiarity with best-practice processes to rapidly remediate incidents. The team is staffed with Administrators, Specialists that provide reliable triage services across many levels of technical and process operations. The team helps to create and mentor other teams on best practice in alerting, monitoring, and operational processes.

As a Service Reliability Administrator, you will work closely with the Live Operations team and company globally to establish and maintain a high-performing and highly available game service for players. You will monitor and support all aspects of production environments, development environments, and general system needs. Your technical skills and grasp of system integration will help you diagnose and communicate potential issues to employees and the community, improving the quality of the player experience. You will be a craft expert in operational and triaging skills. The team can rely on you as a proactive individual, focused on solving day to day problems that affect any aspect of running live games.

Responsibilities:

  • First responder, triage agent, or escalation point from the NOC to external teams.
  • Self-organize with the team around live incidents.
  • Execute technical runbooks in a fast-paced environment.
  • Multitask rapidly to address issues affecting our players and services.
  • Work in a fast paced, constantly changing environment.

Required Qualifications:

  • 1 years of NOC Technician or equivalent role (Analyst, Live Operations, etc)
  • Familiarity with the core concepts of operating systems, networking, and software life cycles
  • Enthusiasm around operations and technology
  • Highly driven and self-motivated

Desired Qualifications:

  • Scripting/programming proficiency is highly desired
  • Experience working on deployments in a live environment is a plus
  • Multiple language proficiency is a plus, especially Mandarin
  • Certified in Linux+ and Network+, or equivalents
  • Experience with the following:
  • Monitoring solutions eg: NewRelic, Nagios, Elastic Search, Grafana
  • Event management tools eg: BigPanda, Moogsoft
  • ITIL-based Ticketing systems eg: ServiceNow, JIRA
Dice Id : 10106335
Position Id : 6497471
Originally Posted : 2 years ago
Have a Job? Post it

Similar Positions