NOC Lead

Overview

On Site
$80,000 - $100,000
Full Time

Skills

NOC
Kubernetes
scripting
Monitor

Job Details

IT Ops / NOC Lead

  • A minimum of 10+ year experience in a technical support or NOC or a similar position with Batch Job Processing, Job Scheduling Tools.
  • Experience working with Windows\Linux systems specifically in managing or supporting production environments
  • Experience working with the latest and greatest technologies like Kubernetes, Prometheus, Splunk, DataDog and more
  • Experience working with the industry s major cloud providers like Azure
  • Experience in managing production downtime incidents
  • High level English
  • Independent in your work, self-motivated
  • Be a problem solver and have excellent troubleshooting skills
  • Be able to handle a stressful and dynamic working environment
  • Excellent communication skills
  • Should be able to work shifts and during weekends and holidays.
  • Have scripting skills: (Shell/Bash, Perl, Python, Ruby, PowerShell)
  • Be able to handle a stressful and dynamic working environment
  • Excellent working relations with interfaces.

As a Lead for Network Operations, you will be responsible for overseeing and managing the day-to-day operations of the network infrastructure within the organization. You will lead a team of network professionals, ensuring the availability, stability, and performance of the network environment. Your role will involve collaborating with cross-functional teams, implementing network best practices, and driving continuous improvement initiatives.

  • Monitor applications and performance of production servers
  • Determine the root cause of production issues
  • Production crisis management
  • Procedures and reports management
  • Communication and escalation of complex issues to development and system teams
  • Perform fault detection, isolation, resolution, and root cause analysis
  • Responsible for managing and coordinating the NOC team.
  • Point of contact for all NOC escalations, both external and internal.
  • Drives incident management, monitoring, tracking, and ensuring that SLAs are met.
  • Develops and Implements new solutions, strategies, and processes to support the
  • NOC's standard operating procedures.
  • Sets work schedules for 24x7x365 coverage.
  • Oversees the work of the team to ensure that system requirements have been properly implemented and procedures carefully followed.
  • Provides input to improve stability, security, efficienc y, and scalability of systems.
  • Responsible for monitoring production, staging and development environments for a
  • large number of applications in an agile, fast paced organization.
  • Monitoring the performance and capacity of computer systems in real time and responding to alerts in near real-time.
  • Server build and installs, application upgrades, network equipment build and installation.
  • Maintaining hardware audits.
  • Performing regular checks on the production stack to ensure the systems and services are running in an optimal fashion.
  • Analyzing production network issues to suggest corrective action.
  • Provide critical service outage notification and escalate issues for timely resolution; notify appropriate Company personnel as appropriate
  • Effectively communicates with team members and trains them in technical aspects.
  • Maintenance of WIKI and technical documentation (for NOC) of processes and procedures used throughout normal operations.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

About Stanley David and Associates