Key Responsibilities
Operational support of network environments:
Switching Routing (underlay and overlay)
Firewall, Traffic Management, Content Inspection, and DNS
Identify service impact, interpret monitors, dashboards, traffic captures, and logs using:
Splunk, SevOne, IBM Watson AI Ops, Wireshark, NetScout, and Gigamon
Cisco Nexus / ACI, Arista CloudVision, VMware vSphere
Identify possible production failure scenarios through eyes on glass monitoring of IT infrastructure Services
React to the failure according to business impact, and communicates with management and technical escalation
Initiate production support triage efforts for network infrastructure incidents, manage bridge line troubleshooting and appropriate team engagement, engage in technical research and troubleshooting, and escalate to next level of leadership as needed
Provide status updates and technical detail for awareness communications, ensure accuracy of all communications sent, and ensure any necessary follow-ups are scheduled
Responsible for data quality and completion of incident tickets, including ensuring all impacts are accurately recorded and documented in the system of record.
Work ad-hoc reports and offline incidents at the direction of the senior team members or leadership
Promote and enforce production governance during triage/testing and fix efforts, exercises judgment within defined procedures and practices to determine appropriate action.
Adhere to design standards and global design authority processes and procedures
Assemble professional documents based on existing templates and ability to provide accurate work descriptions with assumptions, and caveats.
Skills:
Splunk, SevOne, IBM Watson AI Ops, Wireshark, NetScout, and Gigamon
Cisco Nexus / ACI, Arista CloudVision, VMware vSphere
Understanding of enterprise network infrastructure (routing, switching, wireless, SD-WAN).
Hands-on experience with Splunk for data ingestion, searches, dashboards, and alerts.
Experience working with network telemetry sources such as SNMP, syslog, telemetry streams, APIs, and device metrics.
Ability to translate raw telemetry into actionable insights for Operations teams.
Strong analytical skills and attention to data quality.
Good communication skills to work effectively with Engineering and Operations teams.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
- Dice Id: matlennj
- Position Id: 100999535566490
- Posted 4 hours ago