Role NDR & Platform Observability as Architect
Duration 12+ Months
Location Minneapolis, MN OR Hartford, CT (Need local)
Onsite
Position Summary
We are seeking a highly skilled Senior NDR & Platform Observability Engineer to support the operational health, visibility, and performance of the enterprise Network Detection & Response (NDR) environment, with a primary focus on the Corelight platform and associated telemetry pipelines.
This role combines deep expertise in security operations, network monitoring, and observability engineering to design and maintain a modern monitoring architecture leveraging APIs, automation, time-series databases, and visualization platforms. The ideal candidate will help ensure platform reliability, improve detection efficacy, reduce alert noise, and enhance enterprise-wide security visibility.
Key Responsibilities
NDR Operations
- Manage daily operations of NDR sensors, appliances, and Zeek-based detection pipelines.
- Monitor sensor health, packet throughput, ingestion rates, and packet drop metrics.
- Perform triage and escalation support for NDR alerts in collaboration with SOC and Incident Response teams.
- Tune Zeek scripts, Suricata rules, and Corelight detection packs to optimize detections.
- Identify and resolve data gaps, ingest delays, and visibility issues.
- Troubleshoot packet broker integrations, SPAN/TAP feeds, and network visibility paths.
Observability & Monitoring Architecture
- Design and implement an enterprise-grade observability framework for NDR platforms and telemetry systems.
- Develop Python-based metrics collectors leveraging REST APIs.
- Integrate monitoring data into Prometheus, InfluxDB, and other time-series platforms.
- Configure and maintain Telegraf pipelines for data collection, parsing, tagging, and forwarding.
- Build real-time and historical dashboards in Grafana.
- Establish SLIs/SLOs for platform reliability, ingest freshness, sensor uptime, and pipeline availability.
Automation & API Integration
- Develop Python automation scripts for health checks, reporting, and data validation.
- Integrate with SIEM platforms, packet brokers, and telemetry APIs to collect operational metrics.
- Build custom Prometheus exporters and collectors when native integrations are unavailable.
- Automate repetitive operational workflows including:
- Sensor health validation
- Alert verification
- Data integrity checks
- Status reporting
Required Qualifications
- 5+ years of experience in:
- Security Operations
- NDR Engineering
- Network Engineering
- Observability Engineering
- Hands-on experience with:
- Corelight
- Zeek
- Suricata
- Endace
- cPacket or related NDR technologies
- Strong Python scripting and automation skills.
- Experience with observability and monitoring tools:
- Grafana
- Prometheus
- InfluxDB
- Telegraf
- Strong understanding of:
- Network traffic analysis
- Packet capture technologies
- Troubleshooting methodologies
- Experience building dashboards, alerts, and monitoring pipelines at scale.
- Background supporting SOC and Incident Response operations.