Overview
Hybrid2 days onsite
$150,000 - $200,000
Full Time
No Travel Required
Skills
Grafana
Azure
GCP
Network Monitoring
Network Performance
Terraform
Python
Azure Monitor
Observability
Dashboard
Prometheus
OpenTelemetry
network performance metrics
Firewall
BGP
Kubernetes
Azure AZ-120
Job Details
You must be local to: Pleasanton, CA / Phoenix, AZ or Plano, TX Only - Do not apply if you are not in these 3 locations.
W2 and C2C Accepted
Sr. Network Observability Engineer SME ( Azure/Google Cloud Platform/OCI, Grafana, MSFT/Google Cloud Platform/OCI tooling)
Key Responsibilities:
- Design and deploy scalable network observability frameworks for multi/hybrid-cloud environments (Azure, Google Cloud Platform, OCI) using Grafana, Prometheus, OpenTelemetry, and cloud-native tools.
- Implement custom dashboards, alerts, and log analytics for network performance metrics (latency, packet drops, BGP routing health, throughput) and security telemetry (firewall logs, flow logs, IDS/IPS).
- Integrate observability tools with cloud networking services:
- Azure: Monitor ExpressRoute/VNet Gateway metrics, NSG Flow Logs, Traffic Analytics.
- Google Cloud Platform: Stackdriver/Operations Suite for VPC flow logs, Firewall Insights, Network Intelligence Center.
- OCI: VCN Flow Logs, Network Visualizer, Service Connector Hub.
- Automate observability pipelines using Terraform, Python, or PowerShell to ingest, correlate, and visualize telemetry data.
- Troubleshoot network anomalies by analyzing packet captures (PCAP), NetFlow/sFlow, and distributed tracing data.
- Collaborate with SRE and DevOps teams to reduce MTTR via AI/ML-driven anomaly detection (e.g., Azure Sentinel, Google Cloud Platform Chronicle, OCI AI Anomaly Detection).
- Optimize costs by right-sizing monitoring tools and eliminating redundant telemetry data.
Required Skills & Experience:
- 8+ years in network observability, monitoring, or cloud operations, with expertise in Azure/Google Cloud Platform/OCI.
- Hands-on experience with:
- Grafana (dashboarding, Loki for logs, Mimir for metrics).
- Cloud-native tools: Azure Monitor, Google Cloud Platform Cloud Logging/Monitoring, OCI Observability & Management.
- Telemetry protocols: SNMP, gNMI, NetFlow/IPFIX, eBPF.
- Network diagnostics: Wireshark, tcpdump, traceroute, BGP route analytics.
- Automation/scripting: Python, Terraform, or equivalent IaC tools.
- Certifications (Preferred):
- Azure: AZ-120 (Monitoring), AZ-700 (Networking).
- Google Cloud Platform: Professional Cloud Network Engineer.
- OCI: Oracle Cloud Infrastructure Certified Architect.
- Grafana: Grafana Certified Associate (or higher).
Nice-to-Have:
- Experience with AIOps platforms (Dynatrace, New Relic, Splunk ITSI).
- Knowledge of Kubernetes networking observability (Calico, Cilium, Istio).
- Familiarity with compliance frameworks (ISO 27001, NIST CSF) for audit logging.
Resumes to be sent on
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.