Position Overview
We are seeking a highly experienced Senior Network Architect to lead the design, architecture, and evolution of large-scale AI/ML, data center, and backbone network infrastructure. The ideal candidate will have deep expertise in high-performance networking, multi-terabit WAN architectures, EVPN/VXLAN fabrics, network automation, and cloud-scale infrastructure supporting AI workloads.
Key Responsibilities
· Design and architect large-scale AI/ML data center networks and high-capacity WAN infrastructure.
· Lead deployment of EVPN/VXLAN fabrics supporting GPU clusters and AI training environments.
· Drive network scalability, reliability, performance, and automation initiatives across global infrastructure.
· Design and optimize low-latency, high-throughput networks supporting RDMA/RoCE workloads.
· Develop network automation solutions using Python, Ansible, Terraform/OpenTofu, and CI/CD pipelines.
· Define network standards, operational processes, observability frameworks, and reliability best practices.
· Collaborate with infrastructure, cloud, systems, and AI engineering teams on strategic architecture initiatives.
· Lead troubleshooting and performance optimization for large-scale production environments.
· Mentor engineers and contribute to technical leadership, documentation, and architecture reviews.
Required Qualifications
· 15+ years of experience in Network Architecture, Network Engineering, or Network Reliability Engineering.
· Deep expertise with:
o BGP, OSPF, IS-IS, MPLS
o EVPN/VXLAN
o Data Center Networking
o WAN and Backbone Architecture
o AI/ML Infrastructure Networking
o Network Performance and Capacity Planning
· Strong experience with Juniper, Arista, Cisco, and multi-vendor environments.
· Hands-on experience with Linux administration and network automation.
· Strong scripting/programming skills in Python, Go, Bash, or similar languages.
· Experience with Infrastructure-as-Code and automation frameworks (Ansible, Terraform/OpenTofu, Pulumi).
· Experience building highly available, scalable cloud and data center networks.
Preferred Qualifications
· Experience supporting AI training clusters, GPU fabrics, or HPC environments.
· Knowledge of PTP, RDMA, RoCEv2, and low-latency networking technologies.
· Experience with network observability platforms such as Kentik, ThousandEyes, Zabbix, Nagios, or similar.
· Exposure to AWS, Google Cloud Platform, and hybrid cloud networking architectures.
· Experience leading architecture reviews and cross-functional infrastructure programs.
Nice to Have
· Experience with large-scale hyperscaler environments.
· Participation in industry organizations such as NANOG, RIPE, or Internet Society.
· Background supporting multi-terabit AI or research infrastructure environments.
Ideal Candidate: strong network architecture background, AI/ML data center networking experience, EVPN/VXLAN expertise, automation-first mindset, and proven success operating at hyperscale environments such as Meta, Microsoft, Adobe, or similar organizations.