Overview
Skills
Job Details
Skills required
10+ years experience with global enterprise networking operations, data center management, Infrastructure Services in AWS and VMware, you could be a great fit for this role. Strong experience with Python scripting
Relevant certifications such as AWS network certification or VMware Network and Security certifications (Equivalent to CISSP, CISM, or SANS GIAC or related).
10+ years of experience in designing network and workload isolation, network segmentation ,network security policy definition and network standards (DNS & Subdomain, routing etc.)
10+ Years of compute, network, storage, and security services in both AWS and VMware Environments
10+ years experience in developing and executing strategies for improving security and reliability across all systems and services
8+ Years of experience in setting K8s using Rancher, AWS EKS or similar services. Ability to deploy CIS, CSI, Ingress controller, Reverse Proxy and Other instrumentation around Kubernetes clusters
At least 8+ years of experience in business continuity planning including strategies, implementation, game days, and total cost estimation. Can explain well on the differences between Business continuity plan (BCP), High Availability (HA), Backup & Restore, Disaster Recovery (DR), and Archive.
10+ years consulting/pre-sales experience to facilitate relationships with senior technical executives, as well as easily interact and give guidance to software developers, IT operations staff, and system architects.
10+ years of experiences in making overall recommendation (or proposal) based on customer needs and efficiently communication formal presentations, white boarding, large and small group presentations in areas of network systems, security engineering infrastructure and automation
7+ years of experience in security engineering and/or site reliability engineering, with at least 3 years in a leadership role.
5 + years of experience in shared infrastructure services in AWS and VMware environments such as Kafka Stream, Data Pipes (Flink/Spark/Kinesis), Redis Cache, Apigee (API gateway).
Strong understanding of security principles, practices, and technologies, including encryption, authentication, access control, and network security. Proven experience with reliability engineering practices such as monitoring, alerting, incident response, and performance tuning.
Proven experience with reliability engineering practices such as monitoring, alerting, incident response, and performance tuning.
Proven experience with DevOps practices such as CI-CD and Infrastructure as a code.
Nice to have: Proficiency in scripting and automation tools, such as Python, Bash, Ansible, or Terraform.
Nice to have: Experience in implementing Network and infrastructure compliance with financial industry standards and regulatory requirements
Previous experience in Implementing and maintaining monitoring, alerting, and incident response processes. Optimize system performance and automate repetitive tasks to improve efficiency
Experience with DevOps practices and tools, such as CI/CD pipelines, GIT Ops, and infrastructure as code.
Experience with cloud platforms (AWS, Azure, Google Cloud Platform) and container orchestration systems (Kubernetes, Docker).
Excellent problem-solving skills and the ability to work under pressure in a fast-paced environment.
Strong communication and interpersonal skills, with the ability to influence and inspire teams.
Knowledge of compliance frameworks such as GDPR, HIPAA, or SOC 2.