- The hiring manager is fine with any SREs who have experience on cloud operations, especially those who have worked in companies like Nutanix
- vSphere or ESXi or vCenter knowledge/experience is a must have
- He is fine with considering people with any of the below 3 backgrounds:
- SREs with cloud exposure (any cloud) and vsphere/vcenter/esxi experience
- SREs familiar with Datacenter operations having vSphere certifications
- Manual testing engineers who have done vSphere/ESXi testing (not vCenter) and who knows basic scripting and can troubleshooting
- Would be working on PST daytime shift (9-5) with the Bangalore team covering two other shifts. So prefers to have candidates based on the Pacific Time Zone (at the most Mountain Time)
- Would have pager-duty
- Occasionally would need to work in weekends - which would be compensated with Wednesday/Thursday off
JD:
Role Responsibility
As a member of our team, you will play a key role in understanding customer use cases and defining the end-to-end behavior of the solution. This solution will involve developing a highly scalable IaaS service, based on VMware products: vSphere, vCenter, ESXi, vSAN, and NSX.
You will also be required to enable other product engineering teams to drive towards automated problem resolution. On the observability side, we help service owners define and instrument SLOs & alerts that follow best practice, build tools and dashboards, facilitate postmortems, and look to continuously enhance our existing systems and process to improve the reliability of the IaaS offering.
The SRE role is a great fit for engineers who want to own production solutions while getting hands on with a wide variety of the latest and greatest open-source technologies, and love to push the boundaries of what cloud infrastructure software, observability and tooling can achieve.
Required Skills: - 5-8 years of industry experience and 3+ years of relevant hands-on experience managing large scale virtualized data center environment.
- Solid background in troubleshooting distributed environments and cloud production environments
- Comfortable using Python or other scripting languages.
- Experience with VMware technologies, such as vSphere, vCenter, ESX, vSAN and/or NSX in a Datacenter environment
- Worked in a large-scale distributed environments and resolved issues with automation.
- Handled customers escalations, production outages/incidents and oncall responsibilities in supporting 4x9 uptime of the services.
- Conduct post-mortems to analyze and prevent repeat failures
- Work closely with software engineering teams to improve availability of services
- Identify, gather, analyze, and automate responses to key performance metrics, logs, and alerts
- Participate in product roadmap planning and drive team initiatives
- Handle seamless upgrades of infrastructure through automation
- Training and mentoring junior engineers by providing technical guidance and direction