Overview
Hybrid
$120,000 - $140,000
Full Time
Skills
Operations Analyst
Python
troubleshooting
network engineering
Google Cloud
automation tools
Linux
Unix
Data Center
Job Details
Our client is at the forefront of cloud-based graphics technology, revolutionizing the digital asset landscape through innovative 3D internet solutions. Their cutting-edge products have been enabling transformation across various industries globally for over a decade.
This Role Offers:
- Cutting-edge 3D infrastructure development
- Collaborative environment with talented professionals
- Growth opportunities for career advancement
- Remote workplace
- Competitive Salary + Commissions
- Health insurance (including vision and dental)
- Retirement plan (401(k))
- Paid time off (including parental leave)
- Flexible schedule and spending account options
Focus:
- Manage and maintain data center operations focusing on GPU setups for rendering and AI/ML workloads across various platforms.
- Collaborate with IT and network engineering teams to ensure seamless infrastructure performance, including the installation and maintenance of server hardware and software.
- Implement and oversee advanced system configurations for high availability, including troubleshooting and resolving hardware and software issues.
- Develop strategies for effective problem detection, server health monitoring, and performance enhancement.
- Offer expert-level technical support for data center operations, including emergency response outside of standard business hours.
- Enhance infrastructure capabilities using automation tools and integrating third-party services for optimized functionality.
- Document operational procedures meticulously to facilitate knowledge sharing and replication among team members.
- Engage actively in coordination and communication with teams across different time zones to ensure project alignment and efficiency.
Skill Set:
- Proficiency in managing Linux/Unix and Windows Server environments.
- Expertise in GPU configuration and troubleshooting, including knowledge of power requirements, driver installation (TESLA, GeForce), and VBIOS settings.
- Experience with infrastructure automation tools such as Ansible, Puppet, Chef, and Terraform.
- A solid understanding of network protocols and services (DNS, HTTP, SSH, FTP, SMTP).
- Familiarity with various virtualization technologies (VMware, Hyper-V, KVM, Proxmox).
- Proficiency in scripting languages (Python, Bash, PowerShell) for automating routine tasks.
- Knowledge of cloud services (AWS, Azure, Google Cloud) and their deployment strategies.
- Skilled in the use of monitoring tools (Nagios, Zabbix, Prometheus, Grafana).
- Proficient in version control systems (Git, GitHub/Lab).
- Local to the Northridge/DTLA area, with the ability to drive and physically install hardware at data center locations.
About Blue Signal:
Blue Signal is an award-winning, executive search firm specializing in various specialties. Our recruiters have a proven track record of placing top-tier talent across industry verticals, with deep expertise in numerous professional services. Learn more at bit.ly/46Gs4yS