Overview
Skills
Job Details
Linux & GPU Specialist with Large Language Model (LLM) Hosting Expertise
6+Months
San Jose, CA 95113 (Hybrid)
Job Summary:
We are seeking an experienced Linux & GPU Specialist who possesses expertise in hosting and operating Large Language Models (LLMs) to join our dynamic team. The ideal candidate should be highly skilled in managing GPU-accelerated environments and adept at setting up and maintaining systems capable of managing, training, and inferring from large datasets efficiently.
Key Qualifications:
- A minimum of 3 years to a maximum of 5 years of professional experience in a related field.
- Proficient in deploying and maintaining GPU-powered infrastructures.
- Proven track record of system architecture development for handling high-volume data processing.
- Competency in launching and administering LLMs utilizing GPU-accelerated platforms and complementary advanced hardware.
- Skilled in designing scalable systems optimized for both high-volume data training applications and rapid-response inferencing.
- Experience in applying modern orchestration tools such as Kubernetes, along with a practised understanding of Infrastructure as Code (IaC) methodologies.
Responsibilities:
- Oversee the configuration and support of GPU-oriented infrastructure designed for robustness and efficiency.
- Spearhead the creation of systems architecture that effectively deals with extensive quantities of data.
- Execute the deployment and operation of Large Language Models, ensuring an optimized environment for GPU and specialized hardware use.
- Architect robust systems with a focus on high-capacity data handling capabilities, enabling proficient model training and expedited inference.
- Utilize orchestration platforms like Kubernetes to automate deployment, scaling, and operations of application containers.
- Employ Infrastructure as Code practices to manage and provision infrastructure through code and automation tools.
The successful candidate will have a proven record of managing similar workloads and demonstrate the capability to innovate and maintain high-performance computing environments specific to the needs of running Large Language Models.