L1 Support Engineer (Ray)

Overview

Hybrid

$60 - $70

Accepts corp to corp applications

Contract - Independent

Contract - 6 Month(s)

10% Travel

Skills

Apache Flink

Conflict Resolution

Data Link Layer

Python

Shell Scripting

Scripting

Data Science

Communication

Machine Learning (ML)

DevOps

Machine Learning Operations (ML Ops)

Job Details

Location: Austin, TX / Sunnyvale, CA (Hybrid 3 days onsite)

Key Responsibilities
Provide Level-1 support for distributed ML workloads running on Ray and related frameworks.
Monitor, troubleshoot, and resolve issues in MLOps pipelines and distributed systems.
Assist in performance tuning of ML models and infrastructure for optimized execution.
Support Flink workloads and ensure smooth integration with data/ML pipelines.
Write and maintain automation scripts using Python or Shell scripting to streamline operational workflows.
Perform ML tuning to enhance training efficiency and inference performance.
Work closely with L2/L3 Support, DevOps, and Data Science teams to escalate and resolve complex issues.
Document troubleshooting steps, standard procedures, and create runbooks for repetitive support tasks.

Required Skills & Qualifications
Hands-on experience with Ray for distributed ML workloads.
Knowledge of MLOps workflows and pipeline orchestration.
Understanding of Flink (or similar distributed data frameworks).
Proficiency in Python or Shell scripting for automation.
Familiarity with performance tuning and ML tuning techniques.
Strong troubleshooting and problem-solving skills in production support environments.
Good communication skills and ability to work collaboratively across teams.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Job Details

About Teknikoz

Share