L1 Support Engineer (Ray)

  • Austin, TX
  • Posted 1 day ago | Updated 1 day ago

Overview

Hybrid
$60 - $70
Accepts corp to corp applications
Contract - Independent
Contract - 6 Month(s)
10% Travel

Skills

Apache Flink
Conflict Resolution
Data Link Layer
Python
Shell Scripting
Scripting
Data Science
Communication
Machine Learning (ML)
DevOps
Machine Learning Operations (ML Ops)

Job Details

Location: Austin, TX / Sunnyvale, CA (Hybrid 3 days onsite)

Key Responsibilities
Provide Level-1 support for distributed ML workloads running on Ray and related frameworks.
Monitor, troubleshoot, and resolve issues in MLOps pipelines and distributed systems.
Assist in performance tuning of ML models and infrastructure for optimized execution.
Support Flink workloads and ensure smooth integration with data/ML pipelines.
Write and maintain automation scripts using Python or Shell scripting to streamline operational workflows.
Perform ML tuning to enhance training efficiency and inference performance.
Work closely with L2/L3 Support, DevOps, and Data Science teams to escalate and resolve complex issues.
Document troubleshooting steps, standard procedures, and create runbooks for repetitive support tasks.

Required Skills & Qualifications
Hands-on experience with Ray for distributed ML workloads.
Knowledge of MLOps workflows and pipeline orchestration.
Understanding of Flink (or similar distributed data frameworks).
Proficiency in Python or Shell scripting for automation.
Familiarity with performance tuning and ML tuning techniques.
Strong troubleshooting and problem-solving skills in production support environments.
Good communication skills and ability to work collaboratively across teams.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

About Teknikoz