ML Operations Analyst

Remote • Posted 2 hours ago • Updated 2 hours ago
Full Time
No Travel Required
Remote
Up to $55/hr
Fitment

Dice Job Match Score™

📋 Comparing job requirements...

Job Details

Skills

  • ML
  • Operations
  • Support
  • SQL
  • Program Management

Summary

Do something big and innovative! Stretch your creative muscles and work on big issues. Since 1989, we have developed technology environments, applications, and tools by providing experienced teams to implement, enhance, and maintain our clients essential systems and applications. Come join the Scalence team!

Job Title: ML Serving Operations Analyst
Duration: 12+ months
Location: 100% Remote - Pacific work hours (Must be local to bay area)
Pay rate: up to $55/hr. W2 with benefits

Job Summary:
Resource Management team is responsible for end-to-end resource planning and provisioning on our client s infrastructure, including Budgeting, Compute, Storage, Accelerators & Network, Data Center infrastructure resources to support Engineering ( Eng ) & Site Reliability Engineering ( SRE ) service related requests. Responsible for handling tactical execution tasks that cannot yet be automated in order to improve service response times and reduce risk to client s infrastructure. Additionally, the team supports data-driven decision-making and leverages machine learning (ML) techniques to enhance forecasting, automation, and operational efficiency.

Ideal candidate will have an engineering degree like a computer science major with experience in running Terminal Commands and will have really good understanding of SQL, machine learning fundamentals, and the terminology of computer hardware.

Requirements:

  1. Respond to Pool Minding Alerts to proactively keep production service pools Healthy & reduce reliability risk, leveraging ML-based alerting and anomaly detection where applicable.
  2. Manage Resource Requests from SRE/Eng to FTE team for all Infrastructure services, incorporating predictive insights from ML models where available.
  3. Manage Supply Planning Operations including ordering of weekly resources (Machine Orders), writing the weekly health reports, monitoring in progress orders, and escalating in case of SLO slippage for critical growth dependencies, with support from ML-based forecasting models.
  4. Establish migration execution plans to move services between locations to mitigate against data center constraints, using data analysis and ML-driven capacity planning insights.
  5. Execute replacement plans for large-scale infrastructure projects, i.e. cluster turndowns, cluster migrations due to limited data center space, service rebalance due to resource constraints, potentially guided by ML-based optimization models.
  6. Assist in Special Projects (e.g. building data pipelines for automated reporting & metrics management, and supporting ML model data pipelines).
  7. Update vendor playbooks as process changes, subject to FTE review and approval, including documentation of ML-enabled workflows where applicable.

Other requirements:

  1. Required to attend weekly meetings with the client stakeholders and any additional meetings that the client feels is necessary.
  2. Required to provide written reports such as: Weekly Supply/Demand fulfillment status report; Weekly Flexpool low inventory alert report; Weekly Operation ticket queue report on aging tickets and reasons; and Operational project status report, incorporating insights derived from data analysis and ML models where relevant.
  3. Respond to resource ticket requests;
  4. Manage resource pool alerts and machine orders, including ML-assisted alert prioritization;
  5. Support pool migrations; and
  6. Perform data analysis to measure operational performance, including applying machine learning techniques for trend analysis, forecasting, and anomaly detection.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: artech
  • Position Id: 8945994
  • Posted 2 hours ago
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Remote

Today

Easy Apply

Full-time

Up to $55

Remote

Yesterday

Easy Apply

Full-time

Depends on Experience

Remote

Today

Full-time

Remote or San Diego, California

Today

Full-time

USD 91,700.00 - 163,700.00 per year

Search all similar jobs