RTML Engineer // ML Ops Engineer

Overview

Hybrid

Depends on Experience

Accepts corp to corp applications

Contract - W2

Contract - 12 Month(s)

Skills

RTML Framework

kubernetes

KubeFlow

ML Ops

Job Details

Title: RTML Engineer // ML Ops Engineer

Location: Dallas, TX (Or) NJ

What you will be doing:

You will join our critical Real Time ML Service team working on our RTML Model Serving Framework.

This is a fundamental team in our AI Center, and RTML Framework serves all of our real time AI models in the production - enabling our business organizations to maximize the benefits of using AI-driven solutions for our customers.

As a Principle Engineer, you will be

Functioning as a domain expert in the area of RTML model serving technology, familiar with the industrial trends in RTML, common RTML architectures, leading 3rd-party RTML serving products, and evaluation criteria s
Working closely with other teams to define technical strategy, architecture, development choices and ensure overall growth of the Jarvis Framework to meet our internal customers needs.
Leading the Jarvis development activities through phased releases, ensuring it is architecturally sound, implemented correctly/efficiently, and delivered on time.
Supporting internal customers with major framework issues and coordinating triage efforts to solve them.
Lead and mentor junior developers in the team and always pushing for team successes.
Adhering to industry standards and best practices and tracking emerging RTML technologies and trends to continuously improve the Jarvis framework.

You ll need to have:

Bachelor s degree or above in Computer Science/Engineering or other related areas.
Four or more years of work experience in computer software development related jobs.
At least two years are in AI / ML Engineering areas with reasonably good understanding of Data Science and AIML practices/workflows.
Strong expertise in RTML model serving arena and/or large scale cloud-based RT framework development.

Experience with kubernetes. The candidate should be comfortable with kubectl and helm.
Experience in creating, deploying, and maintaining centralized KubeFlow infrastructure on top of one or multiple kubernetes clusters

Experience with cloud infrastructures and ML Ops in clouds.
Familiar with CI/CD process and common frameworks such as ArgoCD.
Experience with programming languages such as Python and Java.
Experience in large application development in cloud environments - AWS, Google Cloud Platform and On-Prem clusters.
Experience in K8s architecture and principle of operations, hands-on skills of deploying large applications in production K8s cluster, configuring K8s properly, and troubleshooting when the application has issues.
Good understanding of of RT system stats collection and performance monitoring methods
Basic understanding of RT Feature Engineering methodology and practices
Understand basic data science concepts and common needs from data scientists.

Raj Vemula

Director Resource Development

Job Details

Share