Apply Now

MLOps Engineer - Machine Learning Platform - New York

Jersey City, NJ, US • Posted 30+ days ago • Updated 2 hours ago

Full Time

On-site

Fitment

Dice Job Match Score™

⭐ Evaluating experience...

Job Details

Skills

Computational Finance
Finance
Strategic Thinking
Team Building
FOCUS
Large Language Models (LLMs)
Continuous Integration
Continuous Delivery
Documentation
Artificial Intelligence
Generative Artificial Intelligence (AI)
Docker
Unix
Storage
Database
SQL
NoSQL
Debugging
Testing
Conflict Resolution
Problem Solving
Software Engineering
Machine Learning Operations (ML Ops)
Python
Terraform
Kubernetes
Orchestration
Cloud Computing
Amazon Web Services
Google Cloud
Google Cloud Platform
Machine Learning (ML)
PyTorch
TensorFlow
Communication
Articulate

Summary

Job Description

What We Do

At Goldman Sachs, our Engineers don't just make things - we make things possible. Change the world by connecting people and capital with ideas. Solve the most challenging and pressing engineering problems for our clients. Join our engineering teams that build massively scalable software and systems, architect low latency infrastructure solutions, proactively guard against cyber threats, and leverage machine learning alongside financial engineering to continuously turn data into action. Create new businesses, transform finance, and explore a world of opportunity at the speed of markets.

Engineering, which is comprised of our Technology Division and global strategists' groups, is at the critical center of our business, and our dynamic environment requires innovative strategic thinking and immediate, real solutions. Want to push the limit of digital possibilities? Start here.

Who We Look For

We are seeking a skilled and motivated engineer to join our Artificial Intelligence Platforms organization as an MLOps Engineer on our Machine Learning Services team.

You will be part of an expert team building and operating production-grade platform and backend systems leveraged by ML engineers and application teams across the entire firm. A key focus of this role is enabling reliable, scalable, and observable deployment of Machine Learning and Large Language Models (LLMs).

This role is best suited for engineers who enjoy working on infrastructure, backend services, and distributed systems, rather than primarily on model experimentation and development.

Key Responsibilities:

Deliver scalable, efficient, secure and automated processes for building, deploying and monitoring Machine Learning models

Enable solutions that provide business customers with the ability to leverage the latest and greatest AI/ML infrastructure, frameworks, and tooling to deliver high impact outcomes

Develop and demonstrate deep subject matter expertise on how to optimize machine learning model deployments to scale to the specific needs of each business customer

Deliver high quality, production ready code leveraging CI/CD best practices

Author and maintain high quality documentation for both the engineering team as well as for business customers

Participate in on-call and support rotations, helping diagnose and resolve production issues.

Continuously expand knowledge of platform architecture with a goal to take ownership of individual components.

Stay up to date with advancements in AI/ML frameworks, model serving technologies, and GenAI infrastructure.

Basic Qualifications:

2 years of experience in software engineering (backend, platform, or infrastructure).

2 years of experience in Python or a similar backend programming language.

1 year of experience supporting production ML systems (MLOps, platform or inference-related work)

Basic understanding of APIs (REST or similar) and service-to-service communication.

Experience working with containers (e.g., Docker).

Familiarity with Unix-based systems.

Exposure to public cloud environments (e.g., AWS or Google Cloud Platform), including core concepts such as compute, storage, and basic IAM.

Experience working with databases (SQL or NoSQL).

Solid grasp of software engineering fundamentals, including debugging, testing, and maintainable code design.

Strong problem-solving skills and the ability to work effectively in a fast-paced, collaborative environment.

Curiosity and a strong desire to keep learning-especially in the model inference and LLM platform space.

Preferred Qualifications:

4 years of experience in software engineering (backend, platform, or infrastructure)

4 years of experience supporting production ML systems (MLOps, platform or inference-related work)

4 years of experience in Python or a similar backend programming language.

Strong understanding of the end-to-end Model Development Lifecycle (MDLC)

Basic understanding of distributed systems concepts and exposure to observability concepts (logging, metrics, tracing).

Experience building containerized runtime environments for model serving (e.g. vLLM, SGLang, TensorRT, Triton, AWS Multi Model Server)

Experience with infrastructure-as-code tools, such as Terraform or CloudFormation

Experience with Kubernetes and other container orchestration platforms in the public cloud (e.g. AWS, Google Cloud Platform)

Experience building Machine Learning models with frameworks such as PyTorch and TensorFlow

Excellent communication skills and the ability to articulate complex technical concepts to both technical and non-technical stakeholders.

What Success Looks like in This Role:

Can take a well-defined task and drive it to completion with minimal hand-holding.

Asks thoughtful questions instead of getting blocked.

Understands basic trade-offs (e.g., performance vs. simplicity, flexibility vs. reliability).

Writes code that is readable, testable, and easy for others to extend.

Shows curiosity about how the entire system works end-to-end, not just their assigned ticket.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Dice Id: 10121118
Position Id: dcbd8dd2c18ede07c7a1473139a74b7a
Posted 30+ days ago

Create job alert

Never miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Lead Software Engineer - AI/ML

New York, New York

•

Today

Job Description Have an exciting and rewarding opportunity for you to take your software engineering career to the next level!! As a Software Engineer (Machine Learning Platform Engineer) at JPMorgan Chase within the Consumer & Community Banking (CCB) line of business, you serve as a seasoned member of an agile team focused on building, scaling, and maintaining robust machine learning platforms. You will design and deliver trusted, market-leading infrastructure and tools that empower data sci

Full-time

USD 152,000.00 - 215,000.00 per year

Sr Lead Software Engineer - AI/ML

New York, New York

•

Today

Job Description Be an integral part of an agile team that's constantly pushing the envelope to enhance, build, and deliver top-notch technology products. As a Senior Lead Software Engineer - ML at at JPMorgan Chase within the Consumer & Community Banking (CCB) line of business, you serve as a seasoned member of an agile team focused on building, scaling, and maintaining robust machine learning platforms. You will design and deliver trusted, market-leading infrastructure and tools that empower

Full-time

USD 171,000.00 - 260,000.00 per year

AI/ML Engineer

New York, New York

•

Today

Nexstar is looking for an ML / AI Engineer to join our growing data science team in New York City. In this role you will design, build, and deploy machine learning and agentic AI systems that power real-world products. You will work across the full ML lifecycle, including data preparation, model training, evaluation, deployment, monitoring, and iteration. You will have the opportunity to work on both classical ML applications and agentic systems, including multi-step reasoning pipelines and con

Full-time

USD 90,000.00 - 110,000.00 per year

Senior Machine Learning Engineer (AI Foundations)

New York, New York

•

Today

Senior Machine Learning Engineer (AI Foundations) As a Capital One Machine Learning Engineer (MLE), you\'ll be part of an Agile team dedicated to productionizing machine learning applications and systems at scale. You?ll participate in the detailed technical design, development, and implementation of machine learning applications using existing and emerging technology platforms. You?ll focus on machine learning architectural design, develop and review model and application code, and ensure high

Full-time

Search all similar jobs