AI Engineer

Overview

On Site

Full Time

Skills

Optimization

Research

Management

GPU

CPU

Computer Science

Algorithms

Python

Software Development

Machine Learning Operations (ML Ops)

Workflow

TensorFlow

PyTorch

Large Language Models (LLMs)

Storage

Artificial Intelligence

Machine Learning (ML)

Conflict Resolution

Problem Solving

Communication

Job Details

Role Overview

We are seeking an experienced AI Infrastructure Engineer to spearhead the development, deployment, and ongoing optimization of machine learning and artificial intelligence systems. This individual will work cross-functionally to design scalable solutions that bridge cutting-edge research with critical business applications.

What You'll Do

Lead the design, implementation, and maintenance of AI systems from prototype to production.
Partner closely with engineers, quantitative researchers, traders, and data scientists to identify high-value opportunities for AI and ML integration across the organization.
Build automated pipelines for model retraining, validation, and monitoring to ensure system stability and minimal operational disruption.
Act as a key contributor to the selection and integration of AI/ML frameworks, optimizing usage across diverse compute environments.
Architect and manage robust systems for feature engineering, including the development of feature stores and model registries.
Develop internal platforms that support efficient, reproducible machine learning experimentation at scale.
Diagnose and resolve computational inefficiencies related to GPU and CPU resource utilization.
Stay at the forefront of AI advancements and bring innovative techniques into the technology stack.

What We're Looking For

Degree in Computer Science, Artificial Intelligence, Machine Learning, or a closely related field; advanced degrees are a plus.
Minimum of 3 years of professional experience building AI/ML-driven applications.
Strong foundation in machine learning principles, algorithms, and real-world applications.
Expertise in Python and familiarity with best practices in large-scale software development.
Proven experience delivering and maintaining machine learning models in live production environments.
Deep familiarity with MLOps workflows, including model versioning, deployment automation, and monitoring.
Proficiency with ML frameworks such as TensorFlow, PyTorch, ONNX, or TensorRT.
Hands-on experience with Large Language Models (LLMs), including techniques such as retrieval-augmented generation (RAG) and model fine-tuning.
Solid understanding of the compute and storage architectures necessary to support AI/ML initiatives.
Strong problem-solving mindset, with the ability to independently troubleshoot and optimize complex systems.
Excellent communication skills and a collaborative approach to working across technical and business teams.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Job Details

Share