Apply Now

The ML Engineer is a new role - Hybrid @ NYC, NY - Only locals needed.

Hybrid in New York, NY, US • Posted 2 days ago • Updated 2 days ago

Contract W2

Contract Independent

6 Months

Occasional Travel Required

Hybrid

Depends on Experience

Fitment

Dice Job Match Score™

👤 Reviewing your profile...

Job Details

Skills

Machine Learning (ML)
Healthcare Information Technology
Extract, Transform, Load
Data Deduplication
Continuous Delivery
CPU
Affinity Propagation
Amazon Web Services
Amazon SageMaker
Cloud Computing
Deep Learning
FFmpeg
GPU
Machine Learning Operations (ML Ops)
Named-Entity Recognition (NER)
Natural Language Processing
Open Source
Okapi BM25
Kubernetes
PyTorch
TensorFlow
Terraform
Transformer
Workflow
Media

Summary

Hybrid – 3 days onsite.

Location – NYC, NY

The ML Engineer is a new role , responsible for shaping how we build and scale machine learning systems at AP, helping to lay the foundation for our machine learning capabilities. The ML Engineer has hands-on experience building and optimizing ML inference systems that run in production environments. This role will develop and tune pipelines that transform millions of photos, videos, and text documents into searchable representations using a combination of deep learning models (examples: DistilBERT, SBERT, TransNetV2) and external multimodal APIs. The ideal candidate has experience optimizing inference at scale, orchestrating ML workloads, and working with both PyTorch and TensorFlow in a cloud environment, focusing on model performance, integration patterns, and inference efficiency.

What You’ll Do:

· Design, build, and scale ML-powered inference systems that process large volumes of text, image, and video data to power news-based intelligence products.

· Productionize and optimize state of the art models and inference pipelines. These models include, but are not limited to:

o DistilBERT for Named Entity Recognition (NER) over hundreds of thousands of search queries/day

o TransNetV2 for video shot boundary detection at scale for archival video as well as real-time

o SBERT for embedding generation from textual descriptions

o External multimodal APIs for image/video captioning

· Support hybrid search architectures by defining embedding/re-ranking interfaces, evaluation metrics, and inference performance requirements; partner with search/platform engineers on index configuration, sharding, and cluster tuning.

· Design and implement scalable data processing pipelines across hybrid CPU/GPU environments to handle millions of media assets.

· Partner with MLOps and platform engineering to enable the deployment and operation of ML systems reliably, contributing to:

o Distributed inference architectures

o Cloud-based execution (e.g., AWS EC2, Batch, Lambda, SageMaker)

o Efficient resource utilization across workloads

· Optimize inference latency and throughput across distributed workloads using cloud-based resources (AWS EC2, Batch, Lambda, SageMaker, etc.)

· Build resilient asynchronous processing systems for large-scale workloads, ensuring:

o Reliability (retries, fault tolerance)

o Efficiency (caching, deduplication)

o Observability (metrics, logging, traceability)

· Work closely with data scientists and product teams to iterate on models, improve performance, and deliver measurable impact in production.

Requirements:

· 8+ years of experience building production ML inference systems.

· Demonstrated ownership of deep-learning inference optimization in production (quantization, distillation, compilation, kernel/profile-level performance work) for transformer NLP and/or CV models.

· Experience with both TensorFlow (SavedModel, tf.data, XLA, TFLite) and PyTorch (TorchScript, ONNX, FastAPI/TorchServe)

· Hands-on experience optimizing inference pipelines on AWS infrastructure, ideally across different types of media assets.

· Experience with video frameworks/tools (e.g., FFmpeg), and working with large-scale frame-level inference.

· Demonstrated experience monitoring and debugging model latency, memory, and pipeline throughput.

· Experience with hybrid search architectures (BM25 + vector search + cross-encoder reranking).

· Familiarity with OpenAI APIs or other foundation model providers.

· Familiarity with open source HuggingFace LLMs.

· Experience with data pipeline and workflow orchestration tools (e.g., Airflow)

Who This Role is Not For:

Candidates whose primary background is MLOps platform work (DAG orchestration, Terraform, Kubernetes administration, generic CI/CD pipelines) will not be a fit. We need a senior level engineer who can profile a transformer, rewrite its serving path for a 2–3x latency reduction, tune an HNSW index, and tell us which SageMaker instance type will hit our p95 target at the lowest cost

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Dice Id: 10121769
Position Id: 8977233
Posted 2 days ago

Contact the job poster

Imran khan

Recruiter @ Central Business Solutions

View Profile

Create job alert

Never miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

AI/ML Engineer

Jersey City, New Jersey

•

Today

Location: Jersey City, NJ Salary: $70.00 USD Hourly - $80.00 USD Hourly Description: Senior Machine Learning Engineer / AI Engineer About the Role We are seeking a Senior Machine Learning Engineer to lead the design, development, and production deployment of advanced AI and machine learning solutions. In this role, you will work across engineering, product, and data teams to solve complex business problems, advance modern ML practices, and ensure responsible and scalable AI systems. You wil

Contract

USD 70.00 - 80.00 per hour

ML Software Engineer

Jersey City, New Jersey

•

Today

Job Title: ML Software Engineer Location : Jersey City, NJ Duration : Contract Required qualifications, capabilities, and skills Bachelor's or Master's degree in Computer Science, Engineering, or a related field 10+ years of engineering experience, including 3-5+ years building, deploying, and operating applied AI/ML systems in production (model lifecycle, MLOps, monitoring, and governance). Demonstrate hands-on engineering leadership: setting technical direction, making architecture d

Easy Apply

Third Party, Contract

Depends on Experience

Gen AI Engineer

New York, New York

•

Today

Solution IT Inc. is looking for Gen AI Engineer for one of its clients in New York, NY / (Hybrid) Design and implement scalable MLOps supportive data pipelines for data ingestion, processing, and storage. Experience deploying models with MLOps tools such as Vertex Pipelines, KubeFlow, or similar platforms. Experience implementing and supporting end-to-end Machine Learning workflows and patterns. Expert level programming skills in Python and experience with Data Science and ML packages and framew

Easy Apply

Contract, Third Party

Software Engineer

Jersey City, New Jersey

•

Today

job summary: Drive pragmatic innovation by balancing future capabilities with production readiness and maintainability Increase team agility through reusable frameworks, clear abstractions, and automation that reduce downstream friction Resolve technical roadblocks and mitigate platform risks related to scalability, reliability, and integration Accelerate delivery and improve system reliability through robust CI/CD pipelines and automated operational workflows location: Jersey City,, New Jer

Contract

USD58 - USD59

Search all similar jobs