Vision-Language Model (VLM) Engineer

Hybrid • Posted 1 day ago • Updated 1 hour ago
Contract W2
Contract Independent
Remote
Fitment

Dice Job Match Score™

⏳ Almost there, hang tight...

Job Details

Skills

  • VLM

Summary

Job Title: Vision-Language Model (VLM) Engineer

Location: Remote

Duration: 6+ Month Contract

Job Description:
We are looking for a Vision-Language Model (VLM) Engineer / Applied Scientist who can design, fine-tune, and deploy multimodal models that understand images/videos and text together. You will own the path from prototype to production, including AWS-based deployment for scalable, secure inference.

Key Responsibilities:

Build and adapt vision-language models (VLMs) for enterprise use-cases (visual inspection, safety monitoring, document/image understanding, workflow automation)

Fine-tune pretrained models for custom datasets using LoRA / QLoRA / adapters

Create pipelines for image/video ingestion model inference structured outputs (JSON, labels, alerts, summaries)

Deploy inference services on AWS with monitoring, scaling, and cost control

Optimize for performance and reliability (batching, quantization, caching, GPU utilization)

Run evaluation, error analysis, and continuous improvement using task-specific metrics

Partner with product and engineering teams to integrate VLM services into applications/APIs

Required Skills:

Strong hands-on experience with multimodal AI / Vision-Language Models

Proficiency in Python and PyTorch (or equivalent deep learning framework)

Real-world experience with fine-tuning and model adaptation (LoRA/QLoRA, prompt tuning)

Experience deploying ML services on AWS, such as:

Amazon SageMaker (endpoints, model hosting, pipelines)

Amazon EC2 + GPU, Auto Scaling, Load Balancers

Amazon ECR (container registry) + Docker

AWS Lambda / API Gateway (where suitable), CloudWatch (logs/metrics)

Strong understanding of computer vision fundamentals (classification, detection, embeddings)

Preferred / Nice to Have:

Hugging Face Transformers, OpenCV, ONNX/TensorRT

ECS / EKS (Kubernetes) for container orchestration

Infrastructure as Code: Terraform / AWS CDK / CloudFormation

Security best practices: IAM roles, VPC setup, secrets management

Multimodal RAG (Retrieval-Augmented Generation) with vector databases

Experience with dataset labeling workflows and MLOps practices

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 10121915
  • Position Id: 2026-4341/21276
  • Posted 1 day ago
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Remote

14d ago

Easy Apply

Contract

Depends on Experience

Remote or Tampa, Florida

Today

Full-time

USD 113,840.00 - 170,760.00 per year

Remote or Irving, Texas

Today

Full-time

USD 125,760.00 - 188,640.00 per year

Remote or Minnetonka, Minnesota

Today

Full-time

USD 112,700.00 - 193,200.00 per year

Search all similar jobs