AI/ML Engineer Speech to Text ML Engineer (Whisper Focus)

  • Posted 3 hours ago | Updated 3 hours ago

Overview

Remote
Depends on Experience
Accepts corp to corp applications
Contract - Independent
Contract - W2

Skills

Whisper
NLP
AI/ML
Python

Job Details

Job Title: Speech to Text ML Engineer (Whisper Focus)

Location: 100% Remote

Duration: 6+ Months

Notes:

AI/ML Engineer (Contractor)

Mission: To work on speech-to-text finetuning and inference in multiple languages.

Responsibilities

Train and fine-tune with speech-to-text(ASR) models like Whisper.

Implement training/eval loops (PyTorch + Transformers).

Add data augmentations (noise/reverb/law), VAD, diarization (e.g., Whisper).

Package inference (ONNX / Tensor if needed), expose REST/gRPC, batch/streaming modes.

Optimize RTF (realtime factor), memory

Design training curricula, LR schedules, frozen vs full finetune, LoRA, data mixing.

Build robust test suites: clean/noisy/telephony, accents, domain terminology.

Analyze error types (sub/del/ins), OOV handling, bias; propose targeted fixes.

Experience in ASR, LM fusion, tokenization/normalization, data curation, PyTorch, Hugging Face, CUDA; experience finetuning seq2seq/CTC/RNNT; SoX/FFmpeg; metrics (WER/CER, latency), NeMo, Triton, Torch/TensorRT, LoRA/PEFT.

Overview:
LanguageLine Solutions is seeking a highly specialized Machine Learning Engineer with hands-on experience working with Whisper, OpenAI s speech-to-text model. This role will focus on fine-tuning and training Whisper to improve transcription accuracy and performance across multilingual datasets, supporting LanguageLine s mission to deliver world-class translation and interpretation services.

Key Responsibilities:

  • Fine-tune and optimize Whisper for specific use cases and languages
  • Analyze model performance and identify areas for improvement
  • Collaborate with engineering teams to integrate Whisper into production workflows
  • Apply broader ML techniques to enhance speech recognition and translation pipelines
  • Recommend tools, frameworks, and best practices for scalable deployment

Qualifications:

  • Proven experience with Whisper or similar speech-to-text models
  • Strong background in machine learning, deep learning, and NLP
  • Familiarity with audio processing and multilingual datasets
  • Ability to work independently and communicate findings clearly
  • Experience with Python, PyTorch, and ML frameworks preferred

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.