AI/ML Engineer Speech to Text ML Engineer (Whisper Focus)

Overview

Remote

Depends on Experience

Accepts corp to corp applications

Contract - Independent

Contract - W2

Skills

Whisper

NLP

AI/ML

Python

Job Details

Job Title: Speech to Text ML Engineer (Whisper Focus)

Location: 100% Remote

Duration: 6+ Months

Notes:

AI/ML Engineer (Contractor)

Mission: To work on speech-to-text finetuning and inference in multiple languages.

Responsibilities

Train and fine-tune with speech-to-text(ASR) models like Whisper.

Implement training/eval loops (PyTorch + Transformers).

Add data augmentations (noise/reverb/law), VAD, diarization (e.g., Whisper).

Package inference (ONNX / Tensor if needed), expose REST/gRPC, batch/streaming modes.

Optimize RTF (realtime factor), memory

Design training curricula, LR schedules, frozen vs full finetune, LoRA, data mixing.

Build robust test suites: clean/noisy/telephony, accents, domain terminology.

Analyze error types (sub/del/ins), OOV handling, bias; propose targeted fixes.

Experience in ASR, LM fusion, tokenization/normalization, data curation, PyTorch, Hugging Face, CUDA; experience finetuning seq2seq/CTC/RNNT; SoX/FFmpeg; metrics (WER/CER, latency), NeMo, Triton, Torch/TensorRT, LoRA/PEFT.

Overview:
LanguageLine Solutions is seeking a highly specialized Machine Learning Engineer with hands-on experience working with Whisper, OpenAI s speech-to-text model. This role will focus on fine-tuning and training Whisper to improve transcription accuracy and performance across multilingual datasets, supporting LanguageLine s mission to deliver world-class translation and interpretation services.

Key Responsibilities:

Fine-tune and optimize Whisper for specific use cases and languages
Analyze model performance and identify areas for improvement
Collaborate with engineering teams to integrate Whisper into production workflows
Apply broader ML techniques to enhance speech recognition and translation pipelines
Recommend tools, frameworks, and best practices for scalable deployment

Qualifications:

Proven experience with Whisper or similar speech-to-text models
Strong background in machine learning, deep learning, and NLP
Familiarity with audio processing and multilingual datasets
Ability to work independently and communicate findings clearly
Experience with Python, PyTorch, and ML frameworks preferred

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Job Details

Share