Machine Learning Engineer

Overview

Hybrid
Depends on Experience
Contract - Independent
Contract - W2
Contract - 12 Month(s)

Skills

Tensorflow or PyTorch
(ASR) and text-to-speech (TTS)
Agile Scrum
Python

Job Details

We are looking for a Machine Learning Engineer who will use machine learning and statistical techniques to help us create state-of-the-art solutions for non-trivial, and arguably, unsolved problems. If you are results driven, interested in how to apply advanced Machine Learning techniques, would love to work with voice and text, are deeply technical, highly innovative and long for the opportunity to build solutions for challenging problems that directly impact the company's bottom-line, we want to talk to you.

Responsibilities:

Design, implement and optimize an end to end Conversational Speech LLM-based virtual assistant

Evaluate and benchmark speech native models (Moshi, SesameAI etc) for in-vehicle applications

Design and execute model fine-tuning strategies for automative domain adaptation.

Implement tool API frameworks for vehicle system control

Implement hardware specific optimization for Qualcomm SA8255P platform.

Develop and maintain Python code for audio preprocessing, model integration and hardware optimization

Document architecture analyses, benchmarking results and optimization approaches

Execute full modeling life cycle including data cleansing, feature creation and iterative model selection

Work in a fast-paced Agile Scrum environment to assist in prototyping, designing, and implementing predictive models and algorithms to create real world solutions

Required Qualifications:

3+ years of hands-on experience in Machine Learning in a corporate environment

Deep understanding of Voice2Voice Architectures and Speech native models

Experience with model quantization techniques and optimization for edge devices.

Strong background in audio processing including feature extraction, noise handling and acoustic modeling.

Experience working with Tensorflow or PyTorch

Experience with speech recognition (ASR) and text-to-speech (TTS) technologies (Speech Encoders, Transformer variants used in ASR and TTS)

Knowledge of multimodal learning and techniques for fusing audio and text information. (Attention Mechanisms, Cross Modal Attention)

Solid understanding of audio processing concepts, including audio feature extraction, signal processing, and acoustic modeling (TorchAudio, Librosa)

Experience with fine-tuning transformer models and developing training pipelines. (Huggingface Transformers, PyTorch/Tensorflow, Distributed training pipelines)

Research Experience in Academia or industry around transformer and multimodal technologies.

Familiar with algorithm design and complexity analysis

Strong decision-making skills with the ability to analyze data, assess risks, and implement effective solutions in a fast-paced environment

Problem-solving skills with the ability to identify challenges, develop creative solutions, and implement effective strategies

Proven ability to learn and apply new technologies, programming practices, patterns, and methods

Experience collaborating effectively with cross-functional teams, including developers, designers, and product owners

Experience taking ownership of assigned projects and tasks, proactively driving them to completion while ensuring accountability for quality and deadlines.

Results-driven with a strong track record of setting goals, executing strategies, and delivering measurable outcomes

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.