Lip Sync Engineer

Overview

Remote

100,000 - 110,000

Full Time

No Travel Required

Unable to Provide Sponsorship

Skills

Animation

Artificial Intelligence

C++

Cloud Computing

Collaboration

FOCUS

Fluency

Machine Learning (ML)

Human-computer Interaction

Marketing Operations

Python

Performance Tuning

Real-time

Mapping

Scripting

Software Engineering

Visualization

Replication

Speech Processing

User Experience

NATURAL

Multilingual

Job Details

Lip Sync Engineer – Real-Time Avatar Synchronization


Work location	Remote - full time (Client is in St. Louis, MO Central times)
Term	6 month - can extend to 1 year+
Work Status

Skills for this job

not a developer role, but they want someone "technically fluent". who can write scripting and who has created lip sync coding.
integrate text, audio, and facial animation—ensuring avatars speak convincingly across languages and platform
AI, animation, and engineering to optimize performance and integration
phoneme/viseme mapping, and facial animation integration.
multi-language speech synchronization and ensuring phoneme accuracy across diverse languages.
Python, C++, or related languages to achieve 1–3 second lip-sync generation times.
ML/AI frameworks applied to speech-to-animation synchronization.
design and optimize pipelines for real-time face animation and voice integration.

Submit filled out - to be reviewed

Candidate Name

C-Phone

C-Email

Lip Sync Engineer

/Coding

Years

Real-Time Avatar Synchronization

Years

Python

Years

C++

Years

C- Location

Salary

Rate W2

+5 Fee

Your email

10/28/25

This is not a developer role, but they want someone "technically fluent". who can write scripting and who has created lip sync coding. We’re seeking a technically skilled Lip Sync Engineer to bring our avatars to life with natural, real-time speech synchronization. This role focuses on building systems that seamlessly integrate text, audio, and facial animation—ensuring avatars speak convincingly across languages and platforms. You’ll work at the intersection of AI, animation, and engineering to deliver lip-sync generation in under 3 seconds, enabling immersive experiences for clients and partners.

The Lip Sync Engineer role focuses on developing real-time, high-accuracy lip synchronization systems for digital avatars, enabling natural speech and facial movement across multiple languages and platforms. The candidate will work at the intersection of AI, animation, and engineering to optimize performance and integration. This position emphasizes technical fluency and collaboration rather than pure development, aiming to deliver seamless, immersive user experiences.

2. Required Skills and Experience

Strong technical knowledge of lip-sync technologies, phoneme/viseme mapping, and facial animation integration.
Experience working with multi-language speech synchronization and ensuring phoneme accuracy across diverse languages.
Proficiency in scripting and system optimization using Python, C++, or related languages to achieve 1–3 second lip-sync generation times.
Familiarity with ML/AI frameworks applied to speech-to-animation synchronization.
Ability to design and optimize pipelines for real-time face animation and voice integration.
Demonstrable experience in system performance tuning to meet speed and accuracy benchmarks.
Experience liaising with product, creative, or client teams to align technical outputs with user experience goals.

3. Desired Skills and Experience

Knowledge of avatar expression and emotion blending techniques.
Familiarity with multilingual phoneme/viseme mapping challenges and solutions.
Experience with animation/visualization pipelines including image, audio, and video input formats.
Certifications or coursework in AI/Machine Learning applied to animation or speech processing.
Exposure to cloud-based or distributed systems supporting real-time avatars.

4. Preferred Qualifications

Strong collaboration skills with creative teams and stakeholders.
Ability to adapt to modular and integrated system environments.
Passion for innovative avatar and speech technologies, with a focus on language inclusivity.

5. Required Education

Bachelor’s degree in Computer Science, Software Engineering, Human-Computer Interaction, or relevant technical discipline.

6. Skills Glossary

Lip-sync technologies: Systems that generate synchronized mouth movements matching speech audio in real time for avatars.
Phoneme/viseme mapping: Assigning speech sound units to visual mouth shapes, crucial for accurate lip movement replication.
Facial animation integration: Combining lip movements with facial expressions and gestures for realistic avatar communication.
Real-time processing: Executing speech and facial animation generation within 1–3 seconds to ensure seamless interaction.
ML/AI frameworks: Machine learning models and tools used to improve speech-to-animation synchronization accuracy and speed.
Performance optimization: Techniques to enhance system speed and accuracy, critical for real-time avatar speech.
Multilingual support: Developing lip-sync solutions that accurately handle diverse languages with unique phonetic characteristics.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Job Details

About S3 Staffing USA

Share