3 days ONSITE
Job Overview: We are seeking a skilled and motivated Mid-Level Research Scientist to join our team. The ideal candidate will focus on developing and deploying multimodal machine learning models specifically for speaker identification and verification tasks. This role involves designing and refining neural architectures that encompass various features, training and evaluating deep learning models, and enhancing the robustness of these systems for real-world applications in voice authentication and behavioral analysis.
Key Responsibilities:
•Model Development: Design innovative neural architectures that integrate speech,acoustic, and linguistic features for speaker identification and verification tasks.
•Data Handling: Train deep learning models on large-scale datasets, includingparticipation in the construction and annotation of specialized datasets, such as theAmerican Dream Dataset.
•Evaluation & Benchmarking: Benchmark age prediction and speaker verificationmodels, leveraging datasets to enhance model performance and demonstrate superiorgeneralization.
•Research Prototyping: Conduct research initiatives focused on cross-modalrepresentation learning and predictive modeling of political career advancement usingvoice quality and prosodic features.
•Optimization: Optimize existing models, including the development of lightweightarchitectures for resource-constrained environments, such as real-time image captioningsystems.
•Architecture Design: Evaluate and benchmark diverse adapter architectures for vision-text alignment, while achieving state-of-the-art performance metrics on establisheddatasets (e.g., COCO dataset).
•Collaboration: Collaborate with cross-functional teams to translate research findings intoscalable solutions and real-world applications.
Required:
•Master’s or PhD in Computer Science, Electrical Engineering, or a related field.
•3-5 years of experience in machine learning and deep learning, with a proventrack record of developing multimodal models.
•Strong proficiency in programming languages such as Python and frameworksincluding TensorFlow and PyTorch.
•Experience with acoustic and linguistic feature extraction and understanding ofspeaker identification and verification systems.
•Familiarity with natural language processing (NLP) and computer visionintegrations, particularly in real-time applications.
•Strong analytical and problem-solving skills, with the ability to work independentlyand as part of a team.
•Excellent communication skills to present complex technical concepts to diverseaudiences.
Preferred Skills:
•Publications in relevant conferences or journals.
•Experience in research involving behavioral analysis and authenticationsystems.
•Understanding of model efficiency and optimization strategies for deployingmachine learning models in production.