Data Engineer IV

Overview

On Site
Accepts corp to corp applications
Contract - W2
Contract - 12 Month(s)

Skills

Data Integrity
Workflow
Multimedia
Audiovisual
Sensors
Data Processing
Extraction
Management
Training
Quality Assurance
Prototyping
Regulatory Compliance
Data Governance
Computer Science
Data Science
Data Engineering
NumPy
Pandas
Linux
Shell Scripting
Video
PyTorch
Python
Machine Learning (ML)
NoSQL
Database
Research
Health Insurance
Insurance
Team Building
Collaboration
Wiki
Knowledge Base
Status Reports
Account Management
IT Consulting
Managed Services
Recruiting
Artificial Intelligence
Cyber Security
Enterprise Architecture
Value Engineering
Lean Methodology

Job Details

Are you passionate about transforming complex multimedia data into powerful machine learning datasets? We're on the lookout for a skilled Data Engineer to operate at the fascinating intersection of data engineering and applied machine learning. In this role, you'll design and implement robust data-processing pipelines, turning raw, intricate data into clean, research-ready datasets. If you thrive on building scalable solutions and ensuring data integrity for cutting-edge ML workflows, we'd love to collaborate with you!
Responsibilities:
  • Design, develop, and maintain scalable data-processing pipelines for large volumes of multimedia (audio, video) and sensor data (e.g. IMU), ensuring reliability and reproducibility.
  • Gather and interpret processing requirements from stakeholders, translating them into practical technical solutions and devising novel approaches where needed.
  • Perform diverse data-processing operations, from mathematical transformations and filtering to feature extraction, synchronisation, and inference through ML models.
  • Interface with various internal tooling, such as dataset management systems and training frameworks to prepare raw data for machine learning, including validation, transformation, and quality assurance.
  • Collaborate with machine learning researchers to integrate research prototypes into production pipelines.
  • Ensure compliance with data governance, security, and relevant standards.
Minimum requirements:
  • Bachelor's degree in a relevant technical field (e.g., Computer Science, Data Science) with a minimum of 3 years of industry experience in machine learning or data engineering; or equivalent combination of education and experience.
  • Demonstrable programming experience in Python using common ML and data libraries, i.e., numpy, scipy, pandas.
  • Proficiency in Linux and shell scripting.
  • Working knowledge of audio, image, and video formats.
Preferred experience:
  • Experience using PyTorch or other Python machine-learning frameworks.
  • Experience with relational and graph / NoSQL databases.
  • Experience using REST APIs for data interactions.
  • Experience working in a research environment.
  • Strong mathematical background.
Benefits
  • 401(k).
  • Dental Insurance.
  • Health insurance.
  • Vision insurance.
  • We are an equal-opportunity employer and value diversity, equality, inclusion, and respect for people.
  • The salary will be determined based on several factors, including, but not limited to, location, relevant education, qualifications, experience, technical skills, and business needs.
Additional Responsibilities
  • Participate in OP monthly team meetings and participate in team-building efforts.
  • Contribute to OP technical discussions, peer reviews, etc.
  • Contribute content and collaborate via the OP-Wiki/Knowledge Base.
  • Provide status reports to OP Account Management as requested.
About us:
At OP, we help you harness the power of technology for maximum impact. A technology consulting and solutions company, we offer advisory and managed services, innovative platforms, and staffing solutions across a wide range of fields including AI, cyber security, enterprise architecture, and beyond. For nearly two decades, we've been challenging the status quo of the consulting industry serving up fresh, ingenious thinking through a radically lean structure. Together, this strategy delivers unprecedented performance at an unparalleled pace for faster results that propel your business forward.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.