Our client is a global leader in the iOT space for retail loss prevention, operations management, and analytics, with our headquarters based in South OC, California. They maintain a strong presence across the globe, with offices in the UK, Australia, China, Hong Kong, Germany, France, and Canada
They are urgently seeking a Mid-Sr. level Data & Applied ML Engineer with strong Python, SQL, ETL/ELT and ML skills. This role bridges the gap between core data engineering and practical machine learning applications. Primarily, you will be a data platform engineer responsible for owning core data pipelines, data models, and quality controls that power Gatekeeper analytics and future data products.
Secondarily, you will drive the production lifecycle of our shopping cart computer-vision feature. You will orchestrate the data workflows that interface with our Machine Learned models to ensure accurate cart classification, while leveraging the FaceFirst ML team for deeper capacity. You will collaborate with BI Analysts, software engineers, and product teams to transform raw data into actionable insights.
This Position is 5 days a week onsite in South Orange County
Responsibilities - Pipeline Design & Operation: Design, build, and operate scalable ELT/ETL pipelines that ingest data from IoT/smart-cart telemetry, video events, operational systems, and external partners into our cloud data lake/warehouse.
- Infrastructure Management: Build and maintain robust data infrastructure, including databases (SQL and NoSQL), data warehouses, and data integration solutions.
- Data Modeling: Establish canonical data models and definitions (schemas, event taxonomy, metrics) so teams can trust and reuse the same data across products, BI, and analytics.
- Data Quality Assurance: Own data quality end-to-end by implementing validation rules, automated tests, anomaly detection, and monitoring/alerting to prevent and quickly detect regressions.
- Consistency & Governance: Drive data consistency improvements across systems (naming, identifiers, timestamps, joins, deduplication) and document data contract expectations with producing teams.
- Root Cause Analysis: Troubleshoot pipeline and data issues, perform root-cause analysis, and implement durable fixes that improve reliability and reduce operational load.
- Collaboration & Analytics: Partner with BI Analysts and Product teams to create curated datasets and self-serve analytics foundations (e.g., marts/semantic layer), as well as support internally facing dashboards to communicate system health.
Required Skill / Must Haves - 5+ years of relatable work experience.
- Core Engineering: Strong experience building and operating production ELT/ETL pipelines and data warehouses.
- Programming: Fluency in SQL and Python (or similar) for data transformation, validation, and automation.
- Cloud Platforms: Experience with cloud data platforms (Azure and/or Google Cloud Platform), including object storage, security/access controls, and cost-aware design.
- Tooling: Hands-on experience with orchestration and transformation tooling (e.g., Airflow/Prefect) and batch processing frameworks (e.g., Spark/Databricks).
- Quality Practices: Practical experience implementing data quality practices (tests, monitoring/alerting, lineage/documentation) and improving data consistency across systems.
- Operations: Collaborate with operational teams to identify, diagnose, and remediate in-field system issues.
- Bachelor's Degree in Computer Science, Software Engineering, Information Systems, Mathematics, Statistics, or a related technical field.
Nice to have Skills - Lifecycle Management: Own the production lifecycle for the cart classification capability, including data collection/labeling workflows, evaluation, threshold tuning, and safe release/rollback processes.
- Pipeline Implementation: Implement and optimize machine learning pipelines, from feature engineering and model training to deployment and monitoring in production.
- Evaluation & Monitoring: Build and maintain an evaluation harness (offline metrics + repeatable test sets) and ongoing monitoring (accuracy drift, data drift, false positive/negative analysis).
- Cross-Team Collaboration: Collaborate with the FaceFirst ML team to incorporate improvements (model updates, feature changes) while keeping Gatekeeper's production integration stable.
- Integration: Work with software engineers to ensure the classifier integrates cleanly into the product workflow with robust telemetry, logging, and operational runbooks.
Education And/Or Experience - BSEE, MSEE, BSCS, or MSCS
The Offer - Attractive total compensation package between 150-180k
- Comprehensive healthcare benefits including medical, dental, and vision coverage; Life/ADD/LTD insurance; FSA/HSA options
- 401(k) Plan with employer match
- Generous paid time off policy
- Observance of 11 paid company holidays