AI/ML Architect with Databricks, AWS
Los Angeles, CA (Hybrid)
Contract Role Overview
We are seeking an experienced AI/ML Architect with deep hands-on expertise in Databricks on AWS to lead the design and implementation of scalable, highperformance data and machine learning platforms. The ideal candidate combines architectural thinking with strong engineering execution, demonstrating the ability to build modern lakehouse systems, optimize largescale pipelines, and drive analytical and ML capabilities across the organization.
This role requires working with large, multi-terabyte datasets, advanced analytics, and endtoend ML lifecycle management using Databricks, Python, PySpark, and AWS-native services.
Must Demonstrate (Critical Competencies)
Designing Databricksbased lakehouse architectures on AWS (Delta Lake + S3 + Unity Catalog).
Clear separation of compute vs. serving layers in distributed architectures.
Low-latency API strategy where Spark is insufficient (e.g., leveraging optimized services or caching).
Caching strategies to accelerate reads and reduce compute cost.
Data partitioning, file size tuning, and optimization strategies for large-scale pipelines.
Experience handling multi-terabyte structured timeseries workloads.
Ability to distill architectural significance from ambiguous business requirements.
Strong curiosity, questioning, and requirementprobing mindset.
Playercoach approach: hands-on technical depth + ability to guide design.
Skills & Qualifications
Required
Bachelor s or Master s in Computer Science, Data Science, Engineering, Statistics, or related field.
10+ years of experience in data engineering, ML engineering, or AI/ML architecture roles.
Deep expertise in Databricks on AWS, including:
PySpark / Spark SQL
Databricks Notebooks
Delta Lake
Unity Catalog
MLflow
Databricks Jobs & Workflows
Strong programming ability in Python (pandas, numpy, scikit-learn).
Demonstrated experience with large-scale, multi-terabyte data processing.
Strong understanding of ML algorithms, distributed systems, and data optimization.