Data Literacy & Architecture
- Understanding of data and its business purposes.
- Proficiency in Lake House and Data Warehouse patterns.
- Knowledge of common ETL/ELT and source system patterns.
- Competency in data modeling, data quality checks, and monitoring.
Databricks & Spark
- Experience with Databricks (notebooks, Unity Catalog, orchestration, and connectors).
- Knowledge of Iceberg and Delta formats.
- Strong General Spark skills, including distributed workloads, troubleshooting, and table/compute optimization.
- Ability to distinguish between Spark and single-node operations.
Software Engineering & Development Practices
- Application of SOLID and DRY principles.
- Experience with metadata frameworks and repeatable patterns.
- Ability to parameterize notebooks, use orchestration flows, and define efficiency-improving frameworks.
- Proficiency in SCRUM, Agile, and Dev practices.
- Experience with Git development practices (GitHub), including pull requests and version control.
- Knowledge of CI/CD approaches and Infrastructure as Code (Terraform).
Programming & Querying
- Python/PySpark: Proficiency in Python practices, package management, and data packages (Pandas, DataFrames).
- SQL: Ability to write effective queries with efficient filtering, joins, and workload optimization.