Scala / Spark — production experience writing Spark applications in Scala (not just notebooks); comfortable with the DataFrame API, joins, window functions, partitioning, and performance tuning
Databricks — Serverless compute, Unity Catalog, Asset Bundles, Databricks CLI
SQL fluency — confortable writing, analyzing and extracting requirements from complex SQL scripts
Snowflake — schema design, performance, Spark-Snowflake connector
Azure — ADLS, networking basics, secrets/identity (Entra ID / managed identities)
Orchestration — Airflow (DAG authoring, sensors, retries, SLAs)
CI/CD — Artifactory, GitHub Actions pipelines: build, sharded test matrices, artifact promotion through dev → QA → UAT → prod
Testing — Experience in TDD, writing unit tests (ScalaTest, AnyFlatSpec) and BDD (Concordion or equivalent)
Data quality & reconciliation — building automated parity checks against legacy outputs, drift detection, row-level reconciliation tooling
Large-scale migrations — proven track record migrating legacy ETL (Autosys/Informatica/etc.) to cloud data platforms, including dependency mapping and cutover planning
Modern data engineering practices — medallion architecture (Bronze/Silver/Gold), idempotent pipelines, schema evolution, lineage, observability
Nice-to-have
Financial services / regulatory reporting domain
Python (Databricks utilities, tooling)
Spec-driven development workflows (specs → plans → tasks → implementation)
Gradle (composite builds) and JVM tooling