Job#: 3027392 Job Description:Data Engineer - Databricks & Lakehouse Platform
Permanent/Contract to Hire | Cloud Data Engineering | Analytics EnablementOverview
We are seeking a skilled
Data Engineer to design, build, and maintain
batch ETL pipelines on a modern
Databricks Lakehouse platform. This role supports enterprise-scale analytics and reporting by delivering high-quality, well-governed data to downstream consumers such as
Power BI dashboards and analytical models.
The ideal candidate brings strong technical depth in
Apache Spark, Delta Lake, and Python, along with a disciplined approach to documentation, data quality, and operational excellence. This role requires autonomy, strong prioritization skills, and a continuous-improvement mindset.
Key Responsibilities
Data Pipeline Development
- Design, build, and maintain batch ETL pipelines ingesting data from diverse source systems into Databricks
- Own the full pipeline lifecycle: ingestion, transformation, serving, monitoring, and incident resolution
- Ensure pipelines are scalable, maintainable, and production-ready
Data Quality & Integrity
- Implement automated data validation, reconciliation checks, and quality gates
- Ensure data meets standards for accuracy, completeness, timeliness, and consistency
- Maintain historical data to support auditability and compliance requirements
Performance Optimization
- Optimize Spark workloads using efficient Spark SQL, partitioning strategies, and Delta Lake best practices
- Perform Delta Lake maintenance activities such as OPTIMIZE, VACUUM, and Z-ordering
Data Modeling & Analytics Enablement
- Design dimensional data models (star schemas, aggregates) optimized for Power BI and self-service analytics
- Prepare semantic data layers with clean business logic, intuitive naming conventions, and well-documented relationships
- Support AI-enabled analytics use cases through high-quality, analytics-ready data structures
Process, Documentation & Engineering Practices
- Create and maintain runbooks, deployment procedures, coding standards, and operational documentation
- Participate in code reviews and contribute to automated quality checks
- Help establish repeatable processes that enable team scalability
Governance, Security & Compliance
- Adhere to enterprise data governance standards and security best practices
- Implement access controls, encryption, and data lineage tracking for sensitive data
- Support compliance with applicable regulations (e.g., AML, BSA)
Collaboration & Communication
- Partner with data architects, analysts, and business stakeholders to translate requirements into scalable solutions
- Communicate technical tradeoffs, risks, and timelines clearly to non-technical audiences
Mentorship & Continuous Improvement
- Support junior and mid-level engineers through code reviews, pairing, and coaching
- Identify opportunities to improve pipeline reliability, efficiency, and scalability
- Evaluate and adopt new Databricks features and industry best practices
Required Skills & Experience
- Bachelor's degree in Computer Science or a related field
- (or equivalent practical experience)
- 5+ years of experience as a Data Engineer in complex, large-scale data environments
- Strong hands-on experience with Databricks and Apache Spark (PySpark, Spark SQL)
- Production experience with Delta Lake and Lakehouse architectural patterns
- Proficiency in Python for data processing and SQL for complex transformations
- Strong understanding of dimensional modeling, star schemas, and BI-optimized table design
- Experience designing end-to-end pipeline architectures including orchestration, monitoring, and error handling
- Hands-on experience implementing data quality checks and automated testing
- Familiarity with Git-based version control and CI/CD workflows
- Strong written and verbal communication skills
Preferred / Nice-to-Have Skills
- Experience in financial services or other regulated industries
- Familiarity with regulatory or compliance-driven data pipelines
- Experience with Databricks Unity Catalog
- Power BI development, DAX, or semantic modeling experience
- Exposure to Databricks AI/BI capabilities or AI-ready data layers
- Knowledge of streaming or event-driven architectures (Spark Structured Streaming, Kafka)
- Understanding of DataOps practices
- Relevant certifications (Databricks, cloud data engineering, etc.)
Physical Requirements
- Ability to communicate effectively via written and verbal methods
- Ability to sit for extended periods and perform computer-based work
- Visual acuity sufficient for data analysis, documentation, and system monitoring
- No substantial exposure to adverse environmental conditions
EEO Employer
Apex Systems is an equal opportunity employer. We do not discriminate or allow discrimination on the basis of race, color, religion, creed, sex (including pregnancy, childbirth, breastfeeding, or related medical conditions), age, sexual orientation, gender identity, national origin, ancestry, citizenship, genetic information, registered domestic partner status, marital status, disability, status as a crime victim, protected veteran status, political affiliation, union membership, or any other characteristic protected by law. Apex will consider qualified applicants with criminal histories in a manner consistent with the requirements of applicable law. If you have visited our website in search of information on employment opportunities or to apply for a position, and you require an accommodation in using our website for a search or application, please contact our Employee Services Department at or .
Apex Systems is a world-class IT services company that serves thousands of clients across the globe. When you join Apex, you become part of a team that values innovation, collaboration, and continuous learning. We offer quality career resources, training, certifications, development opportunities, and a comprehensive benefits package. Our commitment to excellence is reflected in many awards, including ClearlyRated's Best of Staffing in Talent Satisfaction in the United States and Great Place to Work in the United Kingdom and Mexico. Apex uses a virtual recruiter as part of the application process. Click for more details.
Apex Benefits Overview: Apex offers a range of supplemental benefits, including medical, dental, vision, life, disability, and other insurance plans that offer an optional layer of financial protection. We offer an ESPP (employee stock purchase program) and a 401K program which allows you to contribute typically within 30 days of starting, with a company match after 12 months of tenure. Apex also offers a HSA (Health Savings Account on the HDHP plan), a SupportLinc Employee Assistance Program (EAP) with up to 8 free counseling sessions, a corporate discount savings program and other discounts. In terms of professional development, Apex hosts an on-demand training program, provides access to certification prep and a library of technical and leadership courses/books/seminars once you have 6+ months of tenure, and certification discounts and other perks to associations that include CompTIA and IIBA. Apex has a dedicated customer service team for our Consultants that can address questions around benefits and other resources, as well as a certified Career Coach. You can access a full list of our benefits, programs, support teams and resources within our 'Welcome Packet' as well, which an Apex team member can provide.