Title: Senior Databricks Engineer (AI-Ready Data Platforms)
Location: Houston, Texas, United States
Job Description
We are seeking a Senior Databricks Engineer to design, build, and operationalize a modern, AI-ready data platform. This role is responsible for preparing data pipelines, lakehouse architectures, and governance frameworks that support AI ingestion, model training, inference, and downstream analytics. You will work closely with the Data Architect, AI/ML teams, and business stakeholders to ensure the platform follows best practices, scales efficiently, and supports both real-time and batch AI workloads. This is a hands-on senior role requiring strong technical depth, architectural judgment, and the ability to influence platform decisions.
Roles & Responsibilities
Databricks Platform & Lakehouse Engineering
- Design, implement, and optimize Databricks Lakehouse architectures for AI and analytics use cases.
- Build and maintain high-performance ETL/ELT pipelines using Spark, Delta Lake, and Databricks Workflows.
- Ensure data is AI-ready (clean, normalized, versioned, feature-ready) for model ingestion and inference.
- Optimize compute, storage, and query performance for large-scale workloads.
AI Enablement & Data Readiness
- Prepare structured and semi-structured data for:
- Model training
- Feature engineering
- Real-time and batch inference
- Implement best practices for:
- Data versioning
- Lineage and reproducibility
- Data quality validation
- Support integration with ML platforms, feature stores, and vector databases where applicable.
Architecture Collaboration & Best Practices
- Partner closely with the Data Architect to:
- Validate architectural decisions
- Provide implementation-level feedback
- Recommend improvements based on real-world performance and scale
- Influence standards around:
- Data modeling
- Partitioning strategies
- Security and access controls
- Cost optimization
Platform Integration & Analytics Enablement
- Enable downstream consumption across platforms such as:
- Sigma
- Grafana
- Oracle Fusion
- BI, reporting, and operational dashboards
- Support integration with enterprise systems, APIs, and event-driven architectures.
- Ensure observability, monitoring, and alerting are in place for data pipelines and platform health.
Governance, Security & Reliability
- Implement data governance, privacy, and security controls aligned with enterprise standards.
- Support role-based access control (RBAC), auditability, and compliance requirements.
- Contribute to platform resiliency, backup, and recovery strategies.
Required Qualifications
- 7+ years of experience in data engineering, with deep hands-on Databricks experience
- Strong expertise in:
- Apache Spark (PySpark / Scala)
- Delta Lake
- Databricks Workflows & Jobs
- Experience preparing data platforms for AI/ML ingestion and outputs
- Strong SQL skills and data modeling experience
- Experience collaborating with data architects and influencing platform design
- Familiarity with modern cloud data ecosystems ( Azure, AWS or Google Cloud Platform)