DIRECT HIRE ROLE IN HOUSTON, TX
We have a client seeking a Lead Data Engineer on a Direct Hire basis. As a Lead Data Engineer, you will play a critical role in designing, implementing, and managing scalable, high-performance data infrastructure. This role blends systems engineering, data integration, and analytics expertise to support advanced analytics, machine learning initiatives, and real-time data processing.
The ideal candidate brings deep experience with Lakehouse design principles, including layered Medallion Architecture (Bronze, Silver, Gold), to deliver governed, scalable, and high-quality data solutions. This is a highly visible leadership role, responsible for representing the data engineering function and leading the Data Management Community of Practice across the organization.
Key Responsibilities
Design and implement scalable, reliable data pipelines to ingest, process, and store diverse data sets using technologies such as Apache Spark, Hadoop, and Kafka.
Leverage cloud platforms such as AWS or Azure, utilizing services including EC2, RDS, S3, Lambda, and Azure Data Lake for efficient data processing and storage.
Architect and operationalize Lakehouse solutions using Medallion Architecture best practices, ensuring data quality, lineage, governance, and usability across all layers.
Develop and optimize data models and storage solutions (e.g., Databricks, Data Lakehouses) to support both operational and analytical use cases.
Implement and manage ETL/ELT workflows using tools such as Apache Airflow and Fivetran to ensure reliable, automated data integration.
Lead the Data Management Community of Practice by facilitating collaboration, defining best practices, and representing data engineering across technical and business teams.
Partner with data scientists to enable advanced analytics and machine learning initiatives, supporting data processing using Python or R.
Enforce data governance, security, and compliance standards, including encryption, masking, and access controls within cloud environments.
Monitor, troubleshoot, and optimize data pipelines and databases to ensure performance, reliability, and scalability.
Stay current with emerging data engineering technologies and advocate for continuous improvement across the data ecosystem.
What They’re Looking For
Education & Experience
Bachelor’s degree in Computer Science, MIS, or a related discipline with 10+ years of data engineering experience OR
Master’s degree in a related discipline with 5+ years of data engineering experience
Proven experience designing and operating large-scale data pipelines and architectures
Required Skills & Expertise
Hands-on experience implementing Medallion Architecture within a Databricks Lakehouse environment
Strong expertise in ETL/ELT development and orchestration
In-depth knowledge of Databricks, Dataiku, and cloud-native data platforms
Experience with big data technologies (Apache Spark, Hadoop, Kafka)
Strong AWS experience, including integration of cloud compute and storage with Databricks
Proficiency in SQL and programming languages such as Python, Java, or Scala
Hands-on RDBMS experience, including data modeling, analysis, and stored procedures
Familiarity with machine learning model deployment and lifecycle management
Strong executive presence with the ability to lead communities of practice and communicate effectively with senior leadership
Preferred Certifications
AWS Certified Solutions Architect
Databricks Certified Associate Developer for Apache Spark
DAMA CDMP or other relevant certifications
Physical & Environmental Requirements
The role requires the ability to analyze data, communicate effectively, and remain in a stationary position for extended periods while working at a computer. Occasional movement around the office or campus is required, with the ability to lift up to 10 pounds frequently and up to 25 pounds occasionally.
Travel Requirements
Up to 20% travel, including occasional out-of-state travel, may be required.