DIRECT HIRE ROLE IN HOUSTON, TX
We have a client seeking a Lead Data Engineer on a Direct Hire basis. As a Lead Data Engineer, you will play a critical role in designing, implementing, and managing scalable, high-performance data infrastructure. This role blends systems engineering, data integration, and analytics expertise to support advanced analytics, machine learning initiatives, and real-time data processing.
The ideal candidate brings deep experience with Lakehouse design principles, including layered Medallion Architecture (Bronze, Silver, Gold), to deliver governed, scalable, and high-quality data solutions. This is a highly visible leadership role, responsible for representing the data engineering function and leading the Data Management Community of Practice across the organization.
Key Responsibilities
-
Design and implement scalable, reliable data pipelines to ingest, process, and store diverse data sets using technologies such as Apache Spark, Hadoop, and Kafka.
-
Leverage cloud platforms such as AWS or Azure, utilizing services including EC2, RDS, S3, Lambda, and Azure Data Lake for efficient data processing and storage.
-
Architect and operationalize Lakehouse solutions using Medallion Architecture best practices, ensuring data quality, lineage, governance, and usability across all layers.
-
Develop and optimize data models and storage solutions (e.g., Databricks, Data Lakehouses) to support both operational and analytical use cases.
-
Implement and manage ETL/ELT workflows using tools such as Apache Airflow and Fivetran to ensure reliable, automated data integration.
-
Lead the Data Management Community of Practice by facilitating collaboration, defining best practices, and representing data engineering across technical and business teams.
-
Partner with data scientists to enable advanced analytics and machine learning initiatives, supporting data processing using Python or R.
-
Enforce data governance, security, and compliance standards, including encryption, masking, and access controls within cloud environments.
-
Monitor, troubleshoot, and optimize data pipelines and databases to ensure performance, reliability, and scalability.
-
Stay current with emerging data engineering technologies and advocate for continuous improvement across the data ecosystem.
What They’re Looking For
Education & Experience
-
Bachelor’s degree in Computer Science, MIS, or a related discipline with 10+ years of data engineering experience OR
-
Master’s degree in a related discipline with 5+ years of data engineering experience
-
Proven experience designing and operating large-scale data pipelines and architectures
Required Skills & Expertise
-
Hands-on experience implementing Medallion Architecture within a Databricks Lakehouse environment
-
Strong expertise in ETL/ELT development and orchestration
-
In-depth knowledge of Databricks, Dataiku, and cloud-native data platforms
-
Experience with big data technologies (Apache Spark, Hadoop, Kafka)
-
Strong AWS experience, including integration of cloud compute and storage with Databricks
-
Proficiency in SQL and programming languages such as Python, Java, or Scala
-
Hands-on RDBMS experience, including data modeling, analysis, and stored procedures
-
Familiarity with machine learning model deployment and lifecycle management
-
Strong executive presence with the ability to lead communities of practice and communicate effectively with senior leadership
Preferred Certifications
-
AWS Certified Solutions Architect
-
Databricks Certified Associate Developer for Apache Spark
-
DAMA CDMP or other relevant certifications
Physical & Environmental Requirements
The role requires the ability to analyze data, communicate effectively, and remain in a stationary position for extended periods while working at a computer. Occasional movement around the office or campus is required, with the ability to lift up to 10 pounds frequently and up to 25 pounds occasionally.
Travel Requirements
Up to 20% travel, including occasional out-of-state travel, may be required.