job summary:
Fluency in Python for data pipelines, automation, and APIs.
Experience with distributed engines such as Spark, Flink, or PySpark.
Expertise in scalable ETL/ELT pipelines and real-time streaming architectures.
Strong SQL and data modeling expertise.
location: Westlake, Texas
job type: Contract
salary: $72 - 73 per hour
work hours: 8am to 5pm
education: Bachelors
responsibilities:
- 1. Lakehouse Architecture (Apache Iceberg) Design and build Iceberg-based data lakes with ACID-compliant, versioned datasets.
- Implement Iceberg table evolution (schema evolution, partition spec, snapshot management).
- Develop best practices for Iceberg governance, metadata compaction, and performance tuning.
- 2. Data Pipelines & Distributed Processing Build scalable batch and streaming pipelines using AWS services (S3, EMR, Glue, Lambda, Step Functions).
- Develop ingestion and transformation workflows using Python , Spark , or Flink .
- Implement CDC pipelines using Kafka Connect or equivalent tooling.
- Ensure robust CI/CD integration with GitHub Actions or similar.
- 3. Streaming Data Engineering (Kafka) Design and operate Kafka-based streaming pipelines (Kafka/MSK).
- Build producers/consumers using Python or JVM languages.
- Implement patterns such as topic partitioning, compaction, schema registry, and event versioning.
- 4. Data Modeling, Quality, and Observability Design data models for analytical and operational use cases using Iceberg tables.
- Implement automated data quality checks , validation rules, and anomaly detection.
- Build lineage, monitoring, alerting, and pipeline observability.
- 5. AWS Architecture & Operations Apply best practices for AWS security, cost optimization, and data governance.
- Manage IAM, KMS, S3 object lifecycle management, networking, and data encryption.
- Operationalize EMR/Glue jobs, containerized workloads, or serverless workloads.
- 6. Cross?Functional Collaboration Partner with analytics, platform, and product teams to deliver high-quality data products.
- Participate in design reviews, architecture discussions, and roadmap planning.
- Mentor junior engineers and contribute to engineering standards.
qualifications:
4-10+ years of experience in Data Engineering or similar roles.
Strong hands-on experience with Apache Iceberg (table design, evolution, metadata, partitioning).
Deep experience with AWS data stack:
S3, EMR, Lambda, Glue, IAM, Step Functions, CloudWatch
Strong proficiency in Kafka (producers/consumers, schema registry, partitioning strategies).
Equal Opportunity Employer: Race, Color, Religion, Sex, Sexual Orientation, Gender Identity, National Origin, Age, Genetic Information, Disability, Protected Veteran Status, or any other legally protected group status.
At Randstad Digital, we welcome people of all abilities and want to ensure that our hiring and interview process meets the needs of all applicants. If you require a reasonable accommodation to make your application or interview experience a great one, please contact
Pay offered to a successful candidate will be based on several factors including the candidate's education, work experience, work location, specific job duties, certifications, etc. In addition, Randstad Digital offers a comprehensive benefits package, including: medical, prescription, dental, vision, AD&D, and life insurance offerings, short-term disability, and a 401K plan (all benefits are based on eligibility).
This posting is open for thirty (30) days.
![]()