Job Title: PySpark / Python Data Engineer Tech Lead
Lead hands-on PySpark/Python/Java data engineering initiatives as a proven technical leader. Architect scalable AWS data platforms while actively coding, mentoring teams, and driving complex business transformations in Energy & Utilities domain.
Location: PST timezone (Remote/Hybrid)
Experience: 10 15+ years (Tech Lead with hands-on coding)
Focus: PySpark (primary), Python, Java (hands-on Tech Lead role required)
Domain: Energy & Utilities (advantage)
Role Summary (Tech Lead Emphasis)
Execute as hands-on Tech Lead architecting PySpark/Python/Java data solutions on AWS. Demonstrate proven leadership through active coding (60%+ hands-on), team mentoring, and translating complex business logic into production pipelines. Palantir Foundry experience highly desirable.
Key Responsibilities (Hands-On Leadership)
Hands-On Technical Leadership
- Lead by example developing PySpark transformations, Python ETL pipelines, and Java data integrations (EMR, Glue, Lambda).
- Architect end-to-end data platforms translating stored procedures, SQL triggers, and source-system rules into scalable Spark implementations.
- Actively code production-grade solutions (70% hands-on) while mentoring 4-8 engineers through code reviews and pair programming.
Pipeline Architecture
- Design scalable data pipelines with Spark Streaming, job orchestration, workflow design, and data mapping on AWS EMR/Glue.
- Implement API design and real-time processing using Java Spring Boot or Python FastAPI integrated with AWS services.
Tech Lead Responsibilities
- Mentor junior developers on PySpark best practices, Python optimization, and Java scalability patterns.
- Lead Agile ceremonies (stand-ups, sprint planning, retrospectives) with hands-on backlog refinement.
- Drive technical excellence through unit/integration tests, schema validations, and health checks.
Required Technical Expertise (Hands-On Required)
Primary Skills | Hands-On Requirements |
PySpark | DataFrames, Spark SQL, Streaming, Delta Lake (must code daily) |
Python | Pandas, NumPy, FastAPI, ETL automation, AWS SDK (Boto3) |
Java | Spring Boot, data integrations, AWS SDK (Java), microservices |
AWS | EMR, Glue, Lambda, Athena, Redshift, S3, Step Functions |
DevOps | CI/CD, GitHub Actions, Docker, Terraform/CloudFormation |
Tech Lead Toolkit: JIRA, Confluence, Agile/Scrum, code review processes, technical mentoring
Tech Lead Experience Profile
- PROVEN HANDS-ON TECH LEAD: 60%+ coding while leading 4-8 developers
- PySpark/Python/Java: Active development across multiple enterprise projects
- Long-term projects: 2+ years each demonstrating sustained leadership
- Energy/Utilities: Grid, SCADA, asset analytics, regulatory reporting (advantage)
- PST timezone: Full availability required
Must Demonstrate: Code samples/walkthroughs of PySpark transformations, Python automation, Java integrations during interview.
Keywords: PySpark Tech Lead, Python Data Engineer Lead, Java Data Engineer, hands-on Tech Lead, PySpark DataFrames, Spark Streaming, Spark SQL, Delta Lake, Python Pandas NumPy FastAPI, Java Spring Boot, AWS EMR Glue Lambda Athena Redshift S3, data pipeline orchestration, ETL ELT, job scheduling, workflow design, data mapping, business logic translation, stored procedures SQL triggers, Energy Utilities domain, Palantir Foundry Ontology, CI/CD GitHub Actions Docker Terraform, unit testing integration testing schema validation health checks, Agile, Scrum, JIRA, Confluence
About VDart Group
VDart Group is a global leader in technology, product, and talent solutions, serving Fortune 500 clients in 13 countries. With over 4,000 professionals worldwide, we deliver innovation, operational excellence, and measurable outcomes across industries. Guided by our commitment to People, Purpose, and Planet, VDart is recognized with an EcoVadis Bronze Medal and as a UN Global Compact member, reflecting our dedication to sustainable practices.