This position involves ongoing maintenance and hands-on development work within the existing data pipeline environment. The individual will be responsible for optimizing and maintaining pipelines, converting business requirements into technical solutions, and supporting orchestration and automation efforts across the platform.
The business team may provide SQL-based requirements or requests related to Passport and data movement processes; this individual must translate those into efficient, scalable solutions. In addition to pipeline optimization and maintenance, the person will configure AWS services, including Lambda functions and Step Functions, to support orchestration workflows and operational monitoring.
This role is highly critical because the individual will essentially own and manage the pipeline environment. Beyond technical development, they must communicate directly with Product Owners and business stakeholders, gather requirements, design and support jobs, and work across multiple business units and data domains. They must independently manage the full lifecycle—from requirement gathering and architecture discussions through implementation, orchestration, monitoring, and ongoing support.
The environment heavily leverages AWS technologies, particularly Step Functions and Lambda functions. While Step Functions may not be commonly used in every environment, they are central to this architecture and are used extensively for orchestration. The workflows currently support approximately 60 jobs running in a combination of sequential and parallel execution patterns. At the conclusion of each workflow, Lambda functions capture logs and auditing details to monitor job completion status, performance metrics, and notifications. The selected candidate will need to understand this orchestration model in depth and be comfortable configuring and maintaining these processes.
Another major focus area for this role is performance optimization. The current pipeline execution time is approximately three to four hours, and the goal is to reduce that runtime to under two hours. Because of the complexity of the environment, optimization and performance tuning will be a primary value-add for this resource. The candidate should have robust experience with Spark optimization, SQL performance tuning, and large-scale data pipeline efficiency improvements.
From a technical screening perspective, the primary areas of focus will likely include:
- SQL and advanced query optimization
- PySpark and Spark performance tuning
- Data pipeline architecture and optimization
- AWS orchestration technologies, including Step Functions and Lambda
- Data modeling and workflow orchestration concepts
While AWS Step Functions are a mandatory part of the environment, deeper orchestration expertise can also be evaluated further during the interview process. The broader project work is centered around orchestration, optimization, and operational support, so interview discussions will likely include scenario-based questions focused on those areas.
In addition, the person should be capable of acting as a subject matter expert for the broader team, including offshore resources in India. They should be comfortable answering technical questions, providing guidance, and supporting troubleshooting efforts as needed