Overview
On Site
Contract - W2
Skills
File Systems
Engineering Design
Batch Processing
Management
Migration
Legacy Systems
Documentation
Data Modeling
Technical Writing
Data Structure
File Formats
Apache Avro
Docker
Kubernetes
Java
Extract
Transform
Load
PySpark
Amazon S3
Git
Python
Apache Parquet
SQL
JIRA
Apache Kafka
GitHub
Web Services
Microsoft Visual Studio
IntelliJ IDEA
JetBrains
Gradle
Grafana
Agile
Mentorship
Communication
Collaboration
Kanban
Scrum
FOCUS
Regulatory Compliance
Data Governance
Ab Initio
Job Details
Key Responsibilities
Lead Agile Development: Guide and support multiple Agile teams focused on data Extract , Ingestion, and transformation.
Modernize Legacy Systems: Migrate data pipelines from Ab Initio and Filesystem to modern technologies such as PySpark, S3, Airflow, Parquet, and Iceberg.
Full-Stack Engineering: Design and develop scalable backend services using PySpark and Python.
Data Platform Enablement: Support ingestion of 300+ data feeds into the platform to ensure timely nightly batch processing.
Cross-Functional Collaboration: Partner with business stakeholders and product owners to understand requirements and deliver effective solutions.
Agile Execution: Working with both Kanban and scrum teams and should be familiar with both and check-ins and managing tasks via Jira.
Mentorship and Communication: Provide technical guidance, foster collaboration, and proactively seek help when encountering blockers.
Platform Transition Support: Contribute to the migration from legacy systems to a new data platform over the next two years.
BAU and Strategic Support: Balance business-as-usual responsibilities while contributing to long-term platform modernization.
Documentation and Data Modeling: Maintain clear technical documentation and demonstrate a strong understanding of columnar data structures.
Experience on the different file format systems (Parquet, ORC, AVRO).
Experience on the code containerization deployments using Docker / Kubernetes.
Java background would be a plus.
Good Knowledge on large scale ETL Based frameworks
Experience on ETL tool (AbInitio).
Required Skills & Experience
Top Technical Skills
10+ Years as Data Engineer
3+ years of experience with PySpark, S3, Iceberg, Git, Python, Airflow, and Parquet
5+ years of experience with SQL
Experience with Agile methodologies and tools like Jira
Familiarity with Kafka
Experience with GitHub Copilot, Web Services, Visual studio, IntelliJ, and Gradle
Experience in monitoring tools like Grafana or Prometheus
Preferred Qualifications
Proven experience leading Agile teams and mentoring junior developers
Strong communication skills and the ability to collaborate with business stakeholders
Comfortable working in both Scrum and Kanban model with frequent scrum check-ins
Ability to identify blockers and proactively seek help when needed
Experience working in a regulated environment with a focus on compliance and data governance.
2+ years of working with Ab Initio graphs and plans
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.