Overview
Skills
Job Details
Position Summary
We are seeking a highly skilled Data Engineer with strong expertise in Python, Spark/PySpark, AWS, and SQL to support large-scale data engineering initiatives. The ideal candidate will have hands-on experience in building, optimizing, and troubleshooting big-data pipelines, and will work closely with cross-functional engineering teams. Previous Capital One (Ex-Cap) experience is highly preferred, but not mandatory.
Key Responsibilities
Design, develop, and maintain scalable data pipelines using Python and Spark/PySpark.
Work extensively with AWS cloud services to build and optimize data engineering solutions.
Write, optimize, and troubleshoot complex SQL queries against Snowflake databases.
Collaborate with data architects and stakeholders to understand requirements and translate them into technical solutions.
Develop and maintain ETL/ELT workflows for large datasets.
Troubleshoot production issues in data pipelines and ensure high availability and performance.
Build and maintain data ingestion frameworks using Python, Spark, and AWS components.
Optional/Preferred: Leverage Golang for backend or data processing tasks when required.
Required Skills & Experience
Strong proficiency in Python for data engineering and automation.
Hands-on experience with Spark / PySpark.
Deep knowledge of AWS cloud services (EMR, S3, Lambda, Glue, etc.).
Ability to write complex SQL queries; strong experience with Snowflake is required.
Experience in debugging/troubleshooting large-scale data systems.
Familiarity with Golang (preferred, not mandatory).
Excellent communication and collaboration skills.
Ex-Cap (Capital One) experience preferred, but not required.