Role: Data Engineer/analyst with Ex-CapitalOne
Location: Wilmington, DE – Hybrid
Job Description: Data Engineer/Analyst| Control Automation & ETL Testing
Must Have : Ex-Capital One Data Engineer/Analyst with expert Python & SQL, production ETL pipelines, automated data QA/testing, AWS (S3/Glue/Lambda), data governance/lineage, and risk & control automation experience.
Summary
As a Data Analyst within the Risk and Controls organization, you will bridge the gap between data engineering and control execution. You will be responsible for developing, scaling, and monitoring ETL pipelines that power our risk frameworks. Beyond pipeline construction, you will design and implement scripted QA test suites using Python to ensure data integrity, lineage, and compliance. You will work in a high-impact, collaborative environment where your technical expertise in Python and SQL directly ensures the reliability of our first-line controls.
Types of Work
ETL Development & Data Engineering
Design, develop, and maintain robust ETL pipelines to aggregate and transform raw data into actionable datasets for control execution.
Optimize complex SQL queries and Python scripts to improve data processing speed and reliability across various database environments (Postgres, Snowflake, etc.).
Integrate disparate data sources—including unstructured JSON and relational warehouses—into a unified data layer for risk reporting.
Automated Data Validation & Scripted QA
Build and execute automated QA test suites using Python (e.g., PyTest, Great Expectations) to validate data completeness, accuracy, and timeliness.
Develop "Data-as-Code" testing frameworks to catch anomalies or schema drift before they impact downstream control processes.
Perform unit and integration testing on ETL code bases to ensure the logic reflects the underlying business and system rules.
Data Governance & Lineage
Manage data repositories and CI/CD pipelines to ensure seamless and governed deployment of data assets.
Drive adherence to data quality principles, including automated metadata capture and technical lineage mapping.
Evaluate integration points to ensure SQL logic accurately captures the state of the systems being reported on.
General Responsibilities
Pipeline Optimization: Identify bottlenecks in data delivery and implement Python-based solutions to automate manual data work.
Technical Partnership: Collaborate with Engineering and Ops to translate control requirements into technical specifications for ETL workflows.
Strategic Problem Solving: Use a quantitative mindset to solve data gaps, leveraging Python libraries for deep-dive analysis into data anomalies.
Communication: Clearly articulate technical risks and data discrepancies to non-technical stakeholders to drive remediation.
Basic Qualifications
Master’s Degree in a quantitative or technical field.
Proven experience building and running ETL pipelines in a production environment.
Expert-level proficiency in Python and SQL, specifically for data manipulation and automated testing.
Experience with relational and non-relational databases (Postgres, MySQL, DynamoDB, Cassandra, or similar).
Preferred Qualifications
Experience building automated QA frameworks for data validation.
Hands-on experience with AWS services (S3, Glue, Lambda, IAM) to support serverless data processing.
Familiarity with data orchestration tools (e.g., Airflow, Prefect) and version control (Git).
Experience handling unstructured data (JSON) and transforming it for structured reporting.