Job Title: Data Engineer
Location: Remote
Role Overview:
The Data Engineer will design, build, and manage scalable, secure, and high-performance data products and pipelines that power enterprise analytics and decision-making. Data is at the center of our operations, and this role is critical to modernizing how data is managed, accessed, and consumed across.
The Common Data Platform (CDP) is a cloud-based, end-to-end data management platform created to reduce data costs and significantly improve user experience. Initially developed in partnership with the Supervision and Regulation business.
You will join a cross-district, enterprise-scale team operating under the Scaled Agile Framework (SAFe), delivering incremental value in a fast-paced and highly collaborative environment. This is a unique opportunity to help shape the future of data management at a national scale.
Key Responsibilities:
- Design, develop, and maintain robust, scalable, and secure data pipelines to ingest, transform, catalog, and deliver curated, high-quality data into the Common Data Platform
- Build and support end-to-end data pipelines for structured, semi-structured, and unstructured data using Spark-based architectures
- Ensure data pipelines are repeatable, reliable, well-tested, and capable of supporting multiple users and use cases
- Proactively identify, troubleshoot, and resolve issues across data pipelines, workflows, and analytical data stores
- Implement monitoring, alerting, and (where possible) auto-remediation to ensure availability and reliability of data systems
- Apply security-first design principles and data engineering best practices, including testing and automation
- Collaborate closely with product managers, data scientists, analysts, and business stakeholders to understand data requirements and deliver the necessary infrastructure and tooling
- Participate fully in Agile ceremonies and follow SAFe Agile practices as defined by the CDP Program team
- Continuously evaluate emerging tools, frameworks, and technologies to improve data engineering processes and efficiencies
Required Qualifications:
- Bachelor’s degree in Computer Science, Information Systems, or a related field, or equivalent professional experience
- 3+ years of experience with Python and PySpark
- 2+ years of experience working with tools such as Databricks, Collibra, and/or Starburst
- Strong SQL skills (experience with Starburst is a plus)
- Hands-on experience with:
- Databricks (PySpark)
- Python
- GitLab
- CI/CD pipelines
- Tableau
- Experience using Jupyter Notebooks for development, testing, and unit testing
- Proven experience working with relational and NoSQL data stores, including dimensional modeling techniques (STAR schema, dimensional modeling)
- 2+ years of experience with a modern data stack, including:
- Object storage (e.g., S3)
- Spark
- Airflow
- Lakehouse architectures
- Real-time databases
- Experience with cloud data warehouses such as Amazon Redshift or Snowflake
- Broad data engineering experience across traditional ETL and big data platforms, either on-premises or in the cloud
- Data engineering experience in AWS, including familiarity with relevant services and tools (e.g., CFS2/EDS)
Preferred Qualifications:
- Experience supporting enterprise-scale data platforms
- Familiarity with SAFe Agile or large-scale Agile delivery models
- Experience building data products used by diverse stakeholder groups across organizations