Senior Data Engineer - Customer Data Platform
Location: Atlanta, GA/Frisco, TX (Hybrid onsite for both locations)
CDP MISSION: Our mission is to be the authoritative source of truth for customer data - delivering timely, high-quality data at scale to power the contextual experiences that drive the growth of this company. Every customer profile must be accurate, trusted, and available when it matters, across every touchpoint, for the entire US adult population.
Job Overview
We are seeking a Senior Data Engineer to lead the design and development of the scalable, high-performance data architecture that underpins our Customer Data Platform (CDP) - the authoritative source of truth for customer data across the entire US adult population.
This role owns the data foundations that make CDP trustworthy: robust pipelines, rigorous quality frameworks, and production-grade data architectures that deliver accurate, timely customer profiles at massive scale. You will ensure that every downstream consumer - from personalization engines to AI/ML models to business intelligence - can rely on CDP data without hesitation.
You will architect systems that handle billions of records across hundreds of source systems, while maintaining the data quality, governance, and operational excellence that an authoritative platform demands.
Job Responsibilities
- Architect and develop scalable, resilient batch and streaming data pipelines using ADF, Databricks, and Spark - designed for the throughput and reliability CDP demands at population scale
- Design and implement end-to-end data architecture across ADLS, Snowflake, Cosmos DB, and SQL Server, ensuring optimal storage strategies for different customer data workloads
- Define and enforce data modeling standards, data contracts, and schema evolution strategies that protect the integrity of CDP as the single source of truth
- Build and maintain data quality, reconciliation, and observability frameworks - ensuring every customer profile meets the accuracy and completeness thresholds required for trusted, authoritative data
- Optimize performance, scalability, and cost efficiency of large-scale distributed data systems processing billions of customer records
- Develop and manage data ingestion frameworks for APIs, event streams (Event Hub/Kafka), and enterprise systems - ensuring no customer touchpoint is missed
- Enable downstream AI/ML and GenAI use cases by preparing curated datasets, feature-ready data, and metadata layers that accelerate identity resolution and personalization
- Support RAG and search-based use cases by building reliable data pipelines for document ingestion, indexing, and retrieval (in collaboration with AI/ML teams)
- Implement and standardize CI/CD pipelines, orchestration (Control-M), and DevOps practices for data workflows with production-grade reliability
- Drive improvements in data governance, lineage, and access patterns - ensuring CDP data is not only accurate but also secure and compliant
- Lead design reviews, code reviews, and technical mentoring for junior engineers, raising the bar for engineering quality across the team
- Partner with cross-functional teams to align data platform capabilities with business needs, ensuring CDP delivers the contextual experiences that drive company growth
- Troubleshoot complex production issues and implement long-term, scalable fixes that strengthen platform resilience
Education and Work Experience
- Bachelor's or Master's degree in Computer Science, Engineering, or related field
- 6+ years of experience in data engineering and distributed data systems
- Proven experience designing and implementing enterprise-scale data platforms serving high-volume, high-reliability use cases
- Strong experience in cloud-native architectures (Azure preferred)
Technical Skills
- Advanced programming: Python, Spark (PySpark/Scala), SQL
- Azure ecosystem: ADF, ADLS, Databricks
- Data platforms: Snowflake, Cosmos DB, SQL Server
- Strong expertise in ETL/ELT design, streaming architectures, and data modeling for customer-centric data domains
- Experience with API integrations and event-driven architectures at scale
- Hands-on experience with data quality frameworks, reconciliation techniques, and monitoring - with a focus on trust and accuracy
- Familiarity with data support for AI/ML use cases (e.g., feature pipelines, data preparation, metadata management for identity resolution)
- Basic understanding of RAG, vector search, and LLM-driven applications (supporting role, not primary ownership)
- Experience implementing data governance, lineage, and access controls in regulated environments
- Expertise in performance tuning and cost optimization for population-scale data processing
Knowledge, Skills, and Abilities
- Strong system design and architectural thinking for data platforms that must be both authoritative and performant
- Ability to handle large-scale, complex data ecosystems spanning hundreds of source systems
- Leadership in driving engineering best practices, data quality standards, and operational excellence
- Deep understanding of why data trust, timeliness, and accuracy are non-negotiable for customer-facing experiences
- Strong collaboration with data science, AI/ML, security, and business teams
- Focus on building reliable, production-grade data platforms that the entire organization can depend on
Licenses and Certifications
- At least 18 years of age
- Legally authorized to work in the United States
Travel
Travel Required: No