JD Data Architect
Location: Plano, TX - Onsite
We are seeking a hands ‘on Data Architect to design and evolve an AWS ‘based data platform spanning streaming ingestion (Kafka), API/enterprise integration (MuleSoft), containerized data services (EKS), data lake on S3, interactive query with Athena, and analytics/reporting on Snowflake and Tableau.
You will set data architecture standards, lead solution design, and guide engineering teams to deliver a scalable, secure, and cost ‘efficient platform that accelerates product and analytics use cases.
Key Responsibilities Architecture & Design Own the end to end data architecture across ingestion, storage, processing, serving, and visualization layers.
Define canonical data models and domain data contracts; lead conceptual/logical/physical data modeling and schema design for batch and streaming use cases.
Establish reference architectures and patterns for event ‘driven and API ‘led data integration (Kafka, MuleSoft).
Design secure, multi ‘account AWS topologies (VPC, IAM, KMS) for data workloads; enforce governance, lineage, and cataloging. Platform Enablement (New Platform Build ‘out) Lead the blueprint and incremental rollout of a new AWS data platform, including landing at’ raw curated zones on S3, Athena for ‘hoc/interactive SQL, and Snowflake for governed analytics and reporting.
Define platform SLAs/SLOs, cost guardrails, and chargeback/showback models; optimize storage/compute footprints.
Partner with DevOps to run containerized data services on EKS (e.g., stream processors, microservices, connectors) and automate with CI/CD.
Data Integration & Processing Guide ingestion patterns: Kafka topics/partitions, retention, compaction, schema evolution (Avro/Protobuf), DLQ strategies.
Architect MuleSoft APIs/flows for system to system data exchange and orchestration; standardize API contracts and security.
Define Athena query strategies, partitioning, file formats (Parquet/ORC), and table metadata practices for performance/cost.
Set patterns for CDC, bulk/batch ETL/ELT, and stream processing; select ‘purpose transformation engines.
Analytics, Reporting & Self ‘Service Shape a semantic layer and governed Snowflake models (data vault/star schemas) to serve BI and data science.
Enable business teams with Tableau dashboards, certified data sources, and governance for KPI definitions and refresh cadences.
Security, Governance & Quality Implement data classification, encryption, access controls (RBAC/ABAC), masking/tokenization, and audit trails.
Establish data quality standards, SLOs, observability (freshness, completeness, accuracy), and automated validation. Leadership & Collaboration Provide architecture runway, backlog guidance, and technical mentorship for data engineers, API/streaming engineers, and BI developers.
Partner with Product, Security, and Compliance to align roadmaps, standards, and delivery milestones.
Produce decision records, diagrams, and guidance that make complex designs easy to adopt.
Required Qualifications 8+ years in data architecture/engineering with 3+ years architecting on AWS. Proven design of S3 ‘based data lakes with robust partitioning, lifecycle policies, and metadata/catalog strategy.
Hands on experience with Kafka (topic design, schema evolution, consumer groups, throughput/latency tuning).
Practical MuleSoft integration design (API led connectivity, RAML/OAS, policies, governance).
Production experience with Amazon EKS for data/streaming microservices and connectors. S
trong SQL and performance tuning with Athena; expertise selecting file formats/partitioning for cost/perf.
Data warehousing on Snowflake (ELT, clustering, resource monitors, security) and delivering analytics via Tableau.
Mastery of data modeling (3NF, dimensional/star, data vault), data contracts, and event modeling.
Solid foundations in security, IAM/KMS, networking for data platforms, and cost management.
Preferred Qualifications Experience with schema registries, stream processing frameworks, and change data capture.
Background in data governance (catalog/lineage), metadata automation, and compliance frameworks.
Familiarity with DevOps practices for data (pipeline CI/CD, environment promotion, GitOps).
Prior work enabling self ‘service analytics and establishing an enterprise semantic layer.
Tools & Technologies (Environment) AWS: S3, EKS, Athena, IAM, KMS, CloudWatch, Glue/Lake Formation (as applicable).
Streaming & Integration: Kafka (+ Schema Registry), MuleSoft. Warehouse & BI: Snowflake, Tableau.
Data Formats: Parquet/ORC/Avro/ partitioning/bucketing best practices.
Observability & Quality: Metrics, lineage, DQ checks, and alerting (tooling per org standard).