Overview
Skills
Job Details
Data Platform Architect
Location: Plano TX (onsite)
Key Job Duties
1. Architecture Design & Strategy
Design and develop end-to-end data architecture to support analytics, reporting, and machine learning initiatives.
Define the overall data platform strategy, including data ingestion, storage, processing, and access layers.
Select and implement appropriate data storage solutions (e.g., data lakes, data warehouses, or lakehouses).
Create reference architecture blueprints aligning with enterprise standards and business goals.
2. Data Infrastructure & Engineering
Build and maintain scalable, high-performance data pipelines using technologies such as Apache Spark, Databricks, Kafka, or AWS Glue.
Integrate data from multiple internal and external sources into a unified platform.
Optimize data ingestion, transformation, and ETL/ELT processes for reliability and performance.
Implement data partitioning, indexing, and caching strategies for faster query performance.
3. Data Governance, Security & Compliance
Define and enforce data governance standards, including metadata management and data lineage tracking.
Ensure compliance with data privacy regulations (GDPR, HIPAA, etc.).
Implement robust data security and access control mechanisms across environments.
Collaborate with InfoSec and compliance teams to ensure proper audit trails and access management.
4. Cloud & Platform Management
Design and manage cloud-based data platforms (e.g., AWS, Azure, Google Cloud Platform).
Configure cloud data services such as Amazon Redshift, Snowflake, BigQuery, or Azure Synapse Analytics.
Manage data storage costs and performance through monitoring and optimization.
5. Collaboration & Leadership
Partner with data engineers, analysts, and business stakeholders to translate business needs into data solutions.
Provide technical leadership and mentorship to engineering teams.
Lead data architecture reviews, documentation, and best practice enforcement.
Communicate architectural decisions and trade-offs to both technical and non-technical audiences.
6. Performance & Scalability
Continuously monitor system performance and optimize data pipelines for scalability.
Implement data quality frameworks to ensure accurate and consistent data.
Evaluate new technologies and tools for improving data processing efficiency.
Typical Tools & Technologies
Languages: SQL, Python, Scala
Data Processing: Apache Spark, Databricks, Airflow
Databases: Snowflake, Redshift, BigQuery, Azure Synapse, PostgreSQL
Streaming: Kafka, Kinesis
Cloud Platforms: AWS, Azure, Google Cloud Platform
Data Modeling Tools: ER/Studio, dbt, Power BI Dataflows