Role: Data architect with OpenShift Exp
Location: Dallas, TX or Charlotte, NC (Onsite)
Duration: 12 Months
Identity & Access Management (IAM) Data Modernization
Migration of an on-premises SQL data warehouse to a modern enterprise Data Lake platform, enabling analytics and GenAI use cases. The platform leverages PySpark-based processing, CI/CD pipelines, and containerized deployments on OpenShift (OCP), with Google Cloud Platform as a preferred cloud platform, to deliver scalable, secure, and high-performance data solutions
About Program/Project
The IAM Data Modernization program focuses on transforming legacy data platforms into a scalable and cloud-compatible architecture.
Key Highlights:
Integration Scope: 30+ source systems with multiple downstream integrations [
Capabilities: Metrics, reporting, advanced analytics, and GenAI use cases (NL querying, summarisation, cross-domain insights)
Benefits:
Scalable and resilient data platform
High-performance semantic and analytics layer
Single source of truth for enterprise-wide reporting and analytics
Role Summary
We are looking for a Data Architect with strong expertise in OpenShift (OCP), PySpark, and CI/CD pipelines to design and govern scalable data platforms.
The role requires defining end-to-end data architecture, containerised deployment patterns, orchestration strategies (Airflow/Autosys), and platform standards, along with hands-on involvement in implementation.
Key Responsibilities
Data Architecture & Platform Design
Define enterprise data architecture for IAM data lake and analytics platform
Design scalable, modular, and containerised data pipeline architectures on OCP
Establish data models, schema governance, and data lifecycle strategies
Define best practices for data partitioning, performance optimisation, and cost efficiency