Data Architect (with OpenShift - OCP)

Hybrid in Charlotte, NC, US • Posted 5 hours ago • Updated 5 hours ago
Contract Corp To Corp
Contract W2
Hybrid
$80 - $82/hr
Company Branding Image
Fitment

Dice Job Match Score™

🎯 Assessing qualifications...

Job Details

Skills

  • Pyspark
  • Spark
  • Openshift
  • OCP

Summary

Project: Identity & Access Management (IAM) Data Modernization

Migration of an on-premises SQL data warehouse to a modern enterprise Data Lake platform, enabling analytics and GenAI use cases. The platform leverages PySpark-based processing, CI/CD pipelines, and containerized deployments on OpenShift (OCP), with Google Cloud Platform as a preferred cloud platform, to deliver scalable, secure, and high-performance data solutions

About Program/Project

The IAM Data Modernization program focuses on transforming legacy data platforms into a scalable and cloud-compatible architecture.

Key Highlights:

  • Integration Scope: 30+ source systems with multiple downstream integrations
  • Capabilities: Metrics, reporting, advanced analytics, and GenAI use cases (NL querying, summarisation, cross-domain insights)
  • Benefits:
    • Scalable and resilient data platform
    • High-performance semantic and analytics layer
    • Single source of truth for enterprise-wide reporting and analytics

Role Summary

We are looking for a Data Architect with strong expertise in OpenShift (OCP), PySpark, and CI/CD pipelines to design and govern scalable data platforms.

The role requires defining end-to-end data architecture, containerised deployment patterns, orchestration strategies (Airflow/Autosys), and platform standards, along with hands-on involvement in implementation.

Key Responsibilities

Data Architecture & Platform Design

  • Define enterprise data architecture for IAM data lake and analytics platform
  • Design scalable, modular, and containerised data pipeline architectures on OCP
  • Establish data models, schema governance, and data lifecycle strategies
  • Define best practices for data partitioning, performance optimisation, and cost efficiency

OpenShift (OCP) & Platform Engineering

  • Architect and govern containerised data workloads on OpenShift (OCP)
  • Define standards for deployment, scaling, and workload isolation
  • Collaborate with DevOps teams for platform engineering and infrastructure alignment

Big Data & Processing (PySpark Focus)

  • Define architecture for PySpark-based batch and near real-time processing pipelines
  • Provide guidance on distributed processing design, optimisation, and performance tuning
  • Establish reusable frameworks for ETL/ELT processing

Data Ingestion & Orchestration

  • Architect data ingestion frameworks (batch, streaming, CDC)
  • Define orchestration strategies using Airflow / Autosys
  • Implement standards for retry, backfills, dependency management, and error handling

DevOps / CI-CD

  • Define and oversee CI/CD strategy for data and platform deployments
  • Enable automation of build, test, and deployment processes
  • Ensure integration of CI/CD pipelines with OCP-based environments

Cloud & Data Platforms (Preferred)

  • Provide architecture guidance for Google Cloud Platform-based data platforms (preferred, not mandatory)
  • Define integration patterns for cloud-native and on-premise hybrid environments
  • Guide teams on cloud migration strategies and modern data platform adoption

Data Governance, Quality & Observability

  • Define frameworks for:
    • Data quality, validation, and lineage
    • Metadata management and cataloguing
  • Establish monitoring, logging, alerting, and SLOs for platform reliability
  • Ensure compliance with data security and audit requirements

Stakeholder Collaboration

  • Work closely with client architects, IAM teams, and business stakeholders
  • Translate business requirements into scalable technical architecture
  • Provide architectural guidance and mentorship to engineering teams

Required Skills

Core Skills (Must Have)

  • Strong experience in:
    • OpenShift (OCP) / Kubernetes-based platforms
    • PySpark / Spark ecosystem
    • CI/CD implementation for data platforms
    • Airflow / Autosys orchestration tools
  • Solid understanding of:
    • Data lake architectures (layered models)
    • ETL/ELT design patterns
    • Distributed data processing concepts

Data Engineering & Storage

  • Expertise in:
    • Data formats: Parquet, ORC, Avro
    • Partitioning and performance tuning
    • Large-scale data modelling for analytics

Cloud (Preferred Not Mandatory)

  • Experience with Google Cloud Platform (Google Cloud Platform) (preferred)
  • Exposure to services like BigQuery, Dataproc, Dataflow, GCS is a plus

Observability & Reliability

  • Experience defining:
    • Monitoring, logging, alerting frameworks
    • Dashboards, SLOs, and operational runbooks

Good to Have

  • Experience with IAM domain / cybersecurity data
  • Understanding of data security and access control frameworks
  • Exposure to GenAI-enabled data platforms
  • Experience in Agile delivery and team leadership

Qualifications

  • Experience:
    • 10 14+ years in Data Architecture / Data Engineering
    • Strong experience in OCP, PySpark, CI/CD, and orchestration frameworks
    • Prior experience in data modernisation / migration programs
  • Education:
    Bachelor s/Master s in Computer Science, Information Systems, or equivalent
  • Certifications (Preferred):
    • OpenShift / Kubernetes certifications
    • Google Cloud Platform certifications (preferred, not mandatory)

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 91131106
  • Position Id: 8965510
  • Posted 5 hours ago

Company Info

About Rivago infotech inc

Rivago Infotech Inc has been a leader in IT staffing and Software development for over 5 years and is one of the largest diversity and development firms in the industry. We are known for our high-touch, customer-eccentric approach, offering our clients unmatched quality, responsiveness and flexibility . We are appreciated by our clients for our streamlined execution, highly efficient service and exceptional talent management that go above and beyond traditional staffing services.

About_Company_OneAbout_Company_Two
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Charlotte, North Carolina

9d ago

Easy Apply

Contract, Third Party

60 - 65

Hybrid in Newark, New Jersey

8d ago

Easy Apply

Third Party, Contract

60 - 65

Independence Township, New Jersey

3d ago

Easy Apply

Full-time

130,000 - 140,000

Hybrid in Dallas, Texas

Today

Easy Apply

Third Party, Contract

$80 - $82

Search all similar jobs