Apply Now

Data Architect (with OpenShift - OCP)

Hybrid in Charlotte, NC, US • Posted 5 hours ago • Updated 5 hours ago

Contract Corp To Corp

Contract W2

Hybrid

$80 - $82/hr

Rivago infotech inc

Fitment

Dice Job Match Score™

🎯 Assessing qualifications...

Job Details

Skills

Pyspark
Spark
Openshift
OCP

Summary

Project: Identity & Access Management (IAM) Data Modernization

Migration of an on-premises SQL data warehouse to a modern enterprise Data Lake platform, enabling analytics and GenAI use cases. The platform leverages PySpark-based processing, CI/CD pipelines, and containerized deployments on OpenShift (OCP), with Google Cloud Platform as a preferred cloud platform, to deliver scalable, secure, and high-performance data solutions

About Program/Project

The IAM Data Modernization program focuses on transforming legacy data platforms into a scalable and cloud-compatible architecture.

Key Highlights:

Integration Scope: 30+ source systems with multiple downstream integrations
Capabilities: Metrics, reporting, advanced analytics, and GenAI use cases (NL querying, summarisation, cross-domain insights)
Benefits:

Scalable and resilient data platform
High-performance semantic and analytics layer
Single source of truth for enterprise-wide reporting and analytics

Role Summary

We are looking for a Data Architect with strong expertise in OpenShift (OCP), PySpark, and CI/CD pipelines to design and govern scalable data platforms.

The role requires defining end-to-end data architecture, containerised deployment patterns, orchestration strategies (Airflow/Autosys), and platform standards, along with hands-on involvement in implementation.

Key Responsibilities

Data Architecture & Platform Design

Define enterprise data architecture for IAM data lake and analytics platform
Design scalable, modular, and containerised data pipeline architectures on OCP
Establish data models, schema governance, and data lifecycle strategies
Define best practices for data partitioning, performance optimisation, and cost efficiency

OpenShift (OCP) & Platform Engineering

Architect and govern containerised data workloads on OpenShift (OCP)
Define standards for deployment, scaling, and workload isolation
Collaborate with DevOps teams for platform engineering and infrastructure alignment

Big Data & Processing (PySpark Focus)

Define architecture for PySpark-based batch and near real-time processing pipelines
Provide guidance on distributed processing design, optimisation, and performance tuning
Establish reusable frameworks for ETL/ELT processing

Data Ingestion & Orchestration

Architect data ingestion frameworks (batch, streaming, CDC)
Define orchestration strategies using Airflow / Autosys
Implement standards for retry, backfills, dependency management, and error handling

DevOps / CI-CD

Define and oversee CI/CD strategy for data and platform deployments
Enable automation of build, test, and deployment processes
Ensure integration of CI/CD pipelines with OCP-based environments

Cloud & Data Platforms (Preferred)

Provide architecture guidance for Google Cloud Platform-based data platforms (preferred, not mandatory)
Define integration patterns for cloud-native and on-premise hybrid environments
Guide teams on cloud migration strategies and modern data platform adoption

Data Governance, Quality & Observability

Define frameworks for:

Data quality, validation, and lineage
Metadata management and cataloguing

Establish monitoring, logging, alerting, and SLOs for platform reliability
Ensure compliance with data security and audit requirements

Stakeholder Collaboration

Work closely with client architects, IAM teams, and business stakeholders
Translate business requirements into scalable technical architecture
Provide architectural guidance and mentorship to engineering teams

Required Skills

Core Skills (Must Have)

Strong experience in:

OpenShift (OCP) / Kubernetes-based platforms
PySpark / Spark ecosystem
CI/CD implementation for data platforms
Airflow / Autosys orchestration tools

Solid understanding of:

Data lake architectures (layered models)
ETL/ELT design patterns
Distributed data processing concepts

Data Engineering & Storage

Expertise in:

Data formats: Parquet, ORC, Avro
Partitioning and performance tuning
Large-scale data modelling for analytics

Cloud (Preferred Not Mandatory)

Experience with Google Cloud Platform (Google Cloud Platform) (preferred)
Exposure to services like BigQuery, Dataproc, Dataflow, GCS is a plus

Observability & Reliability

Experience defining:

Monitoring, logging, alerting frameworks
Dashboards, SLOs, and operational runbooks

Good to Have

Experience with IAM domain / cybersecurity data
Understanding of data security and access control frameworks
Exposure to GenAI-enabled data platforms
Experience in Agile delivery and team leadership

Qualifications

Experience:

10 14+ years in Data Architecture / Data Engineering
Strong experience in OCP, PySpark, CI/CD, and orchestration frameworks
Prior experience in data modernisation / migration programs

Education:
Bachelor s/Master s in Computer Science, Information Systems, or equivalent
Certifications (Preferred):

OpenShift / Kubernetes certifications
Google Cloud Platform certifications (preferred, not mandatory)

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Dice Id: 91131106
Position Id: 8965510
Posted 5 hours ago

Company Info

About Rivago infotech inc

Rivago Infotech Inc has been a leader in IT staffing and Software development for over 5 years and is one of the largest diversity and development firms in the industry. We are known for our high-touch, customer-eccentric approach, offering our clients unmatched quality, responsiveness and flexibility . We are appreciated by our clients for our streamlined execution, highly efficient service and exceptional talent management that go above and beyond traditional staffing services.

Go to company profile

Create job alert

Never miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Data Engineer with Scala

Charlotte, North Carolina

•

9d ago

Role: Data Engineer Location:Charlotte, NC; Oakland, CA or Mountain View, CA (100% Onsite) Duration: Long term Project Critical Skillsets: Scala 2.x (Functional Programming experience) Hands on experience with SQL Google Cloud Platform / Google Cloud (Pubsub / Dataflow / Big Query) (Nice to have) IDL / Thrift / Finagle knowledge / experience Lead / Senior level (8-10 years) Focus:Credit Ecosystem stability: Reducing CE Services footprint by 2 services Mission:Act as the primary technical

Easy Apply

Contract, Third Party

60 - 65

Data Governance SME

Hybrid in Newark, New Jersey

•

8d ago

Tech/Busi/Data Analyst for the People Domain Activation project plays a critical role in enabling the People Domain to successfully transition into the enterpriseData-as-a-Product operating model. The analyst partners with Domain Owners, Stewards, Data Engineering, and crossfunctional business teams to define domain scope, document data assets, elicit requirements, ensure metadata and quality standards are met, andsupport the build-out of People-related data products Key Responsibilities: Lead t

Easy Apply

Third Party, Contract

60 - 65

Technical Delivery Leader

Independence Township, New Jersey

•

3d ago

Role :Technical Delivery Leader Cloud Location :Warren , NJ (Hybrid) Full Time / Contract Implementation partner - End client: Over the call only. Experience - 15+ look for more strong technology exposure. Most are coming across as managers who know a little about technology. Engagement Type:Managed Services About the Role We are seeking a highly proactive and seasonedSenior Onshore Delivery Leaderto drive complex cloud solutions at a strategic client engagement in New Jersey. This high-visib

Easy Apply

Full-time

130,000 - 140,000

Data Architect (with OpenShift OCP)

Hybrid in Dallas, Texas

•

Today

Project: Identity & Access Management (IAM) Data Modernization Migration of an on-premises SQL data warehouse to a modern enterprise Data Lake platform, enabling analytics and GenAI use cases. The platform leverages PySpark-based processing, CI/CD pipelines, and containerized deployments on OpenShift (OCP), with Google Cloud Platform as a preferred cloud platform, to deliver scalable, secure, and high-performance data solutions About Program/Project The IAM Data Modernization program focuses on

Easy Apply

Third Party, Contract

$80 - $82

Search all similar jobs