Senior Iceberg DBA / Lakehouse Operations Engineer

Remote • Posted 3 hours ago • Updated 3 hours ago

Contract W2

Contract Independent

Contract Corp To Corp

No Travel Required

Able to Sponsor

Remote

Depends on Experience

Pacific Consultancy Services

Fitment

Dice Job Match Score™

✨ Finding the perfect fit...

Job Details

Skills

Big Data
Apache Spark
Apache NiFi
SQL
Python
Teradata
Database Administration
Senior Iceberg
DBA
Iceberg

Summary

Role: Senior Iceberg DBA / Lakehouse Operations Engineer

Location: (Remote)

· 4–6 years of experience in Big Data / Data Operations / DBA roles

· Minimum 1+ year of experience with Apache Iceberg or similar table formats (Hive/Delta/Hudi)

· 4+ years of experience with Cloudera ecosystem (CDP)

· Hands-on experience with:

o Iceberg table operations and maintenance

o Spark SQL, Hive, or Impala

· Experience in:

o Production support and incident handling

o Monitoring, troubleshooting, and operational support

· Apply established data modeling and Lakehouse standards in day-to-day operations

· Support:

o Table structuring

o Partition alignment with ingestion patterns

· Assist in maintaining consistency of datasets across Bronze/Silver/Gold layers

Required Skills

· Strong hands-on experience with Apache Iceberg and/or Hive-based data lakes

· Understanding of data modeling concepts (normal forms) and modern Lakehouse patterns (Medallion architecture)

· Expertise in:

o Table-level optimization and performance tuning

o Large-scale data management (TB/PB scale)

· Experience with:

o Spark SQL, Hive, Impala, NiFI, Trino

· Strong understanding of:

o Partitioning strategies

o File formats (Parquet/ORC)

o Distributed query processing

Preferred Skills

· Experience with:

o Hive-to-Iceberg or Teradata-to-Iceberg migration

o Cloudera CDP (CDE/CDW)

· Familiarity with:

o Cloud platforms (AWS, Azure)

· Scripting/automation (Python, Shell)

What You’ll Work On

· Enterprise-scale Iceberg Lakehouse platform supporting multiple applications

· Large-scale data modernization initiatives

· Performance optimization and stability of mission-critical analytical workloads

Why This Role Matters

· Ensures data correctness and performance for downstream analytics and business-critical reporting

· Enables successful modernization from legacy platforms to Iceberg

· Maintains high availability and reliability of the enterprise data layer

Job Summary

We are seeking a highly skilled Iceberg DBA / Lakehouse Operations Engineer to own the reliability, performance, and operational integrity of the Iceberg data layer powering enterprise analytics and business-critical applications.

This role operates in a large-scale, multi-engine Lakehouse environment, supporting workloads across Spark, Hive, and Impala, and plays a key role in enterprise data modernization initiatives (Hive and Teradata → Iceberg).

The ideal candidate brings deep expertise in Iceberg table operations, metadata management, and query performance optimization, ensuring consistent, high-performance data access across platforms in a cloud-based environment.

This role is critical to ensuring data accuracy and performance—any degradation directly impacts downstream reporting, analytics, and business-critical decision-making.

Key Responsibilities:

Iceberg Data Layer Ownership & Operations

· Own day-to-day operations of Apache Iceberg tables supporting multiple enterprise applications

· Ensure data reliability, consistency, and availability across all Lakehouse workloads

· Maintain operational integrity for datasets at multi-terabyte to petabyte scale

Advanced Table Management & Optimization

· Execute advanced Iceberg table maintenance and optimization strategies:

o Compaction (minor/major) and small file mitigation

o Snapshot expiration and metadata compaction to control metadata growth

o Orphan file cleanup (vacuum) to maintain storage efficiency

· Optimize data layout and performance through:

o File size tuning and distribution strategies

o Partition evolution and pruning optimization

o Clustering and ordering techniques (e.g., Z-ordering or similar patterns)

Data Modeling Standards & Lakehouse Design Alignment

· Support and enforce data modeling best practices aligned with:

o Normalized data structures (3NF) for source-aligned datasets

o Medallion architecture (Bronze / Silver / Gold layers) for curated data flows

· Ensure Iceberg table design aligns with:

o Data ingestion patterns (raw vs curated layers)

o Downstream consumption and performance requirements

· Assist in structuring datasets to balance:

o Data integrity and normalization

o Query performance and analytical efficiency

· Work with data engineering teams to ensure consistent implementation of layered data architecture across multiple applications

Multi-Engine Query Performance & Consistency

· Ensure consistent and performant query behavior across:

o Spark (CDE)

o Hive / Impala (CDW)

· Troubleshoot and resolve:

o Query performance bottlenecks

o Metadata inconsistencies across engines

o Inefficient execution plans and scan patterns

Hive & Teradata Modernization Support

· Play a key role in enterprise data platform modernization (Hive and Teradata → Iceberg)

· Support:

o Schema alignment and data type mapping

o Data validation and reconciliation

· Troubleshoot migration-related issues and ensure post-migration stability and performance

Metadata & Data Lifecycle Management

· Manage Iceberg metadata to ensure:

o Efficient scaling and performance

o Consistent table state across engines

· Execute lifecycle operations:

o Data retention and archival policies

o Snapshot lifecycle management and cleanup

o Time-travel optimization and maintenance

Production Support, Incident Resolution & On-Call

· Provide L2/L3 support for data-related production issues across Iceberg-based Lakehouse workloads

· Participate in on-call rotation to support critical data platforms and ensure timely response to incidents

· Respond to and resolve P1/P2 production incidents within defined SLAs, minimizing impact to downstream applications and reporting

· Troubleshoot:

o Data inconsistencies and reporting discrepancies

o Query failures and performance degradation

· Perform root cause analysis (RCA) and implement preventive measures to avoid recurring issues

· Collaborate with platform and application teams during incident triage and resolution

Security & Data Governance Support

· Support fine-grained access control using:

o Ranger policies and RBAC

· Own and ensure data validation, reconciliation, and accuracy between source and Iceberg datasets

· Ensure secure and compliant access to data across applications

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Dice Id: 91142718
Position Id: 8936649
Posted 3 hours ago

Company Info

About Pacific Consultancy Services

Pacific Consultancy Services, founded with an ambitious vision in 2013, is a prominent IT Consulting and Service Delivery firm. The company is built upon the pillars of exceptional customer-centric solutions, streamlined processes, and impeccable technical expertise.

With nearly two decades of experience, Pacific consultancy services have been at the forefront of delivering intelligent solutions to clients worldwide, including the United States. Its specialized offerings encompass Artificial Intelligence, Machine Learning, Blockchain, Cloud services, IoT, DevOps, IT Staff Augmentation, and Cognitive Analytics, all contributing to achievable and profitable business models. Ensuring quality across all service domains is a priority for Pacific consultancy services. Its services span from IT Staff Augmentation to Digital Transformation, IT Consulting, and Emerging Technologies.

The work ethos of Pacific consultancy services revolves around its core “Model of Delivery,” aimed at providing clients with the best possible solutions.

Go to company profile

Create job alert

Never miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Databricks Administrator

Remote

•

5d ago

Role: Databricks Admin Location: Austin, TX Duration: 12+ Month Remote In-person Interview Level Description 8 or more years of experience, relies on experience and judgment to plan and accomplish goals, independently performs a variety of complicated tasks, a wide degree of creativity and latitude is expected. Job Description Understands business objectives and problems, identifies alternative solutions, performs studies and cost/benefit analysis of alternatives.Analyzes user requirements,

Easy Apply

Contract, Third Party

Depends on Experience

Enterprise Data Architect

Remote

•

Today

We are seeking an experienced Enterprise Data Architect to lead the design and implementation of scalable, enterprise-grade data platforms. This role will focus on defining data architecture strategy, governance, and modern data solutions leveraging Azure Databricks and Lakehouse architecture, with a strong emphasis on AI/ML integration. The ideal candidate will have a strong background in enterprise data architecture, cloud data platforms, and AI/ML-driven data solutions, with the ability to w

Easy Apply

Third Party, Contract

Depends on Experience

Data Engineer

Remote

•

8d ago

We are seeking a strong Data Engineer who can own and evolve the data foundations supporting our investment processes. You will engineer the data pipelines, models, and systems that power decision-making across portfolio management, risk, research, and attribution. This role is ideal for someone who excels at data engineering in a highly analytical, investment-focused environment and wants to work directly with investment professionals and quantitative teams. Role: You will design, build, and ma

Easy Apply

Contract

Depends on Experience

Cloudera Public Cloud Platform Engineer

Remote

•

Today

Role: Cloudera Public Cloud Platform EngineerWork location: Remote work accepted from anywhere in US12+ yearsof experience in Big Data Platform Engineering / Cloud Platform Operations / Infrastructure roles6+ years of hands-on experience with Cloudera ecosystem (CDH/CDP/ Cloudera Public Cloud)Demonstrated ability to quickly learn and adapt to new technologies and evolving platform capabilities, beyond the currently defined CDP stackStrong expertise in:End-to-end CDP platform operations (CDE, CDW

Easy Apply

Contract, Third Party

Depends on Experience

Search all similar jobs