Cloudera Public Cloud Platform Engineer

Remote • Posted 2 hours ago • Updated 2 hours ago
Contract W2
Contract Corp To Corp
Contract Independent
No Travel Required
Able to Sponsor
Remote
Depends on Experience
Company Branding Image
Fitment

Dice Job Match Score™

✨ Finding the perfect fit...

Job Details

Skills

  • Apache NiFi
  • Big Data
  • Cloudera
  • CDP
  • Cloud

Summary

Role: Cloudera Public Cloud Platform Engineer
Work location: Remote work accepted from anywhere in US
 
  • 12+ years of experience in Big Data Platform Engineering / Cloud Platform Operations / Infrastructure roles
  • 6+ years of hands-on experience with Cloudera ecosystem (CDH/CDP/ Cloudera Public Cloud)
  • Demonstrated ability to quickly learn and adapt to new technologies and evolving platform capabilities, beyond the currently defined CDP stack
  • Strong expertise in:
    • End-to-end CDP platform operations (CDE, CDW, CDF, CDL, CAI)
    • Advanced troubleshooting across multi-cluster, multi-environment deployments
    • Kubernetes-based runtime environments (troubleshooting and diagnostics)
    • Observability frameworks, including SLIs/SLOs, alerting, and performance tuning
  • Proven experience in:
    • Leading P1/P2 incident response, triage, and resolution
    • Managing platform upgrades, patching, and lifecycle events
    • Supporting large-scale environments (TB/PB scale, high concurrency workloads)
  • Strong understanding of:
    • Cloud infrastructure (IAM, VPC, networking, storage)
    • Security and governance (Ranger, Kerberos, TLS/SSL, SDX)
  • Expected to:
    • Lead complex troubleshooting and drive root cause resolution across platform layers
    • Mentor and guide L2 engineers
    • Coordinate with Cloudera support and infrastructure teams for critical issues
  • Hands-on experience in developing and troubleshooting NiFi (CDF) data flows, including:
    • Flow design and configuration
    • Processor-level debugging and performance tuning
    • Handling backpressure, throughput optimization, and failure recovery
 
Required Skills
  • Strong experience with Cloudera CDP Public Cloud
  • Expertise in:
    • Cloud platforms (AWS/Azure/Google Cloud Platform)
    • Kubernetes concepts (troubleshooting-focused)
  • Hands-on with:
    • CDE, CDW, CDF (NiFi), CAI
  • knowledge of:
    • IAM, networking, observability tools
  • Platforms operating at multi-terabyte to petabyte scale with high concurrency workloads
  • Hands-on experience with:
    • Kafka (or similar streaming platforms) including monitoring, troubleshooting, and performance tuning
  • Experience with Cloudera CDP CLI (Command Line Interface) for:
    • Platform operations and administration
    • Job execution and service management (CDE/CDW/CDL)
    • Automation of routine operational tasks
  • Strong working knowledge of:
    • Cloud IAM (AWS IAM / Azure AD) including roles, policies, and cross-service access
    • User and group mapping across CDP, cloud IAM, and Ranger policies
    • Troubleshooting access issues across storage (S3/ADLS), CDP services, and data access layers
 
Preferred Skills
  • Experience with:
    • Modernization of legacy data platforms/applications to Cloudera CDP Public Cloud
    • Migration and onboarding of workloads to CDE, CDW, and CAI environments
    • Supporting hybrid or multi-environment transitions (on-prem → cloud)
  • Familiarity with:
    • Cloud platforms (AWS, Azure, Google Cloud Platform) including storage, IAM, and networking concepts
    • Kubernetes-based runtime environments (troubleshooting-focused)
  • Strong scripting and automation skills (Python, Shell, Terraform) for platform operations
 
What You’ll Work On
  • Enterprise-scale Cloudera CDP platform supporting data engineering, analytics, and AI workloads across multiple applications
  • Modernization of legacy platforms and applications into cloud-native CDP services
  • Operational support and scaling of:
    • Data services (CDE, CDW, CDF, CDL)
    • AI/ML platforms (CAI, inference, workbenches)
  • Platform performance optimization, observability, and reliability engineering for mission-critical workloads
 
Why This Role Matters
  • Ensures availability, stability, and performance of the CDP platform supporting all data and AI workloads
  • Enables successful modernization of legacy applications into scalable, cloud-native services
  • Maintains high availability, observability, and operational excellence across enterprise platforms
  • Acts as the backbone for data engineering, analytics, and AI initiatives
  • This role focuses on platform reliability and infrastructure operations and does not include data-layer ownership (e.g., Iceberg table management or data validation).
 
 
Job Summary
We are seeking a highly skilled Cloudera Public Cloud Platform Engineer to operate and manage the end-to-end CDP platform ecosystem, including data services, NiFI, Kafka, AI/ML platforms, and enterprise observability.
 
This role is responsible for ensuring availability, scalability, security, and performance of all platform services supporting data, analytics, and AI workloads across environments.
 
The ideal candidate brings strong expertise in CDP on-prem, public cloud services, cloud infrastructure, Kubernetes-based runtime environments, and platform observability, supporting high-concurrency, mission-critical workloads at multi-terabyte to petabyte scale
 
This role is critical to ensuring uninterrupted operation of data, analytics, and AI platforms—any degradation directly impacts downstream business reporting, data pipelines, and model execution.
 
 
Key Responsibilities
 
CDP Platform & Multi-Service Operations
  • Own end-to-end operational responsibility for Cloudera Public Cloud services across Dev / Stage / UAT / Prod:
    • CDE, CDW, COD, CDL, CDF (NiFi), CDV, CAI, Kafka
  • Ensure multi-cluster stability, workload isolation, and SLA adherence
  • Support onboarding and operations of multiple applications across environments
  • Manage and support multi-environment, multi-cluster deployments with strict isolation, governance, and release coordination across Dev/UAT/Prod
 
AI/ML Platform Operations
  • Operate and support Cloudera AI (CAI) environments:
    • AI Workbenches, AI Studios
    • Model training and development environments
    • AI inference endpoints and model serving
  • Troubleshoot:
    • Resource contention (CPU/GPU)
    • Model deployment/runtime failures
 
CDP Runtime & Kubernetes-Aware Operations
  • Operate CDP services running on Cloudera-managed Kubernetes infrastructure
  • Apply strong understanding of containerized workloads and Kubernetes concepts for troubleshooting
  • Diagnose and resolve:
    • Pod failures, restarts, and resource contention
    • Spark job failures in containerized environments (CDE)
    • Service-to-service communication issues
  • Analyze logs and metrics to identify runtime failures and performance issues
  • Collaborate with Cloudera support for managed service-level issues
 
Data Integration & Platform Services
  • Operate and support:
    • CDF (NiFi) for ingestion pipelines
    • CDV (Data Visualization) for reporting workloads
    • Octopai for data lineage and catalog integration
  • Ensure reliability and performance of data pipelines and integrations
  • Monitor and troubleshoot Kafka environments:
    • Topic configurations, partitions, and replication
    • Consumer lag and throughput issues
    • Broker connectivity and performance bottlenecks
 
Security, Governance & SDX Administration
  • Implement and manage:
    • Kerberos, TLS/SSL, Ranger policies
  • Administer SDX for:
    • Centralized security
    • Metadata and policy enforcement
  • Support Atlas and Octopai integration
  • Manage and troubleshoot user access and identity mapping across layers, including:
    • Cloud IAM roles and permissions
    • CDP users/groups and identity providers
    • Ranger policies for fine-grained data access
  • Resolve access-related issues impacting:
    • Data access (S3/ADLS)
    • Query execution (CDW/CDE)
    • Application and service-level permissions
 
Cloud Infrastructure & Networking
  • Troubleshoot:
    • S3 / ADLS storage issues
    • IAM roles and permissions
    • VPC, subnets, routing, security groups
    • Bastion host access and connectivity
  • Ensure secure and reliable connectivity across services
  • Understand and troubleshoot S3-based data lake patterns, including:
    • Bucket structure, prefix design, and access patterns
    • Performance issues related to small files, request rates, and throughput limits
    • Encryption (SSE-S3, SSE-KMS) and access policies
  • Manage and troubleshoot cross-account IAM roles and access patterns for CDP environments
  • Ensure secure access between:
    • CDP environments and cloud resources
    • Multiple AWS accounts (dev/prod separation)
 
Disaster Recovery & Resiliency
  • Support and validate disaster recovery and failover strategies across CDP environments
  • Ensure backup, recovery, and environment resiliency for critical workloads
  • Participate in DR drills and recovery validation
 
Observability, Monitoring & Alerting (Critical)
  • Implement and manage end-to-end observability:
    • Metrics, logs, and alerting
  • Use:
    • Cloudera observability, Cloudera Manager, Prometheus, Grafana
  • Monitor:
    • Cluster health
    • Workload performance
    • AI inference endpoints
  • Enable proactive issue detection and prevention
  • Define and implement SLIs/SLOs and alerting thresholds to ensure platform reliability and performance
  • Support high-severity (P1/P2) incident response, triage, and resolution within defined SLAs
 
Operational Support & On-Call
  • Participate in on-call rotation to support 24/7 platform operations
  • Respond to production incidents, alerts, and service disruptions within defined SLAs
  • Handle P1/P2 incidents, including triage, troubleshooting, and resolution
  • Perform root cause analysis (RCA) and implement preventive measures
 
Upgrades, Patching & Platform Lifecycle
  • Execute:
    • CDP upgrades and version management
    • Security patches and hotfixes
  • Perform:
    • Rolling upgrades
    • Validation and rollback strategies
 
Performance Optimization & Cost Efficiency
  • Optimize:
    • Platform-level performance (Spark, Hive, Impala workloads)
    • Cluster utilization and workload distribution
  • Drive:
    • Autoscaling strategies
    • Cost optimization (FinOps practices)
 
Automation & Operational Excellence
  • Utilize and support existing automation frameworks for:
    • Platform provisioning
    • Monitoring and alerting
    • Routine operational tasks
  • Work with infrastructure teams that manage Infrastructure-as-Code (Terraform) for environment setup and changes
  • Leverage scripting (Python / Shell) for:
  • Operational support
  • Task automation
  • Troubleshooting and diagnostics
  • Maintain and follow runbooks, SOPs, and operational procedures to ensure consistent platform operations
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 91142718
  • Position Id: 8936897
  • Posted 2 hours ago

Company Info

About Pacific Consultancy Services

Pacific Consultancy Services, founded with an ambitious vision in 2013, is a prominent IT Consulting and Service Delivery firm. The company is built upon the pillars of exceptional customer-centric solutions, streamlined processes, and impeccable technical expertise.

With nearly two decades of experience, Pacific consultancy services have been at the forefront of delivering intelligent solutions to clients worldwide, including the United States. Its specialized offerings encompass Artificial Intelligence, Machine Learning, Blockchain, Cloud services, IoT, DevOps, IT Staff Augmentation, and Cognitive Analytics, all contributing to achievable and profitable business models. Ensuring quality across all service domains is a priority for Pacific consultancy services. Its services span from IT Staff Augmentation to Digital Transformation, IT Consulting, and Emerging Technologies.

The work ethos of Pacific consultancy services revolves around its core “Model of Delivery,” aimed at providing clients with the best possible solutions.

About_Company_OneAbout_Company_Two
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Remote or California

Yesterday

Easy Apply

Contract, Third Party

$DOE

Remote or Cleveland, Ohio

Today

Easy Apply

Contract

USD 42.75 - 49.50 per hour

Remote

Today

Easy Apply

Contract, Third Party

Depends on Experience

Remote

Today

Easy Apply

Third Party, Contract

Depends on Experience

Search all similar jobs