Cerebra Consulting Inc is a System Integrator and IT Services Solution provider with a focus on Big Data, Business Analytics, Cloud Solutions, Amazon Web Services, Salesforce, Oracle EBS, Peoplesoft, Hyperion, Oracle Configurator, Oracle CPQ, Oracle PLM and Custom Application Development. Utilizing solid business experience, industry-specific expertise, and proven methodologies, we consistently deliver measurable results for our customers. Cerebra has partnered with leading enterprise software companies and cloud providers such as Oracle, Salesforce, Amazon and able to leverage these partner relationships to deliver high-quality, end-to-end customer solutions that are targeted to the needs of each customer.
Hello,
Hope you are doing well,
Job Title: Lead Cloudera Consultant (Solution Architect)
Type: Contract through July 2026, potential to extend
Schedule: 100% remote
Required Skills:
Apache NiFi (hands-on)
Apache Kudu
Apache Flink (hands-on development)
Cloudera SQL Stream Builder (SSB)
Cloudera CDP (Streaming / CDF)
SQL-based CDC ingestion
Lead Cloudera Streaming Architect (CDP | NiFi | Kafka | Flink | Kudu | SSB)
About the Role
We are seeking a Lead Cloudera Streaming Architect with deep, hands-on experience across the Cloudera CDP streaming stack, including NiFi, Kafka, Flink, Kudu/Impala, and SQL Stream Builder (SSB). This is a highly technical, architecture-plus-implementation role responsible for designing, delivering, and optimizing mission-critical real-time data pipelines at enterprise scale.
If you have personally built end-to-end CDP/CDF streaming pipelines and can execute complex ingestion, transformation, CDC, and Kudu write-path use cases on day one - this role is for you.
What You'll Do
Streaming Architecture & Implementation
- Architect and build real-time data pipelines using the full Cloudera Data Platform (CDP) streaming suite:
- NiFi Kafka Flink Kudu/Impala SSB
- Own architectural decisions, patterns, and best practices for streaming, CDC, state management, schema evolution, and exactly-once delivery.
- Develop complex NiFi flows involving controller services (DBCP/JDBC), stateful processors, record processors, schema registry integrations, batch-to-stream conversions, and high-volume ingestion patterns.
- Build and optimize Flink SQL or DataStream API jobs with:
- ? Kafka sources/sinks
- ? event-time windows
- ? watermarks
- ? state management
- ? checkpointing / savepoints
- ? exactly-once guarantees
- Design and tune Kudu tables (PKs, partitioning, distribution, upserts, deletes, merges).
- Build and deploy streaming SQL jobs using Cloudera SQL Stream Builder (SSB).
Use Case Delivery
You must be able to deliver the following four core use cases immediately:
- NiFi Snowflake Impala/Kudu ingestion pipeline
- Kafka Flink streaming (real-time processing)
- Flink Kafka sink with exactly-once semantics
- CDC ingestion via NiFi, Flink CDC, or SSB (incremental keys, late events, deletes)
Optimization, Monitoring & Governance
- Tune NiFi, Kafka, and Flink clusters for performance, throughput, and stability.
- Implement schema governance, error handling, back-pressure strategies, and replay mechanisms.
- Work closely with platform engineers to optimize CDP components and CDF deployments.
- Provide architectural guidance, documentation, and mentorship to engineering teams.
Required Experience
You must have hands-on, production-grade experience with ALL of the following:
Cloudera CDP / CDF
- CDP Public Cloud or Private Cloud Base
- Cloudera Flow Management (NiFi + NiFi Registry)
- Cloudera Streams Messaging (Kafka, SMM)
- Cloudera Stream Processing (Flink, SSB)
- Kudu / Impala ecosystem
Apache NiFi (Advanced)
- Building complex flows (not just admin/ops)
- QueryDatabaseTable / GenerateTableFetch / MergeRecord
- Record-based processors & schema registry
- JDBC / DBCP controller services
- Stateful processors & incremental ingestion
- NiFi Snowflake integration
- NiFi Kudu ingestion patterns
Apache Kafka
- Kafka brokers, partitions, retention, replication, consumer groups
- Schema registry (Avro/JSON)
- Designing topics for high-throughput streaming
Apache Flink
- Flink SQL + DataStream API
- Event-time processing, watermarks, windows
- Checkpointing, savepoints, state backends
- Kafka source/sink connectors
- Exactly-once semantics
- Flink CDC a plus
Apache Kudu
- Table design (PKs, partition strategies)
- Upserts, deletes, merge semantics
- Integration with Impala
SQL Stream Builder (SSB)
- Creating jobs, connectors, materialized views
- Deploying and monitoring Flink SQL jobs in CDP
CDC (Change Data Capture)
- CDC via NiFi or Flink CDC or SSB
- Handling late-arriving events
- Handling deletes, updates, schema evolution
- Incremental key tracking
General Requirements
- 8+ years in data engineering / streaming
- 3 5+ years specifically with CDP/CDF streaming
- Strong SQL and distributed system fundamentals
- Experience in financial services, healthcare, telecom, or other high-volume industries preferred
Nice to Have
- Kubernetes experience running NiFi/Kafka/Flink operators
- Snowflake ingestion patterns (staging, Copy Into)
- Experience with Debezium
- CI/CD for data pipelines
- Security (Kerberos, Ranger, Atlas)
What Success Looks Like
In the first 90 days, you will:
- Deliver at least two of the four required streaming use cases end-to-end
- Establish architectural patterns for NiFi, Flink, and Kudu pipelines
- Optimize one existing pipeline for throughput, latency, and reliability
- Become the subject-matter expert for Data in Motion on CDP
Apply If You Can Demonstrate
- You have personally built NiFi Kafka Flink Kudu pipelines
- You understand event-time processing and exactly-once delivery
- You have designed Kudu tables and worked with Impala
- You have authored and deployed SSB SQL streaming jobs
- You can speak to real-world CDC implementations
Thanks,
Sudhanshu Srivastava
US IT Recruiter
Cerebra Consulting Inc,
Email -
Website|
PARTNERS| Oracle | Amazon | Salesforce | Hortonworks | Cloudera | MapR
AWARDS| Philadelphia 100 | INC5000 | CIO Top 10 Oracle Providers 2018
#Stay Safe Stay Healthy