Role : DevOps Engineer Cloudera Administration
Location : Scottsdale AZ (100% Onsite)
Rate : $65
Must have :
- Strong on-prem Cloudera expertise
- candidates with experience in Ozone, Airflow, Ranger configuration, Atlas
- Strong architectural understanding, ability to guide application teams and implement best practices.
- Good to know - AWS and Cloudera on AWS
Identified candidates should:
- Ensure the reliability and ongoing monitoring of the infrastructure, its stability and maintainability.
- architect solutions, support application teams, and effectively utilize automation (viz just executing routine tasks)
Role Summary
We are seeking a DevOps Engineer with strong handson experience in Cloudera platform deployment, configuration, and administration. The ideal candidate will manage, automate, and optimize enterprise data platforms built on Cloudera (CDP/CDH), ensuring availability, performance, security, and cost efficiency. The role requires advanced knowledge of HDFS, Hive, HBase, Solr (Cloudera Search), Ozone, Cloudera Data Services, and Ranger, along with solid DevOps and automation practices.
Required Qualifications
- 3 8 years (adjustable) of handson experience administering Cloudera (CDH/CDP) in production.
- Strong administration skills with HDFS, Hive, HBase, Solr (Cloudera Search), Ozone, Cloudera Data Services, and Ranger.
- Proficiency in Linux (RHEL/CentOS/Ubuntu) systems administration and shell scripting (Bash).
- Practical experience with Kerberos, TLS, LDAP/AD, and Ranger policy management.
- Automation with Ansible and scripting in Python for operational tasks.
- Experience with monitoring and logging tools (Cloudera Manager, Grafana/PrometheElastic or equivalents).
- Solid understanding of networking (DNS, load balancing, firewalls), storage, and JVM fundamentals.
- Gitbased CI/CD familiarity and change management practices in regulated environments.
Key Responsibilities
- Platform Deployment & Configuration
- Install, configure, and upgrade Cloudera clusters using Cloudera Manager across onprem and/or cloud environments.
- Provision and configure services: HDFS, Hive, HBase, Solr (Cloudera Search), Ozone, Data Services (e.g., Data Engineering, Data Warehouse, Machine Learning), and Ranger.
- Set up and manage Kerberos, TLS/SSL, AD/LDAP integration, and Ranger policies for finegrained access control.
- Operations & Reliability
- Own daytoday cluster administration: capacity planning, quota management, service restarts, rolling upgrades/patching, and backup/restore.
- Monitor and tune cluster and service performance (NameNode/ResourceManager health, tuning, YARN queues, Hive LLAP/Tez, HBase region servers, Solr cores/collections).
- Implement SLA/SLO monitoring, alerting, and dashboards; drive rootcause analysis and incident response.
- Security & Governance
- Maintain a secure environment via Ranger policies, Kerberos principals/keytabs, TLS certificates, and compliance checks.
- Support data governance and auditing requirements, integrate with enterprise secrets and key management as needed.
- Automation & DevOps
- Build and maintain Infrastructure as Code (IaC) and configuration automation (e.g., Ansible, Terraform).
- Develop operational runbooks and automation in Python/Bash for provisioning, patching, and routine admin tasks.
- Integrate platform workflows with CI/CD (Git, pipelines) for repeatable, versioncontrolled changes.
- Data Services & Ecosystem Support
- Administer and optimize Hive metastore, ACID tables, compactions, and query engines (Tez/Spark).
- Manage HBase schemas, region splitting/balancing, and performance tuning.
- Operate Solr (Cloudera Search) for indexing, schema management, and query performance.
- Support Ozone object store operations, storage policies, and migration use cases.
- Collaborate with data engineering teams on job orchestration, resource management, and troubleshooting.
- Reliability Engineering
- Implement high availability (HA) for critical components; design DR strategies and execute failover tests.
- Capacity and cost optimization across compute/storage; recommend rightsizing and lifecycle policies.
- Documentation & Collaboration
- Create and maintain architecture diagrams, topology, SOPs/runbooks, and security documentation.
- Partner with platform, security, and data engineering teams; provide L2/L3 support and knowledge transfer.
Thanks
Nikhil
Email id: