Overview
Hybrid
Depends on Experience
Full Time
Accepts corp to corp applications
Able to Provide Sponsorship
Skills
Apache Kafka
Data Flow
Scripting
Job Details
Title: Kafka Administrator
Location: New York, NY (Hybrid 3 Days Onsite)
Role Overview
We are seeking an experienced Kafka Administrator to manage, optimize, and support enterprise-level Apache Kafka clusters. The ideal candidate will ensure the high availability, scalability, and security of Kafka environments while collaborating closely with development and infrastructure teams to support real-time data streaming and integration initiatives.
Key Responsibilities
- Kafka Cluster Management
- Install, configure, and maintain Kafka clusters, ensuring high availability and disaster recovery.
- Manage brokers, topics, partitions, and consumer groups for optimal utilization and data flow.
- Performance Optimization
- Monitor cluster performance, identify bottlenecks, and fine-tune Kafka configurations to achieve low latency and high throughput.
- Conduct periodic health checks and implement best practices for scaling and load balancing.
- Security Administration
- Implement and maintain security policies, including authentication, authorization, and encryption.
- Manage ACLs, SSL/TLS certificates, and integrate with organizational identity management systems.
- Monitoring and Alerting
- Set up and manage monitoring, logging, and alerting systems using tools such as Prometheus, Grafana, or Datadog.
- Proactively identify issues, ensure system stability, and perform root cause analysis for incidents.
- Troubleshooting and Support
- Diagnose and resolve Kafka ecosystem issues related to performance, data loss, or connectivity.
- Collaborate with developers and DevOps teams for debugging and incident response.
- Integration and Collaboration
- Partner with application and data engineering teams to integrate Kafka with other platforms for real-time data streaming and ingestion.
- Support downstream systems and pipelines that depend on Kafka-based event streaming.
- Automation and Scripting
- Automate cluster maintenance tasks using Python, Bash, or Ansible.
- Develop scripts to improve operational efficiency and reduce manual interventions.
- Capacity Planning and Documentation
- Conduct capacity planning and performance forecasting to ensure scalability for future workloads.
- Maintain comprehensive documentation on configurations, procedures, and operational workflows.
Required Skills & Qualifications
- 5 10 years of experience in Kafka Administration and distributed messaging systems.
- Strong understanding of Kafka architecture, producers, consumers, brokers, topics, and partitions.
- Experience with Kafka Connect, Schema Registry, and Kafka Streams.
- Proficiency with Linux systems administration, shell scripting, and automation tools.
- Hands-on experience with monitoring and alerting tools (Prometheus, Grafana, ELK, etc.).
- Knowledge of security configurations SSL, Kerberos, ACLs, and encryption protocols.
- Familiarity with cloud platforms (AWS MSK, Google Cloud Platform Pub/Sub, or Azure Event Hubs) is a plus.
- Excellent analytical, troubleshooting, and communication skills.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.