Senior Kafka Engineer @Irvine, CA (Hybrid).

  • Irvine, CA
  • Posted 7 hours ago | Updated 7 hours ago

Overview

Hybrid
Depends on Experience
Accepts corp to corp applications
Contract - W2
Contract - Independent
Contract - 6 Month(s)

Skills

Apache Kafka
Amazon Web Services
AWS
confluent
Kubernetes
EKS
API
Kafka
Python
Terraform

Job Details

Job Description:

Client is seeking a Senior Kafka Engineer to manage, enhance, and scale an enterprise-grade Apache Kafka implementation deployed on AWS and the Confluent Platform. This person will be responsible for keeping the system reliable, improving it over time, and expanding it to support new applications.
This role involves performing detailed architectural reviews, monitoring, performance tuning, optimizing existing Kafka pipelines, and partnering with application teams to deliver reliable, secure, and performant streaming solutions.
Qualifications:

  • 8+ years in platform engineering with 3+ years of hands-on experience with Apache Kafka.
  • Expertise with Confluent Platform (Brokers, Schema Registry, Control Center, ksqlDB).
  • Experience deploying and managing Kafka on AWS (including MSK or self-managed EC2-based setups).
  • Solid understanding of Kubernetes, especially EKS, for microservices integration.
  • Familiarity with monitoring and alerting stacks: Prometheus, Grafana, ELK, or similar.
  • Management and support of Kafka Cloud and on-premises platforms.
  • Capacity management.
  • Good understanding of SRE principles and methodologies, and experience learning, adapting to, and automating new developments in the Kafka ecosystem.
  • Experience troubleshooting integration platform issues (e.g., connectivity, schema management, producer/consumer, etc.) for Kafka, API gateway, etc.
  • In-depth understanding of the Kafka producer and consumer client functionality.
  • Experience troubleshooting custom Kafka client applications written in Java, .NET, Python, and Spring Boot.
  • Hands-on experience with Kafka Connect, Kafka Streams, and Kafka Schema Registry components, understanding its underlying functionality and implementation.
  • Preferably, at least 3 years' experience working in AWS, specifically EKS, EC2, IAM, Route53 and Terraform.
  • Proficiency in deploying, scaling, and managing Kubernetes clusters, with a strong understanding of security best practices.
  • Familiarity with Docker and Helm.
  • Kafka, Kubernetes, Docker, or any Cloud certification.

Responsibilities:

  • Manage and enhance existing Apache Kafka and Confluent Platform on AWS.
  • Review existing implementations and recommend improvements.
  • Collaborate with internal teams and respective stakeholders to understand user requirements and implement technical solutions.
  • Collaborate with engineering and product teams to integrate new use cases and define scalable streaming patterns.
  • Implement and maintain Kafka producers/consumers, Connectors, and Kafka Streams applications.
  • Enforce governance around topic design, schema evolution, partitioning, and data retention.
  • Monitor, troubleshoot, and tune Kafka clusters using Confluent Control Center, Prometheus, and Grafana.
  • Use Kubernetes and Terraform to automate Kafka infrastructure deployment and scaling.
  • Ensure high availability, security, and disaster recovery including participation in all DR exercises
  • Responsibility and ownership for lifecycle management which includes upgrades, maintenance, restart, and migration projects.
  • Analyze all platforms and ensure the environments are right-sized (i.e., capacity management), along with managing configurations, monitors, and alerts.
  • Create guidelines, procedures, standards, conventions, and best practices for Kafka usage and administration.
  • Develop, maintain, and troubleshoot Terraform IaC modules.
  • Automate repetitive tasks using Terraform scripts.
  • Guide and mentor team members on Terraform implementations.
  • Partner with team members deploying releases to production, and support solution teams
  • Coordinate with the Compute, Database, and other infrastructure teams to support weekend patching activities.
  • Coordinate vendor support.
  • Broad support with troubleshooting and resolving issues, and to be an escalation point for on-premise and cloud integration platform incidents.
  • Remediate security vulnerabilities reported by VMAST across application runtimes, integration services, and messaging platforms.
  • Support production incidents and outages for faster service restoration and provide required support to ensure application connectivity to API services.
  • Work on service requests submitted by application teams requesting application integration team services to integration products (Kafka and any other messaging platforms).
  • Create defects when services like tenant onboarding, resource provisioning, and other customer products are not functioning as per the documentation or working as intended.
  • Analyze SN INCs and REQs to identify trends and opportunities for improvement.
  • Consider implementing automation for manual and repetitive operations support functions.
  • Troubleshoot operational issues related to the AWS infrastructure.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.