Apply Now

Kubernetes Engineer w/ Spark/Big Data exp

Hybrid in Rockville, MD, US • Posted 4 hours ago • Updated 4 hours ago

Contract W2

6 Months

On-site

Depends on Experience

Fitment

Dice Job Match Score™

🧠 Analyzing your skills...

Job Details

Skills

EMR on EC2 over to EMR on EKS
Helm charts
Spark on K8s
Kubectl
Spark experience
Pyspark data pipelines
Hadoop
Spark
Python
Scala
Amazon EMR on EKS (Elastic Kubernetes Service)
Hive & Trino

Summary

Forsys Inc. is a global Lead-to-Revenue and Enterprise Transformation consulting firm helping enterprises modernize revenue operations and core business platforms. Headquartered in Milpitas, California, Forsys partners with enterprise customers across the Americas to deliver large-scale Oracle, Salesforce, and cloud-led transformations.

Location: Rockville, MD (3 days onsite & 2 days remote)

Duration: 6 months; long-term extensions

Work Authorization: Any

Interview: 2 rounds; Offer

Details:

Must be able to interview onsite in Rockville, MD for final round.
Any work authorization is fine
6 month base contract likely long-term, multi-year extensions (our average consultant stays 4.5 years on contract)

Skillset:

Manager called this a Kubernetes Engineer
Transitioning from EMR on EC2 over to EMR on EKS
Experience with Helm charts, Spark on K8 s, Kubectl, etc
Must also have Spark experience
Must be willing to create Pyspark data pipelines
Certifications are preferred but not mandatory (CKAD, CKA)

Interview:

30-45 minute virtual interview
1.5 2 hour final in-person interview
Offer

Job Description:

Job Description Summary

We are seeking a highly skilled and experienced Big Data Engineer to design, develop, and optimize large-scale data processing systems. In this role, you will work closely with cross-functional teams to architect data pipelines, implement data integration solutions, and ensure the performance, scalability, and reliability of big data platforms. The ideal candidate will have deep expertise in distributed systems, cloud platforms, and modern big data technologies such as Hadoop, Spark, and Kubernetes-based orchestration.

Responsibilities:

Design, develop, and maintain large-scale data processing pipelines using Big Data technologies (e.g., Hadoop, Spark, Python, Scala).

Architect and deploy containerized big data workloads on Amazon EMR on EKS (Elastic Kubernetes Service).

Design and implement Kubernetes-based infrastructure for running Spark applications at scale.

Implement data ingestion, storage, transformation, and analysis solutions that are scalable, efficient, and reliable.

Stay current with industry trends and emerging Big Data technologies to continuously improve the data architecture.

Collaborate with cross-functional teams to understand business requirements and translate them into technical solutions.

Optimize and enhance existing data pipelines for performance, scalability, and reliability.

Develop automated testing frameworks and implement continuous testing for data quality assurance.

Conduct unit, integration, and system testing to ensure the robustness and accuracy of data pipelines.

Work with data scientists and analysts to support data-driven decision-making across the organization.

Ability to write and maintain automated unit, integration, and end-to-end tests.

Monitor and troubleshoot data pipelines in production environments to identify and resolve issues.

Manage Kubernetes clusters, pods, services, and deployments for big data workloads.

Essential Technical Skills:

AI Tool Proficiency:

Hands-on experience with AI development tools (GitHub Copilot, Q Developer, ChatGPT, Claude, etc.)

Big Data Technologies:

Experience with Big data technologies such as Hadoop, Spark, Hive & Trino

Understanding of common issues like data skew and strategies to mitigate it, working with massive data volumes in PetaBytes, and troubleshooting job failures due to resource limitations, bad data, and scalability challenges.

Real-world experience with debugging and mitigation strategies.

Container Orchestration & Kubernetes:

Strong experience with Kubernetes architecture, concepts, and operations (pods, services, deployments, namespaces, ConfigMaps, Secrets)

Hands-on experience with Amazon EMR on EKS (Kubernetes) for running Apache Spark workloads

Experience with Kubernetes resource management, scheduling, and auto-scaling

Knowledge of Helm charts for deploying and managing applications on Kubernetes

Understanding of Kubernetes networking, storage (PVs, PVCs), and security best practices

Experience with kubectl and Kubernetes YAML manifests

Ability to troubleshoot Kubernetes cluster issues, pod failures, and resource constraints

Experience integrating Spark with Kubernetes operators and dynamic allocation

AI Skills:

Prompt Engineering: Proficiency in crafting effective prompts for AI coding assistants and analysis tools

AI Workflow Design: Experience redesigning development processes to leverage AI capabilities

Data Analysis: Ability to interpret AI-generated insights and translate them into actionable team improvements

Change Management: Experience leading teams through AI adoption and workflow transformation

Apache Spark (Development, Internals & Tuning):

Deep understanding of Spark's core architecture - executors, tasks, stages, DAG

Expertise in Spark performance tuning techniques: partitioning, caching, broadcast joins, etc.

Experience troubleshooting slow running/stuck jobs or resource issues in Spark

Proven ability to optimize Spark jobs for large-scale datasets

Experience running Spark on Kubernetes and understanding Spark-on-K8s architecture

Cloud Technologies:

Experience with AWS services like S3, EMR, EMR on EKS, Glue, Lambda, Athena, etc.

Hands-on experience using S3 with Spark (e.g., dealing with file formats, consistency issues)

Strong experience with Amazon EKS (Elastic Kubernetes Service) architecture and best practices

Experience with AWS IAM roles for service accounts (IRSA) for Kubernetes workloads

Knowledge of AWS networking for EKS (VPC, subnets, security groups)

Experience with AWS monitoring and logging tools (CloudWatch, CloudTrail) for Kubernetes workloads

Serverless knowledge (Lambda, Fargate)

Programming - Python or Scala:

Ability to write clean, modular, and perform code

Experience with functional programming concepts (e.g., immutability, higher-order functions)

Real-world use cases where scalable data processing code was implemented

Strong understanding of collections, concurrency, and memory management

SQL Skills (Window Functions, Joins, Complex Queries):

Proficiency with SQL window functions, multi-table joins, and aggregations

Ability to write and optimize complex SQL queries

Experience handling edge cases like NULLs, duplicates, and ordering

Good to have:

Experience with managing production data pipelines/ETL systems

Experience with CI/CD pipelines (Jenkins, GitLab CI, GitHub Actions, ArgoCD)

Experience with Infrastructure as Code (Terraform, CloudFormation) for provisioning EKS clusters and EMR on EKS

Experience writing comprehensive test cases and test automation

Experience with Docker and container image optimization

Knowledge of service mesh technologies (Istio, Linkerd)

Experience with monitoring and observability tools (Prometheus, Grafana, ELK stack)

AWS certifications (AI practitioner, Solutions Architect, Big Data Specialty, or Kubernetes certifications like CKA/CKAD)

Experience with GitOps practices for Kubernetes deployments

Education/Experience Requirements:

Bachelor's degree in Computer Science, Information Systems or related discipline with at least five (5) years of related experience, or equivalent training and/or work experience; Master's degree and past Financial Services industry experience preferred.

Demonstrated technical expertise in Object Oriented and database technologies/concepts which resulted in deployment of enterprise quality solutions.

Extensive knowledge of industry leading software engineering approaches including Test Automation, Build Automation and Configuration Management frameworks.

Strong written and verbal technical communication skills.

Demonstrated ability to develop effective working relationships that improved the quality of work products..

Ability to maintain focus and develop proficiency in new skills rapidly.

Ability to work in a fast paced environment.

Forsys Is An Equal Opportunity Employer And All Qualified Applicants Will Receive Consideration For Employment Without Regard To Race, Color, Religion, Sex, National Origin, Disability Status, Protected Veteran Status, Or Any Other Characteristic Protected By Law.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Dice Id: 90896916
Position Id: 8978413
Posted 4 hours ago

Contact the job poster

Amarnath Bhupathi

Sr Lead Recruiter @ Forsys Inc.

View Profile

Create job alert

Never miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Java Spark Developer

Hybrid in McLean, Virginia

•

Today

Job Title:Java Spark Developer Location:McLean, VA (Hybrid 3 days onsite/week) Duration:Contract (Approved through End of Year, possible extension) Work Schedule:MondayFriday, 9:00 AM 5:00 PM Role Overview: We are seeking a highly skilledJava Lead Developerto support a key data integration initiative between Client and Client. The ideal candidate will lead development efforts focused on scalable, cloud-based data processing and integration solutions. Key Responsibilities: Design, develop, and

Easy Apply

Contract, Third Party

60 - 65

Data Engineer

Rockville, Maryland

•

Today

Software Guidance & Assistance, Inc., (SGA), is searching for a Data Engineer for a CONTRACT assignment with one of our premier Regulatory clients in Rockville, MD or Tysons, VA. The Data Engineer works with moderate supervision across two equally weighted domains: (1) large-scale data pipeline development processing market events in a cloud environment, and (2) design and development of agentic AI systems including LLM-powered regulatory data assistants, MCP servers, and agent harness arc

Contract

USD 82.14 - 85.00 per hour

Big Data Engineer

Rockville, Maryland

•

Today

Software Guidance & Assistance, Inc., (SGA), is searching for a Big Data Engineer for a CONTRACT assignment with one of our premier Regulatory clients in the DMV area. This position is for a mid-level support engineer that will provide primarily support of Big data operations in AWS for Market Regulation. Supported platforms include AWS Services, UNIX, Windows, SQL, Postgres, Batch Scheduling, etc Additionally should work under minimal supervision to maintain enterprise Linux infrastructure

Contract

USD 73.00 - 75.00 per hour

AWS Developer - Python - OnSite - Local to Reston Only

Hybrid in Reston, Virginia

•

Today

Must be Local to Reston, NO RELO - OnSite 3 days a week. Top 5 Technical Skills: Python (Big Data Pipeline) AWS Hadoop, Spark, Hive EMR Terraform Job Description: Strong Python development to build a big-data pipeline for data processing and analysis Need strong experience in Hadoop EMR, Spark, Pyspark, Hive, AWS, Terraform (end to end automation), Github, CICD Need to work within a team of developers to assist with development Need to be open to collaborating and advising on best possible solut

Easy Apply

Contract

Depends on Experience

Search all similar jobs