Apply Now

AWS Big Data Engineer

Hybrid in Rockville, MD, US • Posted 6 hours ago • Updated 6 hours ago

Contract W2

6 Months

No Travel Required

Hybrid

Depends on Experience

NGTalentTech Group LLC

Fitment

Dice Job Match Score™

🧠 Analyzing your skills...

Job Details

Skills

apache spark
sql
python
AWS (EMR
S3
EC2)
Hadoop Ecosystem
Hive
Trino
ETL/Data Pipelines
Batch Processing
Large Data (TB/PB)
Spark Performance Tuning
Agile/Scrum
CI/CD
Unit Testing
AI Tools (Copilot/ChatGPT/Claude)
Financial Services
Pyspark
EMR
EC2

Summary

Role Overview

· High‑impact engineering role supporting the Market Regulation / Surveillance group

· Builds complex algorithms to ensure compliance in financial markets

· Supporting surveillance for a new exchange (details confidential)

Technical Environment

· Petabyte scale big data platform

· Batch processing only (Spark; no streaming)

· Tech stack: Spark, SQL, Python

· AWS: EC2, EMR, S3

Core Responsibilities

· Build scalable, high‑performance data pipelines

· Work with extremely large datasets in high‑volume environments

· Contribute to architectural decisions and performance optimization

· Solve complex data platform and performance challenges

Job Description:

Big Data Engineer

We are seeking a highly skilled and experienced Big Data Engineer to design, develop, and optimize large-scale data processing systems. In this role, you will work closely with cross-functional teams to architect data pipelines, implement data integration solutions, and ensure the performance, scalability, and reliability of big data platforms. The ideal candidate will have deep expertise in distributed systems, cloud platforms, and modern big data technologies such as Hadoop, Spark etc

Responsibilities

· Design, develop, and maintain large-scale data processing pipelines using Big Data technologies (e.g., Hadoop, Spark, Python, Scala).

· Implement data ingestion, storage, transformation, and analysis of solutions that are scalable, efficient, and reliable.

· Stay current with industry trends and emerging Big Data technologies to continuously improve the data architecture

· Collaborate with cross-functional teams to understand business requirements and translate them into technical solutions.

· Optimize and enhance existing data pipelines for performance, scalability, and reliability.

· Develop automated testing frameworks and implement continuous testing for data quality assurance.

· Conduct unit, integration, and system testing to ensure the robustness and accuracy of data pipelines.

· Work with data scientists and analysts to support data-driven decision-making across the organization.

· Ability to write and maintain automated unit, integration, and end-to-end tests

· Monitor and troubleshoot data pipelines in production environments to identify and resolve issues.

Education/Experience Requirements

· Bachelor''s degree in Computer Science, Information Systems or related discipline with at least five (5) years of related experience, or equivalent training and/or work experience; Master''s degree and past Financial Services industry experience preferred.

· Demonstrated technical expertise in Object Oriented and database technologies/concepts which resulted in deployment of enterprise quality solutions.

· Past experience with developing enterprise quality solutions in an iterative or Agile environment.

· Extensive knowledge of industry leading software engineering approaches including Test Automation, Build Automation and Configuration Management frameworks.

· Strong written and verbal technical communication skills.

· Demonstrated ability to develop effective working relationships that improved the quality of work products.

· Should be well organized, thorough, and able to handle competing priorities.

· Ability to maintain focus and develop proficiency in new skills rapidly.

· Ability to work in a fast paced environment.

· Experience with object oriented programming languages such as Java, Scala or Python.

Essential Technical Skills

· AI Tool Proficiency: Hands-on experience with AI development tools (GitHub Copilot, Q Developer, ChatGPT, Claude, etc.)

· Technical Background: Strong software development background with ability to contribute to technical discussions

· Agile Methodology: Extensive experience with Scrum, Kanban, and continuous improvement practices

Big Data Technologies

· Experience with Big data technologies such as Hadoop, Spark, Hive & Trino

· Evaluate understanding of common issues like:

o Data skew and strategies to mitigate it.

o Working with massive data volumes in PetaBytes.

o Troubleshooting job failures due to resource limitations, bad data, scalability challenged.

· Look for real-world debugging and mitigation stories.

AI Skills

· Prompt Engineering: Proficiency in crafting effective prompts for AI coding assistants and analysis tools

· AI Workflow Design: Experience redesigning development processes to leverage AI capabilities

· Data Analysis: Ability to interpret AI-generated insights and translate them into actionable team improvements

· Change Management: Experience leading teams through AI adoption and workflow transformation

SQL Skills (Window Functions, Joins, Complex Queries)

· Assess comfort with SQL window functions, multi-table joins, aggregations.

· Provide examples or ask them to write/optimize SQL queries on the spot.

· Probe how they handle edge cases like NULLs, duplicates, ordering, etc.

Apache Spark (Development, Internals & Tuning)

· Test their understanding of Spark''s core architecture — executors, tasks, stages, DAG.

· Focus on Spark performance tuning techniques: partitioning, caching, broadcast joins, etc.

· Ask scenario-based questions on troubleshooting slow running/stuck jobs or resource issues in Spark.

· Explore their experience optimizing Spark jobs for large-scale datasets.

Cloud Technologies

· Check exposure to AWS services like S3, EMR, Glue, Lambda, Athena, etc.

· Ask how they''ve used S3 with Spark (e.g., dealing with file formats, consistency issues).

· EKS, Serverless knowledge, etc.

Programming - Python or Scala

· Assess ability to write clean, modular, and performant code.

· Look for experience in functional programming concepts (e.g., immutability, higher-order functions).

· Ask about real-world use cases where they wrote scalable data processing code.

· Evaluate understanding of collections, concurrency, and memory management.

Good to Have

· Experience with managing production data pipelines/ETL systems

· Experience with CI/CD

· Experience writing test cases

· AWS certifications

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Dice Id: 91171995
Position Id: 9013018
Posted 6 hours ago

Company Info

About NGTalentTech Group LLC

Ngtalenttech Group LLC is a one-stop hub for software development expert support and advisory.

We are an IT consulting and services company involved in developing solutions for different platforms , and in providing best resourcing all the time

We have core team in finding the suitable candidate with strong technical Skills, passion towards the work

We offer corporate training in technologies like Front end,Java UI, Big Data, .Net, Java, iOS/Android, Informatica, Websphere Admin, Weblogin Admin, Selenium Automation QA, SalesForce, etc. With our well qualified training team

Go to company profile

Contact the job poster

Madhusudhan Budiga

Recruiter @ NGTalentTech Group LLC

View Profile

Create job alert

Never miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

It looks like there aren't any Similar Jobs for this job yet.

Search all similar jobs

More jobs at NGTalentTech Group LLC in Rockville, MD