Job Description -
Submit candidates with their DL, Visa Copy, LinkedIn.
Big Data Engineer
Location: Rockville, MD or McLean, VA (Hybrid – 3 days onsite with 2 days remote)
Duration: 6+ months
/ EAD
MOI: Prescreen, Phone, Onsite panel
Candidate should be local and from nearby state with DL
There will be a technical assessment and ID verification before submitting profile to client. Assessment should be completed within 24 hours.
Job Summary:
We are seeking an experienced Big Data Engineer to design, build, and optimize large-scale data processing systems. This role partners closely with engineering, data, and analytics teams to deliver scalable, reliable data pipelines that support data-driven decision-making. The ideal candidate has strong experience with distributed systems, cloud platforms, and modern big data technologies.
Qualifications
- Bachelor''s degree in Computer Science or a related field, or equivalent experience.
- 5 years of experience building enterprise-scale data solutions.
- Strong experience with big data technologies such as Spark, Hadoop, Hive, and Trino.
- Proficiency in Python or Scala, with solid object-oriented and/or functional programming skills.
- Strong SQL skills, including complex queries, joins, and window functions.
- Experience working with large datasets and troubleshooting performance, scalability, and data quality issues.
- Experience with cloud platforms, preferably AWS (e.g., S3, EMR, Glue, Athena, Lambda).
- Familiarity with Agile development practices and CI/CD pipelines.
- Strong communication skills and ability to work effectively in a fast-paced environment.
- Experience managing production ETL or data pipeline systems.
- Exposure to AI-assisted development tools (e.g., Copilot, ChatGPT, Q Developer).
- Experience with Spark performance tuning and optimization.
- AWS certifications or equivalent cloud experience.
Responsibilities
- Design, develop, and maintain scalable data pipelines using technologies such as Spark, Hadoop, Python, and Scala.
- Build and optimize data ingestion, transformation, and storage solutions for large-scale datasets.
- Optimize existing data pipelines for performance, scalability, and reliability.
- Collaborate with cross-functional teams to translate business requirements into technical solutions.
- Monitor, troubleshoot, and resolve data pipeline issues in production environments.
- Implement automated testing and data quality checks across pipelines.
- Stay current with emerging big data and cloud technologies to continuously improve the platform.
- Support data scientists and analysts by enabling reliable access to high-quality data.