Overview
On Site
$45 - $50
Contract - W2
Contract - 6 Month(s)
100% Travel
Skills
pyspark
AWS
python
Apache Spark
sql
hadoop
Job Details
Role: Python Pyspark AWS Developer
Location: Plano, TX / Jersey City, NJ(Onsite)
Location: Plano, TX / Jersey City, NJ(Onsite)
Duration: Long Term (Looking For EAD) {W2}
Job Description:
Core responsibilities
- Data pipeline development: Design, develop, and maintain high-performance data pipelines using PySpark.
- Performance optimisation: Optimise and tune existing data processing workflows for better performance and efficiency.
- Data transformation: Implement complex data transformations and integrations, such as reading from external sources, merging data, and loading into target destinations.
- Troubleshooting Monitor and troubleshoot performance issues, errors, and other problems in data processing systems.
- Collaboration Work with cross-functional teams like data scientists, data engineers, and business analysts to understand requirements and deliver solutions.
Required skills and qualifications
- Technical skills Strong proficiency in Python and Apache Spark is essential, along with experience in distributed computing concepts.
- Big data ecosystem Experience with big data technologies like Hadoop, Hive, and data storage solutions (e.g., HDFS, AWS S3) is often required.
- SQL Proficiency in SQL for querying and data modelling is a must.
- Cloud platforms Familiarity with cloud environments like AWS, Google Cloud, or Azure is a significant advantage.
- Development tools: Experience with version control (Git) and CICD tools like Jenkins is often expected.
- Other skills: Knowledge of Linux, shell scripting, and agile methodologies is beneficial.
- Data pipeline development: Design, develop, and maintain high-performance data pipelines using PySpark.
- Performance optimisation: Optimise and tune existing data processing workflows for better performance and efficiency.
- Data transformation: Implement complex data transformations and integrations, such as reading from external sources, merging data, and loading into target destinations.
- Troubleshooting Monitor and troubleshoot performance issues, errors, and other problems in data processing systems.
- Collaboration Work with cross-functional teams like data scientists, data engineers, and business analysts to understand requirements and deliver solutions.
Required skills and qualifications
- Technical skills Strong proficiency in Python and Apache Spark is essential, along with experience in distributed computing concepts.
- Big data ecosystem Experience with big data technologies like Hadoop, Hive, and data storage solutions (e.g., HDFS, AWS S3) is often required.
- SQL Proficiency in SQL for querying and data modelling is a must.
- Cloud platforms Familiarity with cloud environments like AWS, Google Cloud, or Azure is a significant advantage.
- Development tools: Experience with version control (Git) and CICD tools like Jenkins is often expected.
- Other skills: Knowledge of Linux, shell scripting, and agile methodologies is beneficial.
Thanks & Regards
--------------------------
Sravan Badugu
Talent Acquisition SpecialistiMINDS Technology systems, Inc.
Direct:
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.