Overview
Skills
Job Details
Job Summary:
The ideal candidate will have extensive experience developing and maintaining Python-based data solutions on AWS, working with PostgreSQL, and managing large-scale data environments using EMR and EC2.
Key Responsibilities:
Design and implement Python scripts to extract, transform, and load (ETL) data from multiple source systems into PostgreSQL databases.
Manage and optimize PostgreSQL databases hosted on Amazon RDS for performance and scalability.
Utilize AWS EMR (Elastic MapReduce) for big data processing and analytics.
Configure and maintain AWS EC2 instances for various data and application workloads.
Collaborate with data architects, analysts, and DevOps teams to ensure seamless data integration and deployment.
Troubleshoot, debug, and improve data pipelines for efficiency and reliability.
Required Skills:
Strong Python programming skills, including automation and data pipeline development.
PostgreSQL and Amazon RDS database experience (schema design, query tuning, backups, replication).
AWS EMR experience for distributed data processing.
AWS EC2 setup, configuration, and optimization.
Proficiency in ETL processes and working with multiple data sources.
Excellent analytical, problem-solving, and communication skills.
Familiarity with AWS Lambda, S3, Glue, or other AWS services.