Overview
Skills
Job Details
Mandate Skills: AWS + Python + Airflow
Position Summary:
We re looking for a data engineer to build and maintain ELT pipelines using Apache Airflow, and Snowflake in an AWS cloud environment. Should have experience in modular python coding with experience in deploying any container-based services in aws with monitoring setup as well.
Key Skills & Experience:
- Strong SQL and Snowflake expertise, including performance tuning and data modeling.
- Proficient in Python for scripting, automation, and working with REST APIs.
- Experience with Apache Airflow for orchestration and workflow monitoring.
- Solid experience with AWS services (e.g., S3, Lambda, IAM, CloudWatch) in data engineering workflows.
- Experience integrating and processing data from REST APIs.
- Understanding of data quality, governance, and cloud-native troubleshooting.
Role Responsibilities:
Primary Skillset :(Must have)
10+ Years Experience Great Communicator/Client Facing Individual Contributor 100% Hands on in the mentioned skills
AWS Expertise:
Data STORAGE solutions: In depth understanding of AWS S3 for data storage, including best practices for organization and security
Experience with AWS redshift for data warehousing and performance optimization
Data Integration:
Familiarity with Aws glue for ETL processes and orchestration -Nice to have
Experience with AWS lambda for serverless data processing tasks
Workflow Orchestration:
Proficiency in using Apache Airflow on AWS to design ,schedule and monitor complex data flows
Ability to integrate Airflow with AWS services and DBT models such as triggering a DBT model or EMR or reading from s3 writing to redshift
Data Lakes and Data warehousing:
Understanding the architecture of data lakes vs data warehouses and when to use each
Experience with amazon Athena for querying data directly in s3 using SQL
Monitoring and Logging:
Familiarity with AWS cloud watch for monitoring the pipelines and setting up alerts for workflow failures
Cloud Security:
Knowledge of AWS security best practices ,including IAM roles, encryption, DBT profiles access configurations
Programming Skills:
Python:
Proficiency in Pandas and NumPy for data analysis and manipulation
Ability to write scripts for automating ETL processes and scheduling jobs using airflow
Knowledge on how to implement conditional logic in DBT through python
SQL:
Advanced SQL skills, including complex joins ,window functions, CTE's and subqueries
Experience in optimizing SQL queries for performance and optimization
Secondary Skill Set/Preferred :( Good to have)
React/streamlit, SQL, Snowflake, AWS cloud,REST API via FastAPI or similar