Overview
On Site
Depends on Experience
Contract - W2
Contract - 12 Month(s)
Skills
Amazon DynamoDB
Amazon S3
PySpark
Python
SQL
Glue
Data Lake
Airflow
Job Details
Role: AWS Data Engineer
Locations: Fort Mill, SC (Onsite)
Duration: 12+ Months Contract
F2F Interview Highly Preferred for Local Candidates.
Note: Candidate needs to be in the office 5 Days every week.
About the Project:
- We are developing a modern Data Lakehouse on AWS, leveraging S3, Glue, Iceberg tables, and Snowflake to enable analytics, governance, and scalability.
- The environment will be provisioned through Terraform, orchestrated via Airflow, and driven by robust data pipelines with Glue based Framework.
- We re seeking an experienced AWS Data Engineer with expertise in Glue, Terraform, and Airflow, and strong SQL development skills to help design, automate, and scale this cloud-based data platform.
Key Responsibilities:
- Design, develop, and deploy AWS infrastructure using Terraform, including S3, Glue, IAM, Lake Formation, and Athena resources.
- Develop and maintain AWS Glue ETL jobs (PySpark or Python shell) for data ingestion, transformation, and curation across raw clean curated layers.
- Integrate Airflow (Amazon MWAA or self-managed) for orchestrating Glue jobs, data pipelines, and dependencies.
- Build and maintain Glue Catalog, manage metadata, and align with Lake Formation security policies.
- Write complex SQL queries for data validation, transformation, and reporting logic, ensuring efficient query performance.
- Manage Terraform state files, backend setup (S3 + DynamoDB), and environment-based deployments.
- Implement data ingestion frameworks for batch and near real-time pipelines.
- Collaborate with Snowflake and BI teams for seamless data consumption.
- Contribute to high availability (multi-AZ) and disaster recovery (multi-region) strategies for core data components.
Required Skills:
- 6 years of experience as a Data Engineer or Cloud Engineer.
- Strong expertise in AWS Services: S3, Glue, Glue Catalog, Lake Formation, IAM, Athena, CloudWatch, Lambda (preferred).
- Hands-on proficiency in Terraform (HCL) for infrastructure automation.
- Experience with Airflow DAGs for orchestration of Glue, S3, and external data flows.
- Solid understanding of PySpark / Python for ETL scripting.
- Strong ability to write and optimize complex SQL (joins, window functions, CTEs, and analytical queries).
- Familiarity with data lake formats (Iceberg, Parquet, Delta, etc.).
- Experience with CI/CD pipelines (GitHub Actions, CodePipeline, or Jenkins).
Preferred Skills:
- Exposure to Snowflake or similar cloud data warehouses.
- Knowledge of data modeling, partitioning, and schema evolution.
- Familiarity with AWS security best practices, including encryption, IAM roles, and cross-account access.
- Familiarity with AWS Datazone and other services such as Cloud Trail, SNS
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.