Overview
Contract - W2
Skills
SQL
AWS
Python
PySpark
linux
Job Details
Job Description: Senior AWS Data Engineer -W2
Location: Reston, VA (primary) or Plano, TX (alternate)
Work Schedule: 3 days onsite, 2 days remote
We are seeking a highly experienced Senior Data Engineer to join our Information Security Data Compliance Engineering group for a contract position. You will be a key contributor to a data governance project, building robust and scalable data pipelines to manage large volumes of enterprise data.
The ideal candidate will have extensive hands-on experience with Amazon EMR and AWS Glue, as these are the primary services for this role. You will be responsible for designing and implementing data solutions that ingest, transform, and manage data from various sources, ensuring data integrity and security.
Responsibilities
Develop Data Pipelines: Design, build, and maintain highly scalable and efficient data pipelines using Amazon EMR and AWS Glue as the core technologies. Secondary services will include AWS Lambda and AWS Step Functions.
Data Processing at Scale: Develop and optimize PySpark programs to run on EMR clusters or as Glue Jobs. You will handle large-scale data ingestion and transformation from sources like APIs, S3, and file systems.
Database Management: Utilize AWS Glue Data Catalog to define and manage data schemas. You will be responsible for extensive data inserts, updates, and management of Glue-based tables.
SQL and Redshift Expertise: Write and optimize complex SQL queries, particularly for large tables within our Amazon Redshift data warehouse.
Infrastructure as Code (IaC): Apply DevOps principles using GitHub for version control and either Terraform or AWS CloudFormation for infrastructure automation and CI/CD pipelines. Familiarity with GitLab is also a plus.
Monitoring and Operations: Implement comprehensive monitoring, logging, and alerting to ensure the reliability and performance of data pipelines. Operate in a Unix/Linux environment for scripting and automation.
Solution Architecture: Actively contribute to high-level solution design and architectural discussions, leveraging your deep expertise in EMR and Glue to shape our data strategy.
Qualifications
Overall Experience: 10+ years of professional experience in data engineering or a related field.
Technical Expertise:
5+ years of experience with Python for scripting and automation.
2+ years of hands-on experience with Amazon EMR and AWS Glue.
7+ years of expertise in SQL and working with relational databases.
3+ years of experience with Unix/Linux shell scripting.
5+ years of professional experience with AWS services.
Education: A B.S. or M.S. in Computer Science or a related technical field is highly preferred.
DevOps Experience: Strong understanding of version control with GitHub and experience with IaC tools like Terraform or AWS CloudFormation.
AWS Certification: An AWS Certification (e.g., AWS Certified Data Analytics Specialty) is highly desirable.
Location: Reston, VA (primary) or Plano, TX (alternate)
Work Schedule: 3 days onsite, 2 days remote
We are seeking a highly experienced Senior Data Engineer to join our Information Security Data Compliance Engineering group for a contract position. You will be a key contributor to a data governance project, building robust and scalable data pipelines to manage large volumes of enterprise data.
The ideal candidate will have extensive hands-on experience with Amazon EMR and AWS Glue, as these are the primary services for this role. You will be responsible for designing and implementing data solutions that ingest, transform, and manage data from various sources, ensuring data integrity and security.
Responsibilities
Develop Data Pipelines: Design, build, and maintain highly scalable and efficient data pipelines using Amazon EMR and AWS Glue as the core technologies. Secondary services will include AWS Lambda and AWS Step Functions.
Data Processing at Scale: Develop and optimize PySpark programs to run on EMR clusters or as Glue Jobs. You will handle large-scale data ingestion and transformation from sources like APIs, S3, and file systems.
Database Management: Utilize AWS Glue Data Catalog to define and manage data schemas. You will be responsible for extensive data inserts, updates, and management of Glue-based tables.
SQL and Redshift Expertise: Write and optimize complex SQL queries, particularly for large tables within our Amazon Redshift data warehouse.
Infrastructure as Code (IaC): Apply DevOps principles using GitHub for version control and either Terraform or AWS CloudFormation for infrastructure automation and CI/CD pipelines. Familiarity with GitLab is also a plus.
Monitoring and Operations: Implement comprehensive monitoring, logging, and alerting to ensure the reliability and performance of data pipelines. Operate in a Unix/Linux environment for scripting and automation.
Solution Architecture: Actively contribute to high-level solution design and architectural discussions, leveraging your deep expertise in EMR and Glue to shape our data strategy.
Qualifications
Overall Experience: 10+ years of professional experience in data engineering or a related field.
Technical Expertise:
5+ years of experience with Python for scripting and automation.
2+ years of hands-on experience with Amazon EMR and AWS Glue.
7+ years of expertise in SQL and working with relational databases.
3+ years of experience with Unix/Linux shell scripting.
5+ years of professional experience with AWS services.
Education: A B.S. or M.S. in Computer Science or a related technical field is highly preferred.
DevOps Experience: Strong understanding of version control with GitHub and experience with IaC tools like Terraform or AWS CloudFormation.
AWS Certification: An AWS Certification (e.g., AWS Certified Data Analytics Specialty) is highly desirable.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.