Overview
Skills
Job Details
Location: Remote 100% remote work
Work Hours: 7:45 AM - 4:30 PM EST
Overview: IT Professionals to serve as Data Engineers on a project focused on implementing and maintaining a scalable and efficient data lake on the AWS and Azure platforms. These resources will be primarily responsible for ingesting data into the bronze and silver layers of the data lake, following a medallion architecture. The successful candidates will design and implement data ingestion workflows, data quality checks, and data transformation pipelines. This role requires close collaboration with technologists, data analysts, and business stakeholders.
Responsibilities:
- Design and implement data ingestion workflows to ingest data from various sources, including government operations, business data, workforce data, and health data.
- Build robust, scalable, and efficient ELT (Extract, Load, Transform) patterns.
- Design data models and manage schemas for the bronze and silver layers of the medallion architecture.
- Implement data quality checks, data lineage tracking, and metadata management to ensure data governance and compliance.
- Automate data pipeline workflows using tools like AWS Glue, AWS Step Functions, or Apache Airflow.
- Collaborate closely with technical staff, data analysts, data scientists, and business stakeholders.
- Maintain comprehensive documentation of the data lake architecture, data models, and data pipeline workflows.
- Support existing Azure Data Factory workloads.
- Work with the project team to ensure deliverables meet State quality standards, scope, budget, and timeline.
- Participate as a member of the team, including attending standups or project-specific meetings.
- Comply with all applicable platform functional and nonfunctional requirements, standards, and procedures defined by the State.
- Provide constructive feedback for the ongoing development and enhancement of the platform.
Required Technologies, Tools, and Experience:
- Expertise in AWS data services, including but not limited to S3, Glue, Athena, and Redshift.
- Experience with Azure Data Factory.
- Experience with data modeling and optimization.
- Experience with data management, data security, and data governance.
- Experience automating data pipeline workflows using tools like AWS Glue, AWS Step Functions, or Apache Airflow.
- Minimum of 3-5 years of experience in building and maintaining large-scale, enterprise-grade data lake solutions on the AWS platform.
- Proven track record in building and maintaining large-scale, enterprise-grade data lake solutions.
- Experience with ELT (Extract, Load, Transform) patterns.
- Experience implementing data quality checks, data lineage tracking, and metadata management.
- Domain expertise in one or more areas such as government operations, business data, workforce data, and health data is preferred.
Keywords for Search Performance: Data Engineer, AWS, Azure, Data Lake, Medallion Architecture, Data Ingestion, Data Pipelines, ETL, ELT, Data Transformation, Data Quality, Data Governance, Data Security, Data Modeling, Data Management, S3, Glue, Athena, Redshift, Azure Data Factory, AWS Step Functions, Apache Airflow, Cloud Data, HIPAA, Cyber Security, Remote Work, Vermont, State Government, Enterprise Data, Scalable Data Solutions.