Overview
Skills
Job Details
Position: Data Engineer
Contract: W2 Only
Responsibilities
Design, develop, and maintain scalable data pipelines and architectures for ingestion, processing, and storage of large datasets.
Collaborate with data analysts, data scientists, and business stakeholders to understand data requirements and deliver reliable data solutions.
Build, optimize, and manage ETL/ELT workflows to extract, transform, and load data from multiple sources into data lakes or data warehouses.
Ensure data integrity, accuracy, and quality through validation, monitoring, and automated testing.
Work with cloud data platforms (AWS, Azure, or Google Cloud Platform) for modern data infrastructure development.
Implement best practices for data modeling, partitioning, and performance tuning.
Develop and maintain metadata management, lineage, and documentation for data assets.
Integrate data pipelines with APIs, streaming data sources, and third-party systems.
Collaborate with DevOps teams to automate deployments and monitor data systems using CI/CD pipelines.
Stay up to date with emerging data engineering tools, frameworks, and technologies.
Required Skills
12+ years of experience as a Data Engineer or in a similar data-focused role.
Strong programming skills in Python, Java, or Scala.
Expertise in SQL (complex queries, performance tuning, stored procedures).
Hands-on experience with big data technologies such as Spark, Hadoop, Hive, or Kafka.
Proficiency with cloud data platforms such as AWS (Glue, Redshift, EMR, S3), Azure (Data Factory, Synapse), or Google Cloud Platform (BigQuery, Dataflow).
Experience building and maintaining ETL/ELT pipelines using tools like Informatica, Talend, dbt, or Apache Airflow.
Solid understanding of data warehousing concepts, dimensional modeling, and data architecture principles.
Familiarity with containerization (Docker, Kubernetes) and version control (Git).
Knowledge of CI/CD processes and DevOps practices for data projects.
Strong understanding of data security, governance, and compliance best practices.
Nice-to-Have
Experience with Snowflake, Databricks, or Redshift for modern data warehousing.
Exposure to real-time data streaming using Kafka, Kinesis, or Spark Streaming.
Familiarity with NoSQL databases (MongoDB, Cassandra, DynamoDB).
Experience with data cataloging and metadata management tools.
Knowledge of machine learning data pipelines and MLOps practices.
Cloud certifications (AWS Certified Data Engineer, Azure Data Engineer Associate, or Google Cloud Platform Professional Data Engineer).
Soft Skills
Strong analytical and problem-solving abilities with a data-driven mindset.
Excellent communication and collaboration skills with cross-functional teams.
Ability to work independently and manage multiple data projects simultaneously.
Detail-oriented with a focus on data accuracy, quality, and scalability.
Proactive, innovative, and passionate about modern data engineering practices.