Job Title: Sr. Data Engineer
Contract: Full Time W2
Location: Manassas Park, VA (Hybrid)Full-Time
Job Summary
We are seeking a highly skilled Senior Data Engineer to join our team in Manassas, VA. The ideal candidate will be responsible for designing, developing, and optimizing scalable data pipelines, cloud-based data solutions, and analytical infrastructure to support enterprise data initiatives. This role requires hands-on engineering expertise, strong problem-solving skills, and the ability to collaborate with cross-functional teams.
Responsibilities
· Design and implement scalable and secure data pipelines for batch and real-time processing.
· Architect, build, and maintain Data Lake and data warehouse solutions across Azure/AWS/Google Cloud Platform cloud environments.
· Develop ETL/ELT processes using tools such as Apache Spark, Databricks, Airflow, Hadoop, Kafka, and cloud-native services.
· Optimize data storage and retrieval performance to support advanced analytics, BI dashboards, and machine learning workloads.
· Work closely with Data Scientists, Architects, and Business Analysts to understand data requirements and translate them into technical solutions.
· Implement CI/CD pipelines for data engineering solutions using Git, Jenkins, or similar tools.
· Ensure data quality, governance, security, and compliance best practices.
· Troubleshoot data pipeline issues, perform root cause analysis, and continuously improve system reliability.
Required Skills
· 10+ years of hands-on Data Engineering experience.
· Strong expertise in Python or Scala for data development.
· Proficiency in SQL and performance tuning for large datasets.
· Deep understanding of distributed computing frameworks: Apache Spark, Hadoop, Hive, Kafka.
· Experience with cloud platforms (Azure preferred, AWS or Google Cloud Platform acceptable).
· Strong knowledge of data warehousing concepts (Snowflake, Redshift, Synapse, BigQuery).
· Experience with orchestration tools like Airflow, Azure Data Factory, or Glue.
· Hands-on experience with CI/CD, DevOps practices, and containerization (Docker/Kubernetes).
· Experience with Databricks or MLflow.
· Knowledge of streaming technologies (Kafka, Kinesis).
· Exposure to data security frameworks and data catalog tools.
· Experience working in Agile/Scrum environments.
· Familiarity with machine learning pipelines and MLOps concepts.