Role Overview
The client is seeking an experienced Data Engineer with strong expertise in Python, AWS, Kubernetes, and MLOps to build and manage scalable data and machine learning pipelines. The role involves working with modern ML tooling such as Spark, Pandas, and NumPy in a cloud-based environment to support advanced analytics and model deployment. The role is based in McLean, VA, and the job is open to candidates within the 100 miles radius of McLean, VA.
Key Responsibilities
Design and develop scalable data pipelines and ML workflows
Implement and manage end-to-end MLOps lifecycle including deployment and monitoring
Work with distributed data processing systems using Spark and AWS
Deploy and manage containerized applications using Kubernetes and Kubeflow
Collaborate with cross-functional teams to productionize ML models
Ensure data quality, performance optimization, and system reliability
Required Skills (with Years)
Python 5+ years
AWS (Amazon Web Services) 5+ years
Kubernetes 5+ years
Kubeflow 5+ years
MLOps 5+ years
Apache Spark 5+ years
Pandas 5+ years
NumPy 5+ years
Prior experience working with the client 5+ years
Preferred Skills
SQL (Data Analysis)
mlplot
Databricks
Preferred Certifications
AWS Certified Solutions Architect
Certified Kubernetes Administrator (CKA)
Databricks Certification