Lead Data Engineer

Overview

On Site
$55-$65
Contract - W2
Contract - Independent
Contract - 6-9 Month(s)

Skills

Data Engineering
FOCUS
Cloud Computing
Big Data
Lifecycle Management
Communication
Python
PySpark
Scala
Performance Tuning
Optimization
Apache Spark
Google Cloud
Google Cloud Platform
Workflow
Orchestration
Machine Learning (ML)
Management
Docker
Continuous Integration
Continuous Delivery
Testing
Version Control
Documentation
Data Structure
Reporting
Extract
Transform
Load
Database
Real-time
Analytical Skill
Data Quality
Accessibility
Java
Apache Hadoop
Apache Hive
Apache Cassandra
Apache Pig
MySQL
NoSQL
PPO
UPS
Legal
Insurance
Collaboration
Teamwork

Job Details


***At this time, we are unable to consider candidates requiring visa sponsorship or third-party recruitment agencies for this role. We thank you for your understanding.***

We are seeking a Lead Data Engineer in contract and remote capacity.


Location: Remote
Qualifications:
Bachelor's degree in Computer Science, Engineering, or a related field?or equivalent experience
5-20+ years of experience in data engineering with a strong focus on Python, PySpark, and/or Scala
Expertise in Spark performance tuning and optimization in cloud-based environments (preferably Google Cloud Platform)
Hands-on experience with workflow orchestration tools like KFP or Airflow
Proficiency with Docker and container-based deployment strategies
Experience building scalable and maintainable data pipelines
Strong understanding of distributed systems, big data technologies, and data lifecycle management
Excellent communication and collaboration skills

Core Responsibilities Include:
Designing and building large-scale data pipelines using Python, PySpark, and Scala
Performance tuning and optimization of Spark jobs, especially in Google Cloud Platform environments
Developing and maintaining production-ready workflows using orchestration tools such as Kubeflow Pipelines (KFP) or Airflow
Collaborating closely with data scientists, ML engineers, and product teams to support experimentation, model development, and deployment
Managing containerized applications with Docker and integrating them into CI/CD pipelines
Ensuring code quality through testing, version control, and documentation
Develops software that processes, stores and serves data for use by others.
Develops large scale data structures and pipelines to organize, collect and standardize data that helps generate insights and addresses reporting needs.
Writes ETL (Extract / Transform / Load) processes, designs database systems and develops tools for real-time and offline analytic processing.
Ensures that data pipelines are scalable, repeatable and secure. Troubleshoots software and processes for data consistency and integrity.
Integrates data from a variety of sources, assuring that they adhere to data quality and accessibility standards. Has in-depth knowledge of large scale search applications and building high volume data pipelines.
In-depth knowledge of Java, Hadoop, Hive, Cassandra, Pig, MySQL or NoSQL or similar.


Benefits:

York Solutions Offers a generous benefits package for eligible full-time employees:

  • BCBS Medical with 3 Plans to choose from (PPO and High deductible PPO plans with Health Savings Program)
  • Delta Dental plan with 2 free cleanings and insurance discounts
  • Eye Med Vision with annual check-ups and discounts on lens
  • Life and Accidental Death Insurance paid by company
  • John Hancock 401(k) Retirement Plan with discretionary company match up to 5%
  • Voluntary Insurance programs such as: Hospital Indemnity, Identity Protection, Legal Insurance, Long Term Care, and Pet Insurance.
  • Flexible work environment with some remote working opportunities
  • Strong fun and teamwork environment
  • Learning, development, and career growth




Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.