Python Data engineer 150-200K - Direct client (70-120/hr)

airflow, docker, kubernetes, sagemaker, Python, PySpark, Spark, SQL, Data engineer, data scientist, AWS S3, Amazon Redshift, Pycharm
Contract W2, Contract Independent, Contract Corp-To-Corp, 2 Years
$70 - $120
Work from home not available Travel not required

Job Description

Immediate need for a senior data engineer who can also do some data science work. Experience in Biomedical space is desired. Experience in setting up BigData, Datascience infrastructire on AWS is also desired. Automated setup for EKS, PySpark, AWS Glue, EMR and Sagemaker based environment. Setup security for Redshift, S3, and EC2 instances


  • Assemble large, complex data sets in the format fit for each use case
  • Write generic Python/Pyspark modules for processing data from various data sources (XML, Parquet, CSV, Relational)
  • Demonstrable experience architecting, developing and optimizing ETL pipelines using Python, Spark, EMR, Docker, Kubernetes and Airflow
  • Develop and optimize big data pipelines for data scientists (requires a basic understanding of data science concepts and ML)
  • Research and recommend new innovative methods and systems to manage data for business improvement
  • Participate in internal governance to drive the data quality business cycle and roadmap

Required Skills

Python, Spark, ETL/Data engineering, Docker/Kubernetes, automation/devops related experience in AWS. Development and management of Airflow based data flows

  • Bachelor s or Master s degree in computer science or software engineering
  • 3+ years of programming experience (including functional programming); must be advanced in Python
  • Experience building and optimizing big data pipelines using Spark
  • Experience with AWS cloud services: S3, EC2, EMR, RDS, Redshift, Glue, Lambda, EKS, Sagemaker
  • Experience with relational SQL and NoSQL databases, including Postgres
  • Solid understanding of how to design robust data workflows including optimization and user experience
  • Strong analytical and problem-solving skills
  • Excellent oral and written communication skills
  • Able to work in teams and collaborate with others to clarify requirements
  • Strong co-ordination and project management skills to handle complex projects
  • Experience developing and working with XML, JSON, and external web services

Preferred Qualifications

  • Clinical drug development domain knowledge
  • Experience working with clinical and biomedical data types (clinical patient data, omics, imaging, etc.)
  • Competencies in applied statistics to solve business needs
  • Knowledge of industry data standards used in drug development, particularly in Clinical development
Dice Id : 10126850
Position Id : S-DE-DS
Have a Job? Post it