Job Title: AWS Data Engineer – Pharma / Life Sciences
Location: San Francisco, California(Hybrid)
Contract
Job Summary
We are seeking a highly skilled AWS Data Engineer with strong experience in the Pharmaceutical / Life Sciences domain to design, build, and maintain scalable data pipelines and platforms. The ideal candidate will have deep expertise in AWS data services, big data technologies, and a strong understanding of clinical, regulatory, and pharmaceutical data processes.
You will play a critical role in enabling data-driven decision-making across drug development, clinical trials, and commercial operations while ensuring compliance with industry regulations.
Key Responsibilities
Design, develop, and optimize scalable ETL/ELT pipelines using AWS services.
Build and maintain data lakes and data warehouses on AWS (S3, Redshift, Glue, EMR).
Develop robust data ingestion frameworks for structured and unstructured data sources.
Work with large-scale datasets including clinical, research, and real-world evidence (RWE) data.
Implement data transformation and processing using PySpark, Spark, and SQL.
Ensure data quality, integrity, and compliance with pharma regulations (GxP, FDA, 21 CFR Part 11).
Collaborate with data scientists, analysts, and business stakeholders to deliver data solutions.
Optimize performance, scalability, and cost-efficiency of AWS data platforms.
Implement monitoring, logging, and alerting using AWS CloudWatch and related tools.
Maintain documentation for data pipelines, architecture, and data models.
Required Skills & Experience
Technical Skills
Strong experience with AWS services: Glue, S3, Redshift, EMR, Athena, Lambda, Kinesis
Proficiency in Python, PySpark, and SQL
Hands-on experience with big data technologies (Spark, Hadoop, Databricks)
Experience building ETL/ELT pipelines and data lakes
Knowledge of data modeling, data warehousing, and data architecture principles
Familiarity with workflow orchestration tools (Airflow, Step Functions)
Domain Experience
Strong experience in Pharmaceutical / Life Sciences industry
Understanding of clinical trials, drug development, and regulatory data
Experience working with healthcare/pharma datasets (RWD, RWE, EHR, claims data)
ompliance & Governance
Knowledge of GxP, GMP, FDA regulations, 21 CFR Part 11
Experience implementing data governance, security, and compliance frameworks
Preferred Qualifications
Experience with Databricks or Snowflake
Exposure to CI/CD pipelines and DevOps practices
AWS Certifications (e.g., AWS Certified Data Engineer / Solutions Architect)
Experience with data cataloging and governance tools
Education
Bachelor’s or Master’s degree in Computer Science, Data Engineering, Life Sciences, or related field