Overview
Skills
Job Details
Job Title: AWS Data Architect
Location: Tarrytown, NY 10591 (100% Onsite No Remote/Hybrid)
Duration: Long-term Contract
MANDATORY REQUIREMENT:
Candidates must have prior experience working in the Healthcare, Pharma, or Life Sciences domain.
Only consider candidates who are currently in NY/NJ or are willing to relocate 100% to Tarrytown, NY without exceptions.
Job Summary:
We are seeking an experienced AWS Data Architect to lead the design and implementation of scalable data engineering solutions for a major healthcare/pharma client. This is a critical position where the architect will play a key role in defining the data strategy, architecture, and governance for enterprise-scale systems.
Required Skills & Experience:
12+ years of experience in data engineering and architecture for large-scale platforms.
Strong expertise in:
AWS ecosystem: Redshift, EMR, S3
Big Data & ETL tools: PySpark, Apache Spark, Databricks, Apache Airflow
Programming: Python, SQL, PySpark
Data Modeling: Building efficient, scalable models for complex systems
Proficiency in data ingestion/orchestration frameworks and real-time data processing.
Experience with data governance & access control (e.g., Privacera, Apache Ranger).
Familiar with data virtualization platforms (e.g., Dremio).
Understanding of data security, authentication, and compliance frameworks (e.g., Okta).
Experience with DevOps/CI-CD tools: Jenkins, Git, Docker, Kubernetes.
Familiarity with search/discovery tools such as Solr, Elasticsearch, or Looker.
Exposure to machine learning pipelines and MLOps integration is a plus.
Responsibilities:
Design and develop enterprise data platforms leveraging AWS cloud services.
Define platform roadmap, solution architecture, proof of concepts (POCs), and best practices.
Lead data migration and validation strategies for legacy to modern platforms.
Guide development teams in implementation, code review, and performance optimization.
Collaborate with stakeholders to translate business requirements into technical designs.
Ensure compliance with regulatory and data governance standards.
Contribute to cost optimization through scalable and efficient architecture.