Overview
Skills
Job Details
Job Title: Lead Data Engineer / Lead Data Architect
Location: Remote (Preference for US-based candidates)
Start Date: ASAP
Must have 12+ years of overall IT exp.
No OPT or CPT candidates
H1B Transfers are encouraged to apply
Must be authorized to work in the U.S. without sponsorship (unless H-1B transfer)
Job Summary:
We are seeking a highly skilled Lead Data Engineer / Lead Data Architect with strong experience in Python and Databricks, preferably with a background in pharmaceuticals or supply chain domains. The ideal candidate will possess architectural expertise, leadership capability, and a strategic mindset to drive data engineering solutions. The ability to engage with business stakeholders is essential.
Key Responsibilities:
Lead architecture design and development of scalable and efficient data pipelines using Databricks, Python, and cloud platforms (AWS, Azure, or Google Cloud Platform)
Define data strategies, standards, and best practices across various data domains
Act as a bridge between business and technical teams; engage with stakeholders to gather and translate business requirements into technical solutions
Guide the development of robust data models, ETL/ELT pipelines, and performance tuning
Ensure data quality, governance, and security compliance across solutions
Collaborate with cross-functional teams to enable data-driven decision-making
Evaluate and implement new data technologies/tools as needed for continuous improvement
Mentor and lead junior engineers and data team members
Required Skills & Experience:
12+ years of experience in Data Engineering or architecture roles
Strong hands-on experience with Databricks and Python
Proven experience in cloud platforms AWS, Azure, or Google Cloud Platform (multi-cloud experience is a plus)
Experience designing large-scale data architecture for enterprise systems
Background in pharmaceutical or supply chain domains preferred
Excellent understanding of data lakehouse architecture, Delta Lake, and streaming data solutions
Ability to drive technical decisions and contribute to data strategy discussions
Exceptional communication and stakeholder management skills
Preferred Qualifications:
Experience with CI/CD pipelines, Terraform, and Infrastructure-as-Code (IaC)
Familiarity with data privacy and regulatory requirements (e.g., HIPAA, GxP, GDPR)
Certifications in Databricks, AWS, Azure, or Google Cloud Platform are a plus
Prior experience in leading distributed teams or offshore/onsite model