Responsibilities:
Lead and mentor data engineers and related roles, promoting a culture of technical
excellence and collaboration.
Identify and own enterprise-wide data architecture technical solutions, ensuring
alignment with business growth and emerging technologies.
Design, build, and evolve scalable data architectures and pipelines with a focus on
simplicity, performance, and reliability.
Develop and optimize ETL/ELT processes, data flows, and infrastructure leveraging
AWS, SQL, and platforms such as Databricks, Redshift, Hadoop and Airflow.
Assemble and manage large, complex datasets to meet functional and non-functional
business requirements.
Implement best practices in data quality, profiling, anomaly detection, and performance
monitoring.
Deliver analytical tools and data products that provide actionable insights into business
performance and customer behavior.
Partner with cross-functional teams-including engineers, analysts, data scientists,
product, and government stakeholders-to address data challenges.
Drive process improvements to enhance scalability, automation, and developer
productivity.
Promote engineering best practices through testing, CI/CD, Infrastructure as Code, and
code reviews.
Requirements
Minimum Requirements:
All candidates must pass public trust clearance through the U.S. Federal Government. This
requires candidates to either be U.S.
citizens or pass clearance through the Foreign National Government System which will require
that candidates have lived within
the United States for at least 3 out of the previous 5 years, have a valid and non-expired
passport from their country of birth and
appropriate VISA/work permit documentation.
Bachelor's degree in Computer Science, Software Engineering, Data Science, Statistics,
or related technical field.
Minimum 10 years of relevant experience in software engineering.
Minimum 2 years working in large scale Databricks implementation.
Proficiency in at least one of the following languages: Python, TypeScript or JavaScript.
Proven experience working on large-scale system architectures and Petabyte-level data
systems.
Experience with cloud-native data tools and architectures (e.g., Redshift, Glue, Airflow,
Apache Spark).
Proficient in automated testing frameworks (PyTest, Playwright or Jest) and testing best
practices.
Experience developing, testing, and securing RESTful and GraphQL APIs.
Proven track record with AWS cloud architecture, including networking, security, and
service orchestration.
Experience with containerization and deployment using Docker, and infrastructure
automation with Kubernetes and Terraform/Terragrunt.
Proficiency with Git, Git-based workflows, and release pipelines using GitHub Actions
and CI/CD platforms.
Knowledge of performance monitoring tools like Grafana, Prometheus, and Sentry.
Comfortable working in a tightly integrated Agile team (10 or fewer people).
Strong written and verbal communication skills, including the ability to explain technical
concepts to non-technical stakeholders.
Desired Qualifications:
Deep knowledge of working with relational and NoSQL databases (PostgreSQL,
MySQL, MongoDB).
Knowledge of event-driven architectures and systems like Kafka, Kinesis or RabbitMQ.
Familiarity with Redis for caching or message queuing.
Familiarity with data mesh principles, domain-oriented architectures and experience
connecting data domains securely.
Experience working with authentication/authorization frameworks like OAuth, SAML,
Okta, Active Directory, and AWS IAM (ABAC).
Experience exploring or building ETL pipelines and data ingestion workflows.
Experience with modern frameworks such as React.js, Next.js, Node.js, Flask.
Strong grasp of access control, identity management, and federated data governance.
CMS and Healthcare Expertise: In-depth knowledge of CMS regulations and experience
with complex healthcare projects; in particular, data infrastructure related projects or
similar.