Role: Data Engineer with Python, PySpark and Databricks or Azure Synapse or Fabric
Location: Boston, MA
W2 Position only
Location: Boston, MA x 3 a week(Hybrid)
Visa Status: or
Interview Process: 1 Zoom call, then an on-site meeting with the team.
Key Skillsets: If the candidate does NOT have Microsoft Fabric experience, then they are fine with someone who has worked with Databricks OR Azure Synapse. They must have experience with Python and PySpark.
This is an excellent opportunity for a hands-on Data Engineer who enjoys building foundational platforms from the ground up and working closely with architecture and infrastructure teams.
Key Responsibilities
- Help design and build a new enterprise data platform from inception
- Support platform provisioning and enterprise readiness initiatives
- Work within cloud IAAS and modern data ecosystem environments
- Build scalable data pipelines and integration frameworks
- Collaborate with engineering, architecture, and platform teams
- Contribute to governance, scalability, security, and operational best practices
Preferred Background
- Strong Data Engineering experience in modern cloud data environments
- Experience with Microsoft Fabric strongly preferred
- Open to candidates with strong Databricks or Snowflake backgrounds
- Financial services industry experience preferred, but other industries will be considered
- Experience working in greenfield or platform build-out environments is highly desirable
- Strong cloud and enterprise architecture mindset
Additional Notes
- Team is rapidly expanding and there is significant visibility within the program
- Looking for strong communicators who can operate in an evolving environment
- Hybrid model requires 4 days onsite in either Boston or Needham
Sapient job description:
As a Data Engineer at Publicis Sapient, you will design, build, and optimize modern data platforms that power intelligent, data-driven experiences for global clients.You will work at the intersection of cloud, data engineering, and analytics, enabling scalable ingestion, transformation, and storage of enterprise data across lakehouse and warehouse architectures.You will collaborate closely with architects, analysts, and product teams to ensure data solutions are reliable, performant, and aligned to business outcomes.
- Design and implement end-to-end data ingestion pipelinesusing Azure services, including API-based ingestion and Azure Data Factory (ADF).
- Build and manage lakehouse and data warehousesolutions using modern data storage formats to support analytical and operational workloads.
- Develop and optimize data transformations using PySpark, ensuring scalability, performance, and cost efficiency.
- Apply medallion architecture (bronze, silver, gold layers)to enable high-quality, governed, and reusable datasets.
- Partner with cross-functional teams to support data modeling, analytics, and downstream consumption use cases.
- Contribute to best practices around data quality, reliability, and maintainability across the data platform.
- Hands-on experience or strong working knowledge of Microsoft Fabric, including its role in modern analytics and lakehouse architectures.
- Proven experience working in Azurefor data ingestion and orchestration.
- Strong experience with Azure Data Factory (ADF)for pipeline development and scheduling.
- Experience building API-based data ingestion
- Solid understanding of data storage formats, including CSV, JSON, and Parquet.
- Experience designing and working with data warehouses and lakehouse architectures.
- Strong foundation in data modelingconcepts for analytical workloads.
- Practical experience implementing medallion architecture
- Proficiency in PySparkfor large-scale data transformations and optimization.
- Ability to write clean, maintainable, and well-documented data pipelines.
Set Yourself Apart
- Experience optimising Spark jobs for performance and cost in cloud environments.
- Familiarity with data governance, data quality, or observability practices in large-scale data platforms.
- Experience collaborating with analytics, data science, or AI teams on production-grade data solutions.
- Exposure to agile delivery models and working in cross-functional, client-facing teams.