Our Client is seeking an Applied Data Scientist.
As an Applied Data Scientist, you will bridge the gap between raw data and actionable insights. You are responsible for building robust data pipelines in Databricks and applying advanced analytical models to solve complex business problems. You will work closely with the interdisciplinary team to move from data ingestion, to data analysis and to model deployment.
On any given day we hope that you will:
- ● Talk to clients: Help our customers find the root causes of their pain points and clearly communicate complex technical ideas for a variety of audiences.
- ● Design & Build Pipelines: Develop end-to-end data workflows, integrating data from various APIs and databases.
- ● Architect Data Solutions: Implement Medallion architectures (Bronze/Silver/Gold) to organize data for analytics and AI.
- Develop Models: Build, track, and manage machine learning and data analysis experiments (e.g., using MLflow).
- ● Integrate AI into Workflow: Incorporate AI tools into daily development workflows to accelerate prototyping, streamline debugging, and reduce repetitive work, while critically evaluating outputs and ensuring production-quality results.
- ● Collaborate: Partner with Product Managers and Designers to translate vague business requests like "I want better analytics" into technical requirements and dashboard prototypes.
- ● Ensure Quality: Implement data governance, schema enforcement, and unit tests to maintain high data quality standards.
To succeed, you will need to have some combination of the following:
● Experience: 5–8 years of experience in data engineering and/or data science roles.
<>● Technical Proficiency:>
○ Strong proficiency in Databricks, Spark/PySpark, and Delta Lakehouse architecture.
○ Expertise in Python, SQL, PowerBI.
○ Experience with Azure tools e.g. Data Factory, Key Vault.
○ Comfortable with Git-based version control and collaborative development workflows.
● Data Science: Strong foundation in statistics (experimental design, hypothesis testing, causal inference) and proven ability to apply these methods to complex datasets.
● Data Analysis: Ability to explore, clean, and analyze large datasets to identify trends, anomalies, and actionable insights that inform product and business decisions.
● Data Visualization & Storytelling: Ability to distill complex analyses into clear, compelling narratives and visualizations that drive decision-making for technical and non-technical audiences alike.
● Pipeline Orchestration: Experience scheduling and monitoring data workflows using tools like Databricks Workflows, Airflow, or Azure Data Factory.
* AI-Assisted Coding: Proficiency in leveraging AI tools (e.g., code copilots, LLMs) to accelerate development, automate repetitive tasks, generate and refactor code, and improve code quality, while maintaining strong judgment around validation, correctness, and security.