Due to the nature of the work, Our Client is unable to work with candidates on H1-B visa or EAD.
Our Client is looking for someone who blends data science + data engieering. You’ll architect and build pipelines in Databricks, design data lake systems, and dig into data to find trends, outliers, and actual causal insights.
As an Applied Data Scientist, you will bridge the gap between raw data and actionable insights. You are responsible for building robust data pipelines in Databricks and applying advanced analytical models to solve complex business problems. You will work closely with the interdisciplinary team to move from data ingestion, to data analysis and to model deployment.
On any given day we hope that you will:
- Talk to clients: Help our customers nd the root causes of their pain points and clearly communicate complex technical ideas for a variety of audiences.
- Design & Build Pipelines: Develop end-to-end data workows, integrating data from various APIs and databases.
- Architect Data Solutions: Implement Medallion architectures (Bronze/Silver/Gold) to organize data for analytics and AI.
- Develop Models: Build, track, and manage machine learning and data analysis experiments (e.g., using MLow).
- Integrate AI into Workow: Incorporate AI tools into daily development workows to accelerate prototyping, streamline debugging, and reduce repetitive work, while critically evaluating outputs and ensuring production-quality results.
- Collaborate: Partner with Product Managers and Designers to translate vague business requests like I want better analytics into technical requirements and dashboard prototypes.
- Ensure Quality: Implement data governance, schema enforcement, and unit tests to maintain high data quality standards.
To succeed, you will need to have some combination of the following:
Experience: 5–8 years of experience in data engineering and/or data science roles.
<>
Technical Prociency:
Strong prociency in Databricks, Spark/PySpark, and Delta Lakehouse architecture.
Expertise in Python, SQL, PowerBI.
Experience with Azure tools e.g. Data Factory, Key Vault.
Comfortable with Git-based version control and collaborative development workows.
Data Science: Strong foundation in statistics (experimental design, hypothesis testing, causal inference) and proven ability to apply these methods to complex datasets.
Data Analysis: Ability to explore, clean, and analyze large datasets to identify trends, anomalies, and actionable insights that inform product and business decisions.
Data Visualization & Storytelling: Ability to distill complex analyses into clear, compelling narratives and visualizations that drive decision-making for technical and non-technical audiences alike.
Pipeline Orchestration: Experience scheduling and monitoring data workows using tools like Databricks Workows, Airow, or Azure Data Factory.
* AI-Assisted Coding: Prociency in leveraging AI tools (e.g., code copilots, LLMs) to accelerate development, automate repetitive tasks, generate and refactor code, and improve code quality, while maintaining strong judgment around validation, correctness, and security.