Overview
Skills
Job Details
Essential
Education & background
o MSc or PhD in Statistics, Biostatistics, Data Science, Mathematics or related
quantitative field.
o Strong foundation in probability, statistical inference, and regression modelling.
Technical skills
o Proven experience with multivariate linear regression (incl. diagnostics, handling
multicollinearity, transformations, interaction terms, model selection).
o Hands-on experience with survival analysis, especially Cox proportional hazards
models (assumption checking, time-varying covariates, baseline hazard
interpretation).
o Proficient in Python (e.g. pandas, numpy, scikit-learn, statsmodels, lifelines) and/or R
for statistical modelling.
o Experience working with messy real-world data: missingness, outliers, skewed
distributions, and appropriate remediation (imputation, robust methods,
transformations).
Domain experience
o Experience analysing pharma / healthcare / clinical datasets (e.g. clinical trials,
cost/resource-use data, outcomes data).
o Familiarity with concepts such as endpoints, covariates, censoring, follow-up time,
and patient cohorts.
Analytical & communication skills
o Able to design end-to-end analysis pipelines: problem framing, data preparation,
modelling, validation, and interpretation.
o Strong ability to explain statistical results in plain language to non-technical
stakeholders (e.g. clinicians, commercial teams).
o Comfortable producing clear documentation, slides, and summary reports of methods
and findings.
Ways of working
o Experience using Git / version control and working in a collaborative environment
(code review, branching, pull requests).
o Detail-oriented, with a pragmatic approach to balancing rigour and timelines.
Desirable
o Prior experience in a pharma, biotech, or health analytics setting.
o Familiarity with cost and resource-use modelling, health-economic concepts, and
real-world evidence.
o Experience deploying analyses into reproducible workflows (e.g. Jupyter, pipelines).
o Exposure to broader ML methods (tree-based models, regularisation, gradient
boosting, etc.) and model explainability (e.g. SHAP).
o Strong SQL skills for data extraction, cleaning, joining and aggregation from relational
databases.