The Senior Data Scientist will work with key R&D leaders to extract insights from complex clinical, translational and real-world data. They will have the opportunity to work on complex data science problems including modeling techniques to interpret, infer and recommend based on insights from data.
Roles and responsibilities
As a Senior Data Scientist you will be engaged in most or all of the following activities:
- Apply state of the art machine learning knowledge to the life sciences, pharma and healthcare domains to identify solutions to healthcare problems.
- Work with large data sets, integrate diverse data sources, data types and data structures into the solution.
- Develop analytical approaches to meet business requirements; this involves translating requests into use cases, test cases, preparation of training data sets and iterative algorithm development.
- Collaborate with various stakeholders such as Physicians, KOLs, Hospital EMR/EHR IT teams, Business Stakeholders, Product Management, and Clients on the formulation and application of new modeling solutions for a variety of healthcare related problems.
- Provide leadership to find solutions to problems that are impediments to health systems.
- Engage in deep research and identify new predictive modeling techniques as appropriate for a specific solution
- Present research results as well as recommendations internally as well as to customers.
- Lead teams that present CPML’s research at leading conferences and academic sessions.
- Performs exploratory data analysis to gauge the need for or appropriateness of advanced analytical methods.
- Formulate, implement, test and validates predictive models and implements efficient automated processes for producing modeling results at scale.
- Create robust models based on statistical and data mining techniques to provide insights and recommendations based on large complex data sets.
- Present stories told by data in a visually appealing and easy to understand manner.
- Responsible for collaborating with cross functional teams, including but not limited to, clinicians, data scientist, translational medicine scientist, statisticians, and IT professionals.
- Proactively build partnerships with specialist functions and global counterparts to maximize knowledge and available resources
- Mentor/coach team members to further develop their skills.
- D. in quantitative sciences (computer science, math, statistics and engineering)
- +6 years experience in healthcare or pharmaceutical is preferred but not required
- Strong knowledge of programming languages, with a focus on machine learning (R, Python, and Scala)
- Strong demonstrated understanding of machine learning methods and applied statistical packages as applied to data analysis within open source scripting languages (e.g. R, Python) is required
- Strong demonstrated understanding of data lifecycle required around data ingestion (ingest data from disparate systems into cloud computing environment), data contextualization (merging large data sets, developing algorithms to merge and clean data); data insights and analytics is required
- Demonstrated skills in building scalable analytical solutions is required
- Ability to quickly develop working knowledge in areas outside of one’s expertise
- Hands-on experience with business applications of distributed computing (e.g. MapReduce, HIVE, HADOOP) is preferred
- Demonstrated experience in setting up and managing data in cloud computing environment (AWS or MS Azure) is preferred.
- Ability to summarize technically/analytically complex information for a non-technical audience
- Demonstrated ability to work in a team environment with good interpersonal, communication, writing and organizational skills.
- Outstanding technical and analytic skills, proficient at understanding and conceptualizing business problems and implementing analytic or decision support solutions