Overview
Skills
Job Details
Position: ML / Bioinformatics Data Scientist
Location: Remote (PST work hours)
Duration: Long term contract
About the Role:
We are seeking a highly motivated and collaborative Bioinformatics/ML scientist to join the Computational biology & Medicine department in Computational Sciences COE (Center of Excellence) within Genentech's Research and Early Development (gRED). The successful candidate will contribute to a cross-functional project that will apply Machine Learning (ML) models to multi-modal datasets collected from clinical trials. This role requires a deep understanding of application of Machine Learning models, a background in biology, a passion for innovation, and a commitment to improving healthcare outcomes through cutting-edge technology.
We are looking for exceptional researchers with a passion for interdisciplinary research and technical problem-solving, and a proven ability to develop and implement research ideas. The candidate is expected to have worked on previous ML modeling projects and applying them to multi-modal datasets to be considered.
About the Project:
The goal of this project is to develop a machine learning model to predict a patient's risk for drug-induced liver toxicity based on a wide variety of patient characteristics including clinical, genetics, omics and safety labs. The focus will be harmonizing these diverse data sources, deriving new features, and building machine learning models designed to identify a predictive signature that can distinguish between at-risk and not-at-risk patient populations.
Key Responsibilities:
- Data centralization and harmonization
- Applying ML methods on assembled dataset to identify patients' risk for drug-induced liver toxicity.
- Collaborate with interdisciplinary and cross-functional teams including biologists, chemists, data scientists, and other stakeholders.
Educational Background:
- PhD degree in quantitative field ( e.g., Computer Science, Computational Biology, Bioinformatics, Statistics, Mathematics)
Experience:
- Proven track record of working with statistical modeling techniques, including ML methods, is required
- Demonstrated interest in problems across biology as applied to the discovery and development of treatments for disease is preferred
Technical Skills:
- Data Science & Programming: Expertise in Python/R for data manipulation, statistical analysis, and ML model building (required)
- Multimodal Data & Modeling: Proven ability to work with diverse data types (omics, clinical, imaging) (required).
- Knowledge of statistics and experience with survival analysis (required)
- Domain & AI-specific Skills: Experience with NLP/LLMs for feature extraction from unstructured text, and a strong background in a neuroscience (preferred)
Soft Skills:
- Excellent communication, collaboration, and problem-solving skills (required).
Publications:
- Strong publication record and experience contributing to research communities.