Biomedical Cloud Engineer

Overview

On Site
Full Time

Skills

Genomics
Data Lake
Data Processing
MVP
SQL
Management
Network
ATLAS
Palo Alto
Security Operations
PCA
Data Integrity
Data Flow
Workflow
Genetics
Computer Science
Bioinformatics
Computational Physics
Biomedicine
Python
Cloud Computing
Google Cloud Platform
Google Cloud
Database
Graph Databases
Big Data
Apache Spark
Data Security
FedRAMP
Cloud Security
Data Management
Research
Testing
Software Design
Data Modeling
Communication
Team Building
System Integration Testing
Writing
HIS
SAFE
Training
Policies and Procedures
Inventory
Recruiting
Human Resources
Law

Job Details

The Stanford Center for Genomics and Personalized Medicine (SCGPM) has an exciting opportunity available for a motivated Biomedical Cloud Engineer to create innovative data architectures that will automate the process of turning big genomic data into biomedical insights. The ideal person for this position is a keen listener who can interpret biological questions, assess the value and relevance of different technologies and methods, and deliver actionable technical solutions.

Background:
The Department of Veterans Affairs (VA) has commissioned the sequencing of hundreds of thousands of whole genomes from participants in the Million Veteran Program (MVP) []. This data is currently being delivered to the SCGPM's cloud computing environment and constitutes one of the largest repositories of whole-genome sequencing data in the world. The scale and richness of this data make it an incredible resource for biomedical research. Our goal is to turn this data lake into a data commons: a dynamic computing environment where researchers bring questions and get answers, all without having to go through the ordeal of manually collecting, cleaning, massaging, scrubbing, sorting, transforming, and filtering data.

As an example of a publication from this group, see this reference describing the early design of our data processing system:
Ross, P.B., Song, J., Tsao, P.S. et al. Trellis for efficient data and task management in the VA Million Veteran Program. Scientific Reports 11, 23229 (2021).

Position:
In this position, you would be the system developer of the cloud-based MVP data management system that we have created called Trellis. Trellis stores the petabytes of sequence data contributed to the MVP by veterans and orchestrates its processing while keeping track of what programs were used, maintaining a detailed record of data provenance.

To manage the enormous volumes of biomedical research data that the MVP generates, we have built Trellis in the Google Cloud Platform. The Trellis architecture takes advantage of many serverless cloud services, such as Cloud Functions, Dataproc, Cloud SQL, and Pub/Sub, to make a workflow which responds to the arrival of new data by initiating pipeline processes automatically and at scale.

A production version of Trellis has already processed the whole genomic sequences of over 150,000 veterans and we plan to process at least as many more in the coming year. You would take the lead in keeping this production system running and optimized, and you would interface with our SecOps team which maintains that system in a FedRAMP-secure environment.

Our Team:
Our SCGPM bioinformatics team is a multi-disciplinary group composed of about a dozen scientists, engineers, and software developers with complementary backgrounds, each contributing their own expertise in managing and analyzing complex biomedical data []. Other projects supported by this team include the NCI Human Tumor Atlas Network, Human BioMolecular Atlas Program, and the Stanford Metabolic Health Center.

This position can be on-site in Palo Alto, fully remote, or hybrid.

Duties include:
  • Maintaining the smooth execution of our production Trellis system
  • Working with our Security Operations team to respond to any security incidents
  • Constructing queries to our graph database to gain insights from pipeline run data
  • Implementing population-level genomic analyses (GWAS, PCA) to verify data integrity
  • Designing and integrating novel bioinformatics pipelines into our Trellis system
  • Troubleshooting data flow in our state-driven Trellis architecture
  • Building containers for bioinformatics tools and integrating them with our internal data management system to automate workflows
  • Collaborating with researchers to explore solutions to relevant biological questions and maximize the value of our whole-genome sequencing dataset to the public
* - Other duties may also be assigned.

DESIRED QUALIFICATIONS:
  • Four-year degree in Genetics, Computer Science, Bioinformatics, Computational Physics, or a related field
  • Experience with biomedical data formats (FASTQ, FASTA, BAM, CRAM, Hail MatrixTable, et al.)
  • Comfortable in programming with Python
  • Experience with cloud computing, especially Google Cloud
  • Experience with databases, especially graph databases
  • Experience with big data technologies (e.g., BigQuery, Spark, Hail, Terra)
  • Familiarity with issues in computer data security
  • Familiarity with FedRAMP cloud security
  • Familiarity with FAIR principles of data management
  • Excellent verbal and written communication skills
  • An ability to independently grasp the objectives of research projects and assemble solutions from a range of technologies, standards, and approaches
  • A desire to learn new methods and technologies and to adapt to demands of fast-paced research
EDUCATION & EXPERIENCE (REQUIRED):

Bachelor's degree and five years of relevant experience, or a combination of education and relevant experience.

KNOWLEDGE, SKILLS AND ABILITIES (REQUIRED):
  • Expertise in designing, developing, testing, and deploying applications.
  • Proficiency with application design and data modeling.
  • Ability to define and solve logical problems for highly technical applications.
  • Strong communication skills with both technical and non-technical clients.
  • Ability to lead activities on structured team development projects.
  • Ability to select, adapt, and effectively use a variety of programming methods.
  • Knowledge of application domain.
CERTIFICATIONS & LICENSES:

None

PHYSICAL REQUIREMENTS*:
  • Constantly perform desk-based computer tasks.
  • Frequently sit, grasp lightly/fine manipulation.
  • Occasionally stand/walk, writing by hand.
  • Rarely use a telephone, lift/carry/push/pull objects that weigh up to 10 pounds.
* - Consistent with its obligations under the law, the University will provide reasonable accommodation to any employee with a disability who requires accommodation to perform the essential functions of his or her job.

WORKING CONDITIONS:

May work extended hours, evening and weekends.

WORK STANDARDS (from JDL):
  • Interpersonal Skills: Demonstrates the ability to work well with Stanford colleagues and clients and with external organizations.
  • Promote Culture of Safety: Demonstrates commitment to personal responsibility and value for safety; communicates safety concerns; uses and promotes safe behaviors based on training and lessons learned.
  • Subject to and expected to comply with all applicable University policies and procedures, including but not limited to the personnel policies and other policies found in the University's Administrative Guide, ;/li>
The job duties listed are typical examples of work performed by positions in this job classification and are not designed to contain or be interpreted as a comprehensive inventory of all duties, tasks, and responsibilities. Specific duties and responsibilities may vary depending on department or program needs without changing the general nature and scope of the job or level of responsibility. Employees may also perform other duties as assigned.

Consistent with its obligations under the law, the University will provide reasonable accommodations to applicants and employees with disabilities. Applicants requiring a reasonable accommodation for any part of the application or hiring process should contact Stanford University Human Resources at For all other inquiries, please submit a contact form.

Stanford is an equal employment opportunity and affirmative action employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, protected veteran status, or any other characteristic protected by law.

Additional Information
  • Schedule: Full-time
  • Job Code: 4822
  • Employee Status: Regular
  • Grade: J
  • Requisition ID: 107025
  • Work Arrangement : On Site
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.