Overview
Skills
Job Details
This is a W2 position open to Denver local candidates only!!!!
Our client located in Denver, CO, is looking for a Principal Data Scientist to join their team on a W2 Contract. This role is building and maintaining data pipelines that connect Oracle-based source systems to AWS cloud environments to provide well-structured data for analysis and machine learning in AWS SageMaker. It includes working closely with data scientists to deliver scalable data workflows as a foundation for predictive modeling and analytics.
Duties
The Lead Data Scientist supports development of advanced decision support systems by employing advanced techniques from data analytics (including statistical analysis) and machine learning, particularly including NLP. The successful candidate will join an innovative and energetic team that develops capabilities that improve the performance of our business units and make them more efficient. This is a hands-on role in which the Data Scientist is expected to carry out a project from start to finish.
- Senior level DS centric role that needs 3+ years of real-world DS experience and current. Need strong Python skills and SQL
- Role is more on the GEN AI side than it will be on the ML side from a DS perspective
- Role has a chance to extend based on performance and possibly convert to an FTE for the right candidate.
- Develop and maintain data pipelines to extract, transform, and load data from Oracle databases and other systems into AWS environments (S3, Redshift, Glue, etc.).
- Collaborate with data scientists to ensure data is prepared, cleaned, and optimized for SageMaker-based machine learning workloads.
- Implement and manage data ingestion frameworks, including batch and streaming pipelines.
- Automate and schedule data workflows using AWS Glue, Step Functions, or Airflow.
- Develop and maintain data models, schemas, and cataloging processes for discoverability and consistency.
- Optimize data processes for performance and cost efficiency.
- Implement data quality checks, validation, and governance standards.
- Work with DevOps and security teams.
SKILLS:
Required
- Strong proficiency with SQL and hands-on experience working with Oracle databases.
- Experience integrating data for use in AWS SageMaker or other ML platforms.
- Expertise in Python for data engineering (pandas, boto3, pyodbc, etc.).
- Experience designing and implementing ETL/ELT pipelines and data workflows.
- Hands-on experience with AWS data services, such as S3, Glue, Redshift, Lambda, and IAM.
- Solid understanding of data modeling, relational databases, and schema design.
- Familiarity with version control, CI/CD, and automation practices.
- Ability to collaborate with data scientists to align data structures with model and analytics requirements
Preferred
- Exposure to MLOps or ML pipeline orchestration.
- Familiarity with data cataloging and governance tools (AWS Glue Catalog, Lake Formation).
- Knowledge of data warehouse design patterns and best practices.
- Experience with data orchestration tools (e.g., Apache Airflow, Step Functions).
- Working knowledge of Java is a plus.
Education
B.S. in Computer Science, MIS or related degree and a minimum of five (5) years of related experience or combination of education, training and experience.