Overview
Skills
Job Details
We are seeking a highly skilled Senior Data Scientist (Contractor) to join our team. This role will focus on developing and deploying advanced AI/ML models to drive key business decisions. The ideal candidate will have extensive experience in machine learning, data engineering, and cloud computing, with a proven ability to deliver impactful solutions.
As a contractor, you will work closely with cross-functional teams to design, build, and optimize machine learning models using AWS services, including Sagemaker, Bedrock, LLM, PyTorch, TensorFlow, Deep Learning Containers, Jupyter Notebooks, and Glue. This role offers an exciting opportunity to contribute to cutting-edge AI solutions in a fast-paced environment. You will apply advanced analytics, machine learning, and statistical modeling to optimize gas network operations, enhance safety on assets, and drive data-informed decision-making. You will partner with engineering, operations, safety, and regulatory teams to transform raw data (e.g., SCADA telemetry, meter readings, Inspection data, GIS layers, IoT sensors) into actionable insights that reduce risk, improve reliability, and support regulatory compliance.
Key Responsibilities:
Predictive Modeling & Forecasting:
Develop and implement advanced machine learning models for predictive analytics, forecasting, and optimization against assets such as pipelines, compressors etc.
Develop time-series demand-forecast models to optimize supply planning and minimize imbalances.
Build anomaly-detection algorithms for early leak and fault identification in pipelines and distribution assets.
Train, evaluate, and deploy models using AWS Sagemaker, Bedrock, LLM, PyTorch, TensorFlow, Deep Learning Containers, Jupyter Notebooks, and Glue.
Conduct model validation and performance monitoring to ensure accuracy and efficiency.
Data Engineering & AWS:
Design and implement robust data pipelines using AWS Glue for ETL (Extraction, Transformation, and Loading).
Manage and optimize data storage and processing in the AWS cloud environment.
Ensure data quality and integrity throughout the data lifecycle.
Leverage additional AWS services as needed to enhance data processing and model deployment.
Advanced Analytics & Optimization:
Develop and maintain Python scripts for data manipulation, analysis, and model implementation.
Write clean, efficient, and well-documented code.
Employ machine learning frameworks (e.g., scikit-learn, TensorFlow) to optimize assets such as compressors (for scheduling, valve settings, and pressure control etc.)
Conduct geospatial analyses leveraging GIS data to map risk zones and prioritize inspection routes.
Reporting & Visualization
Create interactive dashboards (e.g., in Power BI, QuickSight) to communicate KPIs leak rates, maintenance effectiveness to stakeholders.
Translate complex model outputs into clear, concise recommendations for both technical and non-technical audiences.
Collaboration & Communication:
Work closely with engineers, business analysts, and stakeholders to define and solve business problems.
Effectively communicate complex technical concepts to both technical and non-technical audiences.
Present findings and recommendations to senior management.
Technical Skills
Programming Languages: Python, SQL
ML Frameworks: Scikit-learn, TensorFlow, PyTorch
NLP Tools: spaCy, HuggingFace Transformers, BERT, GPT-based models
Data Engineering Tools: AWS Glue, Pandas, Polars
Geospatial Tools: GeoPandas, Shapely, PostGIS
Visualization: Plotly, Dash, Power BI, or Tableau
Version Control: Git, GitHub
Preferred Experience:
Prior experience building ML Models in Utility Industry or any other Asset heavy industry, with focus on leak detection, damage prevention, cathodic protection, and regulatory compliance
Prior MLOps experience
Strong Data Engineering background
Knowledge or past experience with employing Gen AI techniques for model development and enhancement.
GIS/Geospatial Data: Experience utilizing geospatial data and GIS tools for advanced geospatial modeling and engineering
Local to Los Angeles/Southern California is preferred but not required
Has a Ph.D. in Computer Science, Electrical Engineering, Geospatial Analytics, Environmental or other Engineering Fields
Prior Experience in Natural Language Processing (NLP) projects
Advanced proficiency in Python programming
Utility Industry (gas/electric preferred; familiarity with SAP, GIS is a plus)
Qualifications:
Master s or Ph.D. in Computer Science, Statistics, Mathematics, or a related field.
10+ years of experience in data science, predictive modeling, and machine learning.
15+ years overall experience in Data Engineering, Software Engineering and/or Data Science roles