AI Ops Engineer/ ML Ops Engineer(ONLY Locals to Bay Area CA)

Fremont, CA, US • Posted 14 hours ago • Updated 14 hours ago
Contract W2
Contract Independent
Travel Required
On-site
Depends on Experience
Fitment

Dice Job Match Score™

🤯 Applying directly to the forehead...

Job Details

Skills

  • Machine Learning (ML)
  • Machine Learning Operations (ML Ops)
  • Python
  • Regulatory Compliance
  • SLA
  • Scripting
  • ServiceNow
  • Shell
  • Artificial Intelligence
  • Gen AI
  • Attention To Detail
  • Change Management
  • Communication
  • Conflict Resolution
  • Confluence
  • Data Security
  • Amazon Web Services
  • Analytical Skill
  • Documentation
  • IT Operations
  • ITIL
  • Debugging
  • DevOps
  • Forecasting
  • Incident Management
  • Cloud Computing
  • Collaboration
  • JIRA
  • Log Analysis
  • Microsoft Azure
  • Microsoft SharePoint
  • Splunk
  • Windows PowerShell
  • BMC Remedy
  • Nagios
  • Problem Solving
  • Time Series
  • Zabbix
  • SOP
  • AI Ops

Summary

Experience Requirements:
5+ years in IT operations or L1 support roles.
Exposure to AIOps environments or automated monitoring solutions is a plus.
Qualifications:
Bachelor s or master s degree in computer science, Engineering, or a related field.

Key Skills:
Splunk, PowerShell, or Python, Logs Monitoring, Confluence and SharePoint
Skill Requirements:
Hands-on experience with IT monitoring tools (e.g., Nagios, Zabbix, Prometheus, Splunk, or similar).
Understanding of scripting (PowerShell, Python, or Shell) for basic automation tasks.
Understanding of AIOps concepts and automation frameworks.
Proficiency in Confluence and SharePoint for status updates and documentation.
Ability to interpret logs and detect anomalies proactively.
Familiarity with ITIL processes for incident, problem, and change management.
Experience using ticketing systems (e.g., ServiceNow, Jira, Remedy).
Skilled in creating and updating runbooks and SOPs.
Ability to follow documented procedures accurately.
Strong attention to detail for maintaining health check reports and incident updates.
Analytical thinking for quick problem identification and escalation.
Excellent communication and documentation skills.
Proactive mindset with a passion for reliability and automation.
Strong problem-solving and debugging skills.
Preferred:
ITIL Foundation Certification.
Experience with anomaly detection, time-series forecasting, and log analysis.
Basic certifications in monitoring tools or cloud platforms (AWS, Azure).
Key Responsibilities:
Proactive Monitoring of alerts and detect anomalies from logs.
Perform daily health checks until full automation and application monitoring are implemented.
Follow status checks as per existing runbooks.
Create and update runbooks as needed to reflect current processes.
Update system health status every 2 hours during the shift in Confluence or SharePoint.
Acknowledge incidents promptly and route them to the correct team.
Update incident status every 4 hours for P1/P2 tickets.
Communicate with users and provide timely updates on their requests.
Ensure timely acknowledgment, follow-up, and closure of incidents within SLA.
Complete service tasks on time as per SLA to release queues quickly.
Work strictly as per SOPs documented by the team.
Familiarity with incident management processes and ITIL principles.
Ability to follow documented procedures and create/update runbooks.
Strong communication and coordination skills.
Understanding of Confluence, SharePoint, and ticketing systems.
Implement best practices in ML operations and productionization.
Ensure compliance with enterprise data security, governance, and regulatory requirements.
Collaborate with data engineers, analysts, DevOps/SRE teams and business teams to ensure reliability and security

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 90942645
  • Position Id: 8929616
  • Posted 14 hours ago
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Fremont, California

Yesterday

Easy Apply

Contract

50 - 55

Fremont, California

Yesterday

Easy Apply

Third Party, Contract

Depends on Experience

Sunnyvale, California

Today

Contract

$55 - $62 hourly

Fremont, California

6d ago

Easy Apply

Contract, Third Party

$60 - $70

Search all similar jobs