Overview
Skills
Job Details
Position Summary
We are seeking an AI/ML Engineer with hands-on experience designing, implementing, and deploying AI Agents and ML models for cloud resource optimization, operational automation, and predictive analytics. The ideal candidate will have deep expertise in hybrid cloud environments and be able to integrate AI-driven insights into enterprise-scale cloud monitoring and operations systems.
Responsibilities
Design & Develop AI Agents for cloud resource allocation, auto-scaling, and performance tuning.
Build predictive models for failure detection, incident management, and system health monitoring.
Automate operational workflows using ML and intelligent scripting.
Integrate AI-driven analytics into cloud monitoring tools.
Collaborate with DevOps & SRE teams for production ML deployment.
Conduct anomaly detection for security, performance, and cost optimization.
Research & recommend emerging AI technologies for operational improvement.
Maintain technical documentation and AI/ML best practices.
Mandatory Skills
Category | Skill |
---|---|
Programming | Python, GoLang, Bash, C/C++ |
ML Frameworks | PyTorch, TensorFlow, scikit-learn |
Tools | Jupyter, Terraform, Ansible, Prometheus, Splunk, AppDynamics |
Cloud Platforms | AWS, Google Cloud Platform, OpenStack, Kubernetes |
Databases | SQL & NoSQL |
Concepts | AI/ML deployment, CI/CD, API integrations |
Version Control | Git (GitLab, GitHub Actions, Jenkins) |
Other | Hybrid Cloud Operations, Streaming Data, Telemetry Systems |
Experience | 5+ years software development, 2+ years in ML/AI for cloud operations |
Preferred Skills
Cisco UCS / Nexus / Thousand Eyes experience.
OS fundamentals and systems performance tuning.
Prior leadership in AI/ML initiatives.
What Kind of Profiles Cisco Can Accept for the Above AI/ML Engineer & AI SRE Roles
(Based on your shared JDs)
Criteria What Client Will Likely Accept What They Will Reject Work Authorization GC-EAD, s (as per your JD) OPT, CPT, TN Visa (for these roles, per your note) Experience Level 5 8+ years relevant hands-on experience in cloud, AI/ML, or SRE Entry-level or purely academic AI/ML experience Domain Expertise Hybrid Cloud (AWS/Google Cloud Platform/OpenStack/Kubernetes), AI Ops, HPC (DGX/UCS) Only application development without infra/ops exposure Technical Breadth Proven hands-on with Python, GoLang, Terraform, Ansible, CI/CD, Kubernetes, ML frameworks (PyTorch/TensorFlow) Candidates who have just one cloud, no automation tools, or only data science notebooks without deployments Mandatory Exposure For AI SRE: HPC/AI infra (NVIDIA DGX, Cisco UCS), Linux Sysadmin (5+ years) No HPC exposure or generic DevOps without AI workload handling Soft Skills Strong collaboration, Agile/DevOps culture, cross-functional team work Weak communication or no experience in large team environments Preferred Add-ons Certifications (Cloud, Linux, Kubernetes), Cisco product familiarity No certifications + no enterprise-scale work history