Overview
Accepts corp to corp applications
Contract - W2
Contract - Independent
Contract - 12
Skills
RAG
GPT
Expertise in GenAI models (e.g.
Claude
PaLM2
Llama2) and frameworks (e.g.
Agents
COT).
Cloud Platform
Database knowledge
Scripting and Automation
Monitoring tools
Kubernetes and Containerization
Job Details
- Role: GenAI SRE DevOps (L3) support
- Position : Onsite
- Location : New York.
Key responsibilities
- Provide Technical Support: Offer expert-level support for GenAI products and services, including troubleshooting, debugging, and resolving technical issues.
- Troubleshooting and Problem Resolution: Identify, diagnose, and resolve complex technical problems involving AI models, algorithms, prompts, and software.
- End User Interaction: Work directly with end users to understand their needs, gather requirements, and provide solutions for GenAI-related challenges.
- Platform Enhancement and Optimization: Support and maintain a scalable, fault-tolerant AI/ML platform, monitor and troubleshoot data pipeline and model inference performance, and optimize AI models for real-time applications.
- Collaboration: Collaborate closely with engineering, DevOps, and product teams to improve product features, functionality, and reliability based on customer feedback and ensure seamless deployment improvements.
- Documentation and Training: Create and maintain detailed documentation of support processes, issues, and resolutions for internal and external stakeholders. Assist in training customers and internal teams on best practices for using and maintaining GenAI products.
- Monitoring and Reporting: Monitor system performance, perform regular health checks, and generate reports on system status, incidents, and resolutions.
- Continuous Improvement: Stay up-to-date with the latest advancements in AI/ML/GenAI technologies and contribute to the continuous improvement of support processes and tools.
Required skills and qualifications
- Experience: Proven experience in technical support or engineering, preferably in AI/ML/GenAI environments.
- Technical Proficiency: Expertise in GenAI models (e.g., GPT, Claude, PaLM2, Llama2) and frameworks (e.g., RAG, Agents, COT).
- Cloud Platform and DevOps: Hands-on experience with cloud platforms (Azure, AWS, Google Cloud Platform) and DevOps tools.
- Database knowledge: SQL/Sybase/Mongo DB, any data warehouse (Snowflake, Databricks) experience.
- Scripting and Automation: Strong proficiency in Python, Shell scripting, and other relevant programming and UI languages like Java, Angular, http protocols.
- Monitoring tools knowledge like Splunk, AppDynamics, Autosys, Grafana/ Loki/ Prometheus
- ITIL application support management processes: Incident/Problem/Service/Jira management.
- Kubernetes and Containerization: Familiarity with containerization technologies like Docker, Loki and orchestration tools like Kubernetes (preferably EKS or OpenShift).
- Problem-solving and Analytical Skills: Excellent problem-solving, analytical, and troubleshooting skills with strong attention to detail.
- Communication and Collaboration: Strong command and control with good communication and interpersonal skills to collaborate effectively with diverse teams and stakeholders across global teams.
- Educational Background: Bachelor's degree in Computer Science, Engineering, or a related field.
Preferred qualifications
- Experience with natural language processing (NLP) and machine learning (ML) models.
- Familiarity with large language models (LLMs) such as GPT-3.5 Turbo, GPT-4.0, and GPT-4-O.
- Experience with OpenAI technologies and managing GenAI services in cloud environments.
- Understanding of MLOps practices and model lifecycle management.
- Familiarity with application monitoring solutions like Dynatrace and Splunk.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.