Overview
Skills
Job Details
Role Description:
Establish cloud-native design principles, reference architectures, and best practices for AI workloads.
Design secure, cost-optimized, high availability AWS architectures for AI Applications.
Architect, design and oversee implementation of machine learning pipelines, from data ingestion to deployment.
Lead proof of concepts (PoCs) for emerging AI capabilities (Gen AI, Agentic AI ..)
Experience with Model Tuning/Evaluation Techniques (Fine Tuning, RLHF) and RAG Implementations
Establish monitoring & observability frameworks using CloudWatch, Prometheus , Grafana , Open Telemetry
 Essential Skills:
10+ yrs of experience in software/solution architecture, with 3+ yrs in AI/ML or data-driven applications.
Hands on experience with AWS services, especially for AI/ML workloads (Sagemaker, Bedrock, Lambda, EKS, Glue)
Strong programming skills in Python /Java with exposure to ML frameworks (TensorFlow, Pytorch, Hugging face)
Familiarity with Generative AI, LLM and vector databases (e.g. Pinecone, Cosmos, OpenSearch)
Experience with AI /ML Ops, CI/CD Pipelines, and containerization.
 Desirable Skills:
Experience in driving continuous improvement including design, and standardization of process and methodologies.
Experience assessing feasibility, complexity and scope of new capabilities and solutions.
Experience collaborating with multiple stakeholders/teams (architecture and infrastructure delivery teams) to ensure consistent solution design and integration.