NOTE:
Onsite to Santa Clara, CA
Open for Fulltime/Contact/C2H
Title: Technical AI Architect
Location: Onsite (Santa Clara, CA)
Duration: Fulltime Position/Contract/C2H
Key technical skills :
As a Technical Architect specializing in LLMs and Agentic AI, you will own the architecture, strategy, and delivery of enterprise-grade AI solutions. You will work with cross-functional teams and customers to define the AI roadmap, design scalable solutions, and ensure responsible deployment of Generative AI across the organization:
Primary Responsibilities:
- Architect scalable and secure AI/ML/LLM platform solutions including data, model, and inference pipelines.
- Establish enterprise reference architectures, reusable components, best practices, and governance standards for AI adoption.
- Integrate cloud-native, open-source, and enterprise tools such as vector databases, feature stores, registries, and orchestration frameworks.
- Implement automated MLOps/LLMOps workflows covering deployment, monitoring, observability, compliance, and performance optimization.
- Collaborate with cross-functional teams (engineering, data science, security, and product) to align platform capabilities with business goals and drive adoption.
Secondary Responsibilities:
- Support GenAI and AI application teams by providing platform enablement, solution advisory, and architecture reviews.
- Conduct technology research, PoCs, benchmarking, and evaluate emerging AI tools, frameworks, and deployment patterns.
- Drive knowledge sharing through documentation, workshops, training sessions, and internal community building initiatives.
- Provide guidance on cost estimation, usage monitoring, finops optimization, and capacity planning.
- Partner with security, compliance, and cloud teams to ensure alignment with regulatory, data privacy, and policy frameworks.
Primary Skills:
- 6-10 years of experience in Designing and implementing large-scale distributed systems, microservices, serverless, and event-driven architectures.
- 5-8 years of experience n Cloud-native architecture experience in Azure / AWS / Google Cloud Platform including networking, storage, compute scaling, GPU workloads, and managed AI services.
- 5-8 years of experience with platform components, API design, integration patterns, and high-performance compute architecture.
- 4-7 years of experience building or integrating AI/ML platforms, pipelines, model lifecycle components, inference gateways, and/or enterprise GenAI frameworks.
- 3-6 years of experience using AI platform tools such as Databricks, Vertex AI, Azure AI Studio, AWS Bedrock, LangChain, PromptFlow, Ray, Kubeflow, MLflow, Airflow, Kafka, etc.
- 2-5 years of experience in designing and integrating vector database solutions such as Pinecone, Weaviate, FAISS, Milvus, Qdrant, Elastic, OpenSearch, CosmosDB Vector.
- 2-3 years of experience in LLM architectures, embeddings, tokenization, prompt engineering, evaluation strategies, hallucination reduction, and RAG patterns.
- 2-3 years of experience building GenAI applications, agent workflows, or knowledge retrieval systems using frameworks like LangChain, LlamaIndex, GraphRAG, or custom implementations.