Title- AI Solution Architect
Location- CA- Remote
Type- Long Term Contract- C2C/W2
Role summary
The AI Solution Architect owns the technical design of AgreeYa's AI engagements from discovery through production. This person translates business problems into defensible architectures, makes the build-vs-buy and model-selection calls, and sets the governance and interface standards the delivery team executes against. They are client-facing and credible in a room with enterprise architects, security reviewers, and infrastructure partners. Critically, they must reason across cloud and on-prem deployment targets and recommend the right one for each use case, rather than fitting every problem to a single stack.
Key responsibilities
Lead AI discovery and use case prioritization, scoring opportunities on data sensitivity, cost at scale, latency, feasibility, and governance exposure.
Design end-to-end architectures spanning RAG, agentic workflows, data pipelines, model serving, and guardrails.
Make model-selection recommendations across closed (GPT, Claude, Gemini) and open-weight (Llama, Mistral, Qwen) options, applying a structured hard-attribute / soft-attribute framework.
Choose the deployment target deliberately: Azure, AWS, on-prem NVIDIA AI factory, or hybrid, and document the rationale.
Define interface specifications between AgreeYa's application layer and partner-owned infrastructure (for example, model-serving endpoint contracts, performance baselines), protecting against handoff and dependency risk.
Own the application-level governance design: NIST AI RMF alignment, risk tiering, human-in-the-loop placement, audit and explainability requirements.
Set delivery standards and review the work of AI Engineers and MLOps Engineers for architectural soundness.
Required qualifications
Demonstrated production AI/ML solution architecture, not only pilots and proofs of concept.
Deep RAG fluency: chunking strategy, embedding model selection, vector search, retrieval evaluation.
Working knowledge of agentic patterns, orchestration (LangChain / LangGraph), and tool integration (including MCP).
Strong grasp of model selection, fine-tuning vs RAG trade-offs, and inference cost/latency economics.
Able to lead technical client conversations and defend design decisions to a skeptical technical audience.
Must be able to architect and reason fluently across all three of the following, and recommend between them:
Azure AI: Azure AI Foundry, Azure OpenAI Service, Azure AI Search, Azure ML.
AWS AI: Amazon Bedrock, SageMaker, OpenSearch, Lambda-based serving.
On-prem NVIDIA AI factory: NVIDIA AI Enterprise (NVAIE), NIM microservices, Triton Inference Server, NeMo and NeMo Guardrails, Run:ai, TensorRT-LLM, and quantized/air-gapped deployment (GGUF, vLLM).