Overview
Skills
Job Details
We are seeking an accomplished and visionary AI Architect to spearhead the design and deployment of transformative AI solutions centered around Large Language Models (LLMs) and Agentic AI systems. As a key technical leader, you will shape the next generation of intelligent platforms that blend powerful LLM capabilities with robust, scalable cloud and data architecture.
Your expertise will drive innovation across our AI initiatives, overseeing everything from architecting advanced multi-agent frameworks to building production-grade pipelines that take projects from initial experimentation all the way to deployment in secure, compliant environments.
Key Responsibilities
- End-to-End AI System Design: Architect and design comprehensive AI systems leveraging state-of-the-art LLMs, vector databases, retrieval-augmented generation (RAG), and agentic frameworks to deliver scalable, business-aligned solutions.
- ML/LLM Pipeline Development: Build and optimize scalable machine learning and LLM-centered pipelines utilizing Databricks, MLflow, Delta Lake, and other modern frameworks for robust data engineering and experimentation workflows.
- Cloud-Native AI Solutions: Lead the deployment of LLM-driven applications on the AWS cloud platform, leveraging services such as SageMaker, Bedrock, Lambda, and ECS to ensure seamless scaling, high availability, and cost efficiency.
- Agentic AI Framework Integration: Evaluate, prototype, and integrate emerging agentic AI frameworks (e.g., AutoGen, CrewAI, BabyAGI, LangGraph) to advance the organization s autonomous system capabilities.
- Cross-Functional Collaboration: Partner closely with data scientists, product teams, and ML engineers to ensure solutions are fit-for-purpose, driving alignment with overarching business objectives and delivering measurable value.
- LLMOps & Observability: Define and implement architectural patterns for LLMOps, covering aspects of observability, performance optimization, and governance to ensure robust and trustworthy production systems.
- Responsible AI & Compliance: Champion Responsible AI principles, embedding data privacy, ethical considerations, and regulatory compliance within all aspects of design, deployment, and operation.
- Mentorship & Code Reviews: Lead code and architecture reviews, providing technical mentorship and guidance to junior engineers and fostering a culture of excellence and continuous learning.
Must-Have Skills & Qualifications
- Extensive Experience: Minimum 8+ years in AI/ML, data engineering, or cloud solution architecture in enterprise environments with demonstrable impact on large-scale deployments.
- LLM Application Design: Proven hands-on experience designing and deploying LLM-powered applications in production, with in-depth understanding of their lifecycle from ideation to release.
- Agentic AI Expertise: Strong working knowledge of agentic AI concepts and frameworks, including autonomous agents, multi-agent orchestration, planning, and tool use.
- Databricks Proficiency: Advanced skills with Databricks ecosystem (Spark, MLflow, Delta Lake), enabling scalable data engineering and experiment tracking at enterprise scale.
- AWS Cloud Mastery: Demonstrable experience architecting and deploying solutions on AWS, with proficiency in SageMaker, Lambda, S3, ECS, and Bedrock.
- LLM Tooling & Vector Databases: Hands-on experience with LLM tooling such as LangChain and LlamaIndex, and with leading vector databases (e.g., FAISS, Pinecone, Chroma, Weaviate).
- Software Engineering Best Practices: Deep understanding of software engineering fundamentals, including version control (e.g., Git), CI/CD pipelines, and containerization (Docker/Kubernetes).
- Programming & API Skills: Strong proficiency in Python and experience building and consuming APIs (REST, FastAPI), with an ability to design and implement cloud-native architectures.
- Leadership & Collaboration: Demonstrated ability to lead architectural conversations, mentor engineers, and work effectively across multidisciplinary teams.
Good-To-Have Skills & Preferred Qualifications
- Multi-Agent System Design: Experience architecting and deploying multi-agent intelligent systems for complex, real-world use cases.
- LLM Customization: Knowledge of LLM fine-tuning, prompt engineering, and advanced techniques for improving LLM performance and alignment with specific business needs.
- Certifications: Relevant certifications in AWS Architecture and/or Databricks demonstrating formal recognition of technical expertise.
- Regulated Industry Experience: Prior experience building secure and scalable AI systems for highly regulated industries such as finance, healthcare, or insurance, with a strong grasp of compliance requirements.
- Open-Source LLM Familiarity: Familiarity with open-source LLMs (e.g., LLaMA, Mistral, Mixtral, Phi-3) and open weights hosting platforms like Hugging Face.
- Enterprise AI Applications: Previous involvement in developing chatbots, copilots, or other autonomous business agents that demonstrate the practical application of LLMs and agentic AI systems.