Overview
Skills
Job Details
Location: Remote - USA
Experience Level: 12+ years of industry experience
The most crucial aspects of a right candidate would be Very Strong MLOps, Chatbot building, AWS DevOps & Terraform.
Role Overview
We are seeking a senior-level technical architect who not only designs solutions but also rolls up their sleeves and implements them. This role demands a hands-on leader with deep expertise in cloud infrastructure, MLOps, AI/ML (or GenAI/Agentic AI) and infrastructure automation. You will own the architecture, development, deployment and tuning of advanced AI-driven chatbot systems, leverage AWS and Terraform for infrastructure automation, and serve as both designer and builder for mission-critical initiatives at Donyati.
Key Responsibilities
- Architect, design and implement cloud infrastructure on AWS that supports AI/ML/agentic-AI workloads.
- Lead infrastructure-as-code efforts using Terraform: create modules, manage environments, ensure reproducible deployments.
- Own the end-to-end MLOps lifecycle: model training, fine-tuning, deployment, monitoring, versioning, governance.
- Design, build and fine-tune chatbot solutions (AI-related) from data ingestion, model selection, interface integration, to live production operations.
- Provide hands-on development of AI/ML components or agentic AI agents, collaborating cross-functionally with data scientists, engineers and product owners.
- Establish best practices for observability, performance tuning, cost optimisation and security for AI/ML infrastructure.
- Mentor and guide engineering teams on cloud-native architecture, IaC, MLOps, and AI solution delivery.
- Drive technical decision-making, produce architecture diagrams, create reference implementations, and enforce design standards.
- Operate in a remote mode, delivering reliably across time zones and providing frequent status updates to stakeholders.
Must-Have Requirements
- Minimum 12 years of industry experience in software engineering, cloud architecture, or systems engineering.
- Demonstrated expertise in MLOps: building pipelines, deploying models, monitoring/operationalising ML/AI solutions.
- Strong hands-on experience with AWS, including services relevant to AI/ML, compute, storage, networking, security.
- Deep infrastructure automation with Terraform: writing modules, managing state, promoting reuse, controlling drift.
- Proven experience designing, building and fine-tuning chatbot systems (or comparable conversational/agentic AI solutions).
- Experience in AI/ML or GenAI or agentic AI (at least one of these domains) in production or near-production environments.
- Strong coding capability in relevant languages (Python, Java, or similar), and comfortable working in architect-developer mode.
- Excellent communication and collaboration skills for remote delivery across global teams.
- Comfortable working in contract mode with remote set-up and delivering high quality in a compressed timeframe.
Preferred Qualifications
- Certifications such as AWS Certified Solutions Architect, AWS Certified Machine Learning, or Terraform Associate.
- Experience with vector databases, embeddings, retrieval-augmented generation (RAG) systems, agent frameworks.
- Experience with CI/CD for ML/AI (GitHub Actions, Jenkins, pipelines) and infrastructure monitoring/observability (CloudWatch, Prometheus, Grafana).
- Experience with secure production deployment of AI systems (governance, data quality, bias mitigation, ethics).
- Past contract experience and comfort working in high-velocity environments with ambiguous requirements.
Regards,
Vinay Ram (Direct) Desk: Suwanee, GA - 30024 An MBE & eVerify Company |
Connect with me for exciting career opportunities:
Open Jobs (For Recruiters):