AI/LLM Consultant
Dallas, TX – Hybrid 3 Days per week
We need strong candidates who have developed LLM’s, Designed workflows and AI tools deployment knowledge.
Key Responsibilities:
• Design, develop, and maintain LLM-powered multi-agent workflows for code analysis, remediation proposals, and safe patch generation.
• Implement agentic patterns including planning/execution loops, dynamic tool orchestration, sandboxing, guardrails, and failure recovery.
• Build scalable automation systems for technical debt remediation: language/runtime upgrades, vulnerability patching, dependency modernization, and config drift correction.
• Partner with Dev Experience and Platform teams to define engineering guidelines and reusable standards across the organization.
• Architect and optimize Retrieval-Augmented Generation (RAG) pipelines, managing chunking, embeddings, hybrid search, reranking, and retrieval policies.
• Develop robust evaluation frameworks for LLMs, RAG, and agent workflows, including offline datasets, validation metrics, statistical testing, and A/B tests.
• Contribute to backend systems using Python, distributed systems, microservices, PostgreSQL, DBT, vector databases, caching, streaming, and queueing.
• Build CI/CD pipelines, observability dashboards, and perform performance analysis on model, retrieval, and network layers.
• Collaborate cross-functionally with product, platform, and security to move prototypes to production-grade services.
• Communicate clearly with stakeholders, write technical documentation, and mentor junior engineers.
Platform runs on AWS and AWS knowledge is must.
• 5+ years’ experience building production-grade systems with end-to-end ownership.
• Expertise in Python programming, software engineering best practices, testing strategies, CI/CD, and system design.
• Hands-on experience shipping LLM-powered features such as autonomous workflows or function calling with measurable impact on reliability or latency.
• Deep knowledge of multi-agent architectures including planners, executors, and tool routing.
• Strong understanding of RAG systems: chunking, embeddings, vector/hybrid search, and retrieval policies.
• Experience evaluating LLMs and agent workflows incorporating statistical reasoning and validation.
• Proficiency with AWS (Lambda, ECS/EKS, S3, API Gateway, EC2, IAM) and Infrastructure-as-Code for cloud resource automation and deployment.
• Experience with observability tools (Datadog, logging, tracing, metrics).
• Familiarity with PostgreSQL, DBT, data modeling, schema evolution, and performance tuning.
• Knowledge of vector databases like Pinecone or pgvector.
• Experience building or optimizing CI/CD pipelines (GitHub Actions or similar).
• Proven track record in application modernization, dependency management, and technical debt reduction.
• Ability to rapidly prototype, validate, and transition solutions to production systems.
Preferred Skills:
• Experience designing agent infrastructure with sandboxing, tool isolation, and fail-safe execution.
• Background in large-scale platform engineering or developer experience tooling.
• Understanding of security, compliance, and privacy for enterprise AI systems.
• Strong architectural communication ability, including RFC writing and diagramming.
Attributes:
• Adaptable and proactive problem solver.
• Strong ownership mindset with excellent collaboration and communication skills.
• Comfortable in ambiguous, fast-paced R&D environments.
• Passionate about building high-leverage platform capabilities impacting hundreds of engineering teams.
Thanks