Position: Generative AI Engineer
Overall Experience: 8+ Years (with strong Python expertise)
GenAI Experience: 2+ Years (in Production Environments, not just POCs)
Locations: Charlotte, NC | New Jersey, NJ
Work Mode: Hybrid (3 days onsite per week)
Employment Type: Full-Time / Contract-to-Hire
Interview Process: Includes a mandatory, live hands-on coding round
Key Responsibilities
· GenAI Solution Engineering & Advanced RAG
· Orchestration: Design and build production GenAI applications using LangChain and LangGraph for multi-agent, stateful, and graph-based workflows.
· RAG Optimization: Develop and optimize RAG pipelines including advanced patterns like HyDE, re-ranking, hybrid search, multi-hop retrieval, and RAPTOR hierarchical summarization.
· API Development: Build and expose GenAI capabilities as RESTful and streaming APIs using FastAPI (with async support, dependency injection, and OpenAPI documentation).
· MCP Server Development & LLMOps
· Context Architecture: Architect and maintain Model Context Protocol (MCP) servers to securely connect LLMs to heterogeneous enterprise data sources (SQL, NoSQL, APIs).
· Observability: Integrate systems with frameworks like LangSmith, Helicone, Arize, or OpenTelemetry for tracing, latency profiling, and prompt lineage.
· Guardrails & Monitoring: Own prompt versioning, model evaluation (RAGAS, ROUGE, BERTScore), and implement guardrails (Guardrails AI, NeMo Guardrails) for PII redaction and toxicity filtering.
Full-Stack Integration & Governance
Angular Frontend: Develop Angular-based user interfaces (chat UIs, agent monitors, dashboards) and consume FastAPI streaming endpoints (SSE / WebSockets) for real-time token streaming.
Platform Governance: Contribute to architectural decisions around model routing, semantic caching (Redis), and multi-tenant isolation while ensuring compliance with enterprise data governance.
Required Qualifications
· Experience: 8+ years of total software engineering experience; 2+ years of hands-on, production-level Generative AI experience.
· Core GenAI Stack: LangChain, LangGraph, LLM APIs (OpenAI, Anthropic, Azure, Bedrock), and Vector Stores (Chroma, Pinecone, Weaviate, pgvector).
· Backend & Protocol: Python 3.10+ (async/await, Pydantic v2), FastAPI, and hands-on experience building/deploying MCP servers or equivalent context-injection frameworks.
· Frontend: Angular 15+ (components, services, RxJS, and signals) to successfully bridge backend AI services with the user interface.
· Data & Infrastructure: SQL + NoSQL (PostgreSQL, MongoDB, Redis), Docker, Kubernetes, and GitHub Actions CI/CD.
· Preferred (Plus) Skills
· Experience with graph-based architectures for complex, cyclic reasoning tasks.
· Familiarity with fine-tuning workflows (LoRA/QLoRA, PEFT, DPO/RLHF) and distributed inference (vLLM, TGI, Triton).
· Prior experience in regulated industries (Banking, FinTech, Healthcare) with awareness of model risk management frameworks (e.g., SR 11-7).