Title: AI voice backend engineering
Location: 3 days a week on site in the Redwood City, CA area ( Tuesday- Thursday)
Contract Length: 2 Years
Mode of interview: 2 Virtual interviews.
Minimum years of experience: 7 Years experience
Interview process: 2 rounds virtual
About the role
We’re building a production-grade AI voice agent platform that supports real-time, low-latency conversations at scale. The backend powers session orchestration, streaming pipelines, tool execution, integrations, and the data layer that makes voice agents reliable, debuggable, and continuously improving.
This role is for someone who can ship quickly while maintaining a high bar for correctness, observability, and uptime in real-time systems.
What you’ll do
- Own voice agent backend services: session lifecycle, call routing, participant management, conversation state, and real-time orchestration.
- Build streaming pipelines: integrate and operate STT/TTS and real-time model endpoints; manage latency budgets, backpressure, and jitter.
- Agent runtime + tools: implement tool calling, sandboxed execution, retries/idempotency, rate limits, and per-tenant budgets/quotas.
- Integrations: telephony/RTC providers, calendars/CRMs/support tools, webhooks, and event streams; build durable sync adapters.
- Data platform: model and store transcripts, audio metadata, agent actions, outcomes, evaluations, and audit trails; optimize for query + replay.
- Reliability engineering: SLIs/SLOs, incident response, end-to-end tracing across streaming + tool calls, and automated degradation strategies.
- Security and privacy: secrets management, least privilege, encryption, data retention controls, PII redaction, and tenant isolation.
What we’re looking for
- 6+ years backend engineering experience with ownership of production systems.
- Strong distributed systems fundamentals, including queues/workers, caching, and operational best practices.
- Excellent data modeling and API design skills (versioning, contracts, idempotency).
- Comfortable in a high-velocity environment; pragmatic, curious, and outcome-oriented.
- Experience building or operating low-latency systems (real-time streaming, WebSockets, RTC, event-driven architectures).
Nice to have
- Voice/real-time experience: WebRTC/RTC providers, telephony, streaming audio, SIP concepts, jitter/packet loss mitigation.
- LLM and agent architecture experience: tool calling, planning/execution loops, memory/state, eval harnesses.
- Familiarity with SDKs/frameworks: LiveKit, Twilio/telephony stacks, Vercel AI SDK, LangChain/LangGraph, Anthropic Agents SDK.
- Experience with observability for real-time systems (OpenTelemetry traces spanning streaming + async workflows).