About the role
Own the real-time backend of an AI sales coaching platform that's live, growing 2× week-over-week, and one month into go-to-market. You'll be the second backend engineer in a team of five, working directly with the head of engineering and CEO — building the systems that decide whether the product works: live audio pipelines, LLM orchestration under millisecond budgets, and the engine that turns real sales calls into personalized AI training.
We just found go-to-market fit. Now the backend has to keep up with it. That's where you come in.
A word of honesty
This is an early-stage role at a small startup in its hardest, most exciting phase, and we'd rather you know exactly what that means before you apply.
You will own large parts of the backend, and ownership here is literal: you'll design it, build it, ship it, and get woken up by it. When a customer's live call drops audio, you'll trace it through the WebSocket, the event loop, and the LLM pipeline yourself. Some of our hardest problems aren't exotic — they're the unglamorous ones every fast-moving team accumulates: a background job that fails without telling anyone, a coaching signal that lands a beat too late, a regression that a replay test should have caught. We know exactly where it hurts, and we're hiring the person who ends it — not the person who files a ticket about it.
If you spent the last few years at a company like Wix or Monday, you already know what excellent engineering looks like at scale — strong typing, clean async code, real observability. What you may not have had recently is the feeling that your work visibly moves the company every single week. Here it does. The trade is real: less process and certainty, far more impact and speed. If that trade excites you rather than worries you, keep reading.
What you'll own
- The real-time pipeline, end to end. Live speech-to-text streams over WebSockets, reconnection and backpressure, and the latency budget from audio chunk to coaching signal on screen. A signal that arrives four seconds late is a signal that didn't happen. Today nobody owns latency as a single number. You will.
- The async evaluation engine. The pipeline that processes both real calls and simulated roleplays: long-form audio at scale, LLMs extracting structured rubric data ("did they ask for budget?"), feeding the analytics layer. You'll give long-running jobs what they're missing today — status, retries, cancellation, idempotency — so a failed run never silently corrupts a rep's performance metrics.
- The simulation engine. The stateful real-time voice backend (Gemini native audio, Pipecat) that simulates buyers for rep training — handling interruptions, context switching, and dynamic feedback. It needs an owner who treats "the avatar didn't respond" as a class of bug to eliminate, not a ticket to close.
- The gap-to-game orchestrator. The heart of the product logic: ingest performance data from real calls, identify specific skill gaps, assign the right AI roleplay scenario, update the manager dashboard. This is the loop that turns "what happened on the call" into "what to practice next."
- Observability and resilience. SLOs, alerting, and distributed tracing across the full lifecycle of an AI conversation — packet arrival to LLM inference to audio generation — so race conditions and latency spikes get found before customers find them.
- Engineering discipline as a side effect. Transcript-replay regression tests in CI, sane releases and changelogs, schema contracts that survive the next feature. The minimum that lets five people ship fast without breaking customers.
Technical challenges you'll solve
- Maintaining real-time AI when services scale up and down or connections drop.
- LLM orchestration that branches dynamically on user input without adding perceptible delay.
- Optimizing the Python async loop to handle audio chunks and vendor API calls within strict millisecond windows.
- Hundreds of concurrent audio streams and LLM contexts without blocking the event loop.
- Integrations that hold: call-recording platforms (Gong, Fireflies), calendar and OAuth flows, and extension clients in the field — built to degrade gracefully instead of breaking loudly.
- Making a "happens once a week, never on a dev machine" bug reproducible — and then impossible.
Tech stack
- Core: Python 3.12+, FastAPI, SQLAlchemy 2.0 (async), PostgreSQL.
- Async & messaging: RabbitMQ + FastStream, WebSockets, Redis.
- AI & data: Gemini, OpenAI, Anthropic APIs, Deepgram (live STT), vector stores, real-time audio (WebRTC/RTMP, Pipecat).
- Infra & tooling: AWS, Kubernetes + KEDA, Grafana/Prometheus, uv, ruff, basedpyright (strict), OpenTelemetry.
What success looks like — first weeks
- Ship to production on real customer calls — there's no three-month onboarding here.
- Know the full path of a coaching signal from microphone to screen, and put a number on its latency that the whole team can see.
What success looks like — first 90 days
- Take full ownership of the evaluation pipeline: real calls in, structured skill data out, reliably, at growing volume.
- No silent failures: every async job reports status, failed runs alert us, nothing corrupts user metrics.
- Transcript-replay tests run in CI and block regressions before they reach a live call.
What success looks like — first 6–12 months
- Architect and ship the core simulation engine — the stateful real-time layer the next phase of the product runs on.
- Build the gap-to-game orchestrator into the closed loop that defines the category.
- The real-time pipeline holds under hundreds of concurrent calls — and you can prove it with dashboards, not anecdotes.
- Set the engineering bar for the backend as the team grows around you.
Who you are
- 5+ years of backend engineering with deep Python expertise — and you can point to real-time systems in production that prove it.
- Mastery of AsyncIO: you understand the event loop inside out, you know what blocks it and what starves it, and you've kept hundreds of concurrent WebSocket or streaming connections honest — including reconnection, backpressure, and the failure modes that only show up under load.
- You've built distributed async pipelines — queues, workers, long-running jobs — and you design them with status, retries, and idempotency from day one, because you've seen what a silently failed job does to user-facing data.
- You treat observability as part of the build, not an afterthought: tracing, SLOs, and alerting are how you ship, and "we found it before the customer did" is your default bar.
- You write code that survives contact with the outside world: third-party APIs that change under you, OAuth flows that expire at the worst moment, and client versions in the wild you can't force-update. Backward compatibility is a habit, not a chore.
- Sane data modeling in PostgreSQL and strict contracts (Pydantic, typing) — because in this product, a wrong number shown to a paying customer is a critical bug, not a display issue.
- You finish things. Not "moved to next sprint" — closed, tested, monitored, documented. When a ticket is vague, you sharpen it yourself instead of waiting for a spec.
- You're hungry. You bring energy, not just experience — you ship without being pushed, follow a bug across service boundaries without being asked, and treat the product like it's yours. Because it will be.
Nice to have
- Real-time voice AI — live STT (Deepgram or similar), conversational audio pipelines (Pipecat, WebRTC), or LLM realtime APIs.
- RabbitMQ/FastStream experience.
- WebSocket scaling patterns (Redis Pub/Sub).
- Docker for production builds (DevOps handles the cluster).
- Building regression-test harnesses or replay-based testing.
- Early-stage startup experience.
Why this role matters
In a real-time product, the backend is the product. Whether a coaching signal arrives in 300 milliseconds or 3 seconds is the difference between a rep winning the deal and ignoring the tool. You'll own that line — at the exact moment the company is converting early traction into a category. The engineers who join now will have built the foundation everything after runs on.
What we offer
A founding-team seat working directly with the CEO, with real authority over the systems that matter most. Competitive salary plus meaningful equity. Performance-based comp and fast advancement — we're small, so growth isn't waiting for a promotion cycle. Bi-annual team retreats abroad. A product that's live, customers who are vocal, and a roadmap your work will visibly shape.
Who this role isn't for
- Engineers who want to manage people and avoid hands-on building.
- Architects who expect massive infrastructure and rigid processes before they can start.
- Anyone who needs big-company order and stability — priorities here shift fast, on purpose.
About Yolk
Yolk.coach is an AI sales coaching platform that gives sales teams real-time coaching. The platform is built and largely working, with a capable team and senior engineering leadership setting the bar. We handle sensitive sales conversation data and surface guidance in real time, so reliability and precision are the trust the entire category depends on.
