Relay: Giving AI Agents a Browser Instead of a Terminal Wall

The interface between an AI coding agent and the human supervising it is, almost always, a terminal. The agent asks “Which database should I use — A, B, or C?” and you type a letter. It wants to show you an architecture before committing to it, so it draws boxes out of dashes and pipes. It finishes a migration plan and asks you to “type done when you’ve reviewed it.” Every decision, every review, every prototype is squeezed through a monospace pipe that was designed for log output, not collaboration.

I spent enough months pair-working with agents to find this genuinely limiting. So I built relay — a small CLI that gives the agent a browser board instead of a terminal wall. This is the story of why, and the decisions that shaped it.

The problem with the terminal as a decision surface

The terminal is a fantastic place to read a stack trace. It is a terrible place to make a decision. Three failures show up over and over when an agent tries to involve a human:

Questions arrive one at a time. The agent needs six answers before it can proceed, but the conversational format forces them into a serial back-and-forth. You answer one, wait, answer the next. There’s no way to see the whole decision at once and answer it as a set.
Options have no shape. “Option B is the one with the caching layer, see my earlier message.” When the choices are visual — three layout variants, two color schemes, competing architectures — describing them in prose and asking you to hold them in your head is exactly backwards.
Review is a wall of text. The agent dumps a plan, a diff summary, a table of numbers rendered as pipe-delimited text, and then asks you to confirm. Feedback means writing another paragraph describing which part you meant.

None of this is the agent’s fault. It’s working with the only surface it has.

The idea: a board, not a chat

Relay’s premise is simple. When an agent needs a decision or wants to show you something, it shouldn’t narrate — it should open a board in your browser and wait. You see real form controls, real charts, real diagrams. You click, you comment on anything, you hit Submit. The agent reads your answers back as JSON and keeps working.

The whole loop is one command:

rly ask --file board.json --timeout 1800

That blocks until you submit, then prints the result to stdout as JSON. A board is just a spec — a title, some blocks, and some questions:

{
  "title": "Pick the caching strategy",
  "blocks": [
    { "type": "mermaid", "code": "graph LR; Client-->Edge; Edge-->Origin; Origin-->DB" },
    { "type": "table",
      "columns": ["Strategy", "p95 latency", "Cost/mo"],
      "rows": [["Edge KV", "12ms", "$40"], ["Redis", "8ms", "$120"]],
      "sortable": true }
  ],
  "questions": [
    { "id": "strategy", "type": "single", "label": "Which one ships first?",
      "options": [{ "value": "kv", "label": "Edge KV" }, { "value": "redis", "label": "Redis" }] }
  ]
}

Six questions become one board with six controls. The chart is a chart. The diagram zooms. And when you want to flag a specific data point, you click it.

Design decisions that mattered

A tool that agents reach for has different constraints than a normal app. A few choices defined the project.

Zero runtime dependencies

Relay has no runtime dependencies. Plain Node ≥ 18, nothing in node_modules at all. This wasn’t minimalism for its own sake — it’s a reliability requirement. An agent installs and invokes this tool unattended, often in a fresh environment. Every transitive dependency is a thing that can fail to install, ship a breaking change, or pull in a vulnerability the agent then has to reason about. Zero dependencies means npm i -g @khanglvm/relay either works or fails for one obvious reason.

The rich blocks — Chart.js, Mermaid, a Graphviz WASM build — are vendored and lazy-loaded, served from the local board, and work fully offline. No CDN call at render time, no network dependency for a diagram to draw. The board opens whether or not you have connectivity.

Detached mode, because agents have time limits

Many agent runtimes cap how long a single tool call can run. A blocking rly ask that waits 30 minutes for a human doesn’t fit that. So relay separates asking from collecting:

rly ask --file board.json --detach   # returns immediately: {"status":"open","url":"…"}
rly wait b-xxxxx --timeout 3500       # blocks elsewhere until you submit
rly result b-xxxxx                    # non-blocking peek — includes the live draft

The agent fires the board, does other work or yields, and collects the answer later. Critically, answers autosave in real time. A page reload restores your input; a draft survives a timeout or cancellation and comes back in the result. Partial human input is never lost just because a clock ran out.

Pick by looking — option-level visuals

The fix for “Option B is the one with caching” is to attach a visual to each option. In relay, any block can live inside an option, so a “which layout?” question renders each layout as a thumbnail right under its label:

{ "id": "layout", "type": "single", "label": "Hero layout?",
  "options": [
    { "value": "split", "label": "Split", "blocks": [{ "type": "image", "src": "split.png", "height": 160 }] },
    { "value": "stack", "label": "Stacked", "blocks": [{ "type": "image", "src": "stack.png", "height": 160 }] }
  ] }

You choose by looking, not by decoding a description. Interacting with the visual — zooming, commenting — never accidentally toggles the choice; only the label row selects.

Comments on anything, as first-class feedback

This is the part I use most. Every element on a board is annotatable. Click a point on a chart, a node in a diagram, a cell in a table, or select a sentence in a markdown block, and leave a comment anchored right there. The comments come back in result.annotations with enough targeting metadata that the agent knows exactly what you meant:

{
  "blockId": "b2",
  "target": { "kind": "chart-element", "label": "Feb", "value": 19 },
  "text": "That Feb spike was a one-off onboarding push — don't extrapolate it."
}

The agent can reply, and the thread grows on the board. Feedback stops being “write another paragraph describing the thing” and becomes “point at the thing.”

A clean machine contract

Everything an agent needs is JSON on stdout; human-facing logs go to stderr. Exit codes are unambiguous — 0 submitted, 2 timeout, 3 cancelled, 4 usage error, 5 not found. Skipped questions are absent from answers and listed in skipped. The agent never has to parse prose to know what happened.

The shift in practice

Without relay	With relay
Six questions, one at a time in the terminal	One board, all of them, real controls
”Option B is the one with caching”	Each option carries its own image or chart — pick by looking
ASCII architecture art	Mermaid / Graphviz / PlantUML — zoomable, full-screen, even editable
Numbers buried in prose	Charts and sortable tables
”Type done when you’re finished”	A Submit button; answers and comments returned as JSON
Feedback = another wall of text	Click any point, node, cell, or sentence to comment

The underlying idea isn’t really “make the terminal prettier.” It’s that human-in-the-loop is a first-class part of how agents work now, and it deserves a real surface. The relay is the handoff: the agent reaches a point where a human should decide or look, raises a board, and the human’s structured response flows straight back into the run.

Try it

Relay is open source under MIT. It started life as quest-board and is now @khanglvm/relay (CLI: rly).

npm i -g @khanglvm/relay                  # the rly CLI
npx -y skills add khanglvm/relay -g -y    # the agent skill (Claude Code, Codex, Cursor, …)

The skill teaches your agent when to open a board — and the next time it needs a decision or wants to show you a plan, it will. The code, the full spec format, and the demo live on GitHub. If you build agent workflows, I’d genuinely like to hear where the terminal still gets in your way.