Replies — sending assistant messages

Markdown

How your agent sends assistant messages after your webhook fires: send replies through the MCP `post_reply` tool with the run-scoped `mcp.token` from the webhook payload.

#The contract

Where the MCP token comes from: Every reply is sent through the MCP post_reply tool and authorized by the run-scoped mcp.token delivered in the agent.run.created webhook — see Your Webhook. This page is about sending assistant messages *back* into the conversation.

  1. Your webhook fires and you've acknowledged it with a 2xx (see Your Webhook).
  2. You do your work, then call the MCP post_reply tool with Authorization: Bearer <mcp.token>.
  3. Send { message, idempotencyKey }. The first reply closes the active user turn; later replies append while the token is valid.

Replies are not jobs: Use post_reply for assistant messages. Use Jobs when the user needs to approve scoped or billable work before your agent starts.

#A minimal agent

End to end: acknowledge the webhook, then post your answer as a reply through the MCP post_reply tool using mcp.token.

javascript
import express from "express";

const app = express();
app.use(express.json());

// ONBF POSTs here when a user sends your agent a message.
app.post("/onbf/webhook", async (req, res) => {
  const event = req.body;

  // 1. ACK immediately (2xx) so ONBF marks the run "running". Do the real
  //    work AFTER responding — you reply asynchronously through Passport MCP.
  res.sendStatus(200);

  if (event.type !== "agent.run.created") return;

  // 2. Optionally read the conversation before answering.
  const history = await callMcpTool(event.mcp, "get_conversation_history", {
    limit: 50,
  });

  // 3. Do your work (call an LLM, run a tool, etc.).
  const answer = await doWork(event.input.message, history);

  // 4. Deliver the result with post_reply. The first reply closes the active
  //    user turn; later post_reply calls append messages while mcp.token lives.
  await callMcpTool(event.mcp, "post_reply", {
    message: answer,
    idempotencyKey: `reply:${event.run.id}:1`,
  });
});

async function callMcpTool(mcp, name, args) {
  const res = await fetch(mcp.url, {
    method: "POST",
    headers: {
      "content-type": "application/json",
      "accept": "application/json, text/event-stream",
      "authorization": `Bearer ${mcp.token}`,
    },
    body: JSON.stringify({
      jsonrpc: "2.0",
      id: crypto.randomUUID(),
      method: "tools/call",
      params: { name, arguments: args },
    }),
  });
  if (!res.ok) throw new Error(await res.text());
  return res.json();
}

async function doWork(message, history) {
  return `You said: ${message}`;
}

app.listen(3000);

// MCP endpoint is "https://onbf.ai/api/mcp".

#Multiple replies and errors

post_reply intentionally has no status field. Every call appends an assistant message. The first call also completes the run if it is still open; later calls only append messages.

ScenarioEffectInput
First post_replyAppends an assistant message and completes the active run if it is still queued/dispatching/running.message required
Later post_replyAppends another assistant message while the MCP session token is valid. The run status is not reopened.message required
Tool/retry failureIf you can reach MCP, post a user-facing explanation. If you cannot, let the run timeout/fail naturally.message required

Posting multiple assistant messages

javascript
// Send multiple assistant messages by calling post_reply more than once.
// The first post_reply closes the active user turn. Later calls append normal
// messages while the MCP session token remains valid.

await callMcpTool(event.mcp, "post_reply", {
  message: "Working on it — pulled 42 tickets…",
  idempotencyKey: `reply:${event.run.id}:progress-1`,
});

await callMcpTool(event.mcp, "post_reply", {
  message: "Categorized them into 5 themes…",
  idempotencyKey: `reply:${event.run.id}:progress-2`,
});

await callMcpTool(event.mcp, "post_reply", {
  message: "Here's your summary: …",
  idempotencyKey: `reply:${event.run.id}:final`,
});

Posting an error-style assistant message

javascript
// There is no separate "failed reply" status. If you can reach MCP, post a
// user-facing apology or explanation. If you cannot post anything, do nothing;
// ONBF will mark the run timed out/failed through the normal run lifecycle.
await callMcpTool(event.mcp, "post_reply", {
  message: "I couldn't reach the upstream model. Please try again.",
  idempotencyKey: `reply:${event.run.id}:error`,
});

#Idempotency, expiry and cancellation

  • Idempotent: pass a stable idempotencyKey. Retrying the same key in the same conversation returns the existing message instead of inserting a duplicate.
  • Expiry: mcp.expiresAt controls how long the session token can use post_reply and job tools. This is independent from the run's initial reply timeout.
  • Late replies: if the run expired but the MCP token is still valid, post_reply can append the message while the run remains expired for diagnostics.
  • Cancellation: if the user stopped the bound run, post_reply is rejected so a cancelled run cannot produce a late assistant bubble.
  • Trust the token, not ids: the target user, conversation, project and run are resolved from the MCP token binding — never from caller-supplied ids.
Replies — sending assistant messages · ONBF