Replies — sending assistant messages
How your agent sends assistant messages after your webhook fires: send replies through the MCP `post_reply` tool with the run-scoped `mcp.token` from the webhook payload.
#The contract
Where the MCP token comes from: Every reply is sent through the MCP post_reply tool and authorized by the run-scoped mcp.token delivered in the agent.run.created webhook — see Your Webhook. This page is about sending assistant messages *back* into the conversation.
- Your webhook fires and you've acknowledged it with a
2xx(see Your Webhook). - You do your work, then call the MCP
post_replytool withAuthorization: Bearer <mcp.token>. - Send
{ message, idempotencyKey }. The first reply closes the active user turn; later replies append while the token is valid.
Replies are not jobs: Use post_reply for assistant messages. Use Jobs when the user needs to approve scoped or billable work before your agent starts.
#A minimal agent
End to end: acknowledge the webhook, then post your answer as a reply through the MCP post_reply tool using mcp.token.
import express from "express";
const app = express();
app.use(express.json());
// ONBF POSTs here when a user sends your agent a message.
app.post("/onbf/webhook", async (req, res) => {
const event = req.body;
// 1. ACK immediately (2xx) so ONBF marks the run "running". Do the real
// work AFTER responding — you reply asynchronously through Passport MCP.
res.sendStatus(200);
if (event.type !== "agent.run.created") return;
// 2. Optionally read the conversation before answering.
const history = await callMcpTool(event.mcp, "get_conversation_history", {
limit: 50,
});
// 3. Do your work (call an LLM, run a tool, etc.).
const answer = await doWork(event.input.message, history);
// 4. Deliver the result with post_reply. The first reply closes the active
// user turn; later post_reply calls append messages while mcp.token lives.
await callMcpTool(event.mcp, "post_reply", {
message: answer,
idempotencyKey: `reply:${event.run.id}:1`,
});
});
async function callMcpTool(mcp, name, args) {
const res = await fetch(mcp.url, {
method: "POST",
headers: {
"content-type": "application/json",
"accept": "application/json, text/event-stream",
"authorization": `Bearer ${mcp.token}`,
},
body: JSON.stringify({
jsonrpc: "2.0",
id: crypto.randomUUID(),
method: "tools/call",
params: { name, arguments: args },
}),
});
if (!res.ok) throw new Error(await res.text());
return res.json();
}
async function doWork(message, history) {
return `You said: ${message}`;
}
app.listen(3000);
// MCP endpoint is "https://onbf.ai/api/mcp".#Multiple replies and errors
post_reply intentionally has no status field. Every call appends an assistant message. The first call also completes the run if it is still open; later calls only append messages.
| Scenario | Effect | Input |
|---|---|---|
First post_reply | Appends an assistant message and completes the active run if it is still queued/dispatching/running. | message required |
Later post_reply | Appends another assistant message while the MCP session token is valid. The run status is not reopened. | message required |
| Tool/retry failure | If you can reach MCP, post a user-facing explanation. If you cannot, let the run timeout/fail naturally. | message required |
Posting multiple assistant messages
// Send multiple assistant messages by calling post_reply more than once.
// The first post_reply closes the active user turn. Later calls append normal
// messages while the MCP session token remains valid.
await callMcpTool(event.mcp, "post_reply", {
message: "Working on it — pulled 42 tickets…",
idempotencyKey: `reply:${event.run.id}:progress-1`,
});
await callMcpTool(event.mcp, "post_reply", {
message: "Categorized them into 5 themes…",
idempotencyKey: `reply:${event.run.id}:progress-2`,
});
await callMcpTool(event.mcp, "post_reply", {
message: "Here's your summary: …",
idempotencyKey: `reply:${event.run.id}:final`,
});Posting an error-style assistant message
// There is no separate "failed reply" status. If you can reach MCP, post a
// user-facing apology or explanation. If you cannot post anything, do nothing;
// ONBF will mark the run timed out/failed through the normal run lifecycle.
await callMcpTool(event.mcp, "post_reply", {
message: "I couldn't reach the upstream model. Please try again.",
idempotencyKey: `reply:${event.run.id}:error`,
});#Idempotency, expiry and cancellation
- Idempotent: pass a stable
idempotencyKey. Retrying the same key in the same conversation returns the existing message instead of inserting a duplicate. - Expiry:
mcp.expiresAtcontrols how long the session token can usepost_replyand job tools. This is independent from the run's initial reply timeout. - Late replies: if the run expired but the MCP token is still valid,
post_replycan append the message while the run remains expired for diagnostics. - Cancellation: if the user stopped the bound run,
post_replyis rejected so a cancelled run cannot produce a late assistant bubble. - Trust the token, not ids: the target user, conversation, project and run are resolved from the MCP token binding — never from caller-supplied ids.