Facts, affordances, and pertinence: who decides what an agent does — code or the model

Two complaints come up again and again about production agents. One: it confidently states things that aren’t true. Two: it keeps pushing actions at the user that don’t fit the moment. They look like separate problems. They’re actually the same mistake, made in opposite directions — a decision was handed to the model that should have been in code, or buried in code when it should have been left to the model.

The fix is a vocabulary. Every piece of information in an agent’s context plays exactly one of three roles, and the role tells you who owns it.

The three roles

Facts — “I have.” Deterministic state the agent knows to be true: the account exists, the trial ends Friday, the device isn’t connected, the user told you last week they prefer plain language. Facts are computed in code, from a source of truth. The model never derives a fact; it only reads one.

Affordances — “I could.” The closed set of platform-recognised actions available given the facts: connect a device, add a payment method, review last week’s summary. Crucially this is a set with eligibility conditions, not a ranked queue. “The device is unconnected” (a fact) is what makes “connect the device” (an affordance) eligible. Affordances are also computed in code.

Pertinence — “I should.” Whether to surface a given affordance in this turn, and how to phrase it. This is conversational judgment over the facts, the available affordances, and the flow of the dialogue. Pertinence is the model’s job — and only the model’s job.

The three compose on every turn: the model reads the facts, sees which affordances are eligible, and decides what’s pertinent to say right now.

Why the split is the whole game

Look at what each role actually requires.

Facts and affordances demand correctness and consistency. The trial-end date must be the real date, every time. Eligibility for an action must follow the same rule on every turn. That’s exactly what code is good at and exactly what language models are bad at — ask a model to compute a billing date and it will, occasionally, invent a plausible one.

Pertinence demands reading the room. Should I bring up the unconnected device now, or is the user mid-crisis about something else? How firm should this reminder be given they just signed up yesterday? That’s exactly what large models are good at and exactly what code is miserable at — every attempt to encode “what to say next” as rules collapses into a brittle decision tree that feels robotic.

So the two failure modes map cleanly onto two boundary violations:

Confabulation is what you get when you delegate a fact to the model. Asking the LLM to decide whether an action “applies,” or to compute a date, or to determine which user a setting belongs to. Those are facts. Leave them in code.
Robotic nagging is what you get when you encode pertinence deterministically. A ranked “next best action” queue that fires regardless of the conversation. Surfacing in the moment is the one thing the model is for; a queue takes it away.

Naming things honestly

A small example of how much the vocabulary matters. We once had a field called next_actions. The name quietly implied a queue — a pre-decided priority order the agent should march through. It encouraged exactly the over-eager nudging we didn’t want.

Renaming it to actions — just the available set, with eligibility conditions, no ordering — changed how everyone reasoned about it. The model was now free to treat it as a library to draw from when pertinent, not a checklist to burn down. A field name shaped behaviour, because it shaped where people thought the decision lived.

The test

When you’re unsure where a piece of logic belongs, ask what happens if the model gets it wrong.

If a wrong answer means a false statement, a security or authorization slip, or an incorrect platform action — it’s a fact or an affordance. Compute it in code. The model should never be in a position to be wrong about it.

If a wrong answer just means slightly awkward phrasing or a mistimed suggestion, with no real-world consequence — that’s pertinence. Let the model handle it, and don’t bury it in rules.

Deterministic where being wrong is expensive. Conversational where being wrong is cheap. Most agent reliability work, in the end, is just moving decisions to the correct side of that line.