Discussion about this post

User's avatar
PEG's avatar

Thoughtful piece, but I wonder if there’s a category mismatch. Your framework assumes agents have persistent goals that can be delegated and constrained through policy. But if LLMs are stateless pattern matchers where “goals” are just temporary statistical biases from context (as the injective proof suggests), what exactly is being constrained?

The attacks you’re defending against exploit accumulated context across sessions, not weak authorization boundaries. No component-level policy can see breaches that emerge from individually compliant interactions.

Curious how your framework accounts for this—are you assuming something like persistent intentions emerges during inference?

Daniel Hardman's avatar

Hey, Phil. Great post, and I agree that authorizing agents is tricky and needs to be done continuously. I also strongly agree with your other post that agents are not a good choice for a policy engine due to their probabilistic nature.

I wanted to mention several things that seem adjacent to what you're talking about here, that might pique your interest or feedback.

1. A supposition in many discussions about agents is that they have sameness across time and are thus worthy of identity that asserts that sameness. But if an agent changes its model or its training data or its policies, what is really the "same" about it from Time1 to Time2, such that it is worthy of having an identity that connects its existence at those two points? The only sameness in such a model might be the controller's access point, or possibly the agent's stored data used for RAG, or possibly the OAuth tokens it uses to authenticate to external systems. I'm worried that this issue is handwaved past by the designers of A2A and MCP, and represents a fundamental trust gap that can never be filled unless/until we figure out how to associate sameness with an agent in a reliable way. (This may be partly what PEG is alluding to with "stateless pattern matchers... just temporary statistical biases from context", or maybe that's a whole nuther point...)

2. I think the entire digital landscape, but ESPECIALLY agent land, is desperately in need of a concept that I call an "intent boundary", which I have written about here: https://dhh1128.github.io/papers/intent-boundaries.html

3. I 100% agree that the constraints you want to place on agents are mandatory, and I think there should be many more. I did some work on constrained delegation a couple years ago, and it is now becoming relevant to stuff I'm working on in verifiable voice (telco). It could also be applied to agents. See https://github.com/provenant-dev/public-schema/blob/main/gcd/index.md (this doc is KERI-oriented, but same principles could be applied more widely). Implicit in this is that KERI delegation is 2-way (has builtin mechanisms for the delegator holding the delegate accountable and transparent), whereas normal delegation is 1-way (transfer authority and then have no ability to monitor, revoke, prevent its redelegation, etc.).

3 more comments...

No posts

Ready for more?