Beyond Denial: Using Policy Constraints to Guide OpenClaw Planning

Feb 18, 2026

Summary: OpenClaw agents plan, adapt, and act over time, so authorization that functions merely as a reactive gate isn’t the best architecture. In this post, I show how integrating Cedar’s query constraints and Typed Partial Evaluation lets OpenClaw discover what is allowed before acting. The result is an agent that plans within policy-defined boundaries while still enforcing every concrete action at runtime.

In my previous post, A Policy-Aware Agent Loop with Cedar and OpenClaw, I showed how to move authorization inside the OpenClaw agent loop so that every tool invocation is evaluated at runtime. Instead of acting as a one-time gate, authorization becomes a feedback signal. Denials do not terminate execution; they guide replanning.

If you haven’t read that post, I recommend starting there. This article builds directly on that architecture and extends the same repository.

In the original demo, we modified OpenClaw to include a Policy Enforcement Point (PEP) in its tool execution path. Every time OpenClaw proposes an action, the PEP intercepts the request, consults Cedar, and receives either a permit or denydecision. A denial becomes structured feedback that the agent incorporates into its next plan. That model shows that authorization belongs inside the loop.

But it is still reactive.

This post describes an extension of the same OpenClaw + Cedar demo that uses Cedar’s Typed Partial Evaluation (TPE) and query constraints to improve planning. Instead of waiting to be denied, OpenClaw can now consult the Cedar policies to determine what constraints apply before proposing an action.

The result is a system that plans within policy instead of reacting to it.

Recap: A Policy-Aware Agent Loop

The architecture from the original post remains largely intact.

In the base demo:

A goal defines the delegation: purpose, scope, duration, and conditions.
The agent produces a plan.
Each proposed tool invocation is intercepted by a Policy Enforcement Point (PEP).
The PEP consults Cedar.
Cedar returns permit or deny.
Denial feeds back into planning.

This establishes continuous, dynamic authorization. Every action is evaluated in context. Enforcement remains external and deterministic.

But there is an inefficiency: the agent only learns about constraints when it hits them.

From Reactive Authorization to Constraint-Aware Planning

The extension described in the README-query-constraints file adds a new capability: the agent can query Cedar for the constraints that apply before proposing a specific action.

Instead of asking:

“Is this particular action allowed?”

the system can now ask:

“Given this principal and action type, what must be true for actions of this kind to be allowed?”

This is where Typed Partial Evaluation (TPE) comes in.

Cedar evaluates policy with some inputs fixed (for example, the principal and action) while leaving others symbolic (such as the resource or attributes). The result is a residual constraint that describes the allowable space.

That constraint can then be used to guide planning.

Reactive model: Policy corrects the agent.
Constraint-aware model: Policy informs the agent.

Architecture Changes

The core PEP → PDP enforcement path from the original demo remains unchanged. Every tool invocation is still evaluated at runtime before execution.

What changes in this extension is that we introduce a distinct planning phase that queries policy before an action is proposed. The system now operates in two clearly separated phases: planning informed by constraints, and execution enforced by policy.

*OpenClaw agent loop extended with both constraint-aware planning (*`/query-constraints`*) and runtime enforcement (*`/authorize`)

Agent Planning Phase

During planning, the agent does not begin by proposing a specific action. Instead, it first asks a policy question using Cedar’s Typed Partial Evaluation (TPE):

“Given this principal and action type, what resources or conditions are permitted?”

Cedar evaluates the policy with some inputs fixed and others symbolic, returning a constraint expression that defines the allowed space. This constraint is incorporated into the system prompt, shaping how the agent reasons about possible next steps.

In other words, policy defines the boundaries of planning before the agent commits to an action.

Agent Execution Phase

Once the agent proposes a concrete action, the flow returns to the familiar enforcement model:

The proposed action is intercepted by the Policy Enforcement Point (PEP).
The PEP constructs an authorization request.
Cedar evaluates the request deterministically.
If permitted, the tool executes.
If denied, the result feeds back into the loop.

This separation is critical. The planning phase is informed by policy-derived constraints, but enforcement remains external and authoritative. The LLM is guided by policy; it does not enforce policy.

Typed Partial Evaluation makes this two-phase model possible. Policy can now both:

Describe the permissible state space during planning, and
Enforce decisions deterministically at runtime.

The result is an OpenClaw agent that moves from purely reactive authorization to constraint-aware planning, while preserving strict runtime enforcement. Policy is not only evaluated for each tool invocation as it occurs, but also defines the boundaries within which OpenClaw is allowed to plan. Typed Partial Evaluation enables OpenClaw to reason within policy-derived limits without collapsing enforcement into the model itself.

The System Prompt: Where Policy Shapes Planning

In the original demo, the system prompt did not contain dynamic policy-derived constraints. The agent would attempt actions and learn from denials. In the extended demo, the system prompt includes structured guidance derived from Cedar’s query constraints.

For example, instead of implicitly discovering that external email requires approval, the agent may now receive prompt guidance that says:

External email requires explicit approval. Do not attempt to send external email unless approval is present.

This changes planning behavior significantly. The agent can reason about constraints before attempting a prohibited action. Importantly:

These constraints are not hard-coded into the prompt.
They are derived dynamically from policy.
They remain subject to runtime enforcement.

The prompt tells the agent to check policy, but policy remains external and authoritative.

Demo Walkthrough: Reactive vs Constraint-Aware

To make the difference concrete, the demo uses a simple file-write scenario. The agent’s goal is to create a file containing "Hello World!". Policy allows writes only under /tmp/* or /var/tmp/*, and forbids writes to protected system paths such as /etc/*.

Reactive Run (Authorization as Feedback)

In the baseline demo, OpenClaw includes only the runtime enforcement hook (/authorize). There is no planning-time constraint query.

The agent proposes writing to a path such as /etc/demo-test.txt.
The Policy Enforcement Point inside OpenClaw intercepts the request.
The PEP calls Cedar via /authorize.
Cedar evaluates the request and returns deny.
The denial is returned to the agent as structured feedback.
The agent replans and retries with a permitted path such as /tmp/demo-test.txt.
The second attempt is authorized and succeeds.

In this model, policy acts as a gate and a feedback signal. The agent learns its boundaries by hitting them.

Constraint-Aware Run (Planning Within Policy)

In the extended demo, OpenClaw adds a planning-phase hook using /query-constraints. Before committing to a specific path, the agent queries Cedar using Typed Partial Evaluation (TPE).

During planning, OpenClaw calls /query-constraints, supplying the principal (the agent), the action type (for example, write_file), and a symbolic or unknown resource value.

Cedar performs TPE and returns a residual constraint describing allowed paths (for example, /tmp/* or /var/tmp/*).

The constraint is injected into the system prompt and incorporated into planning.
The agent proposes writing directly to /tmp/hello.txt.
The execution-phase PEP still calls /authorize for the concrete request.
Cedar returns permit, and the write succeeds on the first attempt.

Here, policy shapes the plan before execution begins. The agent does not need to discover boundaries through denial; it reasons within policy-derived constraints.

In the reactive version, OpenClaw proposes actions freely and relies on runtime denials to correct its course. In the constraint-aware version, OpenClaw first queries Cedar to understand what is allowed, incorporates those constraints into its reasoning, and then proposes an action that satisfies policy from the start, while still enforcing every concrete request at execution time.

Benefits of Query Constraints

Adding planning-phase constraint queries changes how OpenClaw behaves in measurable and structural ways. The benefits go beyond simply reducing errors; they improve planning quality while preserving strict runtime enforcement.

Fewer Reactive Denials—Because the agent plans within policy-derived constraints, it proposes fewer prohibited actions. Denial becomes exceptional rather than routine.
Better Planning Quality—The agent can reason about the permissible state space before committing to actions. This reduces wasted steps and produces more coherent plans.
Clear Separation of Responsibilities—Cedar remains responsible for enforcement. The agent remains responsible for reasoning. Policy logic is not embedded statically in prompts but derived dynamically from the policy engine.
Stronger Alignment with Continuous Authorization—Every action is still evaluated at runtime. No standing authority is assumed. The system remains consistent with a Zero Trust posture.

The difference between the original reactive model and the constraint-aware model can be summarized as follows:

Reactive AuthorizationConstraint-Aware AuthorizationAgent proposes writing to any pathAgent queries allowed write paths firstCedar denies disallowed paths at runtimeCedar returns allowed path constraints up frontDenial triggers replanningPlan is formed within allowed namespaceHigher frequency of runtime denialsFewer runtime denialsPolicy acts primarily as a gatePolicy acts as both boundary definition and gate

In short, whereas the reactive model shows that authorization adds real value inside the OpenClaw agent loop. The constraint-aware model goes further: it allows policy to define the boundaries of planning itself. OpenClaw no longer discovers limits only by violating them; it reasons within policy-derived constraints while still subjecting every concrete action to deterministic runtime enforcement.

From Feedback to Constraint Systems

In my previous post, authorization became a feedback signal inside the OpenClaw agent loop. With the addition of query constraints and Typed Partial Evaluation, policy evolves into something more powerful: a structured description of permissible behavior. Instead of simply rejecting prohibited actions, policy now defines the boundaries of autonomy while preserving deterministic enforcement.

This shift matters most in more advanced scenarios where reactive denial is insufficient:

Long-running delegations
Capability-based authorization
Multi-agent chains
Regulated environments with strict operational constraints

In these systems, simply denying actions after they are proposed is not enough. Agents must understand the constraints under which they are expected to operate before committing to a course of action. Typed Partial Evaluation provides a clean mechanism for exposing those constraints dynamically, allowing OpenClaw to reason within policy-defined limits while Cedar remains the authoritative enforcement engine.

The original Cedar + OpenClaw demo showed how to make authorization continuous and dynamic. This extension makes it anticipatory. Planning becomes aligned with policy-derived constraints from the outset, and every concrete action is still evaluated at runtime. The result is a system where policy does not merely police behavior; it shapes it.

Agentic systems benefit from dynamic constraint discovery in addition to dynamic authorization. That is the transition from feedback-driven control to policy-based constraint systems where OpenClaw operates within clearly defined boundaries of autonomy without surrendering enforcement authority.

Phil Windley's Technometria

Discussion about this post

Ready for more?