security · 8 min read

Safe Expression Evaluation for AI Agent Enforcement Boundaries

Safe expression evaluation for AI agent enforcement boundaries in capital markets workflows: restricted grammars, parser isolation, policy-condition validation, and independently verifiable execution

Published 2026-03-21 · AI Syndicate

Primary topic: safe expression evaluator AI workflow
Category: security
Reading time: 8 min read

For capital markets AI agents, workflow conditions are not just routing logic. They are part of the enforcement boundary. A condition that decides whether a client-record update, KYC field change, trade-support action, or AML disposition can proceed must be evaluated in a way that cannot execute injected code or bypass policy before the action runs.

The audit question is concrete: can the platform prove that the expression used to route or authorize the action was evaluated by a restricted parser, against defined state fields, before execution? If the evaluator accepts arbitrary code, the approval envelope, parameter binding, and fail-closed policy decision can all be undermined by the condition layer.

Syndicate Claw addresses this with a custom recursive-descent parser that evaluates a restricted grammar. The grammar is expressive enough for workflow and policy conditions, but limited enough to prevent code execution.

The eval() Problem

Python's eval() function evaluates a string as Python code. Given sufficient privileges, eval() can execute arbitrary operations: file access, network requests, system commands, and data exfiltration.

The attack vector in workflow engines is expression injection. Workflow definitions include conditions, typically as strings that the workflow engine evaluates to determine routing. If these condition strings are derived from untrusted sources, an attacker may be able to inject malicious code.

Consider a workflow with a condition: "transaction.amount > 1000". This is a straightforward comparison. Now imagine the condition is constructed from user input: f"transaction.amount > {user_amount_threshold}". If user_amount_threshold contains malicious code instead of a number, the expression becomes a code injection vector.

eval() does not provide protection against this. Any string that eval() can parse can contain arbitrary code.

The Safe Expression Grammar

Syndicate Claw's expression evaluator, implemented as _ConditionParser, supports a restricted grammar:

**Literals:** strings, numbers, booleans, null

**Identifiers:** state.field references that access workflow state

**Comparison operators:** ==, !=, <, >, <=, >=

**Boolean operators:** and, or, not

**Grouping:** parentheses for precedence control

This grammar is expressive for workflow conditions. You can express "the transaction amount exceeds the threshold" or "the user has the required role and the resource is not flagged" or "the model output confidence is below the minimum acceptable value."

What you cannot express: function calls, attribute access beyond the defined state structure, imports, comprehensions, assignments, or any other Python construct outside the defined grammar.

The grammar is implemented as a recursive-descent parser. The parser tokenizes the input string, builds an abstract syntax tree from the tokens according to the grammar rules, and evaluates the tree. The evaluation is a straightforward tree walk with no code generation, no dynamic compilation, and no invocation of arbitrary code.

State Field References

Workflow conditions often need to reference runtime state: the current transaction amount, the authenticated user's role, the model output value, or the classification assigned to a KYC workflow. The grammar supports state.field references for this purpose.

A state field reference accesses a named field from the workflow state. The parser validates that the field name exists and is accessible before evaluation. Arbitrary attribute access is not permitted. Only fields defined in the workflow state schema can be referenced.

This prevents attacks that attempt to access internal state, system properties, or environment variables. The allowed fields are explicitly defined, not dynamically discovered.

Parser Implementation

The _ConditionParser is implemented as a recursive-descent parser with three isolated components:

**Tokenizer:** breaks the input string into tokens: literals, operators, identifiers, and punctuation. Malformed input is rejected at this stage.

**AST builder:** constructs an abstract syntax tree from the token stream according to the grammar. Grammar violations are rejected here.

**Evaluator:** walks the tree and produces a boolean result. The evaluator has no access to external state beyond the workflow state and the literal values in the tree.

Each component is isolated. The tokenizer does not execute. The tree builder does not evaluate. Only the evaluator executes, and it operates on a validated, restricted structure.

What Attack Classes Are Eliminated

Safe expression evaluation eliminates several attack classes:

**Code execution.** Without eval(), there is no mechanism to execute arbitrary Python code. The expression grammar does not include function calls, imports, or any construct that could execute system operations.

**OS command injection.** The grammar does not support subprocess execution, shell commands, or file system access. Those capabilities are unavailable through the expression language.

**Data exfiltration.** The grammar does not support network requests, file operations, or any mechanism for exporting data. Even if an attacker could inject malicious content, there is no mechanism to transmit data.

**Attribute traversal.** The grammar does not support arbitrary attribute access. Access is limited to explicitly defined state fields. Internal objects, system properties, and environment variables are inaccessible.

**Denial of service.** The grammar is designed to prevent pathological evaluation cases. Deeply nested expressions are limited by the parser's recursion depth. Memory consumption is bounded by the input size.

Capital Markets Enforcement Relevance

For a platform engineering team operating AI agents in a Canadian capital markets firm, expression evaluation is part of the control surface. OSFI B-13 expectations around access control and attributable security events do not stop at identity checks. If a condition decides whether a privileged action proceeds, the condition evaluation itself must be bounded and attributable.

In a FINTRAC-adjacent KYC or STR workflow, the same issue appears transactionally. If an AI agent routes a case from review to disposition based on workflow state, the regulated entity needs evidence of the condition that ran, the fields it referenced, the parser outcome, and the policy version that interpreted the result.

The evidence artifact is not the parser source code by itself. The evidence artifact is the decision record that binds the workflow definition version, condition expression, accessible state fields, actor identity, parser result, and resulting action before execution.

Safe expression evaluation does not prove that the business rule was legally sufficient. It proves a narrower control: the expression layer did not execute arbitrary code, did not reference undeclared state, and produced a bounded decision the enforcement layer could verify.

Integration with Policy Engine

Safe expression evaluation is not limited to workflow decision nodes. The policy engine uses the same _ConditionParser for policy rule conditions. Policy rules specify conditions under which an action is permitted or denied.

Using the same evaluator for both workflow conditions and policy conditions provides consistency. The security properties that apply to workflow routing also apply to access control decisions.

Policy rule conditions can reference the actor, the resource, the action, and the environment. These references are validated against the policy context schema. The grammar remains restricted; the available context is defined.

Defense in Depth

Safe expression evaluation is one layer in Syndicate Claw's defense-in-depth architecture. Even with a safe parser, other controls protect the system:

Policy evaluation gates tool execution, preventing unauthorized actions even if a condition evaluation is manipulated.

Sandbox enforcement limits what tools can do, even if they are invoked.

State redaction prevents sensitive data from appearing in outputs or logs.

Append-only evidence records capture significant events, providing independently verifiable evidence if something goes wrong.

Safe expression evaluation eliminates an entire attack class. Defense in depth ensures that even if another control fails, the enforcement boundary remains bounded, attributable, and reviewable.

Frequently asked questions

Why is eval() dangerous in workflow condition evaluation?

eval() executes arbitrary Python code. If expression strings are derived from untrusted input, attackers can inject malicious code for execution, leading to code execution, data exfiltration, or system compromise.

How does safe expression evaluation work?

Safe expression evaluation uses a custom recursive-descent parser that tokenizes input, builds an AST from a restricted grammar, and evaluates only that structure. No code execution occurs; the grammar supports comparisons and boolean logic only.

Why does expression evaluation matter for AI agent enforcement?

Workflow and policy conditions decide whether an AI agent can proceed. If those conditions can execute arbitrary code or reference undeclared state, the enforcement boundary can be bypassed before the action runs.

What evidence should a reviewer expect from a safe expression evaluator?

A reviewer should be able to see the workflow definition version, condition expression, accessible state fields, parser outcome, actor identity, policy version, and resulting action bound in the decision record before execution.

Can safe expression evaluation prove a workflow decision was legally sufficient?

No. Safe expression evaluation proves that the expression layer was bounded, did not execute arbitrary code, and produced a verifiable decision. It does not prove that the business rule or compliance disposition was legally sufficient.

Continue reading

security