I’ve been experimenting with how to make agent decisions inspectable in complex CrewAI workflows.
In larger crews, I found that reconstructing why an agent made a choice usually means correlating a lot of unstructured logs across tasks and agents. That breaks down quickly once you have policy checks, risk evaluation, and multi-step reasoning.
So I tried a different approach:
instead of emitting logs, agents emit structured decision events.
Each decision captures:
-
context (inputs / constraints)
-
actor (agent + role)
-
logic (policy / rule / rationale)
-
outcome (decision result)
-
lineage (parent decisions)
I built a deterministic 5-agent demo crew (Analyst → Optimizer → Policy Guard → Risk → Planner) where:
-
policy vs risk disagreements are visible as first-class objects
-
every final plan can be traced back through a causal decision chain
-
output is JSONL (easy to diff, replay, or analyze)
-
no external APIs or keys required (fully offline)
Implementation is decorator-based (@trace_decision) so it doesn’t change CrewAI execution semantics.
Curious if others here have hit similar issues around auditing, debugging, or governance for multi-agent flows.
Repo if anyone wants to look at the mechanics:
https://github.com/logicoflife/crewai-decision-trace
Sharing in case it’s useful for anyone working on more complex crews.