Hallucination is happening by agents

narendra_sompalle · September 14, 2025, 11:02am

I build a AMS project using crewai that integrate with JIRA,Servicenow,Github,Knowledge base(RAG) I am facing Hallucination issue, can you please tell how to fix the issue. I am facing lot of issues

JoRo · September 18, 2025, 2:02pm

That’s pretty cool that you go all those integrations working. Would love to hear how easy it was. Or hard. I’d like to try the same thing for my company.

As for hallucinations. Do you have observability on your LLM calls? This will help you in identifying where in your pipeline you need more focus. Hallucinations is a challenge we all face.

Look at your task prompt and be sure to tell it to only use retrieved sources.

Look at your retrieved chunks to see if it is even possible to answer the question based on the chunks. These first 2 is great first place to look.

Afterwards, you will know if you need to change your embedding model, or prompt expansions.

Then you can check your output response and play with the LLM used in response.

Lastly add citations so you can instill credibility.

Good luck

narendra_sompalle · September 19, 2025, 4:48am

This is not happening in rag, it is happening with JIRA,Sevicenow, even though there is no related to that project ID and that project ID at all not present it showing that it is creating the ticket, same like ServiceNow also it is generating random INC numbers and URLs

Tony_Wood · September 19, 2025, 12:02pm

Great work.

For me when i face hallucinations I break down the crew into smaller parts. It will take a little time, but break it down and then find the problematic area.

Hope this helps

Teemu_Lantta · January 10, 2026, 7:12am

I’m curious, what sort of hallucinations they have been for example?

Tony_Wood · January 12, 2026, 5:26pm

Generally they will get lost if i am too broad.. I call a hallucination anything that goes off my script

major_tiwari · January 23, 2026, 12:35pm

Could it be due to temperature setting of the LLM? Which LLM are you using and can u share settings?

SystemFlowStudio · February 13, 2026, 1:55am

This usually comes down to task boundary leakage + missing termination conditions.
In CrewAI I’ve seen hallucinations spike when agents are allowed to “self-extend” tasks without explicit success/fail criteria.

What helped me:

hard stop conditions per task
explicit “no new assumptions” system instruction
logging intermediate agent thoughts (even briefly) to spot divergence

Curious — are your hallucinations appearing mid-task or at final output?

Bin_Zhang · March 9, 2026, 1:57pm

Interesting problem — this might actually be a bit different from the usual “hallucination” people talk about.

In many agent systems the issue isn’t only hallucinated text, but what I’d call action hallucination. The agent starts executing tools or workflows even when the underlying signal is weak or ambiguous.

When tools like JIRA or ServiceNow are connected, a few things can help:

• Make tool descriptions extremely explicit about when they should be used.
• Add constraints in the task prompt (for example: only create tickets if a verified incident exists).
• Log the reasoning + tool calls so you can trace why the agent decided to create something.

In practice I’ve found the biggest stability improvement comes from adding a validation layer between the agent and external systems. Instead of executing actions immediately, the system checks whether the action is consistent with the context.

Curious if others here have run into similar “action hallucination” behavior when agents interact with enterprise tools.

major_tiwari · March 10, 2026, 2:50am

@Bin_Zhang Thank you for the explanation. I would like to better understand what is meant by the validation layer. Is it intended to function as a set of guardrails, checkpoints, or another form of control mechanism? Additionally, where would this layer sit within the architecture, and how would adding an extra layer contribute to improved performance? Is it possible to achieve a similar result by designing the tasks to be more deterministic instead?

Bin_Zhang · March 10, 2026, 5:43am

Good question.

The validation layer I mentioned is basically a control point between the agent and external systems.

Instead of letting the agent execute actions directly, the flow becomes:

agent → validation layer → tools / APIs

So the agent proposes an action, and the system checks it before execution.

Typical checks are simple things like:

• does the action match the current context
• is the tool call structurally valid
• is the action allowed by policy
• should the action be blocked or delayed

The reason I find this useful is that many agent failures are not just text hallucinations but what I call action hallucination.

The agent decides to execute something even though the signal is weak or ambiguous.

This becomes risky once agents are connected to systems like JIRA, databases, or internal APIs.

The validation layer acts as a control point between reasoning and execution.

Another benefit is that every action can be logged so the full chain of decisions can be reconstructed later.

I’ve been experimenting with this idea as an execution-integrity layer for agent runtimes:

Still early exploration, but the goal is to make action traces deterministic and portable across frameworks like LangGraph, CrewAI, and AutoGen.

Bin_Zhang · March 10, 2026, 5:50am

Topic		Replies	Views
Agent Hallucination Crews agent	12	811	October 14, 2024
2nd agent failing Crews agent , memory	2	55	September 2, 2025
CrewAI YT Video: “How to Detect AI Hallucinations with CrewAI and Opik” Announcements	1	163	May 6, 2025
My custom LLM hallucinate with default agent prompt . Can i override default prompt Crews agent , crewai	0	83	January 29, 2025
Tool Usage not guaranteed General tools_issues , agent , task , crewai , flows	7	250	May 23, 2025

Hallucination is happening by agents

Related topics