Issue: LLM Hallucination in Structured Data Extraction (CrewAI)

Naveed_Ali · March 13, 2025, 6:46am

Category: CrewAI, Tools Issue, Task Processing

Description of the Problem:

We are using CrewAI to extract structured real estate details from property descriptions provided in a CSV file. The dataset contains columns such as House Address, Guide Price, and Description, where the Description column contains relevant property details like:

Tenure (Freehold/Leasehold)
Number of Bedrooms
House Type (Flat, Detached, Bungalow, etc.)
Annual Ground Rent (If applicable)
EPC Rating (Energy Efficiency Rating)

The issue arises when extracting structured details from the unstructured Description field. The LLM is hallucinating incorrect values and generating outputs that do not match the dataset.

What We Have Tried:

Refining the Task Prompt

Clearly instructing the agent to extract only explicitly mentioned details and return “Not Provided” for missing fields.
Ensuring correct UK real estate terminology is used.

Adjusting the LLM Configuration

Setting temperature to 0.0 to reduce randomness.
Using DeepSeek-R1 (1.5B) on Ollama for structured text extraction.

Testing the CSV Tool Independently

The CSVSearchTool correctly retrieves the Description column.
However, when the agent processes the descriptions, it fabricates incorrect data.

Code Implementation Overview:

Agents Configuration (agents.yaml)
- Role: Real Estate Data Analyst
- Goal: Extract only factual details without making assumptions.
- Backstory: Ensures structured extraction and correct data alignment.
Task Definition (tasks.yaml)
- Extract details from the Description column and return structured CSV output.
- Strict rules to avoid hallucination and ensure factual accuracy.
Crew Configuration (crew.py)
- LLM: DeepSeek-R1 (1.5B)
- Task Inputs: Property descriptions from the CSV.
- Output: Structured CSV file with extracted details.

Current Issue:

The agent sometimes makes up property details (wrong house type, incorrect tenure, fabricated EPC ratings).
The output CSV is not aligned with the original dataset.
Even when explicitly told to return “Not Provided” for missing data, it generates incorrect values instead.

Question to the Community:

Has anyone faced similar issues with CrewAI or LLM hallucination in structured data extraction?
Are there specific techniques or settings that helped in ensuring factual consistency when extracting structured details?

Any insights on refining prompting techniques, CrewAI task configuration, or LLM adjustments to mitigate hallucination would be greatly appreciated

Alain_GALL · March 13, 2025, 6:59am

Have you tried using Pydantic models?

Naveed_Ali · March 13, 2025, 7:06am

Hello Alain!!. I believe I have not tried them, Would you please guide me on this? Currently im running deepseek 1.5b r1 on my local and Using that.

Alain_GALL · March 13, 2025, 7:25am

check the doc, also my experience is that smaller models perform poorly when it comes to structured output. Experiment with other models if you can, work on your prompts, include examples … I am no expert but I struggled with this as well.

Naveed_Ali · March 13, 2025, 9:47am

Respected Alain, Thank you so much for the suggestion.

I have been trying alot of deepseek models, Currently Im even not using any TOOL like CSV or JSON. I am manually preprocessing the Csv file and giving only the required Column to it.
Still im not getting exactly what i require.

I tried giving the complete thingy as a prompt to the same model without CrewAI. I got the Right output.

It will be very helpful if someone can guide me on this blocker

Max_Moura · March 13, 2025, 11:03am

Hi, @Naveed_Ali. I think you’d have more success if you could illustrate your problem in a more concrete way. Even if your solution contains confidential data, you could anonymize them and perhaps generate 5 or 10 sample cases (CSV rows) of the source data and examples of how you would like the final data to be presented.

vijay_varadharajan · April 11, 2025, 11:46am

Hi,

I am facing this issue with crewai. have you resolved this.. pls explain how?

Max_Moura · April 11, 2025, 12:08pm

Welcome, Vijay.

What exactly are you trying to achieve, and how are you going about it? Sharing a code snippet would be super helpful, too.

tonykipkemboi · April 11, 2025, 3:19pm

we do have a video to accompany our docs on how to get consistent outputs from CrewAI tasks here. if you know what you need to constantly extract from the document, all you need is to define pydantic models to match your schema and pass that to the Tasks that extract the info.

jmueller · May 26, 2025, 9:03pm

In case you’re still struggling with hallucinated/incorrect LLM outputs in Data Extraction, my startup helps enterprises tackle this via automated trustworthiness scoring for LLMs:

Tutorial you might find useful: Information Extraction from Documents using the Trustworthy Language Model | Cleanlab Documentation

Topic		Replies	Views
Crews or flows for structured data/info retrieval Crews crewai	3	25	June 8, 2025
Agent Hallucination Crews agent	12	543	October 14, 2024
Context limit and hallucinations when reading from file CrewAI Community Support	1	505	November 1, 2024
Issues with complex crew architexture, tool calling / memory. Running in air gapped environments CrewAI Community Support tools_issues , agent , crewai , memory	3	71	May 9, 2025
Issue with CSVSearchTool CrewAI Community Support tools_issues	9	667	April 27, 2025

Issue: LLM Hallucination in Structured Data Extraction (CrewAI)

Question to the Community:

Related topics