Category: CrewAI, Tools Issue, Task Processing
Description of the Problem:
We are using CrewAI to extract structured real estate details from property descriptions provided in a CSV file. The dataset contains columns such as House Address, Guide Price, and Description, where the Description column contains relevant property details like:
- Tenure (Freehold/Leasehold)
- Number of Bedrooms
- House Type (Flat, Detached, Bungalow, etc.)
- Annual Ground Rent (If applicable)
- EPC Rating (Energy Efficiency Rating)
The issue arises when extracting structured details from the unstructured Description field. The LLM is hallucinating incorrect values and generating outputs that do not match the dataset.
What We Have Tried:
- Refining the Task Prompt
- Clearly instructing the agent to extract only explicitly mentioned details and return “Not Provided” for missing fields.
- Ensuring correct UK real estate terminology is used.
- Adjusting the LLM Configuration
- Setting temperature to 0.0 to reduce randomness.
- Using DeepSeek-R1 (1.5B) on Ollama for structured text extraction.
- Testing the CSV Tool Independently
- The CSVSearchTool correctly retrieves the Description column.
- However, when the agent processes the descriptions, it fabricates incorrect data.
Code Implementation Overview:
-
Agents Configuration (agents.yaml)
- Role: Real Estate Data Analyst
- Goal: Extract only factual details without making assumptions.
- Backstory: Ensures structured extraction and correct data alignment.
-
Task Definition (tasks.yaml)
- Extract details from the Description column and return structured CSV output.
- Strict rules to avoid hallucination and ensure factual accuracy.
-
Crew Configuration (crew.py)
- LLM: DeepSeek-R1 (1.5B)
- Task Inputs: Property descriptions from the CSV.
- Output: Structured CSV file with extracted details.
Current Issue:
- The agent sometimes makes up property details (wrong house type, incorrect tenure, fabricated EPC ratings).
- The output CSV is not aligned with the original dataset.
- Even when explicitly told to return “Not Provided” for missing data, it generates incorrect values instead.
Question to the Community:
Has anyone faced similar issues with CrewAI or LLM hallucination in structured data extraction?
Are there specific techniques or settings that helped in ensuring factual consistency when extracting structured details?
Any insights on refining prompting techniques, CrewAI task configuration, or LLM adjustments to mitigate hallucination would be greatly appreciated