Trying to understand CrewAI- is this really about agents, or just managing LLM calls?

I am a faculty member at a US University. I got interested in crewai because we have an upcoming project where we evaluate our courses and compare them against industry trends and so on for relevancy discussions. I thought it may work for this use case.

I decided to implement a simple test for just grading and see how it goes. I created three “agents”, one to pull a student’s discussion posts from the discussion forum, one to grade based on a rubric and instructions and one to use the grading results to craft a feedback response. Also one more at the end that just looks at all the results and provides a summary for the instructor. It worked fine.

The issue is, what I implemented is just a series of LLM calls and not much more. One pushes the forum export and receives that student’s specific work. Grader pushes that plus rubric and grading instructions, receives an evaluation. Feedback writer pushes that plus instructions on tone etc and receives an email. I could easily do all of this manually, using custom gpts or gemini gems. This is nice automation, but I am not seeing the agent angle.

For me, agents imply:

  • A goal or objective.

  • The ability to plan or decompose that goal into tasks.

  • Iterative reasoning with feedback loops.

  • Some notion of state and progress.

  • A stopping condition that is internally determined rather than externally scripted.

That implies loops, reflection, self correction, tool use decisions, and termination logic that emerges from the agent’s own reasoning rather than being told specifically what to do.

Is the difference I am seeing here because of my implementation? My real project of looking at courses and their relevancy wouldn’t be all that different. It would still be a bunch of calls to gather various bits of information, and then calling an LLM to evaluate all of it together.

If crewai is not really an agent framework but an automated managed workflow of LLM calls, there is nothing wrong with that. This was helpful to me, and the other project would also benefit from automation. I just want to understand the terms and what I am doing. If I left some capabilities unexplored and I can tap into more agentic behavior as I described above, that’s great to learn.

Hi Kami,

Your observation is spot on and touches the very core of the “Agent vs. Workflow” debate.

The reason your implementation feels like a series of scripted LLM calls is that there is no Accountability Layer between the steps. In a typical workflow, agents are often just “unreliable narrators” passing text to each other without any formal validation of their claims.

For a system to be truly agentic, it needs what I call Claim Admissibility. An agent shouldn’t just output text; it should produce a Claim that must be validated against Evidence before the next agent (or human) accepts it.

I have been exploring a framework called Neutral Witness & AI Flight Recorder that addresses exactly the points you raised:

Iterative Reasoning: Instead of just moving to the next step, an agent’s output is evaluated by a “Neutral Witness”. If the claim (e.g., a grade) isn’t admissible based on the evidence (the rubric and student post), the agent is forced back into a reasoning loop.

Internal Stopping Condition: The process only terminates when the claims meet the admissibility threshold, rather than just reaching the end of a script.

State and Progress: The AI Flight Recorder creates an immutable “epistemic chain.” In your course evaluation project, this would provide a permanent, auditable record of why a certain course was deemed relevant, making the AI’s “thought process” legally and academically defensible.

I am currently investigating the best way to integrate this Neutral Witness architecture directly into frameworks like CrewAI to move beyond simple “managed workflows” toward true, accountable autonomy. The conceptual framework is ready, and I am just starting the actual implementation phase.

Does this perspective help answer your question about the “agent angle” you were looking for? I’d love to hear your thoughts on whether an accountability-first approach would satisfy the requirements of your university project.

P.S. One key feature of this architecture is that it supports confidential evaluation. It can compare sensitive data against constraints (like private student records vs. grading rubrics) and confirm admissibility without revealing the underlying private details to other agents or the system logs.

My concept is here in case you want to take a look: GitHub - fxg55647/Neutral-Witness-: An Accountability Infrastructure for Claims in Agent and Human Systems

Great question. CrewAI is a goal based agentic tool.. So you give you lead crew a goal and craft your agents, tasks and tools (the measn for it to achive that goal)… So in that way is agentic..

There are lots of options and ways to change and configure based o nthe task including using flows Flows - CrewAI

When I want a job done right… I use CrewAI

Have a look at some of the examples and see if it is the type of tool you want to achieve your goals CrewAI Examples - CrewAI