Say you want to search the web to find 10 companies, get their website url, get their summary, and perform analysis on what products and services they could benefit from: Agent 1 - searches (SerperDevTool) the web to find a company. Agent 2 - scrapes (ScrapeWebsiteTool) the contents of the found company website and summarizes it, describing what it does, what market it operates in. Agent 3 scrapes (ScrapeWebsiteTool) the contents of the found company website and provides possible products that the company might be interested in.
Question 1: how can i just pass the whole scraped website from Agent 2 to Agent 3 to reduce duplication and not re-scrape the same website again? Question 2: how do i run the same crew more than once (in total of 10 times) to find and analyze 10 company websites? Is a for loop the best practice in this case or is there a more specific tool that I should use (something related to flows, agent managers, kickoffs)? Question 3: how do i pass a website url from agent 1 of first crew iteration to agent 1 of second crew iteration, and so on, so with every iteration agent 1 knows what websites to exclude from search because they were already looked at in previous iterations by agent 1 and it should not be searched for again? By the end the process, the crew must have looked at and analyzed 10 UNIQUE websites. Do I just use an empty array and append to it during every iteration or is there a more sophisticated/best practice approach?
I am certain you will get a better answer than this, but here is what I would try.
Configure Agent 3 as a Product Specialist with experience in market research, but not assign it to use a scraper tool. Instead just assign it a LLM. In the configuration of Agent 3, create an agent that analyzes scraped data that it receives. In developing your task for Agent 3, you can use context to wait for the results of a previous task.
Task(....
agent=[product_specialist]
context=[scrape_task] # This task will wait for scrape_task to complete
Another option would be to use a Manager Agent and require that the final report include analysis of 10 distinct companies. It will do them all at once (in theory), no loop required. If you go this route, I would try putting the 10 unique companies requirement in the ‘expected_output’ part of the task description (task.yaml file). But you will need to be sure your LLM rates are not exceeded. Otherwise, you will need to setup rate limits within Crew or just need to do one at a time and make a list of excluded companies as you build reports to tell the Manger Agent which companies are already done. This could be done with a simple code that calls the crew with an f-string that is populated in the topic listed in the main.py file. For example:
and then {topic} appears in the tasks.yaml file for all of the tasks except for Agent 3. In either case you could add one additional agent that has the sole purpose of selecting one company or 10 companies from the initial search space, or even an Agent that keeps track of what has already been done.