Issue with Litellm Perplexity API

Moe · March 10, 2025, 10:07am

Hi,

I am trying to use Perplexity as a search agent. It used to work before with the old Llama models but not anymore with the latest Sonar models. I have the following code:

from openai import OpenAI
import os
from dotenv import load_dotenv
from crewai import Agent, Task, Crew, Process, LLM
from langchain_openai import ChatOpenAI

load_dotenv(".env", override=True) 
perplexity_llm = LLM(
    api_key=os.getenv("PERPLEXITYAI_API_KEY"),
    model="perplexity/sonar",
    base_url="https://api.perplexity.ai/",
)

Is there a new setup for Perplexity to get it to work in CrewAI? Btw, it works fine in a non CrewAI context. With the above set up it now returns ErrorCode 400.

Thx

zinyando · March 10, 2025, 11:38am

Seems the sonar models are not supported by LiteLLM yet Perplexity AI (pplx-api) | liteLLM

Moe · March 10, 2025, 1:29pm

yeah, but from that site I also tried:

perplexity/pplx-70b-online

and this also does not work. Any idea which one would work?

Thx

zinyando · March 10, 2025, 2:26pm

After digging around a bit sonar models should work, can you try this one more time:

perplexity_llm = LLM(
    model="sonar",
    base_url="https://api.perplexity.ai/"
)

Moe · March 10, 2025, 6:52pm

Thx, I tried but it still does not work. What version of crewai and litellm are you using?

zinyando · March 10, 2025, 6:55pm

CrewAI version 0.102.0, which version are you on?

Moe · March 10, 2025, 7:05pm

same version, but for some reason does not work for me. this is most exasperating. I really think that CrewAI should have someone monitoring the community support to help out on queries like these. I wonder if the other multi-agent platforms are as problematic as this one.

Max_Moura · March 11, 2025, 2:28am

Well, I don’t think you need to set base_url. Have you tried just this:

perplexity_llm = LLM(
    model="perplexity/sonar",
)

Moe · March 11, 2025, 8:19am

Hi Max, yes I did try your suggestion. Here is the result that I got:

BadRequestError: litellm.BadRequestError: PerplexityException - Error code: 400 - {‘error’: {‘message’: ‘custom stop words are not implemented for completions.’, ‘type’: ‘unsupported_parameter’, ‘code’: 400}}

It somehow implies that CrewAI is adding some kind of stop words to the perplexity prompt. This is baffling because my CrewAI setup used to work perfectly well in the past with the older Llama models in PerplexityAI.

Max_Moura · March 11, 2025, 12:28pm

Hey Moe, digging around a bit more, I saw that other people are encountering ErrorCode 400 in the LiteLLM library when using models from the Sonar series, for example see here. CrewAI relies on LiteLLM for its interfaces with LLMs. So, let’s confirm if the problem isn’t in LiteLLM itself, alright? I propose the following code:

from litellm import completion
import os

os.environ['PERPLEXITYAI_API_KEY'] = 'YOUR_KEY_NOT_MINE'
MODEL = 'perplexity/sonar'

messages = [
    {'role': 'system', 'content': 'You are a helpful AI philosopher.'},
    {'role': 'user', 'content': 'What is the meaning of life in one paragraph?'},
]

litellm_response = completion(
    model=MODEL,
    messages=messages,
)

print(f'\nLiteLLM Response:\n\n{litellm_response}')

If the same problem shows up, then it’s something that needs to be fixed in the LiteLLM library, which CrewAI depends on.

I’m new to this area, still learning how to handle agent frameworks, but I suspect you could make the connection with LangChain, for example, using something like langchain_community.chat_models.perplexity.ChatPerplexity.

Moe · March 11, 2025, 8:47pm

Hi Max, I tried your code above and it works perfectly fine for me. I do get a response from Perplexity. So where does that leave this issue at? Is it a problem with CrewAI that Perplexity cannot run inside CrewAI?

Moe · March 11, 2025, 9:11pm

Max, does this make sense?:

Max_Moura · March 12, 2025, 6:39pm

Moe, if the code using the LiteLLM library worked, then I really can’t understand how CrewAI wouldn’t, since it uses the LiteLLM library under the hood.

For one last check, let’s take a look at a more complete code example, trying a direct connection and using a crew.

My knowledge is limited to this, so I hope it works out.

from crewai import LLM, Agent, Crew, Task, Process
import os

os.environ['PERPLEXITYAI_API_KEY'] = 'YOUR_API_KEY_HERE'
MODEL = 'perplexity/sonar'

perplexity_llm = LLM(
    model=MODEL,
)

print('-------------------------')
print('  LLM.call()')
print('-------------------------')

call_output = perplexity_llm.call(
    "Tell me how many stars there are in the visible sky. "
    "Output a single paragraph."
)

print(f'LLM.call Output:\n\n{call_output}')

print('-------------------------')
print('  LLM.Crew.kickoff()')
print('-------------------------\n')

agent = Agent(
    role="Senior Astrophysicist",
    goal="Spread deep thoughts about the Universe",
    backstory="With over 10 years of experience in Astrophysic, "
              "you excel at contemplating the Universe.",
    llm=perplexity_llm,
    verbose=True,
)

task = Task(
    description="Tell me how many stars there are in the visible sky.",
    expected_output="A single paragraph.",
    agent=agent
)

crew = Crew(
    agents=[agent],
    tasks=[task],
    process=Process.sequential,
    verbose=True,
)

crew_output = crew.kickoff()

print(f"LLM.Crew.kickoff Output:\n\n{crew_output.raw}")

Moe · March 13, 2025, 8:15am

Hi Max, the new code still gives the same 400 error.

Just to be clear, what is working for me currently is the following and is not litellm related:

from openai import OpenAI
import os

client = OpenAI(
api_key=os.getenv(“PERPLEXITYAI_API_KEY”), base_url=“https://api.perplexity.ai”
)

messages = [
{
“role”: “system”,
“content”: (
"You are an artificial intelligence assistant and you need to "
“engage in a helpful, detailed, polite conversation with a user.”
),
},
{
“role”: “user”,
“content”: (“How many stars are in the universe?”),
},
]

chat completion without streaming

response = client.chat.completions.create(
model=“sonar”,
messages=messages,
)

In the past I was able to get perplexity to work through Litellm with the old Llama-128k models, but those have since been deprecated, and now with the new Sonar series this error 400 problem is coming about.

VinnieD · March 14, 2025, 8:18pm

Hi,

When I use a specific agent I use this moneypatch:

import litellm

Patch unsupported parameters

original_completion = litellm.completion

def patched_completion(*args, **kwargs):
kwargs.pop(‘stop’, None) # Remove problematic parameter
return original_completion(*args, **kwargs)

litellm.completion = patched_completion

Now use LiteLLM normally

perplexity_agent = Agent(
role=‘Researcher’,
llm=‘perplexity/sonar-pro’
)

At the moment I dont use a perplexity agent anymore but have defined it as a tool:

import os
import logging
from typing import Optional, Dict, Any
import requests
from tenacity import retry, wait_exponential, stop_after_attempt, retry_if_exception_type

from crewai.tools import BaseTool

logger = logging.getLogger(name)

class PerplexitySearchTool(BaseTool):
“”"Tool that performs research using Perplexity API with citation tracking.

This tool allows agents to search the web for information on a given topic
and returns results with proper citations.
"""

name: str = "perplexity_search"
description: str = "Performs in-depth research on topics with citation tracking"
api_key: Optional[str] = None
max_results: int = 5
result_count: int = 3

class Config:
    extra = "allow"

def __init__(
    self,
    api_key: Optional[str] = None,
    max_results: int = 5,
    result_count: int = 3
) -> None:
    """Initialize the Perplexity search tool.
    
    Args:
        api_key: Perplexity API key. Defaults to PERPLEXITY_API_KEY env variable.
        max_results: Maximum number of results to return.
        result_count: Number of results to include in Perplexity API query.
    """
    super().__init__()
    self.api_key = api_key or os.getenv("PERPLEXITY_API_KEY")
    self.max_results = max_results
    self.result_count = result_count
    
    if not self.api_key:
        logger.warning("PERPLEXITY_API_KEY not found in environment variables")

@retry(
    wait=wait_exponential(multiplier=1, min=4, max=60),
    stop=stop_after_attempt(3),
    retry=retry_if_exception_type(requests.exceptions.RequestException)
)
def _run(self, query: str) -> str:
    """Run research query through Perplexity API and format results with citations.
    
    Args:
        query: The search query.
        
    Returns:
        str: Research results with citations.
    """
    if not self.api_key:
        return "Error: Perplexity API key not found in environment variables."
    
    try:
        headers = {
            "Authorization": f"Bearer {self.api_key}",
            "Content-Type": "application/json"
        }
        
        data = {
            "query": query,
            "max_results": self.result_count
        }
        
        logger.info(f"Sending query to Perplexity API: {query}")
        response = requests.post(
            "https://api.perplexity.ai/search",
            headers=headers,
            json=data,
            timeout=30
        )
        response.raise_for_status()
        
        results = response.json()
        logger.debug(f"Received response from Perplexity API for query: {query}")
        
        # Format the response with citations
        formatted_response = f"## Research Results for: {query}\n\n"
        
        # Add results with citations
        for i, result in enumerate(results.get("results", [])):
            formatted_response += f"### Source {i+1}: {result.get('title', 'Untitled')}\n"
            formatted_response += f"URL: {result.get('url', 'No URL')}\n\n"
            formatted_response += f"{result.get('snippet', 'No content available')}\n\n"
        
        # Add a summary section
        formatted_response += "## Summary\n\n"
        formatted_response += results.get("summary", "No summary available")
        
        return formatted_response
        
    except requests.exceptions.RequestException as e:
        error_msg = f"Error performing Perplexity search: {str(e)}"
        logger.error(error_msg)
        
        # Fallback to a more generic response when the API fails
        return f"""Research results for: {query}

VinnieD · March 14, 2025, 8:29pm

I still have a crew with a research agent which uses a llm_config.yaml:

esearch_analyst:
cache: true
max_retries: 5
max_tokens: 24000
model: perplexity/sonar-pro
temperature: 0.3
timeout: 60

Moe · March 15, 2025, 6:11pm

Hi Vinnie,

Your moneypatch code worked!! Fantastic. Thx!

With regards to it being used as a tool, I noticed that you are using BaseTool in your code. In the latest library update of crewai_tools, BaseTool is deprecated. What should be used now instead?

Topic		Replies	Views
CrewAI + perplexity General crewai	4	605	February 5, 2025
Litellm issue when used with Ollama CrewAI Community Support	2	165	March 3, 2025
Connect-crewai-to-llms CrewAI Community Support agent	7	865	September 18, 2024
Error: custom stop words are not implemented for completions CrewAI Community Support crewai	1	159	February 5, 2025
Connecting Ollama with crewai Crews crewai	11	7824	May 25, 2025

Issue with Litellm Perplexity API

chat completion without streaming

Patch unsupported parameters

Now use LiteLLM normally

Related topics