Own CrewAI Ollama Cloud Provider

My Problem

CrewAI’s built-in Ollama support has a catch: it routes through the OpenAI-compatible shim (/v1/chat/completions). That’s not the real Ollama API — it’s a translation layer bolted onto Ollama to make it look like OpenAI. Works… mostly. But it means:

  • No access to Ollama-native features — thinking/reasoning tokens, keep_alive, native tool calls, fine-grained options like num_predict, mirostat, etc.
  • Indirect error handling — context overflows get mangled through the OpenAI error format, making debugging harder.
  • No Ollama Cloud support — ollama’s hosted models (like gpt-oss:120b-cloud or kimi-k2.6-cloud) expect native API auth flows, not the OpenAI shim.
  • LiteLLM dependency — another layer between you and the model, adding latency and complexity.

If you’re running models locally, the shim works. But if you want the full Ollama feature set — especially on cloud — you’re out of luck.

Zero Bloat

The whole provider is ~500 lines of Python. Dependencies: crewai (≥0.80.0) and httpx (≥0.25.0). No OpenAI SDK. No LiteLLM. No proxy.

39 unit tests, ruff-clean, GitHub Actions CI on Python 3.10–3.12.

Who Is This For?

  • Ollama Cloud users — finally a direct way to use ollama.com models in CrewAI
  • DeepSeek-R1 / Kimi users — proper thinking token handling
  • Self-hosting with auth — any HTTPS Ollama instance with API key
  • Anyone who wants full control over Ollama’s native parameters

Try It

  pip install crewai-ollama-cloud
  from crewai import Agent, Task, Crew
  from crewai_ollama_cloud import OllamaCloudProvider

  llm = OllamaCloudProvider(
      model="llama3.1:8b",
      base_url="http://localhost:11434",
      stream=True,
  )

  agent = Agent(role="Analyst", goal="Analyze", backstory="...", llm=llm)
  task = Task(description="Summarize", expected_output="Summary")
  crew = Crew(agents=[agent], tasks=[task])
  result = crew.kickoff()

Repo: https://github.com/Hackbard/crewai-ollama-cloud
Issues / PRs welcome.

1 Like