My Problem
CrewAI’s built-in Ollama support has a catch: it routes through the OpenAI-compatible shim (/v1/chat/completions). That’s not the real Ollama API — it’s a translation layer bolted onto Ollama to make it look like OpenAI. Works… mostly. But it means:
- No access to Ollama-native features — thinking/reasoning tokens, keep_alive, native tool calls, fine-grained options like num_predict, mirostat, etc.
- Indirect error handling — context overflows get mangled through the OpenAI error format, making debugging harder.
- No Ollama Cloud support — ollama’s hosted models (like gpt-oss:120b-cloud or kimi-k2.6-cloud) expect native API auth flows, not the OpenAI shim.
- LiteLLM dependency — another layer between you and the model, adding latency and complexity.
If you’re running models locally, the shim works. But if you want the full Ollama feature set — especially on cloud — you’re out of luck.
Zero Bloat
The whole provider is ~500 lines of Python. Dependencies: crewai (≥0.80.0) and httpx (≥0.25.0). No OpenAI SDK. No LiteLLM. No proxy.
39 unit tests, ruff-clean, GitHub Actions CI on Python 3.10–3.12.
Who Is This For?
- Ollama Cloud users — finally a direct way to use ollama.com models in CrewAI
- DeepSeek-R1 / Kimi users — proper thinking token handling
- Self-hosting with auth — any HTTPS Ollama instance with API key
- Anyone who wants full control over Ollama’s native parameters
Try It
pip install crewai-ollama-cloud
from crewai import Agent, Task, Crew
from crewai_ollama_cloud import OllamaCloudProvider
llm = OllamaCloudProvider(
model="llama3.1:8b",
base_url="http://localhost:11434",
stream=True,
)
agent = Agent(role="Analyst", goal="Analyze", backstory="...", llm=llm)
task = Task(description="Summarize", expected_output="Summary")
crew = Crew(agents=[agent], tasks=[task])
result = crew.kickoff()
Repo: https://github.com/Hackbard/crewai-ollama-cloud
Issues / PRs welcome.