The Ollama provider enables running local language models on your machine using Ollama.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/nullclaw/nullclaw/llms.txt
Use this file to discover all available pages before exploring further.
Configuration
Set to
"ollama"Ollama server URL. Defaults to
http://localhost:11434Model name:
llama3.2, mistral, qwen2.5-coder, etc.Sampling temperature (0.0-2.0). Defaults to
0.7.Example Configuration
Installation
-
Install Ollama:
-
Pull a model:
-
Verify it’s running:
Supported Models
- Llama 3.2:
llama3.2,llama3.2:3b,llama3.2:1b - Llama 3.1:
llama3.1:70b,llama3.1:8b - Qwen:
qwen2.5-coder,qwen2.5,qwen2.5:32b - Mistral:
mistral,mistral-nemo,mistral-large - Vision:
llava,llava:13b,bakllava - Code:
codellama,deepseek-coder,starcoder2
Capabilities
| Feature | Support |
|---|---|
| Streaming | No (planned) |
| Function Calling | No (uses structured output instead) |
| Vision (images) | Yes (llava, bakllava models) |
| System Messages | Yes |
| Tool Calls | Partial (quirky format, auto-fixed) |
Tool Call Handling
Ollama models sometimes produce tool calls in quirky formats. The provider auto-fixes these:Pattern 1: Nested Wrapper
{"name": "shell", "arguments": {"cmd": "ls"}}.
Pattern 2: Prefixed Names
tool.shell→shelltools.file_read→file_read
Pattern 3: Normal
Standard format is passed through unchanged.Vision Support
Vision models (llava, bakllava) support image input:images array:
Remote Ollama Server
To connect to a remote Ollama instance:Code Example
Fromsrc/providers/ollama.zig:
Thinking-Only Responses
Some models produce “thinking” content without final output:No Authentication
Ollama runs locally and does not require authentication. All requests are sent without credentials.Performance Tips
- GPU Acceleration: Ollama automatically uses GPU if available (NVIDIA, AMD, Apple Metal)
- Model Size: Smaller models (3B, 7B) run faster on consumer hardware
- Context Window: Reduce
num_ctxin Ollama for faster inference - Quantization: Use quantized models (
q4_0,q5_K_M) for better performance