Ollama Provider

The Ollama provider enables running local language models on your machine using Ollama.

Configuration

provider

string

required

Set to "ollama"

base_url

string

Ollama server URL. Defaults to http://localhost:11434

model

string

required

Model name: llama3.2, mistral, qwen2.5-coder, etc.

temperature

number

Sampling temperature (0.0-2.0). Defaults to 0.7.

Example Configuration

{
  "provider": "ollama",
  "model": "llama3.2",
  "base_url": "http://localhost:11434",
  "temperature": 0.7
}

Installation

Install Ollama:

# macOS/Linux
curl -fsSL https://ollama.ai/install.sh | sh

# Or download from https://ollama.ai/download

Pull a model:
```
ollama pull llama3.2
```
Verify it’s running:
```
ollama list
```

Supported Models

Llama 3.2: llama3.2, llama3.2:3b, llama3.2:1b
Llama 3.1: llama3.1:70b, llama3.1:8b
Qwen: qwen2.5-coder, qwen2.5, qwen2.5:32b
Mistral: mistral, mistral-nemo, mistral-large
Vision: llava, llava:13b, bakllava
Code: codellama, deepseek-coder, starcoder2

Browse all models at ollama.ai/library

Capabilities

Feature	Support
Streaming	No (planned)
Function Calling	No (uses structured output instead)
Vision (images)	Yes (llava, bakllava models)
System Messages	Yes
Tool Calls	Partial (quirky format, auto-fixed)

Tool Call Handling

Ollama models sometimes produce tool calls in quirky formats. The provider auto-fixes these:

Pattern 1: Nested Wrapper

{
  "name": "tool_call",
  "arguments": {
    "name": "shell",
    "arguments": {"cmd": "ls"}
  }
}

The provider unwraps this to {"name": "shell", "arguments": {"cmd": "ls"}}.

Pattern 2: Prefixed Names

tool.shell → shell
tools.file_read → file_read

Pattern 3: Normal

Standard format is passed through unchanged.

Vision Support

Vision models (llava, bakllava) support image input:

{
  "provider": "ollama",
  "model": "llava",
  "base_url": "http://localhost:11434"
}

Images are sent as base64-encoded data in the images array:

{
  "model": "llava",
  "messages": [
    {
      "role": "user",
      "content": "What's in this image?",
      "images": ["iVBORw0KGgo..."]
    }
  ]
}

Note: Ollama only supports base64-encoded images, not URLs. HTTP URLs are automatically skipped.

Remote Ollama Server

To connect to a remote Ollama instance:

{
  "provider": "ollama",
  "model": "llama3.2",
  "base_url": "http://192.168.1.100:11434"
}

Code Example

From src/providers/ollama.zig:

pub const OllamaProvider = struct {
    base_url: []const u8,
    allocator: std.mem.Allocator,

    const DEFAULT_BASE_URL = "http://localhost:11434";

    pub fn init(allocator: std.mem.Allocator, base_url: ?[]const u8) OllamaProvider {
        const url = if (base_url) |u| trimTrailingSlash(u) else DEFAULT_BASE_URL;
        return .{
            .base_url = url,
            .allocator = allocator,
        };
    }

    pub fn chatUrl(self: OllamaProvider, allocator: std.mem.Allocator) ![]const u8 {
        return std.fmt.allocPrint(allocator, "{s}/api/chat", .{self.base_url});
    }
};

Thinking-Only Responses

Some models produce “thinking” content without final output:

{
  "message": {
    "role": "assistant",
    "content": "",
    "thinking": "Let me reason about this carefully..."
  }
}

The provider returns a preview:

I was thinking about this: Let me reason about this carefully... but I didn't complete my response. Could you try asking again?

No Authentication

Ollama runs locally and does not require authentication. All requests are sent without credentials.

Performance Tips

GPU Acceleration: Ollama automatically uses GPU if available (NVIDIA, AMD, Apple Metal)
Model Size: Smaller models (3B, 7B) run faster on consumer hardware
Context Window: Reduce num_ctx in Ollama for faster inference
Quantization: Use quantized models (q4_0, q5_K_M) for better performance

CLI Commands

Providers

Channels

Tools

Configuration

Example Configuration

Installation

Supported Models

Capabilities

Tool Call Handling

Pattern 1: Nested Wrapper

Pattern 2: Prefixed Names

Pattern 3: Normal

Vision Support

Remote Ollama Server

Code Example

Thinking-Only Responses

No Authentication

Performance Tips

CLI Commands

Providers

Channels

Tools

Documentation Index

​Configuration

​Example Configuration

​Installation

​Supported Models

​Capabilities

​Tool Call Handling

​Pattern 1: Nested Wrapper

​Pattern 2: Prefixed Names

​Pattern 3: Normal

​Vision Support

​Remote Ollama Server

​Code Example

​Thinking-Only Responses

​No Authentication

​Performance Tips

​Related

Configuration

Example Configuration

Installation

Supported Models

Capabilities

Tool Call Handling

Pattern 1: Nested Wrapper

Pattern 2: Prefixed Names

Pattern 3: Normal

Vision Support

Remote Ollama Server

Code Example

Thinking-Only Responses

No Authentication

Performance Tips

Related