Providers

Overview

Providers are NullClaw’s abstraction layer for AI model APIs. Every provider implements the Provider vtable interface, enabling runtime-swappable model backends with zero code changes.

Provider Interface

pub const Provider = struct {
    ptr: *anyopaque,
    vtable: *const VTable,

    pub const VTable = struct {
        // Simple one-shot: system prompt + user message → text
        chatWithSystem: *const fn(
            ptr: *anyopaque,
            allocator: std.mem.Allocator,
            system_prompt: ?[]const u8,
            message: []const u8,
            model: []const u8,
            temperature: f64,
        ) anyerror![]const u8,

        // Structured chat: full conversation → ChatResponse
        chat: *const fn(
            ptr: *anyopaque,
            allocator: std.mem.Allocator,
            request: ChatRequest,
            model: []const u8,
            temperature: f64,
        ) anyerror!ChatResponse,

        // Capabilities
        supportsNativeTools: *const fn(ptr: *anyopaque) bool,
        getName: *const fn(ptr: *anyopaque) []const u8,
        deinit: *const fn(ptr: *anyopaque) void,

        // Optional: streaming, vision, native tools
        supports_streaming: ?*const fn(ptr: *anyopaque) bool = null,
        supports_vision: ?*const fn(ptr: *anyopaque) bool = null,
        stream_chat: ?*const fn(...) anyerror!StreamChatResult = null,
        chat_with_tools: ?*const fn(...) anyerror!ChatResponse = null,
    };
};

Supported Providers

Core Providers (9)

These providers have dedicated implementations in src/providers/:

Provider	Models	Features
Anthropic	Claude Opus/Sonnet/Haiku 3/3.5/4	Native tools, streaming, vision, extended thinking
OpenAI	GPT-4o, GPT-4, GPT-3.5, o1/o3	Native tools, streaming, vision, reasoning effort
OpenRouter	200+ models (proxy)	Native tools, streaming, vision (model-dependent)
Ollama	Local models (llama3, mistral, etc.)	Native tools, streaming, local inference
Gemini	Gemini Pro/Flash/Ultra	Native tools, streaming, vision, thinking
Venice	Privacy-focused models	No logs, Swiss jurisdiction
xAI	Grok models	Real-time data, Twitter integration
Groq	Fast inference (llama3, mixtral)	Ultra-low latency
Mistral	Mistral Large/Medium/Small	EU-based, open weights

Compatible Providers (41+)

These providers work via the OpenAI-compatible adapter (src/providers/compatible.zig):

All Compatible Providers

Together AI — Fast inference for open models
Fireworks — Optimized model deployment
Perplexity — Search-augmented generation
Cohere — Enterprise-grade NLP
DeepSeek — Code-specialized models
Replicate — Community model hosting
Hugging Face — Inference API
Cloudflare Workers AI — Edge inference
AI21 Labs — Jurassic models
Anyscale — Ray-based inference
OctoAI — Optimized model serving
Novita AI — Affordable inference
Deep Infra — GPU cloud
Lepton AI — Serverless inference
Cerebras — Ultra-fast inference chips
Monster API — Low-cost hosting
Nebius — EU cloud AI
Scaleway — EU inference
Hyperbolic — Fast open models
Lambda — GPU cloud
Modal — Serverless containers
Runpod — GPU marketplace
Avian — Privacy-first AI
Glhf — Community models
Kindo — Enterprise AI
SambaNova — Fast LLM chips
Featherless — Lightweight inference
GitHub Models — Azure-backed playground
LM Studio — Local model server
Llamafile — Single-file executables
Jan — Desktop AI runtime
Text Generation Web UI — Local web interface
vLLM — High-throughput inference
TGI (Text Generation Inference) — Hugging Face server
LocalAI — OpenAI-compatible local server
Tabby — Code completion server
Continue — IDE assistant backend
Amazon Bedrock — AWS managed models
Azure OpenAI — Microsoft-hosted OpenAI
Google Vertex AI — GCP model deployment
Custom endpoint — Any OpenAI-compatible API

Configuration

Basic Setup

Providers are configured in ~/.nullclaw/config.json:

{
  "models": {
    "providers": {
      "openrouter": { "api_key": "sk-or-..." },
      "anthropic": { "api_key": "sk-ant-..." },
      "ollama": { "base_url": "http://localhost:11434" }
    }
  },
  "agents": {
    "defaults": {
      "model": { "primary": "openrouter/anthropic/claude-sonnet-4" }
    }
  }
}

Custom Endpoints

Use custom: prefix for any OpenAI-compatible API:

{
  "models": {
    "providers": {
      "custom:my-endpoint": {
        "api_key": "token",
        "base_url": "https://my-inference.example.com/v1"
      }
    }
  },
  "agents": {
    "defaults": {
      "model": { "primary": "custom:my-endpoint/my-model" }
    }
  }
}

Model Selection

Model string format: <provider>/<model>

"model": {
  "primary": "openrouter/anthropic/claude-opus-4",
  "fallback": "openai/gpt-4o"
}

Provider Selection Flow

Capabilities

Native Tool Calling

Providers that return supportsNativeTools() == true:

Anthropic (function calling API)
OpenAI (function calling API)
Gemini (function calling API)
Ollama (via tool schema)
OpenRouter (proxies native support)

Other providers fall back to prompt-based tool simulation.

Streaming

Providers implement stream_chat vtable method:

pub fn streamChat(
    self: Provider,
    allocator: std.mem.Allocator,
    request: ChatRequest,
    model: []const u8,
    temperature: f64,
    callback: StreamCallback,
    callback_ctx: *anyopaque,
) !StreamChatResult

Supported by: Anthropic, OpenAI, Ollama, Gemini, most compatible providers.

Vision (Multimodal Input)

Providers check supportsVision() or supportsVisionForModel(model):

const parts = [_]ContentPart{
    makeTextPart("What's in this image?"),
    makeImageUrlPart("https://example.com/cat.jpg"),
};
const msg = ChatMessage{
    .role = .user,
    .content = "",
    .content_parts = &parts,
};

Supported by: GPT-4o, Claude Opus/Sonnet 3.5/4, Gemini Pro Vision, OpenRouter (model-dependent).

Error Handling

Automatic Retries

The ReliableProvider wrapper (src/providers/reliable.zig) adds:

Exponential backoff (50ms → 200ms → 800ms)
Rate limit detection (429 status)
Transient error classification
Circuit breaker integration

Fallback Chain

Configure fallback models in agent config:

"model": {
  "primary": "openrouter/anthropic/claude-opus-4",
  "fallback": "openai/gpt-4o",
  "fallback_on_error": true
}

Fallback triggers only on provider errors (rate limits, outages). It does NOT trigger on LLM refusals or content policy violations.

Performance

Provider Warmup

Optional vtable method warmup() pre-establishes connections:

provider.warmup();

Currently a no-op for all providers (HTTP clients are stateless).

Response Caching

SQLite-backed cache deduplicates identical requests:

{
  "memory": {
    "response_cache": {
      "enabled": true,
      "ttl_minutes": 60,
      "max_entries": 10000
    }
  }
}

Cache key: sha256(system_prompt + messages + model + temperature)

Implementation Guide

Minimal Provider

const std = @import("std");
const Provider = @import("root.zig").Provider;

pub const MyProvider = struct {
    api_key: []const u8,
    base_url: []const u8,

    pub fn chatWithSystem(
        ptr: *anyopaque,
        allocator: std.mem.Allocator,
        system_prompt: ?[]const u8,
        message: []const u8,
        model: []const u8,
        temperature: f64,
    ) anyerror![]const u8 {
        const self: *MyProvider = @ptrCast(@alignCast(ptr));
        // Call API, parse response...
        return try allocator.dupe(u8, "response text");
    }

    pub fn chat(
        ptr: *anyopaque,
        allocator: std.mem.Allocator,
        request: ChatRequest,
        model: []const u8,
        temperature: f64,
    ) anyerror!ChatResponse {
        // Full implementation...
        return .{ .content = "response" };
    }

    pub fn supportsNativeTools(_: *anyopaque) bool {
        return false;
    }

    pub fn getName(_: *anyopaque) []const u8 {
        return "my_provider";
    }

    pub fn deinit(_: *anyopaque) void {}

    pub fn provider(self: *MyProvider) Provider {
        return .{ .ptr = @ptrCast(self), .vtable = &vtable };
    }

    pub const vtable = Provider.VTable{
        .chatWithSystem = chatWithSystem,
        .chat = chat,
        .supportsNativeTools = supportsNativeTools,
        .getName = getName,
        .deinit = deinit,
    };
};

Get Started

Core Concepts

Configuration

Deployment

Guides

Overview

Provider Interface

Supported Providers

Core Providers (9)

Compatible Providers (41+)

Configuration

Basic Setup

Custom Endpoints

Model Selection

Provider Selection Flow

Capabilities

Native Tool Calling

Streaming

Vision (Multimodal Input)

Error Handling

Automatic Retries

Fallback Chain

Performance

Provider Warmup

Response Caching

Implementation Guide

Minimal Provider

Next Steps

Configuration

Tools

Get Started

Core Concepts

Configuration

Deployment

Guides

Documentation Index

​Overview

​Provider Interface

​Supported Providers

​Core Providers (9)

​Compatible Providers (41+)

​Configuration

​Basic Setup

​Custom Endpoints

​Model Selection

​Provider Selection Flow

​Capabilities

​Native Tool Calling

​Streaming

​Vision (Multimodal Input)

​Error Handling

​Automatic Retries

​Fallback Chain

​Performance

​Provider Warmup

​Response Caching

​Implementation Guide

​Minimal Provider

​Next Steps

Configuration

Tools

Overview

Provider Interface

Supported Providers

Core Providers (9)

Compatible Providers (41+)

Configuration

Basic Setup

Custom Endpoints

Model Selection

Provider Selection Flow

Capabilities

Native Tool Calling

Streaming

Vision (Multimodal Input)

Error Handling

Automatic Retries

Fallback Chain

Performance

Provider Warmup

Response Caching

Implementation Guide

Minimal Provider

Next Steps