Documentation

Set up spaturzu and ship cost-attributed agents.

Swap one import — your OpenAI, Anthropic, Bedrock, Gemini, or Mistral client becomes a drop-in that attributes every call to the agent and run that made it. Construction and call sites stay the same — no proxy, no prompt changes.

Agent setup

Wire the SDK into your scripts.

spaturzu sits at the call site, not the invoice. It records token counts and cost as each request happens — your keys still call the provider directly, and prompts never touch us.

Install the SDK

The Python SDK is published on PyPI as spaturzu, and the Node SDK on npm as @spaturzu/sdk. Install it alongside the provider client you already use.

# published on npm
pnpm add @spaturzu/sdk openai

# published on PyPI
pip install spaturzu openai

Set your environment variables

The SDK reads these on construction. Set them in your shell, your process manager, or a .env file.

SPATURZU_API_KEY

Your project key. Create or copy it from the Onboarding page in your dashboard — the SDK authenticates every call to the gateway with it.

SPATURZU_BASE_URL

Your spaturzu gateway URL. The SDK posts token and cost metadata here; it falls back to https://spaturzu-api.superchiu.org if unset.

OPENAI_API_KEY

Your provider key (or ANTHROPIC_API_KEY). The drop-in client calls OpenAI or Anthropic directly with it — exactly as the real client does — and it never reaches the spaturzu gateway.

Swap one import and tag your calls

Change the import specifier (openai → @spaturzu/sdk/openai in Node, spaturzu.openai in Python) — construction and call sites are unchanged. Then tag any call with the agent that made it: .withAgent("name") — no wrapper, no closure. To group several calls into one unit of work, wrap them in run(...) (see Attribution).

// Swap "openai" for "@spaturzu/sdk/openai" — construction + calls unchanged.
import OpenAI from "@spaturzu/sdk/openai";
import { flush } from "@spaturzu/sdk";

// SPATURZU_API_KEY + SPATURZU_BASE_URL are read from env; OpenAI() reads
// OPENAI_API_KEY, exactly like the real client.
const openai = new OpenAI();

async function main() {
  // Tag the call with the agent that made it - no wrapper, no closure.
  const res = await openai.withAgent("support-triage").chat.completions.create({
    model: "gpt-4o-mini",
    messages: [{ role: "user", content: "Summarise this ticket" }],
  });
  console.log(res.choices[0]?.message?.content);

  // CLIs, Lambdas, CI: flush queued metadata before the process exits.
  await flush();
}

main();

# Swap "openai" for "spaturzu.openai" — construction + calls unchanged.
from spaturzu.openai import OpenAI
from spaturzu import flush

# SPATURZU_API_KEY + SPATURZU_BASE_URL are read from env; OpenAI() reads
# OPENAI_API_KEY, exactly like the real client.
openai = OpenAI()

# Tag the call with the agent that made it - no wrapper, no with-block.
res = openai.with_agent("support-triage").chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Summarise this ticket"}],
)
print(res.choices[0].message.content)

# CLIs, Lambdas, CI: flush queued metadata before the process exits.
flush()

Need a custom-configured or injected client (Azure, a proxy baseURL, a client you built elsewhere)? The explicit new Spaturzu() + sp.wrapOpenAI(client) path still works — see Providers → Bring your own client.

Run it

npx tsx agent.ts
# Node 22.6+: node --experimental-strip-types agent.ts

python agent.py

Within a couple of seconds of the first call, the agent, its run, and the cost appear in your dashboard. Short-lived processes (CLIs, Lambdas, CI jobs) must call flush() before exit — otherwise queued metadata is dropped when the process ends.

Providers

Five providers, one-line swap.

spaturzu ships a drop-in module per provider — swap the import and your client is auto-instrumented from env, construction and call sites unchanged. Each instruments only the inference methods listed below; everything else on the client passes through untouched.

Same in Python — from spaturzu.openai import OpenAI, spaturzu.anthropic, spaturzu.google (Client), spaturzu.mistral, spaturzu.bedrock (BedrockRuntime). See the Python SDK reference for full Python signatures.

Provider	Drop-in import	Methods intercepted	Install
OpenAI	@spaturzu/sdk/openai	chat.completions.create	pnpm add openai
Anthropic	@spaturzu/sdk/anthropic	messages.create	pnpm add @anthropic-ai/sdk
Bedrock	@spaturzu/sdk/bedrock	converse, converseStream	pnpm add @aws-sdk/client-bedrock-runtime

Attribution

Three layers of context.

spaturzu attributes every call to a hierarchy of context: process-wide tags from configure(), frame-scoped tags from run(...), per-call agents from .withAgent(...), and (for nested frames) an agentPath array tracking the parent → child chain. The hierarchy is built per process; nothing leaves your servers besides token counts, cost, and the tags you set.

Same API in Python — run() is a context manager importable from spaturzu: with run("nightly-report"): for sync code, async with run("nightly-report"): for async. Frame-scoped tags pass as a tags= kwarg: async with run("synthesize", tags={"phase": "draft"}):.

Tag a single call — `.withAgent()`

The simplest tag is per-call: .withAgent(name) attributes a single call to the agent that made it — no run() block, no wrapper. It nests under any enclosing run() as a sub-agent, and is concurrency-safe — each call gets its own frame. In Python it is .with_agent(name).

// Standalone - its own run, agentPath = ["writer"]
await openai.withAgent("writer").chat.completions.create({ /* ... */ });

// Inside a run() - nests as a sub-agent, agentPath = ["planner", "writer"]
await run("planner", async () => {
  await openai.withAgent("writer").chat.completions.create({ /* ... */ });
});

# Standalone - its own run, agent_path = ["writer"]
openai.with_agent("writer").chat.completions.create(...)

# Inside a run() - nests as a sub-agent, agent_path = ["planner", "writer"]
with run("planner"):
    openai.with_agent("writer").chat.completions.create(...)

Process-wide tags

Set once at startup with configure()— before constructing any drop-in client — and merged into every logged call. Use for dimensions that don't change inside the process: env, deployment, region, version, pod name. Values are coerced to strings at the SDK boundary so the wire format stays . Per-run tags override these on key conflict. (On the explicit path, pass to the constructor instead.)

Cost control

Hard-cap the bill before the call.

The same client that meters cost can also enforce it. Pass { budget: { hardCap: true } } as the drop-in client's second argument and it throws BudgetExceededError before the call hits the provider when any applicable hard-cap budget is breached — no token cost incurred for refused calls. Enforcement is in-process; no proxy in the request path.

Same API in Python — pass a spaturzu= kwarg to the drop-in: OpenAI(spaturzu={"budget": {"hard_cap": True, "on_breach": "throw"}}). The error class is spaturzu.BudgetExceededError; field names match.

Enable a hard cap

Budgets are configured in the dashboard's Budgets page, scoped to a project or to a specific agent name. The SDK keeps an in-process policy cache that refreshes via SSE within seconds of a budget being hit; if the SSE connection drops, a polling backstop keeps the cache fresh.

import OpenAI from "@spaturzu/sdk/openai";

// Second arg carries spaturzu options; the first is the usual client config.
const openai = new OpenAI({}, {
  budget: { hardCap: true, onBreach: "throw" }, // 'warn' to log + proceed
});

from spaturzu.openai import OpenAI

# spaturzu= carries budget/fallback; other kwargs go to the real client.
openai = OpenAI(spaturzu={"budget": {"hard_cap": True, "on_breach": "throw"}})

Catch the typed error

BudgetExceededErrorexposes the scope, period, current cost, limit cost, and agent name. Re-throw anything else so the upstream call's own errors continue to surface normally.

import { BudgetExceededError } from "@spaturzu/sdk";

try {
  await run("nightly-report", async () => {
    await openai.chat.completions.create({
      model: "gpt-4o",
      messages: [{ role: "user", content: "..." }],
    });
  });
} catch (err) {
  if (err instanceof BudgetExceededError) {
    // err.scope        — "project" | "agent"
    // err.period       — "daily" | "monthly"
    // err.limitCost    — string, e.g. "10.00" (USD)
    // err.currentCost  — string, e.g. "10.42"
    // err.agentName    — string | null
    console.error(`Budget cap hit: ${err.message}`);
    return;
  }
  throw err;
}

from spaturzu import BudgetExceededError

try:
    with run("nightly-report"):
        openai.chat.completions.create(
            model="gpt-4o",
            messages=[{"role": "user", "content": "..."}],
        )
except BudgetExceededError as err:
    # err.scope        — "project" | "agent"
    # err.period       — "daily" | "monthly"
    # err.limit_cost   — str, e.g. "10.00" (USD)
    # err.current_cost — str, e.g. "10.42"
    # err.agent_name   — str | None
    print(f"Budget cap hit: {err}")
    # Other exceptions propagate normally — only this type is caught.

Reliability

Production knobs.

Three knobs to handle the realities of production LLM traffic: cross-provider fallback when a provider 429s or 5xxs, streaming usage capture across heterogeneous provider response formats, and constructor options that shape how the metering plane itself behaves under load.

Same API in Python — spaturzu.flush(timeout_s=None) and spaturzu.shutdown() are sync. Fallback config is a list of dicts. Constructor option names use snake_case but with two divergences from Node: timeout is timeout_s (seconds, float — not timeout_ms), and on_error is (exc, entry) → None (Node passes only the error).

Cross-provider fallback

On a retryable upstream error (429, 5xx, connection-class), the wrap walks the chain. Request and response shapes are translated automatically; your code keeps receiving OpenAI-shaped responses even when an Anthropic or Bedrock fallback served the call. All 20 directional pairs (5 providers × 4 other-providers) are supported.

import OpenAI from "@spaturzu/sdk/openai";
import Anthropic from "@anthropic-ai/sdk";
import { BedrockRuntime } from "@aws-sdk/client-bedrock-runtime";

// Fallback targets are raw provider clients, passed as the second arg.
const openai = new OpenAI({}, {
  fallback: [
    {
      provider: "anthropic",
      client: new Anthropic(),
      model: "claude-3-5-haiku-20241022",
    },
    {
      provider: "bedrock",
      client: new BedrockRuntime({ region: "us-east-1" }),
      model: "anthropic.claude-3-5-haiku-20241022-v1:0",
    },
  ],
});

from spaturzu.openai import OpenAI
from anthropic import Anthropic
import boto3

# spaturzu= carries the fallback chain; targets are raw provider clients.
openai = OpenAI(spaturzu={
    "fallback": [
        {
            "provider": "anthropic",
            "client": Anthropic(),
            "model": "claude-3-5-haiku-20241022",
        },
        {
            "provider": "bedrock",
            "client": boto3.client("bedrock-runtime", region_name="us-east-1"),
            "model": "anthropic.claude-3-5-haiku-20241022-v1:0",
        },
    ],
})

v1 fallback limits: non-streaming only, text content only, no tools, no response_format. Calls that use any of those features quietly skip fallback and surface the original error.

Option	Default	What it controls
timeoutMs	10000	Per-attempt POST timeout to the gateway.
backoffMs	[1000, 2000, 4000, 8000, 16000]	Retry backoff schedule. Length = retries after the initial send. Set [] to disable retries.
maxConcurrent	50	Max in-flight log POSTs at once. Excess calls queue FIFO.
onError	silent (no-op)	Called when a log POST fails after all retries. Default swallows the failure so the customer's app never crashes because the metering plane hiccupped.

Set up spaturzu and ship cost-attributed agents.

Wire the SDK into your scripts.

Install the SDK

Set your environment variables

Swap one import and tag your calls

Run it

Five providers, one-line swap.

Three layers of context.

Tag a single call — `.withAgent()`

Process-wide tags

Hard-cap the bill before the call.

Enable a hard cap

Catch the typed error

Production knobs.

Cross-provider fallback

Set up spaturzu and ship cost-attributed agents.

Wire the SDK into your scripts.

Install the SDK

Set your environment variables

Swap one import and tag your calls

Run it

Five providers, one-line swap.

Three layers of context.

Tag a single call — `.withAgent()`

Process-wide tags

Hard-cap the bill before the call.

Enable a hard cap

Catch the typed error

Production knobs.

Cross-provider fallback

OpenAI-compatible clients

Bring your own client

Frame-scoped tags

Nested `run()` and `agentPath`

Streaming usage capture

Metering-plane options

`flush()` vs `shutdown()`

Wire the SDK into your scripts.

Install the SDK

Set your environment variables

Swap one import and tag your calls

Run it

Five providers, one-line swap.

Three layers of context.

Tag a single call — .withAgent()

Process-wide tags

Hard-cap the bill before the call.

Enable a hard cap

Catch the typed error

Production knobs.

Cross-provider fallback

Wire the SDK into your scripts.

Install the SDK

Set your environment variables

Swap one import and tag your calls

Run it

Five providers, one-line swap.

Three layers of context.

Tag a single call — .withAgent()

Process-wide tags

Hard-cap the bill before the call.

Enable a hard cap

Catch the typed error

Production knobs.

Cross-provider fallback

OpenAI-compatible clients

Bring your own client

Frame-scoped tags

Nested run() and agentPath

Streaming usage capture

Metering-plane options

flush() vs shutdown()

Tag a single call — `.withAgent()`

Tag a single call — `.withAgent()`

Nested `run()` and `agentPath`

`flush()` vs `shutdown()`