How to track LLM API costs per agent in Node.js

The spaturzu team · 3 June 2026 · 9 min read

If you run more than one AI agent on the same OpenAI or Anthropic key, your provider bill is a single number with no breakdown. This guide shows how to attribute that spend to the agent and run that caused each call in Node.js — with a drop-in SDK, no proxy, and a hard budget cap you can set before the call ever hits the provider.

What you'll build

By the end you'll have a Node script whose every LLM call is tagged with the agent that made it, grouped into runs, visible as per-agent cost in a dashboard, and protected by a budget that stops runaway spend before it happens. For the concepts behind this, see what LLM cost attribution is.

Before you start

—Node 18+ and an OpenAI (or Anthropic) API key.
—A spaturzu project key — sign in and copy it from the Onboarding page. Accounts are free and created on first sign-in.

Step 1 — Install the SDK

Add the spaturzu SDK alongside the provider client you already use.

install

pnpm add @spaturzu/sdk openai
# see the docs for the exact artifact URL and current version

Step 2 — Set your environment variables

The SDK reads these on construction. Your provider key never reaches the spaturzu gateway — the instrumented client calls OpenAI directly with it, exactly like the real client does.

.env

SPATURZU_API_KEY=sk_...        # your project key (from Onboarding)
OPENAI_API_KEY=sk-...          # your provider key — calls go direct to OpenAI
# SPATURZU_BASE_URL is optional; it defaults to the hosted gateway

Step 3 — Swap one import and tag your calls

Change the import specifier from openai to @spaturzu/sdk/openai. Construction and call sites stay the same. Then tag any call with .withAgent("name") — the agent that made it.

agent.ts

import OpenAI from "@spaturzu/sdk/openai";
import { flush } from "@spaturzu/sdk";

// Swap "openai" for "@spaturzu/sdk/openai" — construction + calls unchanged.
const openai = new OpenAI();

// Tag the call with the agent that made it. No wrapper, no closure.
const res = await openai.withAgent("support-triage").chat.completions.create({
  model: "gpt-4o-mini",
  messages: [{ role: "user", content: "Summarise this ticket" }],
});

// Short-lived processes (CLIs, Lambdas, CI) must flush before exit.
await flush();

Run it (npx tsx agent.ts) and, within a couple of seconds, the agent, its run, and the cost appear in your dashboard.

Step 4 — Group related calls into a run

A single task usually spans several calls. Wrap them in run(...) so they share one runId and each agent nests under the run as an agentPath. Now you can drill from a costly run into the exact calls that made it up.

nightly.ts

import { run } from "@spaturzu/sdk";

// Group several calls into one unit of work. They share a runId, and each
// agent nests under the run as an agentPath (e.g. ["nightly-report", "fetch"]).
await run("nightly-report", async () => {
  await openai.withAgent("fetch").chat.completions.create({ /* ... */ });
  await openai.withAgent("summarise").chat.completions.create({ /* ... */ });
});

Step 5 — Read per-agent cost in the dashboard

With calls tagged, the dashboard rolls cost up per agent over 7- and 30-day windows, and per run for individual units of work. The agent that quietly became your most expensive one is now the one at the top of the list — no spreadsheet of guesses required.

Step 6 — Cap the budget before the call

Attribution tells you what happened; a budget stops the next runaway from happening. Pass { budget: { hardCap: true } } as the client's second argument and it throws a typed BudgetExceededError beforethe call reaches the provider when a cap is breached — so refused calls cost nothing. Enforcement is in-process; there's no proxy in the request path.

capped.ts

import OpenAI from "@spaturzu/sdk/openai";
import { BudgetExceededError } from "@spaturzu/sdk";

// hardCap throws BEFORE the provider call when a cap is breached — no token
// cost for refused calls. Budgets themselves are set in the dashboard.
const openai = new OpenAI({}, { budget: { hardCap: true, onBreach: "throw" } });

try {
  await openai.withAgent("nightly-report").chat.completions.create({ /* ... */ });
} catch (err) {
  if (err instanceof BudgetExceededError) {
    // err.scope / err.period / err.limitCost / err.currentCost / err.agentName
    console.error("Budget cap hit: " + err.message);
    return;
  }
  throw err; // re-throw anything else
}

Don't forget flush() in short-lived processes

Long-running servers can ignore this. But CLIs, Lambdas, and CI jobs must call flush() before the process exits — otherwise queued metadata is dropped on exit. If you enabled a hard cap, prefer shutdown()instead: it flushes and also tears down the budget's event stream so the process can exit cleanly.

Where to go next

—The documentation covers all five providers (OpenAI, Anthropic, Bedrock, Gemini, Mistral), nested tags, and cross-provider fallback.
—Weighing tools? See spaturzu vs Helicone and spaturzu vs Langfuse.
—New to the idea? Start with what LLM cost attribution is.

See which agent spent the money.

spaturzu attributes every OpenAI, Anthropic, Bedrock, Gemini, and Mistral call to the agent and run that made it — no proxy, no prompt changes. Free to start.

Get started — it's free Read the docs

← All posts