SDK Documentation

Complete guide to integrating AgentCost into your OpenAI, Anthropic, and LangChain applications

Installation

Install the AgentCost SDK using pip:

bash

pip install agentcost

Or install from source:

bash

cd agentcost-sdk
pip install -e .

Quick Start

Add just two lines of code to start tracking LLM costs:

python

from agentcost import track_costs

# Initialize tracking
track_costs.init(
    api_key="your_api_key",
    project_id="my-project"
)

# OpenAI — automatically tracked
from openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(model="gpt-4o", messages=[{"role": "user", "content": "Hello!"}])

# Anthropic — automatically tracked
from anthropic import Anthropic
client = Anthropic()
message = client.messages.create(model="claude-3-5-sonnet-20241022", max_tokens=100, messages=[{"role": "user", "content": "Hello!"}])

# LangChain — automatically tracked
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4")
response = llm.invoke("Hello, world!")  # Automatically tracked

Note: The SDK uses monkey patching to intercept OpenAI, Anthropic, and LangChain calls. Your existing code requires no modifications.

Security: API keys are shown once on creation. Store them securely and rotate keys from the dashboard if needed.

Configuration

The SDK supports extensive configuration options:

python

track_costs.init(
    # Required for cloud mode
    api_key="sk_...",
    project_id="my-project",

    # Optional settings
    base_url="https://api.agentcost.tech",  # Your backend URL
    batch_size=10,                          # Events before auto-flush
    flush_interval=5.0,                     # Seconds between flushes
    debug=True,                             # Enable debug logging
    default_agent_name="my-agent",          # Default agent tag
    local_mode=False,                       # Store locally (no backend)
    enabled=True,                           # Enable/disable tracking

    # Custom pricing (overrides defaults)
    custom_pricing={
        "my-custom-model": {"input": 0.001, "output": 0.002}
    },

    # Global metadata (attached to all events)
    global_metadata={
        "environment": "production",
        "version": "1.0.0"
    }
)

Configuration Options

Parameter	Type	Default	Description
api_key	str	None	Your project API key
project_id	str	None	Your project identifier
batch_size	int	10	Events before auto-flush
flush_interval	float	5.0	Seconds between flushes
local_mode	bool	False	Store events locally only
debug	bool	False	Enable debug logging

Agent Tagging

Tag LLM calls by agent for granular analytics:

python

# Option 1: Set default agent
track_costs.set_agent_name("router-agent")

# Option 2: Context manager (recommended)
with track_costs.agent("technical-agent"):
    llm.invoke("How do I fix this bug?")  # Tagged as "technical-agent"

with track_costs.agent("billing-agent"):
    llm.invoke("What's my balance?")  # Tagged as "billing-agent"

Agent names appear in your dashboard, allowing you to track costs per agent and identify which parts of your system are most expensive.

Metadata

Attach custom metadata for filtering and grouping:

python

# Persistent metadata (attached to all subsequent events)
track_costs.add_metadata("user_id", "user_123")
track_costs.add_metadata("tenant_id", "acme_corp")

# Temporary metadata (context manager)
with track_costs.metadata(conversation_id="conv_456", step="routing"):
    llm.invoke("Route this query")

Local Mode

Test without running a backend:

python

track_costs.init(local_mode=True, debug=True)

# Make LLM calls
llm.invoke("Hello!")
llm.invoke("World!")

# Retrieve captured events
events = track_costs.get_local_events()
for event in events:
    print(f"Model: {event['model']}")
    print(f"Tokens: {event['total_tokens']}")
    print(f"Cost: ${event['cost']:.6f}")

Streaming Support

Streaming calls are automatically tracked:

python

# Sync streaming
for chunk in llm.stream("Tell me a story"):
    print(chunk.content, end="")
# Event recorded after stream completes

# Async streaming
async for chunk in llm.astream("Tell me a story"):
    print(chunk.content, end="")
# Event recorded after stream completes

Supported Models

AgentCost supports over 2,900+ models from all major providers. Pricing is automatically synced from LiteLLM's comprehensive pricing database, ensuring you always have accurate, up-to-date cost information.

View all models: Browse the complete model catalog with search, filtering, and live pricing.

Provider	Examples
OpenAI	gpt-4, gpt-4-turbo, gpt-4o, gpt-4o-mini, gpt-3.5-turbo, o1, o1-mini, o1-preview
Anthropic	claude-3-opus, claude-3-sonnet, claude-3-haiku, claude-3.5-sonnet, claude-3.5-haiku, claude-4-opus
Google	gemini-pro, gemini-1.5-pro, gemini-1.5-flash, gemini-2.0-flash
Groq	llama-3.1-8b, llama-3.1-70b, llama-3.3-70b, mixtral-8x7b
DeepSeek	deepseek-chat, deepseek-coder, deepseek-reasoner
Cohere	command, command-r, command-r-plus
Mistral	mistral-small, mistral-medium, mistral-large
Together AI	meta-llama/Llama-3-70b, Qwen models, Phi models
AWS Bedrock	All Bedrock-hosted models (Claude, Titan, Llama)
Azure OpenAI	All Azure-hosted OpenAI models
50+ More	Replicate, Fireworks, Anyscale, Perplexity, etc.

For custom or private models, you can provide custom pricing via the custom_pricing parameter. The SDK also fetches the latest pricing from the backend automatically.

Event Structure

Each tracked event contains:

json

{
  "agent_name": "my-agent",
  "model": "gpt-4",
  "input_tokens": 150,
  "output_tokens": 80,
  "total_tokens": 230,
  "cost": 0.0093,
  "latency_ms": 1234,
  "timestamp": "2024-01-23T10:30:45.123Z",
  "success": true,
  "error": null,
  "streaming": false,
  "metadata": {"conversation_id": "conv_456"}
}

Graceful Shutdown

Ensure all events are sent before your application exits:

python

# Send pending events
track_costs.flush()

# Full shutdown
track_costs.shutdown()

Tip: Use Python's atexit module to automatically call shutdown() when your application exits.

Error Handling

The SDK is designed to never interfere with your application. All tracking operations are:

Non-blocking: Events are batched and sent asynchronously
Fault-tolerant: Network failures are silently handled
Retry-enabled: Failed batches are retried with exponential backoff

python

# The SDK never throws exceptions to your code
try:
    response = llm.invoke("Hello!")  # This works even if tracking fails
except Exception as e:
    # This will only catch LLM errors, not tracking errors
    print(f"LLM error: {e}")

# To see tracking errors, enable debug mode
track_costs.init(api_key="...", debug=True)  # Logs errors to console

Best Practices

1. Initialize Early

Call track_costs.init() before creating any LLM instances:

python

# Correct: Initialize before importing LLM
from agentcost import track_costs
track_costs.init(api_key="sk_...")

from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4")

# Wrong: LLM created before initialization
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4")

from agentcost import track_costs
track_costs.init(api_key="sk_...")  # Too late!

2. Use Agent Context Managers

Context managers ensure proper agent tagging even if exceptions occur:

python

# Recommended: Context manager
with track_costs.agent("router"):
    response = llm.invoke(query)

# Less safe: Manual setting
track_costs.set_agent_name("router")
response = llm.invoke(query)  # What if this throws?
track_costs.set_agent_name("default")  # Might not run

3. Environment Variables

Store sensitive configuration in environment variables:

python

import os
from agentcost import track_costs

track_costs.init(
    api_key=os.environ["AGENTCOST_API_KEY"],
    base_url=os.environ.get("AGENTCOST_URL", "https://api.agentcost.tech"),
    debug=os.environ.get("DEBUG", "false").lower() == "true"
)

4. Graceful Shutdown

Always flush events before your application exits:

python

import atexit
from agentcost import track_costs

track_costs.init(api_key="sk_...")

# Register shutdown handler
atexit.register(track_costs.shutdown)

# Or in FastAPI/Flask
@app.on_event("shutdown")
async def shutdown_event():
    track_costs.shutdown()

Troubleshooting

Events not appearing in dashboard

Ensure track_costs.init() is called before LLM usage
Check your API key is correct
Enable debug=True to see error messages
Call track_costs.flush() to force send events

Token counts seem wrong

The SDK uses tiktoken for accurate counting
Make sure tiktoken is installed: pip install tiktoken
Some models may use different tokenizers

Connection errors

Verify your base_url is correct
Check that the backend is running and accessible
Look for firewall or proxy issues

Getting support

If you're still having issues, check our GitHub Issues or start a discussion.

Back to Settings API Reference →

SDK Documentation

Contents

Installation

Quick Start

Configuration

Configuration Options

Agent Tagging

Metadata

Local Mode

Streaming Support

Supported Models

Event Structure

Graceful Shutdown

Error Handling

Best Practices

1. Initialize Early

2. Use Agent Context Managers

3. Environment Variables

4. Graceful Shutdown

Troubleshooting

Events not appearing in dashboard

Token counts seem wrong

Connection errors

Getting support