SDK Documentation

Complete guide to integrating AgentCost into your LangChain applications

Installation

Install the AgentCost SDK using pip:

bash
pip install agentcost

Or install from source:

bash
cd agentcost-sdk
pip install -e .

Quick Start

Add just two lines of code to start tracking LLM costs:

python
from agentcost import track_costs

# Initialize tracking
track_costs.init(
    api_key="your_api_key",
    project_id="my-project"
)

# Your existing code works unchanged
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4")
response = llm.invoke("Hello, world!")  # Automatically tracked

Note: The SDK uses monkey patching to intercept LangChain calls. Your existing code requires no modifications.

Security: API keys are shown once on creation. Store them securely and rotate keys from the dashboard if needed.

Configuration

The SDK supports extensive configuration options:

python
track_costs.init(
    # Required for cloud mode
    api_key="sk_...",
    project_id="my-project",

    # Optional settings
    base_url="https://api.agentcost.tech",  # Your backend URL
    batch_size=10,                          # Events before auto-flush
    flush_interval=5.0,                     # Seconds between flushes
    debug=True,                             # Enable debug logging
    default_agent_name="my-agent",          # Default agent tag
    local_mode=False,                       # Store locally (no backend)
    enabled=True,                           # Enable/disable tracking

    # Custom pricing (overrides defaults)
    custom_pricing={
        "my-custom-model": {"input": 0.001, "output": 0.002}
    },

    # Global metadata (attached to all events)
    global_metadata={
        "environment": "production",
        "version": "1.0.0"
    }
)

Configuration Options

ParameterTypeDefaultDescription
api_keystrNoneYour project API key
project_idstrNoneYour project identifier
batch_sizeint10Events before auto-flush
flush_intervalfloat5.0Seconds between flushes
local_modeboolFalseStore events locally only
debugboolFalseEnable debug logging

Agent Tagging

Tag LLM calls by agent for granular analytics:

python
# Option 1: Set default agent
track_costs.set_agent_name("router-agent")

# Option 2: Context manager (recommended)
with track_costs.agent("technical-agent"):
    llm.invoke("How do I fix this bug?")  # Tagged as "technical-agent"

with track_costs.agent("billing-agent"):
    llm.invoke("What's my balance?")  # Tagged as "billing-agent"

Agent names appear in your dashboard, allowing you to track costs per agent and identify which parts of your system are most expensive.

Metadata

Attach custom metadata for filtering and grouping:

python
# Persistent metadata (attached to all subsequent events)
track_costs.add_metadata("user_id", "user_123")
track_costs.add_metadata("tenant_id", "acme_corp")

# Temporary metadata (context manager)
with track_costs.metadata(conversation_id="conv_456", step="routing"):
    llm.invoke("Route this query")

Local Mode

Test without running a backend:

python
track_costs.init(local_mode=True, debug=True)

# Make LLM calls
llm.invoke("Hello!")
llm.invoke("World!")

# Retrieve captured events
events = track_costs.get_local_events()
for event in events:
    print(f"Model: {event['model']}")
    print(f"Tokens: {event['total_tokens']}")
    print(f"Cost: ${event['cost']:.6f}")

Streaming Support

Streaming calls are automatically tracked:

python
# Sync streaming
for chunk in llm.stream("Tell me a story"):
    print(chunk.content, end="")
# Event recorded after stream completes

# Async streaming
async for chunk in llm.astream("Tell me a story"):
    print(chunk.content, end="")
# Event recorded after stream completes

Supported Models

AgentCost supports over 1,900+ models from all major providers. Pricing is automatically synced from LiteLLM's comprehensive pricing database, ensuring you always have accurate, up-to-date cost information.

View all models: Browse the complete model catalog with search, filtering, and live pricing.

ProviderExamples
OpenAIgpt-4, gpt-4-turbo, gpt-4o, gpt-4o-mini, gpt-3.5-turbo, o1, o1-mini, o1-preview
Anthropicclaude-3-opus, claude-3-sonnet, claude-3-haiku, claude-3.5-sonnet, claude-3.5-haiku, claude-4-opus
Googlegemini-pro, gemini-1.5-pro, gemini-1.5-flash, gemini-2.0-flash
Groqllama-3.1-8b, llama-3.1-70b, llama-3.3-70b, mixtral-8x7b
DeepSeekdeepseek-chat, deepseek-coder, deepseek-reasoner
Coherecommand, command-r, command-r-plus
Mistralmistral-small, mistral-medium, mistral-large
Together AImeta-llama/Llama-3-70b, Qwen models, Phi models
AWS BedrockAll Bedrock-hosted models (Claude, Titan, Llama)
Azure OpenAIAll Azure-hosted OpenAI models
30+ MoreReplicate, Fireworks, Anyscale, Perplexity, etc.

For custom or private models, you can provide custom pricing via the custom_pricing parameter. The SDK also fetches the latest pricing from the backend automatically.

Event Structure

Each tracked event contains:

json
{
  "agent_name": "my-agent",
  "model": "gpt-4",
  "input_tokens": 150,
  "output_tokens": 80,
  "total_tokens": 230,
  "cost": 0.0093,
  "latency_ms": 1234,
  "timestamp": "2024-01-23T10:30:45.123Z",
  "success": true,
  "error": null,
  "streaming": false,
  "metadata": {"conversation_id": "conv_456"}
}

Graceful Shutdown

Ensure all events are sent before your application exits:

python
# Send pending events
track_costs.flush()

# Full shutdown
track_costs.shutdown()

Tip: Use Python's atexit module to automatically call shutdown() when your application exits.

Error Handling

The SDK is designed to never interfere with your application. All tracking operations are:

  • Non-blocking: Events are batched and sent asynchronously
  • Fault-tolerant: Network failures are silently handled
  • Retry-enabled: Failed batches are retried with exponential backoff
python
# The SDK never throws exceptions to your code
try:
    response = llm.invoke("Hello!")  # This works even if tracking fails
except Exception as e:
    # This will only catch LLM errors, not tracking errors
    print(f"LLM error: {e}")

# To see tracking errors, enable debug mode
track_costs.init(api_key="...", debug=True)  # Logs errors to console

Best Practices

1. Initialize Early

Call track_costs.init() before creating any LLM instances:

python
# Correct: Initialize before importing LLM
from agentcost import track_costs
track_costs.init(api_key="sk_...")

from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4")

# Wrong: LLM created before initialization
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4")

from agentcost import track_costs
track_costs.init(api_key="sk_...")  # Too late!

2. Use Agent Context Managers

Context managers ensure proper agent tagging even if exceptions occur:

python
# Recommended: Context manager
with track_costs.agent("router"):
    response = llm.invoke(query)

# Less safe: Manual setting
track_costs.set_agent_name("router")
response = llm.invoke(query)  # What if this throws?
track_costs.set_agent_name("default")  # Might not run

3. Environment Variables

Store sensitive configuration in environment variables:

python
import os
from agentcost import track_costs

track_costs.init(
    api_key=os.environ["AGENTCOST_API_KEY"],
    base_url=os.environ.get("AGENTCOST_URL", "https://api.agentcost.tech"),
    debug=os.environ.get("DEBUG", "false").lower() == "true"
)

4. Graceful Shutdown

Always flush events before your application exits:

python
import atexit
from agentcost import track_costs

track_costs.init(api_key="sk_...")

# Register shutdown handler
atexit.register(track_costs.shutdown)

# Or in FastAPI/Flask
@app.on_event("shutdown")
async def shutdown_event():
    track_costs.shutdown()

Troubleshooting

Events not appearing in dashboard

  • Ensure track_costs.init() is called before LLM usage
  • Check your API key is correct
  • Enable debug=True to see error messages
  • Call track_costs.flush() to force send events

Token counts seem wrong

  • The SDK uses tiktoken for accurate counting
  • Make sure tiktoken is installed: pip install tiktoken
  • Some models may use different tokenizers

Connection errors

  • Verify your base_url is correct
  • Check that the backend is running and accessible
  • Look for firewall or proxy issues

Getting support

If you're still having issues, check our GitHub Issues or start a discussion.