Complete guide to integrating AgentCost into your LangChain applications
Install the AgentCost SDK using pip:
pip install agentcostOr install from source:
cd agentcost-sdk
pip install -e .Add just two lines of code to start tracking LLM costs:
from agentcost import track_costs
# Initialize tracking
track_costs.init(
api_key="your_api_key",
project_id="my-project"
)
# Your existing code works unchanged
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4")
response = llm.invoke("Hello, world!") # Automatically trackedNote: The SDK uses monkey patching to intercept LangChain calls. Your existing code requires no modifications.
Security: API keys are shown once on creation. Store them securely and rotate keys from the dashboard if needed.
The SDK supports extensive configuration options:
track_costs.init(
# Required for cloud mode
api_key="sk_...",
project_id="my-project",
# Optional settings
base_url="https://api.agentcost.tech", # Your backend URL
batch_size=10, # Events before auto-flush
flush_interval=5.0, # Seconds between flushes
debug=True, # Enable debug logging
default_agent_name="my-agent", # Default agent tag
local_mode=False, # Store locally (no backend)
enabled=True, # Enable/disable tracking
# Custom pricing (overrides defaults)
custom_pricing={
"my-custom-model": {"input": 0.001, "output": 0.002}
},
# Global metadata (attached to all events)
global_metadata={
"environment": "production",
"version": "1.0.0"
}
)| Parameter | Type | Default | Description |
|---|---|---|---|
| api_key | str | None | Your project API key |
| project_id | str | None | Your project identifier |
| batch_size | int | 10 | Events before auto-flush |
| flush_interval | float | 5.0 | Seconds between flushes |
| local_mode | bool | False | Store events locally only |
| debug | bool | False | Enable debug logging |
Tag LLM calls by agent for granular analytics:
# Option 1: Set default agent
track_costs.set_agent_name("router-agent")
# Option 2: Context manager (recommended)
with track_costs.agent("technical-agent"):
llm.invoke("How do I fix this bug?") # Tagged as "technical-agent"
with track_costs.agent("billing-agent"):
llm.invoke("What's my balance?") # Tagged as "billing-agent"Agent names appear in your dashboard, allowing you to track costs per agent and identify which parts of your system are most expensive.
Attach custom metadata for filtering and grouping:
# Persistent metadata (attached to all subsequent events)
track_costs.add_metadata("user_id", "user_123")
track_costs.add_metadata("tenant_id", "acme_corp")
# Temporary metadata (context manager)
with track_costs.metadata(conversation_id="conv_456", step="routing"):
llm.invoke("Route this query")Test without running a backend:
track_costs.init(local_mode=True, debug=True)
# Make LLM calls
llm.invoke("Hello!")
llm.invoke("World!")
# Retrieve captured events
events = track_costs.get_local_events()
for event in events:
print(f"Model: {event['model']}")
print(f"Tokens: {event['total_tokens']}")
print(f"Cost: ${event['cost']:.6f}")Streaming calls are automatically tracked:
# Sync streaming
for chunk in llm.stream("Tell me a story"):
print(chunk.content, end="")
# Event recorded after stream completes
# Async streaming
async for chunk in llm.astream("Tell me a story"):
print(chunk.content, end="")
# Event recorded after stream completesAgentCost supports over 1,900+ models from all major providers. Pricing is automatically synced from LiteLLM's comprehensive pricing database, ensuring you always have accurate, up-to-date cost information.
View all models: Browse the complete model catalog with search, filtering, and live pricing.
| Provider | Examples |
|---|---|
| OpenAI | gpt-4, gpt-4-turbo, gpt-4o, gpt-4o-mini, gpt-3.5-turbo, o1, o1-mini, o1-preview |
| Anthropic | claude-3-opus, claude-3-sonnet, claude-3-haiku, claude-3.5-sonnet, claude-3.5-haiku, claude-4-opus |
| gemini-pro, gemini-1.5-pro, gemini-1.5-flash, gemini-2.0-flash | |
| Groq | llama-3.1-8b, llama-3.1-70b, llama-3.3-70b, mixtral-8x7b |
| DeepSeek | deepseek-chat, deepseek-coder, deepseek-reasoner |
| Cohere | command, command-r, command-r-plus |
| Mistral | mistral-small, mistral-medium, mistral-large |
| Together AI | meta-llama/Llama-3-70b, Qwen models, Phi models |
| AWS Bedrock | All Bedrock-hosted models (Claude, Titan, Llama) |
| Azure OpenAI | All Azure-hosted OpenAI models |
| 30+ More | Replicate, Fireworks, Anyscale, Perplexity, etc. |
For custom or private models, you can provide custom pricing via the custom_pricing parameter. The SDK also fetches the latest pricing from the backend automatically.
Each tracked event contains:
{
"agent_name": "my-agent",
"model": "gpt-4",
"input_tokens": 150,
"output_tokens": 80,
"total_tokens": 230,
"cost": 0.0093,
"latency_ms": 1234,
"timestamp": "2024-01-23T10:30:45.123Z",
"success": true,
"error": null,
"streaming": false,
"metadata": {"conversation_id": "conv_456"}
}Ensure all events are sent before your application exits:
# Send pending events
track_costs.flush()
# Full shutdown
track_costs.shutdown()Tip: Use Python's atexit module to automatically call shutdown() when your application exits.
The SDK is designed to never interfere with your application. All tracking operations are:
# The SDK never throws exceptions to your code
try:
response = llm.invoke("Hello!") # This works even if tracking fails
except Exception as e:
# This will only catch LLM errors, not tracking errors
print(f"LLM error: {e}")
# To see tracking errors, enable debug mode
track_costs.init(api_key="...", debug=True) # Logs errors to consoleCall track_costs.init() before creating any LLM instances:
# Correct: Initialize before importing LLM
from agentcost import track_costs
track_costs.init(api_key="sk_...")
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4")
# Wrong: LLM created before initialization
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4")
from agentcost import track_costs
track_costs.init(api_key="sk_...") # Too late!Context managers ensure proper agent tagging even if exceptions occur:
# Recommended: Context manager
with track_costs.agent("router"):
response = llm.invoke(query)
# Less safe: Manual setting
track_costs.set_agent_name("router")
response = llm.invoke(query) # What if this throws?
track_costs.set_agent_name("default") # Might not runStore sensitive configuration in environment variables:
import os
from agentcost import track_costs
track_costs.init(
api_key=os.environ["AGENTCOST_API_KEY"],
base_url=os.environ.get("AGENTCOST_URL", "https://api.agentcost.tech"),
debug=os.environ.get("DEBUG", "false").lower() == "true"
)Always flush events before your application exits:
import atexit
from agentcost import track_costs
track_costs.init(api_key="sk_...")
# Register shutdown handler
atexit.register(track_costs.shutdown)
# Or in FastAPI/Flask
@app.on_event("shutdown")
async def shutdown_event():
track_costs.shutdown()track_costs.init() is called before LLM usagedebug=True to see error messagestrack_costs.flush() to force send eventspip install tiktokenbase_url is correctIf you're still having issues, check our GitHub Issues or start a discussion.