Rate Limits
Keystore enforces rate limits and budget caps at the proxy layer, before requests ever reach the downstream provider. This protects you from runaway agent loops, accidental infinite retries, and unexpected cost spikes.
Limit Types
Each agent can be configured with three types of limits:
| Limit | Description | Default |
|---|---|---|
| RPM (Requests Per Minute) | Sliding window, per agent + provider combination | 60 |
| RPD (Requests Per Day) | Fixed window (24h), per agent + provider combination | 10,000 |
| Monthly Budget | Total estimated spend cap in cents, per agent | No limit |
How Limits Are Enforced
On every proxy request, the following checks run in order:
1. Validate agent token (SHA-256 hash lookup)
2. Check agent status (active / paused / revoked)
3. Check RPM limit → if exceeded, return 429
4. Check RPD limit → if exceeded, return 429
5. Check budget → if exceeded, return 429
6. Forward request to provider
7. After response: increment spend counterIf any check fails, the request is rejected immediately and no downstream API call is made.
Configuring Limits
Via Dashboard
Navigate to Agents > [Agent Name] > Settings. Set the RPM, RPD, and monthly budget fields. Changes take effect within 5 minutes (the agent context cache TTL).
Via API
curl -X PATCH https://api.keystore.io/v1/agents/<agent-id> \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"rpmLimit": 120,
"rpdLimit": 50000,
"budgetCentsMonthly": 5000
}'Via CLI
keystore agents update <agent-id> \
--rpm-limit 120 \
--rpd-limit 50000 \
--budget 5000Setting a field to null removes the custom limit and falls back to the default.
Rate Limit Response
When a request is rate-limited, the proxy returns HTTP 429 Too Many Requests with headers indicating the limit state:
HTTP/1.1 429 Too Many Requests
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1709712000
Content-Type: application/json
{
"error": "Rate limit exceeded",
"limitType": "rpm",
"limit": 60,
"remaining": 0,
"resetAt": 1709712000
}The limitType field indicates which limit was hit (rpm, rpd, or budget).
RPM: Sliding Window
RPM uses a sliding window algorithm via Upstash Redis. This provides smoother rate limiting compared to fixed windows -- there is no "burst at window boundary" problem.
The window slides continuously, so the limit represents the maximum number of requests in any 60-second period.
Example: RPM = 60
Time 0:00 → Request 1 (remaining: 59)
Time 0:30 → Request 30 (remaining: 30)
Time 0:59 → Request 60 (remaining: 0)
Time 1:00 → Request 1 from 0:00 expires (remaining: 1)RPD: Fixed Window
RPD uses a fixed window that resets every 24 hours. The window is aligned to the first request of the day for each agent+provider combination.
Example: RPD = 10,000
Day starts → counter at 0
Throughout the day → counter increments
Counter hits 10,000 → all further requests rejected until reset
Day ends → counter resets to 0Budget Enforcement
The monthly budget is tracked in Redis as a running total of estimated costs in cents. The key follows the pattern:
agent:{agentId}:spend:{YYYY-MM}Before each request, the proxy compares the current spend to budgetCentsMonthly. If the spend equals or exceeds the budget, the request is rejected.
After each successful request, the proxy increments the spend counter by the estimated cost. Spend keys automatically expire after 32 days, ensuring old month data is cleaned up.
Budget Response
{
"error": "Budget exceeded",
"limitType": "budget",
"spentCents": 5000,
"budgetCents": 5000
}Per-Provider Scoping
RPM and RPD limits are tracked per agent and per provider. An agent with RPM=60 can make up to 60 requests/minute to OpenAI and 60 requests/minute to Anthropic independently.
Rate limit key format: {agentId}:{providerSlug}
Example: 550e8400-e29b-41d4-a716-446655440000:openaiBudget limits, on the other hand, are tracked per agent across all providers. A $50/month budget applies to the agent's combined spend across OpenAI, Anthropic, Resend, and any other assigned providers.
Fail-Open Behavior
If the Redis instance used for rate limiting becomes unavailable, the proxy fails open -- requests are allowed through without rate limiting. This design choice prioritizes availability over strict enforcement during infrastructure incidents.
When Redis recovers, rate limiting resumes automatically. A warning is logged for each request that skipped rate limiting:
[Redis] Rate limiting skipped -- Redis unavailable (fail open)The same fail-open behavior applies to budget checks.
Monitoring Usage
Use the usage API or CLI to monitor how agents are tracking against their limits:
# Get a summary with budget insights
keystore usage summary <agent-id> --period 24h
# View per-provider breakdown
keystore usage providers <agent-id> --period 7dThe summary endpoint returns burn rate, projected monthly cost, and estimated days until budget exhaustion:
{
"totalRequests": 4250,
"costCents": 1200,
"budgetCents": 5000,
"burnRateCentsPerHour": 1.67,
"projectedMonthlyCents": 1202,
"daysUntilBudget": 76.2
}