Overview
Cost Harvesting safeguards LLM usage by monitoring and limiting the number of tokens consumed by individual users. If a user exceeds a defined token limit, the system blocks further requests to avoid unnecessary cost spikes. The policy tracks the prompt and response tokens consumed by each user on a per-minute basis. If the tokens exceed the configured threshold, all additional requests for that minute will be denied.User Configuration
- Threshold Range: 0 - 100,000,000 prompt and response tokens per minute.
- Default: 100,000 prompt and response tokens per minute.
User ID Integration
To ensure this policy functions correctly, the user should provide a unique User ID to activate the policy. Without the User ID, the policy will not function. The User ID parameter should be passed in the request body asuser:
.
Security Standards
- OWASP LLM Top 10 Mapping: N/A.
- NIST Mapping: N/A.
- MITRE ATLAS Mapping: AML.T0034 - Cost Harvesting.