Overview

Cost Harvesting safeguards LLM usage by monitoring and limiting the number of tokens consumed by individual users. If a user exceeds a defined token limit, the system blocks further requests to avoid unnecessary cost spikes. The policy tracks the prompt and response tokens consumed by each user on a per-minute basis. If the tokens exceed the configured threshold, all additional requests for that minute will be denied.

User Configuration

  • Threshold Range: 0 - 100,000,000 prompt and response tokens per minute.
  • Default: 100,000 prompt and response tokens per minute.

If the number of prompt and response tokens exceeds the defined threshold within a minute, all additional requests from that user will be blocked for the remainder of that minute, including history.

User ID Integration

To ensure this policy functions correctly, the user should provide a unique User ID to activate the policy. Without the User ID, the policy will not function. The User ID parameter should be passed in the request body as user:.

Security Standards

  1. OWASP LLM Top 10 Mapping: N/A.
  2. NIST Mapping: N/A.
  3. MITRE ATLAS Mapping: AML.T0034 - Cost Harvesting.