Skip to main content

Best Practices for Managing AI Credit Usage

Learn how to monitor and optimize your account's AI inference credit usage to maximize value from Casefleet's AI-powered features.

Meg Hall avatar
Written by Meg Hall
Updated this week

Understanding AI Credits

AI inference credits power Casefleet's AI Assistant and Suggested Facts features. Each subscription includes 10,000 free AI inference credits per billed user per month, shared between both features. This allocation resets on the 1st of each calendar month and does not roll over.

How Credits Are Consumed

Credit consumption varies based on:

  • Query complexity: Simple lookups use fewer credits than complex analysis

  • Context size: Longer conversations with more history consume more credits

  • Document scope: Queries across many documents use more credits than focused searches

Credits are reflective of tokens (units of text processed), which cannot be predicted precisely before a query is executed. For complete details on free-tier allocations and overage billing, see Subscriptions and Billing.

Monitoring Credit Usage

Account-Wide Tracking

Navigate to Account Settings > Usage to view:

  • Current month's free-tier allotment of AI credits

  • Month-to-date total AI credit consumption

  • Real-time updates as credits are consumed

Case-Level Tracking

  • Generate reports showing AI credit consumption by case (current month and up to 5 months prior)

  • Identify which matters are driving usage

  • Export reports for client billing or internal analysis

Conversation-Level Tracking

When using AI Assistant, each conversation displays its credit consumption in real-time below the query box. The total shown represents the entire conversation, not individual messages.

Tip: Monitor conversation credit usage and start a new chat when consumption gets high to reset context and reduce ongoing costs.


Optimizing AI Assistant Usage

✅ Choose the Right Model

Casefleet offers three AI models with different cost-to-performance ratios. The selected model is displayed in the query box and can be changed at any time, even mid-conversation:

Model

Credit Cost

Fast and Low-Cost

2-10 credits per 1k tokens

Balanced (default)

6-30 credits per 1k tokens

Premium Intelligence

30-150 credits per 1k tokens

Strategic model selection:

  • Start with Balanced for most queries

  • Switch to Fast and Low-Cost for simple follow-up questions or straightforward lookups

  • Upgrade to Premium Intelligence for complex multi-document analysis or detailed legal reasoning

  • You can switch models mid-conversation based on the complexity of each specific query

✅ Narrow the Context

The most effective way to reduce credit consumption is helping the AI focus on relevant information. Here are a few effective strategies:

  • Reference specific documents by name or date

  • Use document tags: "In documents tagged as witness statements..."

  • Select particular sources within the conversation interface

  • Ask focused questions rather than broad queries

Example:

  • Good: "Review the medical records from Dr. Smith dated March 2024"

  • Bad: "Tell me everything about this case"

✅ Monitor Conversation Length

Longer conversations consume more credits because the AI maintains context from previous messages. Start a new chat when:

  • Conversation credit usage becomes high (generally when context approaches capacity)

  • Shifting to a completely different topic or document set

  • You notice the AI's responses becoming less precise about earlier parts of the conversation

Understanding context compaction: Casefleet automatically compacts context when conversations grow large (you'll see a notification when this occurs). While compaction helps manage credit consumption, the AI's memory of earlier conversation details becomes less precise rather than laser sharp. Starting fresh conversations maintains optimal response quality and is the most efficient approach for both accuracy and cost.


Managing Suggested Facts Efficiently

Suggested Facts are generated on-demand when you provide a prompt. Each prompt consumes credits based on document length and complexity. You can run multiple prompts on the same document to extract different types of facts.

Optimization Strategies

Be specific in your prompts:

  • Good: "Extract facts about medical treatments and procedures performed"

  • Bad: "Find important information"

Use selectively:

  • Prioritize key documents (depositions, medical records, expert reports)

  • Use manual fact creation for shorter, straightforward documents

Usage Alerts and Overages

Automatic Notifications

All account administrators receive email notifications when your account reaches its monthly free tier limit for AI inference credits. These notifications indicate:

  • Whether your account is opted in to overage billing

  • What happens next based on your settings

When You Reach the Free Tier

If Overages Are Disabled (default):

  • All features using AI credits pause immediately until the 1st of the next month

  • Your existing data remains fully accessible

  • All other Casefleet work continues normally

If Overages Are Enabled:

  • AI features continue working without interruption

  • Usage beyond the free tier is billed at $0.001 per credit

  • Charges are billed monthly in arrears on the 1st

Managing Overage Settings

Navigate to Account Settings > Usage to enable or disable overage billing. You can change this setting mid-month, but disabling overages doesn't cancel charges for credits already consumed.

Best Practices for Administrators

  • Monitor regularly: Check the Usage dashboard weekly and generate Case Reports monthly to understand patterns and identify high-usage cases.

  • Educate your team: Share guidelines on context narrowing, starting new conversations, and writing focused queries.

  • Use Case Reports for client billing: Export AI credit usage by case to determine if costs should be passed through to clients.

  • Strategically enable overages: Enable if you need uninterrupted AI access for urgent deadlines; keep disabled for strict cost control.

  • Optimize document processing: While Document Intelligence itself doesn't consume AI inference credits (it uses a separate page-based allocation), process critical documents strategically for better AI Assistant results.

  • Plan around monthly resets: If you're close to the limit late in the month, consider whether queries can wait until the 1st when allocations reset.

Frequently Asked Questions

Q: Can I see which team members are using the most AI credits?
A:
Individual user attribution is not currently available. Usage is tracked at the account and case level only.

Q: Do unused AI credits roll over to the next month?
A: No, unused allocations expire at the end of each calendar month.

Q: If I add users mid-month, do they get full AI credit allocations?
A: Yes. Your monthly free tier is calculated based on the maximum number of billed users active at any point during the month. If you add users mid-month, your total allocation increases immediately to reflect the new user count.

Q: Can I set a spending limit for AI credit overages?
A: Spending limits are not currently available. To control costs, disable overage billing at Account Settings > Usage.

Q: Does running Document Intelligence affect later AI credit usage?
A: Document Intelligence processing itself doesn't consume AI inference credits. It uses a separate page-based allocation and enhances the AI Assistant's ability to work with your documents.

Did this answer your question?