Understanding AI Credits
AI inference credits power Casefleet's AI Assistant and Suggested Facts features. Each subscription includes 10,000 free AI inference credits per billed user per month, shared between both features. This allocation resets on the 1st of each calendar month and does not roll over.
How Credits Are Consumed
Credit consumption varies based on:
Query complexity: Simple lookups use fewer credits than complex analysis
Context size: Longer conversations with more history consume more credits
Document scope: Queries across many documents use more credits than focused searches
Credits are reflective of tokens (units of text processed), which cannot be predicted precisely before a query is executed. For complete details on free-tier allocations and overage billing, see Subscriptions and Billing.
Monitoring Credit Usage
Account-Wide Tracking
Navigate to Account Settings > Usage to view:
Current month's free-tier allotment of AI credits
Month-to-date total AI credit consumption
Real-time updates as credits are consumed
Case-Level Tracking
Navigate to Account Settings > Case Reports to:
Generate reports showing AI credit consumption by case (current month and up to 5 months prior)
Identify which matters are driving usage
Export reports for client billing or internal analysis
Conversation-Level Tracking
When using AI Assistant, each conversation displays its credit consumption in real-time below the query box. The total shown represents the entire conversation, not individual messages.
Tip: Monitor conversation credit usage and start a new chat when consumption gets high to reset context and reduce ongoing costs.
Optimizing AI Assistant Usage
✅ Choose the Right Model
Casefleet offers three AI models with different cost-to-performance ratios. The selected model is displayed in the query box and can be changed at any time, even mid-conversation:
Model | Credit Cost |
Fast and Low-Cost | 2-10 credits per 1k tokens |
Balanced (default) | 6-30 credits per 1k tokens |
Premium Intelligence | 30-150 credits per 1k tokens |
Strategic model selection:
Start with Balanced for most queries
Switch to Fast and Low-Cost for simple follow-up questions or straightforward lookups
Upgrade to Premium Intelligence for complex multi-document analysis or detailed legal reasoning
You can switch models mid-conversation based on the complexity of each specific query
✅ Narrow the Context
The most effective way to reduce credit consumption is helping the AI focus on relevant information. Here are a few effective strategies:
Reference specific documents by name or date
Use document tags: "In documents tagged as witness statements..."
Select particular sources within the conversation interface
Ask focused questions rather than broad queries
Example:
Good: "Review the medical records from Dr. Smith dated March 2024"
Bad: "Tell me everything about this case"
✅ Monitor Conversation Length
Longer conversations consume more credits because the AI maintains context from previous messages. Start a new chat when:
Conversation credit usage becomes high (generally when context approaches capacity)
Shifting to a completely different topic or document set
You notice the AI's responses becoming less precise about earlier parts of the conversation
Understanding context compaction: Casefleet automatically compacts context when conversations grow large (you'll see a notification when this occurs). While compaction helps manage credit consumption, the AI's memory of earlier conversation details becomes less precise rather than laser sharp. Starting fresh conversations maintains optimal response quality and is the most efficient approach for both accuracy and cost.
Managing Suggested Facts Efficiently
Suggested Facts are generated on-demand when you provide a prompt. Each prompt consumes credits based on document length and complexity. You can run multiple prompts on the same document to extract different types of facts.
Optimization Strategies
Be specific in your prompts:
Good: "Extract facts about medical treatments and procedures performed"
Bad: "Find important information"
Use selectively:
Prioritize key documents (depositions, medical records, expert reports)
Use manual fact creation for shorter, straightforward documents
Usage Alerts and Overages
Automatic Notifications
All account administrators receive email notifications when your account reaches its monthly free tier limit for AI inference credits. These notifications indicate:
Whether your account is opted in to overage billing
What happens next based on your settings
When You Reach the Free Tier
If Overages Are Disabled (default):
All features using AI credits pause immediately until the 1st of the next month
Your existing data remains fully accessible
All other Casefleet work continues normally
If Overages Are Enabled:
AI features continue working without interruption
Usage beyond the free tier is billed at $0.001 per credit
Charges are billed monthly in arrears on the 1st
Managing Overage Settings
Navigate to Account Settings > Usage to enable or disable overage billing. You can change this setting mid-month, but disabling overages doesn't cancel charges for credits already consumed.
Best Practices for Administrators
Monitor regularly: Check the Usage dashboard weekly and generate Case Reports monthly to understand patterns and identify high-usage cases.
Educate your team: Share guidelines on context narrowing, starting new conversations, and writing focused queries.
Use Case Reports for client billing: Export AI credit usage by case to determine if costs should be passed through to clients.
Strategically enable overages: Enable if you need uninterrupted AI access for urgent deadlines; keep disabled for strict cost control.
Optimize document processing: While Document Intelligence itself doesn't consume AI inference credits (it uses a separate page-based allocation), process critical documents strategically for better AI Assistant results.
Plan around monthly resets: If you're close to the limit late in the month, consider whether queries can wait until the 1st when allocations reset.
Frequently Asked Questions
Q: Can I see which team members are using the most AI credits?
A: Individual user attribution is not currently available. Usage is tracked at the account and case level only.
Q: Do unused AI credits roll over to the next month?
A: No, unused allocations expire at the end of each calendar month.
Q: If I add users mid-month, do they get full AI credit allocations?
A: Yes. Your monthly free tier is calculated based on the maximum number of billed users active at any point during the month. If you add users mid-month, your total allocation increases immediately to reflect the new user count.
Q: Can I set a spending limit for AI credit overages?
A: Spending limits are not currently available. To control costs, disable overage billing at Account Settings > Usage.
Q: Does running Document Intelligence affect later AI credit usage?
A: Document Intelligence processing itself doesn't consume AI inference credits. It uses a separate page-based allocation and enhances the AI Assistant's ability to work with your documents.
