Agent LLM Selection Guide
TriLuna offers multiple Large Language Model (LLM) options for your agents. Choosing the right LLM impacts your agent's intelligence, response quality, and conversation capabilities.
Available LLM Options
Your TriLuna dashboard includes an LLM dropdown menu where you can select from several powerful language models, each optimized for different use cases:
Current LLM Options
GPT-4 Series (Recommended for Most Use Cases)
- Best for: Complex reasoning, nuanced conversations, professional interactions
- Strengths: Excellent comprehension, context retention, creative problem-solving
- Use Cases: Customer service, sales, technical support, appointment setting
- Response Quality: Highest quality, most human-like responses
GPT-3.5 Turbo (Fast and Efficient)
- Best for: Quick interactions, high-volume calling, cost-conscious deployments
- Strengths: Fast response times, efficient processing, good general knowledge
- Use Cases: Lead qualification, appointment reminders, basic information gathering
- Response Quality: Good quality with faster response times
Claude Models (Alternative Option)
- Best for: Analytical tasks, detailed explanations, thoughtful responses
- Strengths: Careful reasoning, thorough responses, safety-focused
- Use Cases: Consultative selling, complex information gathering, detailed support
- Response Quality: High quality with emphasis on accuracy and helpfulness
How to Change Your Agent’s LLM
Using the Interactive LLM Selector
- Log into your TriLuna dashboard
- Navigate to My Agents
- Click on the agent you want to modify
- Find the AI Model section with an interactive dropdown
- Click the dropdown to see all available models organized by provider:
- Google: Gemini models (fast, efficient)
- OpenAI: GPT models (versatile, powerful)
- Anthropic: Claude models (thoughtful, safe)
- Select your preferred model - the change applies immediately
- Your agent will use the new LLM for all future conversations
Understanding the Numbers
Each LLM model shows two important specifications:
Max Tokens (e.g., “8,192 tokens”)
- What it means: Maximum length of response the model can generate in a single reply
- Practical impact: Higher numbers = longer, more detailed responses possible
- Typical range: 4,096 tokens = ~3,000 words, 8,192 tokens = ~6,000 words
- Choose higher for: Detailed explanations, complex scenarios, thorough customer service
- Choose lower for: Quick interactions, brief responses, fast-paced conversations
Context Tokens (e.g., “128,000 tokens”)
- What it means: How much conversation history the model can remember and reference
- Practical impact: Higher numbers = better memory of earlier conversation parts
- Example: 128,000 tokens = remembers last ~100,000 words of conversation
- Choose higher for: Long consultations, complex multi-topic discussions, detailed support calls
- Choose lower for: Simple, short interactions where conversation history isn’t critical
Choosing the Right LLM
Consider Your Use Case
High-Touch Customer Service
Recommended: GPT-4 or Claude
- Handles complex customer issues with nuance
- Better at understanding emotional context
- More sophisticated problem-solving capabilities
- Superior at de-escalating difficult situations
Lead Qualification and Sales
Recommended: GPT-4
- Excellent at reading between the lines
- Strong persuasion and rapport-building abilities
- Good at asking qualifying questions naturally
- Effective at handling objections
High-Volume Appointment Setting
Recommended: GPT-3.5 Turbo
- Fast response times keep conversations flowing
- Efficient for straightforward scheduling tasks
- Cost-effective for large call volumes
- Adequate intelligence for routine interactions
Technical Support
Recommended: GPT-4 or Claude
- Better at understanding technical concepts
- More accurate troubleshooting guidance
- Superior at breaking down complex solutions
- Better context retention for multi-step processes
Performance vs. Cost Considerations
Premium Models (GPT-4, Claude)
- Higher Cost: More expensive per conversation
- Superior Quality: Better understanding and responses
- Best For: High-value interactions, complex use cases
- ROI Consideration: Higher conversion rates often justify increased costs
Efficient Models (GPT-3.5 Turbo)
- Lower Cost: More conversations per dollar
- Good Quality: Adequate for most routine interactions
- Best For: High-volume, straightforward use cases
- ROI Consideration: Cost savings enable higher call volumes
LLM Performance Characteristics
Response Time Comparison
- GPT-3.5 Turbo: ~1-2 seconds (fastest)
- GPT-4: ~2-4 seconds (moderate)
- Claude: ~2-5 seconds (varies by complexity)
Context Window Sizes
Different LLMs can remember different amounts of conversation history:
- GPT-4: Large context window - remembers entire conversations
- GPT-3.5 Turbo: Moderate context - good for most interactions
- Claude: Large context window - excellent memory for long conversations
Specialized Capabilities
GPT-4 Strengths
- Complex reasoning and analysis
- Creative problem-solving
- Understanding subtle context and implications
- Excellent at adapting communication style
GPT-3.5 Turbo Strengths
- Fast, efficient responses
- Good general knowledge
- Reliable performance for routine tasks
- Cost-effective scaling
Claude Strengths
- Careful, thoughtful responses
- Strong analytical capabilities
- Excellent safety and appropriateness
- Good at detailed explanations
Testing Different LLMs
A/B Testing Your LLM Choice
- Set Baseline: Use your current LLM for a week and track metrics
- Switch Models: Change to a different LLM for comparison
- Monitor Performance: Track conversation success rates, customer satisfaction
- Compare Results: Analyze which LLM performs better for your specific use case
Key Metrics to Track
- Conversation Success Rate: How often does the agent achieve the desired outcome?
- Customer Satisfaction: How do customers respond to different LLMs?
- Response Appropriateness: How well does the LLM understand context and respond appropriately?
- Efficiency: How quickly does the agent reach conversation goals?
LLM-Specific Configuration Tips
Optimizing System Prompts by LLM
For GPT-4:
GPT-4 handles complex, nuanced prompts very well. You can include detailed instructions about tone, personality, and specific behaviors. It excels with context-rich prompts that provide examples and edge case handling.
For GPT-3.5 Turbo:
Keep prompts clear and concise. Focus on specific, actionable instructions rather than nuanced personality descriptions. Works best with straightforward, goal-oriented prompts.
For Claude:
Claude responds well to structured, thoughtful prompts that emphasize helpfulness and accuracy. Include clear guidelines about when to be thorough vs. concise.
Troubleshooting LLM Issues
Common Issues and Solutions
Agent Responses Too Slow
- Solution: Switch to GPT-3.5 Turbo for faster responses
- Check: Ensure system prompts aren’t overly complex
- Consider: Whether response quality vs. speed trade-off is acceptable
Agent Doesn’t Understand Context
- Solution: Upgrade to GPT-4 or Claude for better comprehension
- Check: System prompt clarity and examples
- Adjust: Behavior settings to be more explicit
Responses Too Expensive
- Solution: Switch to GPT-3.5 Turbo for cost efficiency
- Optimize: System prompts to be more concise
- Review: Whether premium model ROI justifies cost
Agent Responses Inappropriate
- Solution: Switch to Claude for more conservative responses
- Adjust: System prompt to include safety guidelines
- Review: Behavior settings for appropriateness rules
Best Practices Summary
- Start with GPT-4 for most use cases, then optimize based on performance
- Use GPT-3.5 Turbo for high-volume, routine interactions
- Choose Claude for conservative, analytical, or safety-critical applications
- Test thoroughly before making permanent changes
- Monitor performance metrics after LLM changes
- Adjust prompts and behaviors to match your chosen LLM’s strengths
- Consider cost vs. performance trade-offs for your specific use case
Related Articles
Need Help Choosing?
LLM selection can significantly impact your agent’s performance. Get expert guidance:
- Email our AI optimization team
- Schedule an LLM consultation through your dashboard
- Use the chat widget for quick LLM questions
- Request performance analysis of your current LLM choice