Overview
This guide covers configuring and optimizing AI response times and queuing settings to balance speed, accuracy, and a natural conversational feel.Steps to Optimize
Set Up the AI Assistant
Create a new assistant and add an Active Tag to enable logging and monitoring.
Adjust Queuing Times
In the assistant’s settings, locate Wait Time and adjust:
- Zero seconds — Instant responses, ideal for widgets or fast interactions
- 15+ seconds — More human-like delays for conversational realism
Optimize Prompt Design
- Minimize prompt length to reduce processing time
- Avoid verbose instructions — focus on concise, direct commands
- Use templates or structured prompts for clarity without bulk
Evaluate Tools and Context
Analyze logs to identify the impact of:
- Tool calls integrated within the assistant
- Conversation history size (larger contexts slow processing)
Incorporate Synthetic Delays (Optional)
For non-immediate workflows, add 5-10 second delays to allow time for workflow detection or tag processing.
Optimize Knowledge Bases
Connect relevant knowledge bases — but ensure they’re focused and not overly broad.
Key Factors Affecting Response Time
| Factor | Impact | Recommendation |
|---|---|---|
| Prompt length | Longer = slower | Keep concise |
| Number of tools | More tools = more processing | Only enable needed tools |
| Conversation history | Larger context = slower | Keep conversations focused |
| Knowledge base size | Larger = slower embedding | Use targeted FAQ entries |
| Model choice | Larger models are slower | Use gpt-4o-mini for speed |
Related Pages
Knowledge Base Optimization
Build a better knowledge base
Engine & Streaming
How the Flow Builder engine works
API Key Setup
Choose the right model
