Skip to main content

Overview

This guide covers configuring and optimizing AI response times and queuing settings to balance speed, accuracy, and a natural conversational feel.

Steps to Optimize

1

Set Up the AI Assistant

Create a new assistant and add an Active Tag to enable logging and monitoring.
2

Adjust Queuing Times

In the assistant’s settings, locate Wait Time and adjust:
  • Zero seconds — Instant responses, ideal for widgets or fast interactions
  • 15+ seconds — More human-like delays for conversational realism
3

Optimize Prompt Design

  • Minimize prompt length to reduce processing time
  • Avoid verbose instructions — focus on concise, direct commands
  • Use templates or structured prompts for clarity without bulk
4

Evaluate Tools and Context

Analyze logs to identify the impact of:
  • Tool calls integrated within the assistant
  • Conversation history size (larger contexts slow processing)
Remove or simplify unnecessary tools and context.
5

Incorporate Synthetic Delays (Optional)

For non-immediate workflows, add 5-10 second delays to allow time for workflow detection or tag processing.
6

Optimize Knowledge Bases

Connect relevant knowledge bases — but ensure they’re focused and not overly broad.
7

Test Robustness

Simulate varied scenarios:
  • Simple inquiries vs. complex requests
  • High vs. low context demands
  • Record response time metrics

Key Factors Affecting Response Time

FactorImpactRecommendation
Prompt lengthLonger = slowerKeep concise
Number of toolsMore tools = more processingOnly enable needed tools
Conversation historyLarger context = slowerKeep conversations focused
Knowledge base sizeLarger = slower embeddingUse targeted FAQ entries
Model choiceLarger models are slowerUse gpt-4o-mini for speed

Knowledge Base Optimization

Build a better knowledge base

Engine & Streaming

How the Flow Builder engine works

API Key Setup

Choose the right model