AI chat concierge
A customer-facing chat grounded in your own content with retrieval-augmented generation and cited sources.
What it is
The concierge is a customer-facing chat widget that answers questions from your content, not a generic bot. It uses retrieval-augmented generation (RAG): your knowledge is embedded into a vector store, the most relevant chunks are retrieved per question, and the answer cites the sources it used.
Knowledge sources
Point the concierge at:
- Site URLs, it crawls and indexes the pages you list.
- Uploaded documents, PDFs and Markdown (product guides, policies, FAQs).
Under the hood, content is chunked (around 500 tokens per chunk), embedded with text-embedding-3-small (1536-dimensional), and retrieved by cosine similarity in pgvector. When you update a source, re-index to refresh the answers.
Citations
Every answer can surface the source chunks it drew from, so a visitor (and you) can see why it answered the way it did. This keeps the concierge honest and debuggable.
Capture and escalation
The concierge isn't a dead end:
- It can capture a lead mid-conversation into the CRM with the transcript attached.
- When a visitor wants a human or a demo, it escalates to a lead form rather than looping.
Safety
Prompt-injection defences guard the retrieval and answer path, and the public chat endpoint is rate-limited per IP. The concierge only answers from indexed content, it doesn't have access to your internal data.
Next
- CRM sync, what happens to a captured conversation.