ThorStackThorStack
Growth

AI chat concierge

A customer-facing chat grounded in your own content with retrieval-augmented generation and cited sources.

What it is

The concierge is a customer-facing chat widget that answers questions from your content, not a generic bot. It uses retrieval-augmented generation (RAG): your knowledge is embedded into a vector store, the most relevant chunks are retrieved per question, and the answer cites the sources it used.

Knowledge sources

Point the concierge at:

  • Site URLs, it crawls and indexes the pages you list.
  • Uploaded documents, PDFs and Markdown (product guides, policies, FAQs).

Under the hood, content is chunked (around 500 tokens per chunk), embedded with text-embedding-3-small (1536-dimensional), and retrieved by cosine similarity in pgvector. When you update a source, re-index to refresh the answers.

Citations

Every answer can surface the source chunks it drew from, so a visitor (and you) can see why it answered the way it did. This keeps the concierge honest and debuggable.

Capture and escalation

The concierge isn't a dead end:

  • It can capture a lead mid-conversation into the CRM with the transcript attached.
  • When a visitor wants a human or a demo, it escalates to a lead form rather than looping.

Safety

Prompt-injection defences guard the retrieval and answer path, and the public chat endpoint is rate-limited per IP. The concierge only answers from indexed content, it doesn't have access to your internal data.

Next

  • CRM sync, what happens to a captured conversation.

Ready for a stack
built around you?

Every ThorStack deployment starts with a 30-minute call. Tell us how you operate, and we'll show you what your stack would look like.