Skip to content

Indexing Content

The Knowledge Base page shows the status of your indexed content and lets you manage the crawling and embedding process.

Navigate to: Inqyra > Knowledge Base

The Knowledge Base page showing indexing statistics, progress bar, and chunk table The Knowledge Base page with indexing statistics and content chunks

How Indexing Works

Inqyra indexes your website content in three steps:

1. Crawl

Inqyra reads the published content on your WordPress site by visiting each page as a regular visitor would. It extracts text from the HTML tags you selected in the Crawling configuration.

2. Chunk

The extracted text is split into smaller pieces called chunks (approximately 1,000 characters each, with 200 characters of overlap between adjacent chunks). This chunking ensures that search results are specific and relevant.

3. Embed

Each chunk is converted into a numerical vector (embedding) that captures its meaning. This allows Inqyra to find content that is semantically similar to a visitor's question, not just keyword matches.

Browser-based embeddings

Embeddings are generated in your browser using Transformers.js. This means no data is sent to any external embedding service. Keep your browser tab open while embeddings are being processed.

Statistics

At the top of the page, you'll see:

Metric Description
Website Chunks Number of chunks from crawled web pages
Document Chunks Number of chunks from uploaded documents (Premium)
Total Chunks Combined total
Searchable Chunks that have embeddings and can be found by the chatbot
Pending Chunks waiting for embeddings to be generated
Progress Percentage of chunks that are searchable

Re-indexing Content

Click Re-index to recrawl all your content. This is useful when:

  • You've made significant changes to multiple pages
  • You've changed the post types or HTML tags to crawl
  • Content appears outdated in chatbot responses

The Re-index button and progress bar during crawling The progress bar shows crawling and embedding progress during re-indexing

The progress bar shows two phases:

  1. Crawling — Reading pages and creating chunks
  2. Generating embeddings — Converting chunks to searchable vectors

Processing Pending Embeddings

If there are chunks without embeddings (shown as "Pending"), click Process Embeddings to generate them. The embedding process runs in your browser — a progress indicator shows the batch-by-batch progress.

Auto-indexing After Post Save

When Auto Re-crawl is enabled, Inqyra automatically re-indexes content when you save a post or page:

  1. You save a post in the WordPress editor
  2. The post is queued for re-crawling
  3. After the page reloads, the content is crawled and embedded automatically in the background
  4. If the admin page doesn't process it, a cron fallback handles the embedding within 10 seconds

Knowledge Chunks Table

The bottom of the page shows a paginated table of all indexed chunks:

Column Description
ID Chunk identifier
Source URL or source where the content came from
Type Website or Document
Content Preview First ~20 words of the chunk
Embedding Status Ready (searchable) or Pending
Updated When the chunk was last updated

This table helps you verify that your content has been indexed correctly.