Knowledge Base Service

The Knowledge Base Service provides a centralized repository for managing and accessing personal data through files and documents. This service enables document storage, retrieval, and management while integrating with the AI backend for intelligent search capabilities.

Architecture Overview

The Knowledge Base Service is built on a Node.js backend with ArangoDB for graph-based data persistence. The service consists of several key components:

  • Record Management - Core document storage and metadata handling
  • File Storage Integration - Connects with external storage services
  • Event Broadcasting - Kafka-based events for system integration
  • AI Indexing - Automatic content indexing for search capabilities

The service integrates with these components:

  • Storage Service - Handles actual file storage and versioning
  • AI Backend - Processes and indexes content for search
  • Enterprise Search - Provides search capabilities across the knowledge base
  • IAM Service - Handles user authentication and authorization
  • Configuration Manager - Manages application settings

Data Models

Records

Records represent the core entities in the knowledge base:

  • Metadata about stored content (name, type, source)
  • References to physical files in storage
  • Versioning information
  • Indexing status and history

File Records

File Records contain file-specific metadata:

  • File format information (extension, MIME type)
  • Size and storage information
  • Access URLs and paths
  • Checksum and integrity information

Knowledge Base

Knowledge Base represents a collection of records:

  • Organizational grouping of records
  • Permission structure
  • Metadata about the collection

Record API

The Record API enables management of documents and files in the knowledge base.

Create Records

Upload and create new records in the knowledge base.

POST /api/v1/kb

Get Records

Retrieve all records in the knowledge base with filtering and pagination.

GET /api/v1/kb

Get Record By ID

Retrieve a specific record by its ID.

GET /api/v1/kb/record/:recordId

Update Record

Update a record with new data or file content.

PUT /api/v1/kb/record/:recordId

Delete Record

Soft-delete a record by setting its deleted flag.

DELETE /api/v1/kb/record/:recordId

Archive Record

Archive a record to move it to archived state.

PATCH /api/v1/kb/record/:recordId/archive

Unarchive Record

Restore a record from archived state.

PATCH /api/v1/kb/record/:recordId/unarchive

Stream Record Content

Stream the actual file content of a record.

GET /api/v1/kb/stream/record/:recordId

Reindex Record

Force reindexing of a record by the AI backend.

POST /api/v1/kb/reindex/record/:recordId

Event System

The Knowledge Base Service broadcasts events through Kafka to notify other services about record changes. These events trigger actions like content indexing, search updates, and audit logging.

Event Types

Event TypeDescription
newRecordTriggered when a new record is created
updateRecordTriggered when a record is updated
deletedRecordTriggered when a record is deleted
reindexRecordTriggered when a record needs reindexing

Event Payload Structure

New Record Event

{
  "orgId": "org123456",
  "recordId": "rec789012",
  "recordName": "Company Policy.pdf",
  "recordType": "FILE",
  "version": 1,
  "signedUrlRoute": "https://storage.example.com/api/v1/document/internal/doc123456/download",
  "origin": "UPLOAD",
  "extension": "pdf",
  "createdAtTimestamp": "1714208400000",
  "updatedAtTimestamp": "1714208400000",
  "sourceCreatedAtTimestamp": "1714208400000"
}

Update Record Event

{
  "orgId": "org123456",
  "recordId": "rec789012",
  "version": 2,
  "signedUrlRoute": "https://storage.example.com/api/v1/document/internal/doc123456/download",
  "updatedAtTimestamp": "1714294800000",
  "sourceLastModifiedTimestamp": "1714294800000"
}

Deleted Record Event

{
  "orgId": "org123456",
  "recordId": "rec789012",
  "version": 2
}

Integration with AI Indexing

When a record is created or updated, the Knowledge Base Service:

  1. Stores the file metadata in ArangoDB
  2. Uploads the file content to the Storage Service
  3. Publishes a Kafka event with the file metadata and download URL
  4. The AI Backend consumes these events and:
    • Downloads the file content
    • Extracts text and structured data
    • Processes the content for search indexing
    • Updates the indexing status in the Knowledge Base

This integration enables Enterprise Search to provide intelligent search capabilities across the entire knowledge base.

Schema Definitions