Skip to main content

Knowledge Base API

The Knowledge Base Service provides a centralized repository for managing and accessing personal data through files and documents. This service enables document storage, retrieval, and management while integrating with the AI backend for intelligent search capabilities.

Base URL

All endpoints are prefixed with /api/v1/kb

Authentication

All endpoints require authentication via Bearer token:
Authorization: Bearer YOUR_TOKEN

Architecture Overview

The Knowledge Base Service is built on a Node.js backend with ArangoDB for graph-based data persistence. The service consists of several key components:
  • Record Management - Core document storage and metadata handling
  • File Storage Integration - Connects with external storage services
  • Event Broadcasting - Kafka-based events for system integration
  • AI Indexing - Automatic content indexing for search capabilities
The service integrates with these components:
  • Storage Service - Handles actual file storage and versioning
  • AI Backend - Processes and indexes content for search
  • Enterprise Search - Provides search capabilities across the knowledge base
  • IAM Service - Handles user authentication and authorization
  • Configuration Manager - Manages application settings

Data Models

Records

Records represent the core entities in the knowledge base:
  • Metadata about stored content (name, type, source)
  • References to physical files in storage
  • Versioning information
  • Indexing status and history

File Records

File Records contain file-specific metadata:
  • File format information (extension, MIME type)
  • Size and storage information
  • Access URLs and paths
  • Checksum and integrity information

Knowledge Base

Knowledge Base represents a collection of records:
  • Organizational grouping of records
  • Permission structure
  • Metadata about the collection

API Endpoints

Manage knowledge base instances including creation, listing, updates, and deletion.
Create a new knowledge base for organizing documents.
  • Request
  • Response
Endpoint: POST /api/v1/kb/Request Body Parameters:
ParameterTypeRequiredDescription
kbNamestringYesName for the knowledge base (1-255 characters)
{
  "kbName": "Company Documents"
}
Retrieve all knowledge bases accessible to the user with filtering and pagination.
  • Request
  • Response
Endpoint: GET /api/v1/kb/Query Parameters:
ParameterTypeDefaultDescription
pagenumber1Page number for pagination
limitnumber20Number of items per page (1-100)
searchstring-Search term for filtering knowledge bases (max 100 characters)
permissionsstring-Comma-separated list of permissions to filter by
sortBystringcreatedAtField to sort by (createdAt, updatedAt, name)
sortOrderstringdescSort order (asc, desc)
GET /api/v1/kb/?page=1&limit=10&search=documents&permissions=OWNER,ADMIN&sortBy=name&sortOrder=asc
Retrieve a specific knowledge base by its ID.
  • Request
  • Response
Endpoint: GET /api/v1/kb/:kbIdPath Parameters:
ParameterTypeRequiredDescription
kbIdstringYesKnowledge base UUID
Update knowledge base properties.
  • Request
  • Response
Endpoint: PUT /api/v1/kb/:kbIdPath Parameters:
ParameterTypeRequiredDescription
kbIdstringYesKnowledge base UUID
Request Body Parameters:
ParameterTypeRequiredDescription
namestringNoNew name for the knowledge base
{
  "name": "Updated Company Documents"
}
Soft-delete a knowledge base and all its contents.
  • Request
  • Response
Endpoint: DELETE /api/v1/kb/:kbIdPath Parameters:
ParameterTypeRequiredDescription
kbIdstringYesKnowledge base UUID
Manage individual records within knowledge bases including creation, retrieval, updates, and deletion.
Retrieve all records accessible to the user across all knowledge bases.
  • Request
  • Response
Endpoint: GET /api/v1/kb/recordsQuery Parameters:
ParameterTypeDefaultDescription
pagenumber1Page number for pagination
limitnumber20Number of items per page (1-100)
searchstring-Search term for filtering records (max 100 characters)
recordTypesstring-Comma-separated list of record types (FILE, EMAIL, WEBPAGE, MESSAGE, OTHERS)
originsstring-Comma-separated list of origins (UPLOAD, CONNECTOR)
connectorsstring-Comma-separated list of connector names
indexingStatusstring-Comma-separated list of indexing statuses (NOT_STARTED, IN_PROGRESS, FAILED, COMPLETED)
permissionsstring-Comma-separated list of permissions to filter by
dateFromnumber-Filter by creation timestamp (milliseconds)
dateTonumber-Filter by creation timestamp (milliseconds)
sortBystringcreatedAtTimestampField to sort by
sortOrderstringdescSort order (asc, desc)
sourcestringallSource filter (all, local, connector)
GET /api/v1/kb/records?page=1&limit=20&recordTypes=FILE&origins=UPLOAD&indexingStatus=COMPLETED&sortBy=createdAtTimestamp&sortOrder=desc
Retrieve all records within a specific knowledge base.
  • Request
  • Response
Endpoint: GET /api/v1/kb/:kbId/recordsPath Parameters:
ParameterTypeRequiredDescription
kbIdstringYesKnowledge base UUID
Query Parameters: Same as “Get All Records” endpoint
Retrieve a specific record by its ID with full metadata.
  • Request
  • Response
Endpoint: GET /api/v1/kb/record/:recordIdPath Parameters:
ParameterTypeRequiredDescription
recordIdstringYesRecord UUID
Update a record with new metadata or file content.
  • Request
  • Response
Endpoint: PUT /api/v1/kb/record/:recordIdPath Parameters:
ParameterTypeRequiredDescription
recordIdstringYesRecord UUID
Request Body Parameters:
ParameterTypeRequiredDescription
recordNamestringNoNew name for the record
filefileNoNew file content (creates a new version)
Content-Type: multipart/form-data (when uploading file)File Constraints:
  • Maximum file size: 30MB
  • Supported MIME types: Based on extensionToMimeType mapping
  • File extension must match existing record’s extension for versioned updates
curl -X PUT \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -F "recordName=Updated Company Policy.pdf" \
  -F "file=@updated_policy.pdf" \
  /api/v1/kb/record/rec789012
Soft-delete a record by setting its deleted flag.
  • Request
  • Response
Endpoint: DELETE /api/v1/kb/record/:recordIdPath Parameters:
ParameterTypeRequiredDescription
recordIdstringYesRecord UUID
Stream the actual file content of a record for download.
  • Request
  • Response
Endpoint: GET /api/v1/kb/stream/record/:recordIdPath Parameters:
ParameterTypeRequiredDescription
recordIdstringYesRecord UUID
Force reindexing of a record by the AI backend.
  • Request
  • Response
Endpoint: POST /api/v1/kb/reindex/record/:recordIdPath Parameters:
ParameterTypeRequiredDescription
recordIdstringYesRecord UUID
Upload files and folders to knowledge bases with support for bulk operations and folder structures.
Upload files directly to a knowledge base, creating records and folder structures as needed.
  • Request
  • Response
Endpoint: POST /api/v1/kb/:kbId/uploadPath Parameters:
ParameterTypeRequiredDescription
kbIdstringYesKnowledge base UUID
Request Body Parameters:
ParameterTypeRequiredDescription
filesfile[]YesArray of files to upload (max 1000 files, 30MB per file)
file_pathsstring[]YesArray of file paths corresponding to files
last_modifiednumber[]YesArray of last modified timestamps (milliseconds)
isVersionedbooleanNoWhether files should be versioned (default: true)
Content-Type: multipart/form-dataFile Constraints:
  • Maximum files per upload: 1000
  • Maximum file size: 30MB per file
  • Supported MIME types: Based on extensionToMimeType mapping
  • File paths must be unique and not contain invalid characters
curl -X POST \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -F "files=@document1.pdf" \
  -F "files=@folder/document2.docx" \
  -F "file_paths=document1.pdf" \
  -F "file_paths=folder/document2.docx" \
  -F "last_modified=1714208400000" \
  -F "last_modified=1714208500000" \
  -F "isVersioned=true" \
  /api/v1/kb/kb123456/upload
Upload files to a specific folder within a knowledge base.
  • Request
  • Response
Endpoint: POST /api/v1/kb/:kbId/folder/:folderId/uploadPath Parameters:
ParameterTypeRequiredDescription
kbIdstringYesKnowledge base UUID
folderIdstringYesTarget folder UUID
Request Body Parameters: Same as “Upload to Knowledge Base”
curl -X POST \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -F "files=@document.pdf" \
  -F "file_paths=document.pdf" \
  -F "last_modified=1714208400000" \
  /api/v1/kb/kb123456/folder/folder456/upload
Manage folder structures within knowledge bases including content retrieval, updates, and deletion.
Retrieve all records and subfolders within a specific folder.
  • Request
  • Response
Endpoint: GET /api/v1/kb/:kbId/folder/:folderId/recordsPath Parameters:
ParameterTypeRequiredDescription
kbIdstringYesKnowledge base UUID
folderIdstringYesFolder UUID
Query Parameters: Same filtering and pagination parameters as record endpoints
Update folder properties such as name.
  • Request
  • Response
Endpoint: PUT /api/v1/kb/:kbId/folder/:folderIdPath Parameters:
ParameterTypeRequiredDescription
kbIdstringYesKnowledge base UUID
folderIdstringYesFolder UUID
Request Body Parameters:
ParameterTypeRequiredDescription
folderNamestringYesNew name for the folder
{
  "folderName": "Updated Folder Name"
}
Delete a folder and all its contents.
  • Request
  • Response
Endpoint: DELETE /api/v1/kb/:kbId/folder/:folderIdPath Parameters:
ParameterTypeRequiredDescription
kbIdstringYesKnowledge base UUID
folderIdstringYesFolder UUID
Administrative endpoints for managing connectors, bulk operations, and system maintenance.
Retrieve statistics for a specific connector.Access Control: Admin users only
  • Request
  • Response
Endpoint: GET /api/v1/kb/stats/:connectorPath Parameters:
ParameterTypeRequiredDescription
connectorstringYesConnector name (onedrive, google_drive, confluence, slack, etc.)
Force reindexing of all failed records for a specific connector.Access Control: Admin users only
  • Request
  • Response
Endpoint: POST /api/v1/kb/reindex-all/connectorRequest Body Parameters:
ParameterTypeRequiredDescription
connectorstringYesConnector name to reindex
orgIdstringNoOrganization ID (defaults to user’s org)
{
  "connector": "google_drive",
  "orgId": "org123456"
}
Force resynchronization of all records for a specific connector.Access Control: Admin users only
  • Request
  • Response
Endpoint: POST /api/v1/kb/resync/connectorRequest Body Parameters:
ParameterTypeRequiredDescription
connectorstringYesConnector name to resync
orgIdstringNoOrganization ID (defaults to user’s org)
fullResyncbooleanNoWhether to perform full resync (default: false)
{
  "connector": "google_drive",
  "orgId": "org123456",
  "fullResync": true
}

Event System

The Knowledge Base Service broadcasts events through Kafka to notify other services about record changes. These events trigger actions like content indexing, search updates, and audit logging.

Event Types

Event TypeDescription
newRecordTriggered when a new record is created
updateRecordTriggered when a record is updated
deletedRecordTriggered when a record is deleted
reindexRecordTriggered when a record needs reindexing

Event Payload Structure

{
  "orgId": "org123456",
  "recordId": "rec789012",
  "recordName": "Company Policy.pdf",
  "recordType": "FILE",
  "version": 1,
  "signedUrlRoute": "https://storage.example.com/api/v1/document/internal/doc123456/download",
  "origin": "UPLOAD",
  "extension": "pdf",
  "mimeType": "application/pdf",
  "createdAtTimestamp": "1714208400000",
  "updatedAtTimestamp": "1714208400000",
  "sourceCreatedAtTimestamp": "1714208400000"
}
{
  "orgId": "org123456",
  "recordId": "rec789012",
  "version": 2,
  "signedUrlRoute": "https://storage.example.com/api/v1/document/internal/doc123456/download",
  "updatedAtTimestamp": "1714294800000",
  "sourceLastModifiedTimestamp": "1714294800000"
}
{
  "orgId": "org123456",
  "recordId": "rec789012",
  "version": 2,
  "extension": "pdf",
  "mimeType": "application/pdf"
}

Schema Definitions

interface IRecordDocument {
  // Identifiers
  _key: string;
  orgId: string;
  
  // Core metadata
  recordName: string;
  externalRecordId: string;
  externalRevisionId?: string;
  recordType: RecordType; // 'FILE' | 'WEBPAGE' | 'MESSAGE' | 'EMAIL' | 'OTHERS'
  origin: OriginType; // 'UPLOAD' | 'CONNECTOR'
  
  // Timestamps
  createdAtTimestamp: number;
  updatedAtTimestamp?: number;
  lastSyncTimestamp?: number;
  sourceCreatedAtTimestamp?: number;
  sourceLastModifiedTimestamp?: number;
  
  // Status flags
  isDeletedAtSource?: boolean;
  deletedAtSourceTimestamp?: number;
  isDeleted?: boolean;
  isArchived?: boolean;
  deletedByUserId?: string;
  
  // Indexing information
  lastIndexTimestamp?: number;
  lastExtractionTimestamp?: number;
  indexingStatus?: IndexingStatus; // 'NOT_STARTED' | 'IN_PROGRESS' | 'FAILED' | 'COMPLETED'
  
  // Versioning
  version?: number;
  isLatestVersion?: boolean;
  isDirty?: boolean;
  
  // File-specific fields (for FILE records)
  webUrl?: string;
  mimeType?: string;
  
  // Optional connector information
  connectorName?: ConnectorName; // 'ONEDRIVE' | 'GOOGLE_DRIVE' | 'CONFLUENCE' | 'SLACK'
}
interface IFileRecordDocument {
  // Identifiers
  _key?: string;
  orgId: string;
  
  // Core metadata
  name: string;
  isFile?: boolean;
  extension?: string | null;
  mimeType?: string | null;
  sizeInBytes?: number;
  webUrl?: string;
  path?: string;
  
  // Hash information for integrity
  etag?: string | null;
  ctag?: string | null;
  quickXorHash?: string | null;
  crc32Hash?: string | null;
  sha1Hash?: string | null;
  sha256Hash?: string | null;
  
  // External identifiers
  externalFileId?: string;
}
interface IKnowledgeBase {
  // Identifiers
  _key: string;
  _id?: string;
  orgId: string;
  
  // Metadata
  name: string;
  createdAtTimestamp: number;
  updatedAtTimestamp: number;
  
  // Status flags
  isDeleted: boolean;
  isArchived: boolean;
}
interface UploadRecordsRequest {
  // File upload arrays (must have same length)
  files: Express.Multer.File[];
  file_paths: string[];
  last_modified: number[];
  
  // Optional parameters
  isVersioned?: boolean;
  recordType?: string;
  origin?: string;
}

interface UploadRecordsResponse {
  message: string;
  data: {
    total_created: number;
    folders_created: number;
    records_created: number;
    knowledgeBase: {
      id: string;
      name: string;
    };
    createdRecords: Array<{
      id: string;
      name: string;
      type: string;
      folderId?: string;
    }>;
    createdFolders?: Array<{
      id: string;
      name: string;
      path: string;
    }>;
  };
  meta: {
    requestId: string;
    timestamp: string;
  };
}

Error Handling

All endpoints return structured error responses with specific HTTP status codes:
{
  "error": {
    "code": "VALIDATION_ERROR",
    "message": "Invalid request parameters",
    "details": {
      "field": "kbName",
      "issue": "Knowledge base name is required"
    }
  },
  "meta": {
    "requestId": "req-error-123",
    "timestamp": "2025-04-27T13:20:00.000Z"
  }
}
Common Error Codes:
  • 400 Bad Request - Invalid request parameters, file validation errors
  • 401 Unauthorized - Missing or invalid authentication
  • 403 Forbidden - Insufficient permissions
  • 404 Not Found - Resource not found
  • 413 Payload Too Large - File size exceeds limit
  • 422 Unprocessable Entity - Validation errors
  • 500 Internal Server Error - Server errors, backend service failures
File Upload Specific Errors:
  • File extension mismatch for versioned updates
  • Unsupported MIME types
  • File size exceeds 30MB limit
  • Invalid file path characters
  • Duplicate file paths in upload

Integration with AI Indexing

When a record is created or updated, the Knowledge Base Service:
  1. Stores the file metadata in ArangoDB
  2. Uploads the file content to the Storage Service
  3. Publishes a Kafka event with the file metadata and download URL
  4. The AI Backend consumes these events and:
    • Downloads the file content
    • Extracts text and structured data
    • Processes the content for search indexing
    • Updates the indexing status in the Knowledge Base
This integration enables Enterprise Search to provide intelligent search capabilities across the entire knowledge base.

Security and Permissions

  • All endpoints require valid JWT authentication
  • Admin endpoints require additional role verification
  • File uploads are validated for MIME type and size
  • Access control is enforced at the organization level
  • File content is streamed securely through signed URLs
  • Upload paths are sanitized to prevent directory traversal

Rate Limits and Constraints

  • File Upload: Maximum 30MB per file, 1000 files per upload
  • Pagination: Maximum 100 items per page
  • Search Terms: Maximum 100 characters
  • Knowledge Base Names: 1-255 characters
  • File Paths: Must not contain invalid characters or be duplicated
I