Storage Service

The Storage Service provides robust file storage, versioning, and retrieval capabilities across your organization’s infrastructure. This service abstracts away the underlying storage vendor implementation, allowing seamless interaction with files regardless of where they’re physically stored.

Architecture Overview

The Storage Service is built on a Node.js backend with MongoDB for metadata persistence. It provides a flexible storage interface that supports three storage vendor backends:

  1. Amazon S3 - For cloud-based storage on AWS
  2. Azure Blob Storage - For cloud-based storage on Microsoft Azure
  3. Local Storage - For on-premises file storage

The service integrates with other components such as:

  • Configuration Manager - Manages storage configuration settings stored in etcd
  • IAM Service - Handles user authentication and authorization
  • Key-Value Store - Provides access to configuration data from etcd

Storage Configuration

Storage configuration is maintained in etcd under the /services/storage path. The configuration determines which storage vendor is active and the credentials needed to access it. The service dynamically loads this configuration at runtime and can switch between storage vendors without restarting.

Authentication Modes

The Storage Service offers two authentication modes:

  1. User Authentication - Uses standard user JWT tokens for operations performed by end users
  2. Service Authentication - Uses scoped tokens with specific permissions for service-to-service communication

Data Models

Document

The Document model represents metadata about files stored in the system, including:

  • Basic information (name, path, size, MIME type)
  • Versioning metadata and history
  • Storage location details
  • Permissions and ownership
  • Custom metadata

Version History

Each versioned document maintains a complete history of changes, including:

  • Previous versions of the document
  • Metadata for each version (size, creation time, creator)
  • Storage locations for each version
  • Version notes and labels

Storage API

The Storage API provides endpoints for document management, versioning, and retrieval operations.

Create Document

Upload a new document to the storage service.

POST /api/v1/document/upload

Create Document Placeholder

Create a document placeholder for future direct uploads.

POST /api/v1/document/placeholder

Get Document

Retrieve document metadata by ID.

GET /api/v1/document/:documentId

Download Document

Get a signed URL to download a document or stream it for local storage.

GET /api/v1/document/:documentId/download

Delete Document

Mark a document as deleted (soft delete).

DELETE /api/v1/document/:documentId

Get Document Buffer

Get the document’s content as a buffer.

GET /api/v1/document/:documentId/buffer

Update Document Buffer

Update a document’s content.

PUT /api/v1/document/:documentId/buffer

Upload Next Version

Upload a new version of an existing document.

POST /api/v1/document/:documentId/uploadNextVersion

Roll Back to Previous Version

Restore a document to a previous version.

POST /api/v1/document/:documentId/rollBack

Get Direct Upload URL

Generate a presigned URL for directly uploading to storage.

POST /api/v1/document/:documentId/directUpload

Check If Document Is Modified

Check if a document has been modified since its last version.

GET /api/v1/document/:documentId/isModified

Internal Service API

All of the endpoints described above also have service-to-service equivalents which are accessed using scoped tokens instead of user tokens. These endpoints are prefixed with /internal and provide the same functionality but are intended for use by other services.

Example: /api/v1/document/internal/:documentId/download

Schema Definitions

Storage Provider Implementation

The Storage Service uses an adapter pattern to abstract the underlying storage provider implementation. Three storage providers are supported:

Amazon S3

Uses the AWS SDK to interact with S3 buckets. Supports features like:

  • Direct file uploads/downloads
  • Signed URLs with expiration
  • Multipart uploads for large files
  • Object versioning

Azure Blob Storage

Uses the Azure Storage SDK to interact with blob containers. Supports:

  • Direct file uploads/downloads
  • SAS token generation for secure access
  • Blob versioning

Local Storage

Uses the Node.js filesystem API to store files locally. Supports:

  • Direct file storage and retrieval
  • File streaming
  • Directory-based organization

Configuration

Storage configuration is stored in etcd and accessed via the Key-Value Store. The configuration includes:

  • Storage type (s3, azureBlob, local)
  • Credentials for the selected storage vendor
  • Default options and behaviors

The service can switch between storage vendors at runtime by updating the configuration in etcd.