Skip to main content

Storage Service API

The Storage Service provides robust file storage, versioning, and retrieval capabilities across your organization’s infrastructure. This service abstracts away the underlying storage vendor implementation, allowing seamless interaction with files regardless of where they’re physically stored.

Base URL

All endpoints are prefixed with /api/v1/document

Authentication

All endpoints require authentication via Bearer token:
Authorization: Bearer YOUR_TOKEN
Internal endpoints use scoped token authentication for service-to-service communication with specific scopes:
  • STORAGE_TOKEN - For storage operations
  • FETCH_CONFIG - For configuration updates

Architecture Overview

The Storage Service is built on a Node.js backend with MongoDB for metadata persistence. It provides a flexible storage interface that supports three storage vendor backends:
  1. Amazon S3 - For cloud-based storage on AWS
  2. Azure Blob Storage - For cloud-based storage on Microsoft Azure
  3. Local Storage - For on-premises file storage
The service integrates with other components such as:
  • Configuration Manager - Manages storage configuration settings stored in etcd
  • IAM Service - Handles user authentication and authorization
  • Key-Value Store - Provides access to configuration data from etcd

Storage Configuration

Storage configuration is maintained in etcd under the /services/storage path. The configuration determines which storage vendor is active and the credentials needed to access it. The service dynamically loads this configuration at runtime and can switch between storage vendors without restarting.

Authentication Modes

The Storage Service offers two authentication modes:
  1. User Authentication - Uses standard user JWT tokens for operations performed by end users
  2. Service Authentication - Uses scoped tokens with specific permissions for service-to-service communication

Data Models

Document

The Document model represents metadata about files stored in the system, including:
  • Basic information (name, path, size, MIME type)
  • Versioning metadata and history
  • Storage location details
  • Permissions and ownership
  • Custom metadata

Version History

Each versioned document maintains a complete history of changes, including:
  • Previous versions of the document
  • Metadata for each version (size, creation time, creator)
  • Storage locations for each version
  • Version notes and labels

API Endpoints

Core document operations including upload, retrieval, and deletion.
Upload a new document to the storage service.
  • Request
  • Response
Endpoint: POST /api/v1/document/uploadRequest Body Parameters:
ParameterTypeRequiredDescription
fileFileYesThe file to upload (multipart/form-data)
documentNamestringYesName of the document (no extensions allowed)
documentPathstringNoOptional path for document storage
alternateDocumentNamestringNoAlternative name for the document
permissionsstringNoAccess permissions (owner, editor, commentator, readonly)
customMetadataobjectNoCustom metadata to associate with the document
isVersionedFilestringYesWhether the document supports versioning (“true”/“false”)
File Size Limits:
  • User endpoints: 1GB (1024 * 1024 * 1000 bytes)
  • Internal endpoints: 100MB (1024 * 1024 * 100 bytes)
Direct Upload Threshold: 0MB (all files trigger direct upload for S3/Azure for large files)Validation Rules:
  • Document name cannot contain extensions or forward slashes
  • File must have a valid extension with supported MIME type
  • File buffer is required in multipart form data
Supported MIME Types: Extensive list including documents (PDF, Office), images (JPEG, PNG, SVG), archives (ZIP, RAR), and many others.
Internal endpoint for service-to-service document uploads.
  • Request
  • Response
Endpoint: POST /api/v1/document/internal/uploadAuthentication: Requires scoped token with STORAGE_TOKEN scope.Request Body: Same structure as public upload endpoint.File Size Limit: 100MB for internal uploads.
Create a document placeholder for future direct uploads.
  • Request
  • Response
Endpoint: POST /api/v1/document/placeholderRequest Body Parameters:
ParameterTypeRequiredDescription
documentNamestringYesName of the document (no extensions allowed)
documentPathstringYesPath for document storage
alternateDocumentNamestringNoAlternative name for the document
permissionsstringNoAccess permissions
metaDataobjectNoCustom metadata
isVersionedFilebooleanNoWhether the document supports versioning
extensionstringYesFile extension (e.g., “pdf”)
{
  "documentName": "large-dataset",
  "documentPath": "analytics/data",
  "isVersionedFile": false,
  "extension": "csv",
  "permissions": "owner"
}
Validation:
  • Document name cannot contain extensions or forward slashes
  • Extension must be provided separately
Internal endpoint for creating document placeholders.
  • Request
  • Response
Endpoint: POST /api/v1/document/internal/placeholderAuthentication: Requires scoped token with STORAGE_TOKEN scope.Request Body: Same structure as public placeholder endpoint.
Retrieve document metadata by ID.
  • Request
  • Response
Endpoint: GET /api/v1/document/:documentIdPath Parameters:
  • documentId: MongoDB ObjectId (24-character hex string)
Request Headers:
  • Authorization: Bearer token (required)
Request Body: Contains fileBuffer field for validation (per DocumentIdParams schema)
Internal endpoint for retrieving document metadata.
  • Request
  • Response
Endpoint: GET /api/v1/document/internal/:documentIdAuthentication: Requires scoped token with STORAGE_TOKEN scope.
Mark a document as deleted (soft delete).
  • Request
  • Response
Endpoint: DELETE /api/v1/document/:documentId/Path Parameters:
  • documentId: MongoDB ObjectId (24-character hex string)
Note: The endpoint includes a trailing slash in the route definition.
Internal endpoint for deleting documents.
  • Request
  • Response
Endpoint: DELETE /api/v1/document/internal/:documentId/Authentication: Requires scoped token with STORAGE_TOKEN scope.
Operations for downloading and accessing document content.
Get a signed URL to download a document or stream it for local storage.
  • Request
  • Response
Endpoint: GET /api/v1/document/:documentId/downloadPath Parameters:
  • documentId: MongoDB ObjectId (24-character hex string)
Query Parameters:
ParameterTypeRequiredDescription
versionstringNoVersion number to download (transformed to number, must be > 0)
expirationTimeInSecondsstringNoExpiration time for the signed URL (transformed to number, must be > 0, default: 3600)
Validation:
  • Version is transformed from string to number and must be greater than 0 if provided
  • expirationTimeInSeconds is transformed from string to number and must be greater than 0 if provided
  • Version must exist in document’s version history
  • Only versioned documents can specify version parameter
  • Storage vendor must match current configuration
Internal endpoint for downloading documents.
  • Request
  • Response
Endpoint: GET /api/v1/document/internal/:documentId/downloadAuthentication: Requires scoped token with STORAGE_TOKEN scope.Query Parameters: Same as public download endpoint.
Get the document’s content as a buffer.
  • Request
  • Response
Endpoint: GET /api/v1/document/:documentId/bufferPath Parameters:
  • documentId: MongoDB ObjectId (24-character hex string)
Query Parameters:
ParameterTypeRequiredDescription
versionstringNoVersion number to retrieve (transformed to number, min: 0)
Validation:
  • Version is transformed from string to number via .pipe(z.number().min(0).optional())
  • Version must exist if specified
  • Version cannot exceed available version history length
Internal endpoint for retrieving document buffers.
  • Request
  • Response
Endpoint: GET /api/v1/document/internal/:documentId/bufferAuthentication: Requires scoped token with STORAGE_TOKEN scope.Query Parameters: Same as public buffer endpoint.
Update a document’s content.
  • Request
  • Response
Endpoint: PUT /api/v1/document/:documentId/bufferPath Parameters:
  • documentId: MongoDB ObjectId (24-character hex string)
Request Body Parameters:
  • file: File to upload (multipart/form-data with field name ‘file’)
File Size Limit: 100MB (1024 * 1024 * 100 bytes)Validation: Uses DocumentIdParams schema which includes fileBuffer in bodyBehavior:
  • Updates the document content in storage
  • Increments mutation count
  • Updates document size metadata
Internal endpoint for updating document buffers.
  • Request
  • Response
Endpoint: PUT /api/v1/document/internal/:documentId/bufferAuthentication: Requires scoped token with STORAGE_TOKEN scope.Request Body: Same as public buffer update endpoint.
Version control operations for document lifecycle management.
Upload a new version of an existing document.
  • Request
  • Response
Endpoint: POST /api/v1/document/:documentId/uploadNextVersionPath Parameters:
  • documentId: MongoDB ObjectId (24-character hex string)
Request Body Parameters:
ParameterTypeRequiredDescription
fileFileYesThe new file content (multipart/form-data with field name ‘file’)
currentVersionNotestringNoNote for the current version (if modified)
nextVersionNotestringNoNote for the next version
File Size Limit: 100MB (1024 * 1024 * 100 bytes)Validation:
  • Document must support versioning (isVersionedFile: true)
  • File extension must match existing document extension
  • Document existence and access verification
Behavior:
  • Checks if current document differs from last version
  • If changed, creates version entry for current state with currentVersionNote
  • Uploads new version with nextVersionNote
  • Updates current document content
  • Increments mutation count
Internal endpoint for uploading next version.
  • Request
  • Response
Endpoint: POST /api/v1/document/internal/:documentId/uploadNextVersionAuthentication: Requires scoped token with STORAGE_TOKEN scope.Request Body: Same structure as public upload next version endpoint.
Restore a document to a previous version.
  • Request
  • Response
Endpoint: POST /api/v1/document/:documentId/rollBackPath Parameters:
  • documentId: MongoDB ObjectId (24-character hex string)
Query Parameters:
ParameterTypeRequiredDescription
versionstringYesVersion number to roll back to (transformed to number, min: 0)
Request Body Parameters:
ParameterTypeRequiredDescription
notestringYesNote about the rollback operation
Validation:
  • Uses RollBackToPreviousVersionSchema which extends GetBufferSchema
  • Version comes from query parameter (inherited from GetBufferSchema)
  • Note comes from request body
  • Document must support versioning
  • Target version must exist and be less than current version count
Note: There appears to be a discrepancy between the validator schema (which expects version in query) and the controller implementation (which reads version from body). The schema specification is documented here.
Internal endpoint for rolling back document versions.
  • Request
  • Response
Endpoint: POST /api/v1/document/internal/:documentId/rollBackAuthentication: Requires scoped token with STORAGE_TOKEN scope.Request Body: Same structure as public rollback endpoint.
Direct upload capabilities and utility functions.
Generate a presigned URL for directly uploading to storage.
  • Request
  • Response
Endpoint: POST /api/v1/document/:documentId/directUploadPath Parameters:
  • documentId: MongoDB ObjectId (24-character hex string)
Request Body: Empty object (per DirectUploadSchema)Use Case: For large files that need to be uploaded directly to storage vendor without going through the API server.Validation:
  • Document and document path must exist
  • Storage vendor configuration must be valid
  • Uses DirectUploadSchema validation
Internal endpoint for generating direct upload URLs.
  • Request
  • Response
Endpoint: POST /api/v1/document/internal/:documentId/directUploadAuthentication: Requires scoped token with STORAGE_TOKEN scope.
Check if a document has been modified since its last version.
  • Request
  • Response
Endpoint: GET /api/v1/document/:documentId/isModifiedPath Parameters:
  • documentId: MongoDB ObjectId (24-character hex string)
Validation: Uses DocumentIdParams schemaBehavior:
  • Compares current document buffer with latest version in history (or version 0)
  • Returns boolean indicating modification status
  • Uses buffer comparison for accurate change detection
Internal endpoint for checking document modification status.
  • Request
  • Response
Endpoint: GET /api/v1/document/internal/:documentId/isModifiedAuthentication: Requires scoped token with STORAGE_TOKEN scope.
Internal configuration management endpoints for service administration.
Internal endpoint for updating application configuration.
  • Request
  • Response
Endpoint: POST /api/v1/document/updateAppConfigAuthentication: Requires scoped token with FETCH_CONFIG scope.Behavior:
  • Reloads application configuration from configuration manager
  • Updates storage configuration in dependency injection container
  • Recreates storage controller with new configuration
  • Allows dynamic reconfiguration without service restart

Schema Definitions

  • Document Schema
  • DocumentVersion Schema
  • StorageInfo Schema
  • CustomMetadata Schema
  • Storage Vendor Types
  • Validation Schemas
interface Document {
  // Basic document information
  documentName: string;
  alternateDocumentName?: string;
  documentPath?: string;
  isVersionedFile: boolean;
  orgId: mongoose.Types.ObjectId;
  
  // Document metrics
  mutationCount?: number;
  sizeInBytes?: number;
  mimeType?: string;
  extension: string;
  
  // Access control
  permissions?: 'owner' | 'editor' | 'commentator' | 'readonly';
  initiatorUserId: mongoose.Types.ObjectId | null;
  
  // Versioning
  versionHistory?: DocumentVersion[];
  currentVersion: number;
  
  // Metadata
  customMetadata?: CustomMetadata[];
  tags?: string[];
  
  // Timestamps
  createdAt?: number;
  updatedAt?: number;
  
  // Deletion status
  deletedByUserId?: mongoose.Types.ObjectId;
  isDeleted: boolean;
  
  // Storage information
  s3?: StorageInfo;
  azureBlob?: StorageInfo;
  local?: StorageInfo;
  storageVendor: StorageVendor;
}

Supported MIME Types

The storage service supports an extensive list of MIME types including:
  • Document Formats
  • Image Formats
  • Archive Formats
  • Media & Other Formats
  • PDF: application/pdf
  • Microsoft Office: Word (.docx, .doc), Excel (.xlsx, .xls), PowerPoint (.pptx, .ppt)
  • OpenDocument: .odt, .ods, .odp
  • Text: .txt, .rtf, .csv, .md, .mdx
  • Web: .html, .css, .js, .json, .xml
  • eBooks: .epub
  • Google Workspace: .gdoc, .gsheet, .gslides, .gdraw

Storage Provider Implementation

The Storage Service uses an adapter pattern to abstract the underlying storage provider implementation.
  • Amazon S3
  • Azure Blob Storage
  • Local Storage
Features:
  • Direct file uploads/downloads via AWS SDK
  • Signed URLs with configurable expiration (default: 1 hour)
  • Multipart uploads for large files
  • Object versioning support
  • Comprehensive error handling with specific S3 error types
Configuration:
interface S3StorageConfig {
  accessKeyId: string;
  secretAccessKey: string;
  region: string;
  bucketName: string;
}
URL Structure:
https://{bucket}.s3.{region}.amazonaws.com/{orgId}/PipesHub/{path}/{documentId}/current/{name}.{ext}
Error Handling:
  • StorageConfigurationError - Missing credentials
  • StorageUploadError - Upload failures
  • StorageDownloadError - Download failures
  • PresignedUrlError - URL generation failures
  • MultipartUploadError - Multipart operation failures

File Size Limits & Direct Upload

  • Size Limits
  • Direct Upload Flow
Public Endpoints:
  • Upload: 1GB (1024 * 1024 * 1000 bytes)
  • Buffer operations: 100MB (1024 * 1024 * 100 bytes)
Internal Endpoints:
  • All operations: 100MB (1024 * 1024 * 100 bytes)
Direct Upload Threshold:
  • Set via maxFileSizeForPipesHubService constant
  • Currently configured to 0MB (all files trigger direct upload evaluation)
  • Applied only to S3 and Azure Blob storage
  • Local storage always uses direct API upload

Error Handling

All endpoints return structured error responses:
{
  "error": {
    "code": "VALIDATION_ERROR",
    "message": "Invalid request parameters",
    "details": {
      "field": "documentName",
      "issue": "Document name cannot contain extensions"
    }
  },
  "timestamp": "2025-04-27T13:20:00.000Z"
}
Common Error Codes:
  • VALIDATION_ERROR - Invalid request parameters
  • NOT_FOUND - Document not found
  • BAD_REQUEST - Invalid request format
  • FORBIDDEN - File format mismatch or access denied
  • INTERNAL_SERVER_ERROR - Storage service errors
Storage-Specific Errors:
  • Extension validation failures
  • File size limit exceeded
  • Storage vendor configuration errors
  • Version control constraint violations
  • MIME type validation failures
HTTP Status Codes:
  • 200 - Success
  • 308 - Permanent Redirect (for direct uploads)
  • 400 - Bad Request
  • 401 - Unauthorized
  • 403 - Forbidden
  • 404 - Not Found
  • 500 - Internal Server Error

Configuration Management

  • Storage Configuration
  • Service Integration
Storage configuration is maintained in etcd under /services/storage path:
{
  "storageType": "s3|azureBlob|local",
  "credentials": {
    // Provider-specific configuration
  }
}
Dynamic Configuration:
  • Service loads configuration at startup and via watch mechanism
  • Can switch storage vendors without service restart
  • Configuration changes trigger adapter reinitialization
  • Graceful handling of configuration errors

Important Notes

  1. Schema Discrepancy: There’s a mismatch between the rollback validator schema (expects version in query) and controller implementation (reads version from body). Documentation follows schema specification.
  2. File Extensions: Automatically detected from original filename and validated against comprehensive MIME type mappings.
  3. Version Control: Optional per document via isVersionedFile flag, maintains complete history with metadata and notes.
  4. Storage Abstraction: Seamless switching between storage vendors via configuration without code changes.
  5. Organization Isolation: All documents are organization-scoped (orgId) for secure multi-tenancy.
  6. Soft Deletion: Documents are marked as deleted (isDeleted: true) rather than physically removed.
  7. Direct Upload: Large files bypass API server for optimal performance, using vendor-specific presigned URLs.
  8. Internal APIs: Service-to-service communication with scoped authentication using specific token scopes.
  9. Mutation Tracking: Documents track change count (mutationCount) for auditing and optimization.
  10. MIME Type Validation: Extensive validation against supported file types with automatic extension detection.
  11. Path Structure: Organized as {orgId}/PipesHub/{documentPath}/{documentId}/current|versions/ for clear file organization.
  12. Cross-Platform Support: Local storage handles Windows, macOS, and Linux filesystem differences transparently.
I