Crawling Manager API
The Crawling Manager API provides endpoints for scheduling, managing, and monitoring data crawling jobs across various connector types including Google Workspace, OneDrive, SharePoint Online, Slack, and Confluence.Base URL
All endpoints are prefixed with/api/v1/crawlingManager
Authentication
All endpoints require:- Authentication via
Authorizationheader with valid JWT token - Admin privileges - Only users with admin role can access these endpoints
Supported Connectors
gmail- Gmail connectordrive- Google Drive connectoronedrive- OneDrive connectorsharepointonline- SharePoint Online connectorconfluence- Confluence connectorslack- Slack connectorlinear- Linear connectordropbox- Dropbox connectoroutlook- Outlook connectorjira- Jira connectoratlassian- Atlassian connectorgithub- GitHub connectorbox- Box connectors3- Amazon S3 connectorazure- Azure connectorairtable- Airtable connectorzendesk- Zendesk connector
crawl-{connector}-{orgId} where connector is normalized to lowercase with hyphens.
API Endpoints
POST /:connector/schedule - Schedule Job
POST /:connector/schedule - Schedule Job
Schedule a new crawling job for a specific connector type.Endpoint: Schedule Configuration Types:All schedule configurations inherit base properties:Daily Schedule:Weekly Schedule:Monthly Schedule:Custom Schedule (Cron):Once Schedule:
POST /api/v1/crawlingManager/:connector/scheduleParameters:connector(string, path) - The connector type (gmail, drive, onedrive, sharepointonline, confluence, slack, linear, dropbox, outlook, jira, atlassian, github, box, s3, azure, airtable, zendesk)
- Request Body
- Success Response
- Error Responses
scheduleType(required) - The type of scheduleisEnabled(boolean, default: true) - Whether the schedule is enabledtimezone(string, default: “UTC”) - Timezone for schedule execution
GET /:connector/schedule - Get Job Status
GET /:connector/schedule - Get Job Status
Retrieve the status of a scheduled crawling job for a specific connector.Endpoint:
GET /api/v1/crawlingManager/:connector/scheduleParameters:connector(string, path) - The connector type
- Success Response
- Error Responses
Status: Job States:
200 OKwaiting- Job is waiting to be processedactive- Job is currently being processedcompleted- Job completed successfullyfailed- Job failed with errorsdelayed- Job is delayed for future executionpaused- Job is paused
GET /schedule/all - Get All Job Statuses
GET /schedule/all - Get All Job Statuses
Retrieve all scheduled crawling jobs for the organization.Endpoint:
GET /api/v1/crawlingManager/schedule/all- Success Response
- Error Responses
Status:
200 OKDELETE /:connector/schedule - Remove Job
DELETE /:connector/schedule - Remove Job
Remove a scheduled crawling job for a specific connector.Endpoint:
DELETE /api/v1/crawlingManager/:connector/scheduleParameters:connector(string, path) - The connector type
- Success Response
- Error Responses
Status:
200 OKDELETE /schedule/all - Remove All Jobs
DELETE /schedule/all - Remove All Jobs
Remove all scheduled crawling jobs for the organization.Endpoint:
DELETE /api/v1/crawlingManager/schedule/all- Success Response
- Error Responses
Status:
200 OKPOST /:connector/pause - Pause Job
POST /:connector/pause - Pause Job
Pause a scheduled crawling job for a specific connector.Endpoint:
POST /api/v1/crawlingManager/:connector/pauseParameters:connector(string, path) - The connector type
- Success Response
- Error Responses
Status:
200 OKPOST /:connector/resume - Resume Job
POST /:connector/resume - Resume Job
Resume a paused crawling job for a specific connector.Endpoint:
POST /api/v1/crawlingManager/:connector/resumeParameters:connector(string, path) - The connector type
- Success Response
- Error Responses
Status:
200 OKGET /stats - Get Queue Statistics
GET /stats - Get Queue Statistics
Retrieve statistics about the crawling job queue.Endpoint:
GET /api/v1/crawlingManager/stats- Success Response
- Error Responses
Status: Statistics Fields:
200 OKwaiting- Number of jobs waiting to be processedactive- Number of jobs currently being processedcompleted- Number of completed jobsfailed- Number of failed jobsdelayed- Number of delayed jobspaused- Number of paused jobsrepeatable- Number of repeatable/scheduled jobstotal- Total number of jobs across all states
Data Types
- Enums
- Interfaces
System Configuration
- Rate Limiting
- Job Lifecycle
- Sync Events
The API includes built-in rate limiting and concurrency controls:
- Maximum 5 concurrent jobs per queue
- Jobs are automatically retried with exponential backoff (5000ms initial delay) on failure
- Stalled jobs are detected after 30 seconds
- Maximum 3 retry attempts for failed jobs
- Job history retention: Last 10 completed and 10 failed jobs per connector type
- Jobs are removed and recreated when updating schedules (no job modification)
Error Handling
All endpoints follow a consistent error response format:200- Success201- Created (for scheduling jobs)400- Bad Request (validation errors, invalid configuration)401- Unauthorized (missing or invalid authentication)403- Forbidden (insufficient privileges, admin required)404- Not Found (job not found)500- Internal Server Error














