Skip to main content

Job System

The job system is the backbone of OSAPI. Every state-reading or state-changing operation runs as an asynchronous job, allowing the API server to remain unprivileged while agents execute operations on target hosts.

How It Works

OSAPI uses a KV-first, stream-notification architecture built on NATS JetStream:

  1. The API server writes a job definition to a NATS KV bucket
  2. A notification is published to a NATS stream
  3. An agent receives the notification, reads the job from KV, and executes the operation
  4. The agent writes the result to a response KV bucket
  5. The client polls the API server, which reads the result from KV

Job Routing

Jobs can be targeted to specific agents using routing modes:

TargetBehavior
_anyLoad-balanced across available agents (default)
_allBroadcast to every agent
hostnameSent to a specific host
group:labelSent to all agents matching a label

Agents register with their hostname and optional key-value labels. Labels support hierarchical matching with dot separators (e.g., group:web.dev matches agents with group: web.dev.us-east).

Job Lifecycle

Jobs progress through a defined set of states:

Jobs can be listed, inspected, deleted, and retried through the API and CLI. See CLI Reference for usage and examples, or the API Reference for the REST endpoints.

Configuration

nats:
kv:
bucket: 'job-queue' # KV bucket for job definitions
response_bucket: 'job-responses' # KV bucket for results
ttl: '1h' # Entry time-to-live
max_bytes: 104857600 # 100 MiB max bucket size

node:
agent:
max_jobs: 10 # Max concurrent jobs
queue_group: 'job-agents' # Queue group for load balancing
hostname: '' # Defaults to OS hostname
labels: # Key-value labels for routing
group: 'web.dev.us-east'

See Configuration for the full reference including NATS stream, consumer, and DLQ settings.

Permissions

OperationPermission
Create jobjob:write
List/get jobsjob:read
Delete jobjob:write
Retry jobjob:write