Skip to main content

Job System

The job system is the backbone of OSAPI. Every state-reading or state-changing operation runs as an asynchronous job, allowing the API server to remain unprivileged while agents execute operations on target hosts.

How It Works

OSAPI uses a KV-first, stream-notification architecture built on NATS JetStream:

  1. The API server writes a job definition to a NATS KV bucket
  2. A notification is published to a NATS stream
  3. An agent receives the notification, reads the job from KV, and executes the operation
  4. The agent writes the result to a response KV bucket
  5. The client polls the API server, which reads the result from KV

Job Routing

Jobs can be targeted to specific agents using routing modes:

TargetBehavior
_anyLoad-balanced across available agents (default)
_allBroadcast to every agent
hostnameSent to a specific host
group:labelSent to all agents matching a label

Agents register with their hostname and optional key-value labels. Labels support hierarchical matching with dot separators (e.g., group:web.dev matches agents with group: web.dev.us-east).

Job Lifecycle

Jobs progress through a defined set of states:

Jobs are created implicitly when you call a domain endpoint. Once created, jobs can be listed, inspected, deleted, and retried through the job API and CLI. See CLI Reference for usage and examples, or the API Reference for the REST endpoints.

Configuration

nats:
kv:
bucket: 'job-queue' # KV bucket for job definitions
response_bucket: 'job-responses' # KV bucket for results
ttl: '1h' # Entry time-to-live
max_bytes: 104857600 # 100 MiB max bucket size

agent:
max_jobs: 10 # Max concurrent jobs
queue_group: 'job-agents' # Queue group for load balancing
hostname: '' # Defaults to OS hostname
labels: # Key-value labels for routing
group: 'web.dev.us-east'

See Configuration for the full reference including NATS stream, consumer, and DLQ settings.

Permissions

OperationPermission
Create jobDomain-specific (e.g., node:read, network:write)
List/get jobsjob:read
Delete jobjob:write
Retry jobjob:write

Jobs are created implicitly through typed domain endpoints, so the permission required to create a job depends on the operation being performed. For example, reading a hostname requires node:read, while updating DNS requires network:write.