Skip to content

API Usage Guide

Use this guide after you deploy your own Rafiki Worker and Modal backend. It is the main API reference for the Cloudflare routes your app calls. Do not call internal Modal URLs directly or send internal service headers from clients.

If you are still setting up, start with Getting Started, Deploy to Modal, and Deploy to Cloudflare.

Base URL

https://<your-worker>.workers.dev

Most Developers Start Here

If you need to... Use Notes
send one request and wait for the full answer POST /query simplest request/response path
stream execution events live GET /query_stream WebSocket upgrade
run work in the background POST /submit poll with GET /jobs/{job_id}

Auth

What your client sends

  • All public endpoints except GET /health require Authorization: Bearer <session_token>.
  • WebSocket routes also accept token=<session_token> as a query parameter.
  • Session tokens are usually minted by your server. They authorize one session_id or a list of session IDs, and can also carry user_id and tenant_id scope.
  • Session-scoped read routes that do not already carry session_id in the path or body must include session_id as a query parameter. Common examples are /schedules?... and some /jobs/{job_id}... reads.

What Rafiki handles internally

  • Rafiki signs and verifies its own service-to-service auth between the Worker and the execution backend.
  • Clients should only integrate with the public Worker routes documented here.

Core Endpoints

GET /health

Health probe for the public Worker.

POST /query

Non-streaming query execution. This is the easiest first integration path.

Example:

curl -X POST https://<your-worker>.workers.dev/query \
  -H "Authorization: Bearer <session_token>" \
  -H "Content-Type: application/json" \
  -d '{
    "question": "What is the capital of Canada?",
    "session_id": "sess_demo_001"
  }'

Request body:

{
  "question": "Your prompt",
  "agent_type": "default",
  "session_id": null,
  "session_key": null,
  "fork_session": false
}

Error responses:

  • 400 when the request body is not valid JSON or does not include a string question
  • 409 when a session is already executing
  • 429 when session budget rails deny pre-flight execution
  • 500 when execution fails after request validation

Response envelope:

{
  "ok": true,
  "messages": [
    {
      "type": "assistant",
      "content": [{ "type": "text", "text": "..." }],
      "model": "gpt-4.1"
    },
    {
      "type": "result",
      "subtype": "success",
      "duration_ms": 1234,
      "session_id": "...",
      "usage": {
        "requests": 1,
        "input_tokens": 100,
        "output_tokens": 50,
        "total_tokens": 150
      },
      "result": "..."
    }
  ],
  "summary": {
    "text": "...",
    "is_complete": true,
    "subtype": "success",
    "duration_ms": 1234,
    "session_id": "..."
  },
  "session_id": "..."
}

GET /query_stream

Streaming query execution via WebSocket upgrade.

Use this when your client needs live execution updates instead of one final response.

Expected public WebSocket message types include:

  • connection_ack
  • query_start
  • assistant_message
  • execution_state
  • query_complete
  • query_error

When resuming a session, pass session_id in the query string. WebSocket query messages must not include session_id, session_key, user_id, or tenant_id; Rafiki derives actor scope from the authenticated connection.

GET /ws or GET /events

WebSocket event-bus subscriptions for multi-session updates.

Authentication:

  • Authorization: Bearer <session_token> header, or
  • token=<session_token> query parameter

Supported subscription query parameters:

  • session_id
  • session_ids (comma-separated)
  • user_id
  • tenant_id

Expected public event-bus message types include:

  • connection_ack
  • presence_update
  • session_update
  • job_submitted
  • job_status

Session Resources

Only the documented /state, /messages, /queue, /queue/{prompt_id}, and /stop session routes are public Worker contract surfaces.

  • GET /session/{session_id} and /session/{session_id}/query are blocked at the Worker edge and return 404
  • Methods outside the documented set return 405

GET /session/{session_id}/state

Returns current session metadata.

Response payload:

{
  "ok": true,
  "state": {
    "session_id": "...",
    "session_key": "...",
    "user_id": "...",
    "tenant_id": "...",
    "created_at": 1234,
    "last_active_at": 1234,
    "status": "idle"
  }
}

GET /session/{session_id}/messages

Returns persisted user and assistant message history for the session.

Response payload:

{
  "ok": true,
  "messages": [
    {
      "id": "...",
      "session_id": "...",
      "role": "user",
      "content": [{ "type": "text", "text": "..." }],
      "created_at": 1234
    }
  ]
}

POST /session/{session_id}/queue

Queue a prompt for sequential execution in the target session.

Actor scope is derived from the authenticated session context, not from client-controlled identity fields in the request body.

Request body:

{
  "question": "Your prompt",
  "agent_type": "default"
}

Error responses:

  • 400 when the request body is not valid JSON or does not include a string question
  • 429 when the session queue has reached its configured max size
  • 429 when session budget rails deny queue preflight

GET /session/{session_id}/queue

Returns queue state, prompt positions, and expiry timestamps.

Response payload:

{
  "ok": true,
  "session_id": "...",
  "is_executing": false,
  "queue_size": 1,
  "max_queue_size": 10,
  "prompts": [
    {
      "prompt_id": "...",
      "question": "...",
      "user_id": "...",
      "queued_at": 1234,
      "expires_at": 5678,
      "position": 1
    }
  ]
}

DELETE /session/{session_id}/queue

Clears the session queue.

Response payload includes ok, session_id, cleared_count, and a human-readable message.

DELETE /session/{session_id}/queue/{prompt_id}

Removes a single queued prompt.

Error responses:

  • 404 when the prompt_id is not present in the queue

POST /session/{session_id}/stop

Stops an active run.

Request body:

{
  "mode": "graceful",
  "reason": "optional"
}

Modes:

  • graceful: after-turn cancellation
  • immediate: immediate cancellation

Response payload mirrors the stop contract, including status, requested_at, expires_at, reason, and requested_by.

Additional behavior:

  • requested_by is derived from the authenticated actor scope and is not accepted as a client-controlled field
  • 400 when the request body is not valid JSON or fails the public stop schema
  • 502 when the upstream stop response fails the public runtime contract

GET /session/{session_id}/stop

Returns cancellation flag state.

Error responses:

  • 502 when the upstream stop-status response fails the public runtime contract

Session Semantics

  • A new run returns a stable session_id
  • Reuse session_id to continue memory
  • Set fork_session=true to branch into a new session_id with inherited history

Background Jobs

POST /submit

Queues a background job.

Error responses:

  • 400 when the request body is not valid JSON or fails the public job schema

GET /jobs/{job_id}

Returns status plus result, summary, and metrics when complete.

Job reads enforce the authenticated session, user, and tenant scope. Rafiki returns deterministic 403 responses when ownership checks fail.

Additional validation:

  • 502 when the upstream job-status payload is malformed
  • 502 when the upstream payload omits contract-required identity fields

GET /jobs/{job_id}/artifacts

Returns the artifact manifest for a completed job.

Artifact reads also enforce authenticated ownership scope and return 403 on mismatch.

GET /jobs/{job_id}/artifacts/{artifact_path}

Downloads a single artifact.

Additional behavior:

  • Rafiki performs ownership checks before attempting the download
  • malformed URL-encoded artifact paths return deterministic 400 with Invalid artifact path encoding
  • Rafiki handles internal artifact-access tokens on the backend; clients call the public Worker route only

Schedules

POST /schedules

Creates a schedule owned by the authenticated session and actor scope.

Error responses:

  • 400 when the request body is not valid JSON or fails the public schedule-create schema

GET /schedules

Lists schedules visible to the authenticated actor scope.

Additional validation:

  • 502 when the upstream payload does not match the public schedule-list contract

GET /schedules/{schedule_id}

Returns a single schedule resource.

Additional validation:

  • 502 when the upstream payload does not match the public schedule contract

PATCH /schedules/{schedule_id}

Updates a schedule resource.

Error responses:

  • 400 when the request body is not valid JSON or fails the public schedule-update schema

Browser preflight for this route explicitly allows PATCH.

References