> For the complete documentation index, see [llms.txt](https://api.docs.blockbrain.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://api.docs.blockbrain.ai/concepts/streaming.md).

# Streaming

SSE provides a lightweight one-way stream over HTTP where the server pushes incremental tokens/events to the client for responsive UIs and long-running completions

#### Key concepts

* **Transport**: HTTP with `Content-Type: text/event-stream`, connection kept open.
* **Events**: Lines prefixed by `event:` and `data:`; each event ends with a blank line.
* **Heartbeat**: Periodic comments (`: ping`) keep the connection alive.
* **Termination**: A final event (e.g., `event: done`) or stream close.

#### Typical flow

1. Client sends a completion request indicating streaming mode.
2. Server responds with `text/event-stream` and starts sending tokens/events.
3. Client renders tokens incrementally and listens for terminal event.

### Full stream of SSE Example

![SSE Example](/files/1KP4QVMOShQNnmClQ3lW)

## How our SSE works

![Overview of our streaming SSE](/files/hEcwB5dWbp8jtQ5QpoVo)

Our streaming communication flow consists of four main components:

* Client
* Streaming Service
* Completion Service
* AI Provider

The sequence can be broken down into several key phases:

### Initial Connection Setup

* The Client initiates a connection to the Streaming Service using URL endpoint `/message-stream/{convo_id}/{session_id}`

{% openapi src="<https://blocky.theblockbrain.ai/openapi.json>" path="/message-stream/{convo\_id}/{session\_id}" method="get" %}
<https://blocky.theblockbrain.ai/openapi.json>
{% endopenapi %}

* Upon connection established, the Streaming Service responds with a `connected` event

### Request Initiation

* The Client sends a request to the Completion Service using URL endpoint `/cortex/completion/user-input`

{% openapi src="<https://blocky.theblockbrain.ai/openapi.json>" path="/cortex/completions/user-input" method="post" %}
<https://blocky.theblockbrain.ai/openapi.json>
{% endopenapi %}

* The Completion Service forwards this request to the AI Provider with streaming enabled, the actual endpoint varies by each AI Provider (OpenAI, Azure, Anthropic, Google, ...)

### Streaming Process

* Streaming begins with a `message_start` event propagating from the Completion Service to the Streaming Service via the streaming bus, then the Streaming Service forwards it to the Client, indicating that the service has started working to generate a response.
* The AI Provider streams generated tokens incrementally to the Completion Service.
* Each token is forwarded along with the `new_token` events through the chain: AI Provider → Completion Service → Streaming Service → Client.
* This token streaming process repeats multiple times until text generation is complete.

### Completion and Persistence

* The Completion Service signals `message_end` when text generation is complete.
* The response message is persisted internally by the Completion Service.
* A `message_ready` event is sent to indicate the full response message is now available for further processing (copy, delete, generate text-to-speech, ...).
* The Completion Service returns a `200 OK` response to the original client request.

### Connection Termination

* The sequence ends with the Streaming Service closing the connection to the Client.

## List of SSE Events

Beside the main SSE Events, there are others that provide extra information about the completion request, the details are as below:

<details>

<summary>Event: connected</summary>

Event: `connected` - The stream is connected

Typescript Interface

```typescript
interface SSEData {
  message: string;
}
```

Example Data:

```json
{
  "message": "Connection established"
}
```

</details>

<details>

<summary>Event: user_message</summary>

Event: `user_message` - The input message is successfully saved and can be used for further process (copy, delete, generate text-to-speech, ...)

Typescript Interface

```typescript
interface UserMessage {
  _id: string;
  createdAt: string;
  modifiedAt: string;
  botId: string;
  userId: string;
  content: string;
  role: string;
  model: string;
  messageType: string;
  status: string;
}
```

Example Data:

```json
{
  "_id": "678097bc45b174911deeeb44",
  "createdAt": "2025-01-10T03:45:00.534383",
  "modifiedAt": "2025-01-10T03:45:00.534384",
  "botId": "6628fddb8c9b707741f7b551",
  "userId": "255317220014445068",
  "content": "hello",
  "role": "user",
  "model": "azure-gpt-4o",
  "messageType": "user-question",
  "status": "activate"
}
```

</details>

<details>

<summary>Event: message_start</summary>

Event: `message_start` - The reply message is about to be sent

Typescript Interface

```typescript
interface MessageStartEvent {
  role: string; // The role of the reply message, e.g. assistant
  gid: string; // The id of the reply message
}
```

Example Data:

```json
{
  "role": "assistant",
  "gid": "678097bc45b174911deeeb45"
}
```

</details>

<details>

<summary>Event: new_token</summary>

Event: `new_token` - A new token is sent

Typescript Interface

```typescript
interface NewTokenEvent {
  role: string;
  token: string; // The new token to append
  gid: string; // The gid of the reply message
}
```

Example Data:

```json
{
  "role": "assistant",
  "token": "Hello",
  "gid": "678097bc45b174911deeeb45"
}
```

</details>

<details>

<summary>Event: message_end</summary>

Event: `message_end` - The reply message is complete

Typescript Interface

```typescript
interface MessageEndEvent {
  role: string;
  gid: string;
}
```

Example Data:

```json
{
  "role": "assistant",
  "gid": "678097bc45b174911deeeb45"
}
```

</details>

<details>

<summary>Event: message_ready</summary>

Event: `message_ready` - The reply message is successfully saved and can be used for further process (copy, delete, generate text-to-speech, ...)

Typescript Interface

```typescript
interface MessageReadyEvent {
  messageIds: string[]; // List of new messages that are successfully saved
}
```

Example Data:

```json
{
  "messageIds": ["678097bc45b174911deeeb45"]
}
```

</details>


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://api.docs.blockbrain.ai/concepts/streaming.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
