You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: en_US/design/durable-storage.md
+2-20Lines changed: 2 additions & 20 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -59,7 +59,7 @@ A single EMQX cluster can host multiple DS databases.
59
59
60
60
#### Shard
61
61
62
-
A shard is the horizontal partition of a durable storage database. Data is distributed across shards based on the publisher's client ID, enabling parallel processing and high availability. Each EMQX node can host one or more shards, and the total number of shards is determined by [n_shards](./managing-replication.md#number-of-shards) configuration parameter during the initial startup of EMQX.
62
+
A shard is the horizontal partition of a durable storage database. Data is distributed across shards based on the publisher's client ID, enabling parallel processing and high availability. Each EMQX node can host one or more shards, and the total number of shards is determined by [n_shards](../durability/managing-replication.md#number-of-shards) configuration parameter during the initial startup of EMQX.
63
63
64
64
Shards also serve as the fundamental unit of replication. Each shard is replicated across multiple nodes according to the `durable_storage.messages.replication_factor` setting, ensuring that all replicas maintain identical message sets for redundancy and fault tolerance.
65
65
@@ -76,7 +76,7 @@ Generations may also differ in how they internally structure and store data, dep
76
76
77
77
#### Slab
78
78
79
-
A slab is a physical partition of data identified by both shard ID and generation ID. Each slab acts as a durable container for one or more streams. All data in a slab shares the same encoding schema, eliminating the need for storing extra metadata. Atomicity and consistency properties are guaranteed within a slab.
79
+
A slab is a physical partition of data identified by both shard ID and generation ID. Each slab acts as a durable container for one or more durable storage streams. All data in a slab shares the same encoding schema, eliminating the need for storing extra metadata. Atomicity and consistency properties are guaranteed within a slab.
80
80
81
81
Example: `shard 2, gen 3` represents a distinct slab that stores all streams written during that generation’s time range.
82
82
@@ -160,24 +160,6 @@ Both pools group subscribers by stream and topic, reusing resources to serve mul
## Applications: Durable Sessions and Shared Subscriptions
164
-
165
-
Durable Storage is the backbone for EMQX's advanced reliability features:
166
-
167
-
### Durable Sessions (EMQX 5+)
168
-
169
-
Durable sessions are a parallel session implementation that uses DS for message routing.
170
-
171
-
-**Mechanism:** When a client connects with a session expiry interval greater than zero and subscribes to a topic, the filter is marked as durable. Messages published to matching topics are saved to DS *in addition* to being dispatched.
172
-
-**State:** Durable sessions access saved messages via the DS subscription mechanism. Their state includes a set of iterators for each matching stream, allowing them to precisely track their progress. Only one copy of each message is stored per database replica, regardless of how many durable sessions share it.
173
-
174
-
### Shared Subscriptions (EMQX 6.0)
175
-
176
-
EMQX 6.0 extended DS to shared subscriptions for enhanced load balancing and reliability.
177
-
178
-
-**Iterator Management:** The iterator sets for shared subscriptions are managed by a separate entity called the **shared sub leader**.
179
-
-**Replay and Rebalancing:** Sessions subscribing to a shared topic communicate with the leader, which **lends them iterators** for message replay. Updated iterators are reported back. If a client disconnects or the group is rebalanced, the leader **revokes the iterators** and redistributes them to other members, ensuring consumption continuity and load distribution.
180
-
181
163
## Conclusion: The Foundation of High-Reliability MQTT
182
164
183
165
The Optimized Durable Storage in EMQX 6.0 is the resilient foundation for high-reliability MQTT messaging. By re-engineering RocksDB and embedding concepts like TTVs and Streams, DS provides a purpose-built, highly available, and persistent internal database. This architecture, coupled with sophisticated features like the LTS algorithm and Raft replication, ensures lossless message delivery and optimal retrieval for complex wildcard and shared subscriptions, solidifying EMQX's position as a leading solution for demanding IoT infrastructure.
Copy file name to clipboardExpand all lines: en_US/durability/durability_introduction.md
+35-1Lines changed: 35 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -12,7 +12,7 @@ Before learning the Durable Sessions feature in EMQX, it's essential to understa
12
12
13
13
**Session**: A session is a lightweight process within EMQX created for every client connection. Sessions implement behaviors prescribed to the broker by MQTT Standard, including initial connection, subscribing and unsubscribing to topics, and message dispatching.
14
14
15
-
**Durable Storage**: Durable storage is an internal database within EMQX. Sessions may use it to save their state and MQTT messages sent to the topics. Database engine powering durable storage uses [RocksDB](https://rocksdb.org/) to save the data on disk, and [Raft algorithm](https://raft.github.io/) to consistently replicate data across the cluster. It is important not to confuse durable storage with **Durable Sessions**.
15
+
**Durable Storage (DS)**: Durable storage is an internal database within EMQX. Sessions may use it to save their state and MQTT messages sent to the topics. Database engine powering durable storage uses [RocksDB](https://rocksdb.org/) to save the data on disk, and [Raft algorithm](https://raft.github.io/) to consistently replicate data across the cluster. It is important not to confuse durable storage with **Durable Sessions**.
16
16
17
17
### Session Expiry Interval
18
18
@@ -161,6 +161,40 @@ Even if durable sessions are not enabled, following steps 2-4 will still retain
161
161
162
162
Durable Sessions rely on the Durable Storage for persisting session state and messages. To understand how this storage layer is structured and operates, refer to the *Architecture: Backends and Storage Hierarchy* section in [Design for Durable Storage](../design/durable-storage.md).
163
163
164
+
## How Durable Storage Supports Durable and Shared Subscription Sessions
165
+
166
+
Durable Storage is the backbone for durable sessions and shared subscription sessions in EMQX.
167
+
168
+
### Durable Sessions
169
+
170
+
Durable Sessions are implemented on top of the DS database engine. When a client connects with a **non-zero session expiry interval**, EMQX stores the session state and the messages routed to that session in DS.
171
+
172
+
-**Message persistence:**
173
+
174
+
When a durable session subscribes to a topic, matching messages are saved to the DS in addition to being delivered to online clients. This ensures that messages published while the client is offline are available when it reconnects.
175
+
176
+
-**Progress tracking:**
177
+
178
+
Durable sessions read messages from DS using *iterators*, lightweight markers that track how far the session has progressed within each durable storage stream. This allows message replay to resume reliably after disconnection or node restart.
179
+
180
+
-**Efficient storage:**
181
+
182
+
Messages are stored only once per DS replica, regardless of how many durable sessions subscribe to the topic, minimizing storage overhead.
183
+
184
+
### Shared Subscription Sessions
185
+
186
+
Starting from EMQX v6.0, DS also supports the persistence of shared subscription sessions. Shared subscriptions rely on DS to maintain consistent message distribution across a subscriber group.
187
+
188
+
-**Iterator management:**
189
+
190
+
A designated shared subscription leader manages iterator sets for the group. It assigns iterators to members to ensure coordinated consumption.
191
+
192
+
-**Replay and rebalancing:**
193
+
194
+
Sessions subscribing to a shared topic communicate with the leader, which lends them iterators for message replay. Updated iterators are reported back. If a client disconnects or the group is rebalanced, the leader revokes the iterators and redistributes them to other members, ensuring consumption continuity and load distribution.
195
+
196
+
These mechanisms ensure load balancing, message ordering, and fault tolerance across the entire subscription group.
197
+
164
198
## Durable Storage Across Cluster
165
199
166
200
Each node within an EMQX cluster is assigned a unique *Site ID*, which serves as a stable identifier, independent of the Erlang node name (`emqx@...`). Site IDs are persistent, and they are randomly generated at the first startup of the node. This stability maintains the integrity of the data, especially in scenarios where nodes might undergo name modifications or reconfigurations.
0 commit comments