adding JetStream Docs

2025-01-18 04:03:23 -08:00 · 2020-11-24 15:12:44 -06:00
parent 8eb1e6a171
commit 1fd9cab1a5
19 changed files with 2109 additions and 0 deletions
--- a/jetstream/concepts/concepts.md
+++ b/jetstream/concepts/concepts.md
@@ -0,0 +1,19 @@
+## Concepts
+
+In JetStream the configuration for storing messages is defined separately from how they are consumed. Storage is defined in a *Stream* and consuming messages is defined by multiple *Consumers*.
+
+We'll discuss these 2 subjects in the context of this architecture.
+
+![Orders](../.gitbook/assets/images/streams-and-consumers-75p.png)
+
+While this is an incomplete architecture it does show a number of key points:
+
+ * Many related subjects are stored in a Stream
+ * Consumers can have different modes of operation and receive just subsets of the messages
+ * Multiple Acknowledgement modes are supported
+
+A new order arrives on `ORDERS.received`, gets sent to the `NEW` Consumer who, on success, will create a new message on `ORDERS.processed`.  The `ORDERS.processed` message again enters the Stream where a `DISPATCH` Consumer receives it and once processed it will create an `ORDERS.completed` message which will again enter the Stream. These operations are all `pull` based meaning they are work queues and can scale horizontally.  All require acknowledged delivery ensuring no order is missed.
+
+All messages are delivered to a `MONITOR` Consumer without any acknowledgement and using Pub/Sub semantics - they are pushed to the monitor.
+
+As messages are acknowledged to the `NEW` and `DISPATCH` Consumers, a percentage of them are Sampled and messages indicating redelivery counts, ack delays and more, are delivered to the monitoring system.
--- a/jetstream/concepts/configuration.md
+++ b/jetstream/concepts/configuration.md
@@ -0,0 +1,10 @@
+### Configuration
+
+The rest of this document introduces the `nats` utility, but for completeness and reference this is how you'd create the ORDERS scenario.  We'll configure a 1 year retention for order related messages:
+
+```bash
+$ nats str add ORDERS --subjects "ORDERS.*" --ack --max-msgs=-1 --max-bytes=-1 --max-age=1y --storage file --retention limits --max-msg-size=-1 --discard=old
+$ nats con add ORDERS NEW --filter ORDERS.received --ack explicit --pull --deliver all --max-deliver=-1 --sample 100
+$ nats con add ORDERS DISPATCH --filter ORDERS.processed --ack explicit --pull --deliver all --max-deliver=-1 --sample 100
+$ nats con add ORDERS MONITOR --filter '' --ack none --target monitor.ORDERS --deliver last --replay instant
+```
--- a/jetstream/concepts/consumers.md
+++ b/jetstream/concepts/consumers.md
@@ -0,0 +1,33 @@
+### Consumers
+
+Each Consumer, or related group of Consumers, of a Stream will need an Consumer defined.  It's ok to define thousands of these pointing at the same Stream.
+
+Consumers can either be `push` based where JetStream will deliver the messages as fast as possible to a subject of your choice or `pull` based for typical work queue like behavior. The rate of message delivery in both cases is subject to `ReplayPolicy`.  A `ReplayInstant` Consumer will receive all messages as fast as possible while a `ReplayOriginal` Consumer will receive messages at the rate they were received, which is great for replaying production traffic in staging.
+
+In the orders example above we have 3 Consumers. The first two select a subset of the messages from the Stream by specifying a specific subject like `ORDERS.processed`. The Stream consumes `ORDERS.*` and this allows you to receive just what you need. The final Consumer receives all messages in a `push` fashion.
+
+Consumers track their progress, they know what messages were delivered, acknowledged, etc., and will redeliver messages they sent that were not acknowledged. When first created, the Consumer has to know what message to send as the first one. You can configure either a specific message in the set (`StreamSeq`), specific time (`StartTime`), all (`DeliverAll`) or last (`DeliverLast`).  This is the starting point and from there, they all behave the same - delivering all of the following messages with optional Acknowledgement.
+
+Acknowledgements default to `AckExplicit` - the only supported mode for pull-based Consumers - meaning every message requires a distinct acknowledgement. But for push-based Consumers, you can set `AckNone` that does not require any acknowledgement, or `AckAll` which quite interestingly allows you to acknowledge a specific message, like message `100`, which will also acknowledge messages `1` through `99`. The `AckAll` mode can be a great performance boost.
+
+Some messages may cause your applications to crash and cause a never ending loop forever poisoning your system. The `MaxDeliver` setting allow you to set a upper bound to how many times a message may be delivered.
+
+To assist with creating monitoring applications, one can set a `SampleFrequency` which is a percentage of messages for which the system should sample and create events.  These events will include delivery counts and ack waits.
+
+When defining Consumers the items below make up the entire configuration of the Consumer:
+
+|Item|Description|
+|----|-----------|
+|AckPolicy|How messages should be acknowledged, `AckNone`, `AckAll` or `AckExplicit`|
+|AckWait|How long to allow messages to remain un-acknowledged before attempting redelivery|
+|DeliverPolicy|The initial starting mode of the consumer, `DeliverAll`, `DeliverLast`, `DeliverNew`, `DeliverByStartSequence` or `DeliverByStartTime`|
+|DeliverySubject|The subject to deliver observed messages, when not set, a pull-based Consumer is created|
+|Durable|The name of the Consumer|
+|FilterSubject|When consuming from a Stream with many subjects, or wildcards, select only a specific incoming subjects, supports wildcards|
+|MaxDeliver|Maximum amount times a specific message will be delivered.  Use this to avoid poison pills crashing all your services forever|
+|OptStartSeq|When first consuming messages from the Stream start at this particular message in the set|
+|ReplayPolicy|How messages are sent `ReplayInstant` or `ReplayOriginal`|
+|SampleFrequency|What percentage of acknowledgements should be samples for observability, 0-100|
+|OptStartTime|When first consuming messages from the Stream start with messages on or after this time|
+|RateLimit|The rate of message delivery in bits per second|
+|MaxAckPending|The maximum number of messages without acknowledgement that can be outstanding, once this limit is reached message delivery will be suspended|
--- a/jetstream/concepts/streams.md
+++ b/jetstream/concepts/streams.md
@@ -0,0 +1,31 @@
+### Streams
+
+Streams define how messages are stored and retention duration.  Streams consume normal NATS subjects, any message found on those subjects will be delivered to the defined storage system. You can do a normal publish to the subject for unacknowledged delivery, else if you send a Request to the subject the JetStream server will reply with an acknowledgement that it was stored.
+
+As of January 2020, in the tech preview we have `file` and `memory` based storage systems, we do not yet support clustering.
+
+In the diagram above we show the concept of storing all `ORDERS.*` in the Stream even though there are many types of order related messages. We'll show how you can selectively consume subsets of messages later. Relatively speaking the Stream is the most resource consuming component so being able to combine related data in this manner is important to consider.
+
+Streams can consume many subjects. Here we have `ORDERS.*` but we could also consume `SHIPPING.state` into the same Stream should that make sense (not shown here).
+
+Streams support various retention policies - they can be kept based on limits like max count, size or age but also more novel methods like keeping them as long as any Consumers have them unacknowledged, or work queue like behavior where a message is removed after first ack.
+
+Streams support deduplication using a `Msg-Id` header and a sliding window within which to track duplicate messages. See the [Message Deduplication](#message-deduplication) section.
+
+When defining Streams the items below make up the entire configuration of the set.
+
+|Item|Description|
+|----|-----------|
+|MaxAge|Maximum age of any message in the stream, expressed in microseconds|
+|MaxBytes|How big the Stream may be, when the combined stream size exceeds this old messages are removed|
+|MaxMsgSize|The largest message that will be accepted by the Stream|
+|MaxMsgs|How many messages may be in a Stream, oldest messages will be removed if the Stream exceeds this size|
+|MaxConsumers|How many Consumers can be defined for a given Stream, `-1` for unlimited|
+|Name|A name for the Stream that may not have spaces, tabs or `.`|
+|NoAck|Disables acknowledging messages that are received by the Stream|
+|Replicas|How many replicas to keep for each message (not implemented as of January 2020)|
+|Retention|How message retention is considered, `LimitsPolicy` (default), `InterestPolicy` or `WorkQueuePolicy`|
+|Discard|When a Stream reached it's limits either, `DiscardNew` refuses new messages while `DiscardOld` (default) deletes old messages| 
+|Storage|The type of storage backend, `file` and `memory` as of January 2020|
+|Subjects|A list of subjects to consume, supports wildcards|
+|Duplicates|The window within which to track duplicate messages|