diff --git a/doc/README.md b/doc/README.md index 87e45cdc..e431f8a8 100644 --- a/doc/README.md +++ b/doc/README.md @@ -1,21 +1,3 @@ # Architecture Decision Records -The directory [adr](adr) hold Architecture Decision Records that document major decisions made -in the design of the NATS Server. - -A good intro to ADRs can be found in [Documenting Architecture Decisions](http://thinkrelevance.com/blog/2011/11/15/documenting-architecture-decisions) by Michael Nygard. - -## When to write an ADR - -Not every little decision needs an ADR, and we are not overly prescriptive about the format. -The kind of change that should have an ADR are ones likely to impact many client libraries -or those where we specifically wish to solicit wider community input. - -## Format - -The [adr-tools](https://github.com/npryce/adr-tools) utility ships with a template that's a -good starting point. We do not have a fixed format at present. - -## ADR Statuses - -Each ADR has a status, let's try to use `Proposed`, `Approved` `Partially Implemented`, `Implemented` and `Rejected`. +The NATS ADR documents have moved to their [own repository](https://github.com/nats-io/nats-architecture-and-design/) diff --git a/doc/adr/0001-jetstream-json-api-design.md b/doc/adr/0001-jetstream-json-api-design.md deleted file mode 100644 index 1a88e828..00000000 --- a/doc/adr/0001-jetstream-json-api-design.md +++ /dev/null @@ -1,155 +0,0 @@ -# 1. JetStream JSON API Design - -Date: 2020-04-30 -Author: @ripienaar - -## Status - -Partially Implemented - -## Context - -At present, the API encoding consists of mixed text and JSON, we should improve consistency and error handling. - -### Admin APIs - -#### Requests - -All Admin APIs that today accept `nil` body should also accept an empty JSON document as request body. - -Any API that responds with JSON should also accept JSON, for example to delete a message by sequence we accept -`10` as body today, this would need to become `{"seq": 10}` or similar. - -#### Responses - -All responses will be JSON objects, a few examples will describe it best. Any error that happens has to be -communicated within the originally expected message type. Even the case where JetStream is not enabled for -an account, the response has to be a valid data type with the addition of `error`. When `error` is present -empty fields may be omitted as long as the response still adheres to the schema. - -Successful Stream Info: - -```json -{ - "type": "io.nats.jetstream.api.v1.stream_info", - "time": "2020-04-23T16:51:18.516363Z", - "config": { - "name": "STREAM", - "subjects": [ - "js.in" - ], - "retention": "limits", - "max_consumers": -1, - "max_msgs": -1, - "max_bytes": -1, - "max_age": 31536000, - "max_msg_size": -1, - "storage": "file", - "num_replicas": 1 - }, - "state": { - "messages": 95563, - "bytes": 40104315, - "first_seq": 34, - "last_seq": 95596, - "consumer_count": 1 - } -} -``` - -Consumer Info Error: - -```json -{ - "type": "io.nats.jetstream.api.v1.consumer_info", - "error": { - "description": "consumer not found", - "code": 404, - "error_code": 10059 - } -} -``` - -Here we have a minimally correct response with the additional error object. - -In the `error` struct we have `description` as a short human friendly explanation that should include enough context to -identify what Stream or Consumer acted on and whatever else we feel will help the user while not sharing privileged account -information. These strings are not part of the API promises, we can update and re-word or translate them at any time. Programmatic -error handling should look at the `code` which will be HTTP like, 4xx human error, 5xx server error etc. Finally, the `error_code` -indicates the specific reason for the 404 - here `10059` means the stream did not exist, helping developers identify the -real underlying cause. - -More information about the `error_code` system can be found in [ADR-7](0007-error-codes.md). - -Ideally the error response includes a minimally valid body of what was requested but this can be very hard to implement correctly. - -Today the list API's just return `["ORDERS"]`, these will become: - -```json -{ - "type": "io.nats.jetstream.api.v1.stream_list", - "time": "2020-04-23T16:51:18.516363Z", - "streams": [ - "ORDERS" - ] -} -``` - -With the same `error` treatment when some error happens. - -## Implementation - -While implementing this in JetStream the following pattern emerged: - -```go -type JSApiResponse struct { - Type string `json:"type"` - Error *ApiError `json:"error,omitempty"` -} - -type ApiError struct { - Code int `json:"code"` - ErrCode int `json:"err_code,omitempty"` - Description string `json:"description,omitempty"` - URL string `json:"-"` - Help string `json:"-"` -} - -type JSApiConsumerCreateResponse struct { - JSApiResponse - *ConsumerInfo -} -``` - -This creates error responses without the valid `ConsumerInfo` fields but this is by far the most workable solution. - -Validating this in JSON Schema draft 7 is a bit awkward, not impossible and specifically leads to some hard to parse validation errors, but it works.: - -```json -{ - "$schema": "http://json-schema.org/draft-07/schema#", - "$id": "https://nats.io/schemas/jetstream/api/v1/consumer_create_response.json", - "description": "A response from the JetStream $JS.API.CONSUMER.CREATE API", - "title": "io.nats.jetstream.api.v1.consumer_create_response", - "type": "object", - "required": ["type"], - "oneOf": [ - { - "$ref": "definitions.json#/definitions/consumer_info" - }, - { - "$ref": "definitions.json#/definitions/error_response" - } - ], - "properties": { - "type": { - "type": "string", - "const": "io.nats.jetstream.api.v1.consumer_create_response" - } - } -} -``` - -## Consequences - -URL Encoding does not carry data types, and the response fields will need documenting. diff --git a/doc/adr/0002-nats-typed-messages.md b/doc/adr/0002-nats-typed-messages.md deleted file mode 100644 index 32c6fd4e..00000000 --- a/doc/adr/0002-nats-typed-messages.md +++ /dev/null @@ -1,183 +0,0 @@ -# 2. NATS Typed Messages - -Date: 2020-05-06 -Author: @ripienaar - -## Status - -Accepted - -## Context - -NATS Server has a number of JSON based messages - monitoring, JetStream API and more. These are consumed, -and in the case of the API produced, by 3rd party systems in many languages. To assist with standardization -of data validation, variable names and more we want to create JSON Schema documents for all our outward facing -JSON based communication. Specifically this is not for server to server communication protocols. - -This effort is ultimately not for our own use - though libraries like `jsm.go` will use these to do validation -of inputs - this is about easing interoperability with other systems and to eventually create a Schema Registry. - -There are a number of emerging formats for describing message content: - - * JSON Schema - transport agnostic way of describing the shape of JSON documents - * AsyncAPI - middleware specific API description that uses JSON Schema for payload descriptions - * CloudEvents - standard for wrapping system specific events in a generic, routable, package. Supported by all - major Public Clouds and many event gateways. Can reference JSON Schema. - * Swagger / OpenAPI - standard for describing web services that uses JSON Schema for payload descriptions - -In all of these many of the actual detail like how to label types of event or how to version them are left up -to individual projects to solve. This ADR describes how we are approaching this. - -## Decision - -### Overview - -We will start by documenting our data types using JSON Schema Draft 7. AsyncAPI and Swagger can both reference -these documents using remote references so this, as a starting point, gives us most flexibility and interoperability -to later create API and Transport specific schemas that reference these. - -We define 2 major type of typed message: - - * `Message` - any message with a compatible `type` hint embedded in it - * `Event` - a specialized `message` that has timestamps and event IDs, suitable for transformation to - Cloud Events. Typically, published unsolicited. - -Today NATS Server do not support publishing Cloud Events natively however a bridge can be created to publish -those to other cloud systems using the `jsm.go` package that supports converting `events` into Cloud Event format. - -### Message Types - -There is no standard way to indicate the schema of a specific message. We looked at a lot of prior art from CNCF -projects, public clouds and more but found very little commonality. The nearest standard is the Uniform Resource Name -which still leaves most of the details up to the project and does not conventionally support versioning. - -We chose a message type like `io.nats.jetstream.api.v1.consumer_delete_response`, `io.nats.server.advisory.v1.client_connect` -or `io.nats.unknown_message`. - -`io.nats.unknown_message` is a special type returned for anything without valid type hints. In go that implies -`map[string]interface{}`. - -The structure is as follows: io.nats.``.``.v``.`` - -#### Source - -The project is the overall originator of a message and should be short but descriptive, today we have 2 - `server` and ` -jetstream` - as we continue to build systems around Stream Processing and more we'd add more of these types. I anticipate -for example adding a few to Surveyor for publishing significant lifecycle events. - -Generated Cloud Events messages has the `source` set to `urn:nats:`. - -|Project|Description| -|-------|-----------| -|`server`|The core NATS Server excluding JetStream related messages| -|`jetstream`|Any JetStream related message| - -#### Category - -The `category` groups messages by related sub-groups of the `source`, often this also appears in the subjects -these messages get published to. - -This is a bit undefined, examples in use now are `api`, `advisory`, `metric`. Where possible try to fit in with -existing chosen ones, if none suits update this table with your choice and try to pick generic category names. - -|Category|Description| -|----|-----------| -|`api`|Typically these are `messages` used in synchronous request response APIs| -|`advisory`|These are `events` that describe a significant event that happened like a client connecting or disconnecting| -|`metric`|These are `events` that relate to monitoring - how long did it take a message to be acknowledged| - -#### Versioning - -The ideal outcome is that we never need to version any message and maintain future compatibility. - -We think we can do that with the JetStream API. Monitoring, Observability and black box management is emerging, and we -know less about how that will look in the long run, so we think we will need to version those. - -The philosophy has to be that we only add fields and do not significantly change the meaning of existing ones, this -means the messages stay `v1`, but major changes will require bumps. So all message types includes a single digit version. - -#### Message Name - -Just a string identifying what this message is about - `client_connect`, `client_disconnect`, `api_audit` etc. - -## Examples - -### Messages - -At minimum a typed message must include a `type` string: - -```json -{ - "type": "io.nats.jetstream.api.v1.stream_configuration" -} -``` - -Rest of the document is up to the specific use case - -### Advisories - -Advisories must include additional fields: - -```json -{ - "type": "io.nats.jetstream.advisory.v1.api_audit", - "id": "uafvZ1UEDIW5FZV6kvLgWA", - "timestamp": "2020-04-23T16:51:18.516363Z" -} -``` - - * `timestamp` - RFC 3339 format in UTC timezone, with sub-second precision added if present - * `id` - Any sufficiently unique ID such as those produced by `nuid` - -### Errors - -Any `message` can have an optional `error` property if needed and can be specified in the JSON Schema, -they are not a key part of the type hint system which this ADR focus on. - -In JetStream [ADR 0001](0001-jetstream-json-api-design.md) we define an error message as this: - -``` -{ - "error": { - "description": "Server Error", - "code": 500 - } -} -``` - -Where error codes follow basic HTTP standards. This `error` object is not included on success and so -acceptable error codes are between `300` and `599`. - -It'll be advantageous to standardise around this structure, today only JetStream API has this and we have -not evaluated if this will suit all our needs. - -## Schema Storage - -Schemas will eventually be kept in some form of formal Schema registry. In the near future they will all be placed as -fully dereferenced JSON files at `http://nats.io/schemas`. - -The temporary source for these can be found in the `nats-io/jetstream` repository including tools to dereference the -source files. - -## Usage - -Internally the `jsm.go` package use these Schemas to validate all requests to the JetStream API. This is not required as -the server does its own validation too - but it's nice to fail fast and give extended errors like a JSON validator will -give. - -Once we add JetStream API support to other languages it would be good if those languages use the same Schemas for -validation to create a unified validation strategy. - -Eventually these Schemas could be used to generate the API structure. - -The `nats` utility has a `nats events` command that can display any `event`. It will display any it finds, special -formatting can be added using Golang templates in its source. Consider adding support to it whenever a new `event` is added. - -## Status - -While this is marked `accepted`, we're still learning and exploring their usage so changes should be anticipated. - -## Consequences - -Many more aspects of the Server move into the realm of being controlled and versioned where previously we took a much -more relaxed approach to modifications to the data produced by `/varz` and more. diff --git a/doc/adr/0003-distributed-tracing.md b/doc/adr/0003-distributed-tracing.md deleted file mode 100644 index 71414142..00000000 --- a/doc/adr/0003-distributed-tracing.md +++ /dev/null @@ -1,156 +0,0 @@ -# 3. NATS Service Latency Distributed Tracing Interoperability - -Date: 2020-05-21 -Author: @ripienaar - -## Status - -Approved - -## Context - -The goal is to enable the NATS internal latencies to be exported to distributed tracing systems, here we see a small -architecture using Traefik, a Go microservice and a NATS hosted service all being observed in Jaeger. - -![Jaeger](0003-jaeger-trace.png) - -The lowest 3 spans were created from a NATS latency Advisory. - -These traces can be ingested by many other commercial systems like Data Dog and Honeycomb where they can augment the -existing operations tooling in use by our users. Additionally Grafana 7 supports Jaeger and Zipkin today. - -Long term I think every server that handles a message should emit a unique trace so we can also get visibility into -the internal flow of the NATS system and exactly which gateway connection has a delay - see our current HM issues - but -ultimately I don't think we'll be doing that in the hot path of the server, though these traces are easy to handle async - -Meanwhile, this proposal will let us get very far with our current Latency Advisories. - -## Configuring an export - -Today there are no standards for the HTTP headers that communicate span context downstream - with Trace Context being -an emerging w3c standard. - -I suggest we support the Jaeger and Zipkin systems as well as Trace Context for long term standardisation efforts. - -Supporting these would mean we have to interpret the headers that are received in the request to determine if we should -publish a latency advisory rather than the static `50%` configuration we have today. - -Today we have: - -``` -exports: [ - { - service: weather.service - accounts: [WEB] - latency: { - sampling: 50% - subject: weather.latency - } - } -] -``` - -This enables sampling based `50%` of the service requests on this service. - -I propose we support the additional sampling value `headers` which will configure the server to -interpret the headers as below to determine if a request should be sampled. - -## Propagating headers - -The `io.nats.server.metric.v1.service_latency` advisory gets updated with an additional `headers` field. - -`headers` contains only the headers used for the sampling decision. - -```json -{ - "type": "io.nats.server.metric.v1.service_latency", - "id": "YBxAhpUFfs1rPGo323WcmQ", - "timestamp": "2020-05-21T08:06:29.4981587Z", - "status": 200, - "headers": { - "Uber-Trace-Id": ["09931e3444de7c99:50ed16db42b98999:0:1"] - }, - "requestor": { - "acc": "WEB", - "rtt": 1107500, - "start": "2020-05-21T08:06:20.2391509Z", - "user": "backend", - "lang": "go", - "ver": "1.10.0", - "ip": "172.22.0.7", - "cid": 6, - "server": "nats2" - }, - "responder": { - "acc": "WEATHER", - "rtt": 1389100, - "start": "2020-05-21T08:06:20.218714Z", - "user": "weather", - "lang": "go", - "ver": "1.10.0", - "ip": "172.22.0.6", - "cid": 6, - "server": "nats1" - }, - "start": "2020-05-21T08:06:29.4917253Z", - "service": 3363500, - "system": 551200, - "total": 6411300 -} -``` - -## Header Formats - -Numerous header formats are found in the wild, main ones are Zipkin and Jaeger and w3c `tracestate` being an emerging standard. - -Grafana supports Zipkin and Jaeger we should probably support at least those, but also Trace Context for future interop. - -### Zipkin - -``` -X-B3-TraceId: 80f198ee56343ba864fe8b2a57d3eff7 -X-B3-ParentSpanId: 05e3ac9a4f6e3b90 -X-B3-SpanId: e457b5a2e4d86bd1 -X-B3-Sampled: 1 -``` - -Also supports a single `b3` header like `b3={TraceId}-{SpanId}-{SamplingState}-{ParentSpanId}` or just `b3=0` - -[Source](https://github.com/openzipkin/b3-propagation) - -### Jaeger - -``` -uber-trace-id: {trace-id}:{span-id}:{parent-span-id}:{flags} -``` - -Where flags are: - - * One byte bitmap, as two hex digits - * Bit 1 (right-most, least significant, bit mask 0x01) is “sampled” flag - * 1 means the trace is sampled and all downstream services are advised to respect that - * 0 means the trace is not sampled and all downstream services are advised to respect that - -Also a number of keys like `uberctx-some-key: value` - -[Source](https://www.jaegertracing.io/docs/1.17/client-libraries/#tracespan-identity) - -### Trace Context - -Supported by many vendors including things like New Relic - -``` -traceparent: 00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01 -tracestate: rojo=00f067aa0ba902b7,congo=t61rcWkgMzE -``` - -Here the `01` of `traceparent` means its sampled. - -[Source](https://www.w3.org/TR/trace-context/) - -### OpenTelemetry - -Supports Trace Context - -[Source](https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/api.md) - diff --git a/doc/adr/0003-jaeger-trace.png b/doc/adr/0003-jaeger-trace.png deleted file mode 100644 index e1dc831c..00000000 Binary files a/doc/adr/0003-jaeger-trace.png and /dev/null differ diff --git a/doc/adr/0004-nats-headers.md b/doc/adr/0004-nats-headers.md deleted file mode 100644 index a3cb9c20..00000000 --- a/doc/adr/0004-nats-headers.md +++ /dev/null @@ -1,171 +0,0 @@ -# 4. nats-headers - -Date: 2021-05-12 - -## Status - -Accepted - -## Context - -This document describes NATS Headers from the perspective of clients. NATS -headers allow clients to specify additional meta-data in the form of headers. -The headers are effectively -[HTTP Headers](https://tools.ietf.org/html/rfc7230#section-3.2). - -The salient points of the HTTP header specification are: - -- Each header field consists of a case-insensitive field name followed by a - colon (`:`), optional leading whitespace, the field value, and optional - trailing whitespace. -- No spaces are allowed between the header field name and colon. -- Field value may be preceded or followed by optional whitespace. -- The specification may allow any number of strange things like comments/tokens - etc. -- The keys can repeat. - -More specifically from [rfc822](https://www.ietf.org/rfc/rfc822.txt) Section -3.1.2: - -> Once a field has been unfolded, it may be viewed as being composed of a -> field-name followed by a colon (":"), followed by a field-body, and terminated -> by a carriage-return/line-feed. The field-name must be composed of printable -> ASCII characters (i.e., characters that have values between 33. and 126., -> decimal, except colon). The field-body may be composed of any ASCII -> characters, except CR or LF. (While CR and/or LF may be present in the actual -> text, they are removed by the action of unfolding the field.) - -The only difference between a NATS header and HTTP is the first line. Instead of -an HTTP method followed by a resource, and the HTTP version (`GET / HTTP/1.1`), -NATS will provide a string identifying the header version (`NATS/X.x`), -currently 1.0, so it is rendered as `NATS/1.0␍␊`. - -Please refer to the -[specification](https://tools.ietf.org/html/rfc7230#section-3.2) for information -on how to encode/decode HTTP headers. - -### Enabling Message Headers - -The server that is able to send and receive headers will specify so in it's -[`INFO`](https://docs.nats.io/nats-protocol/nats-protocol#info) protocol -message. The `headers` field if present, will have a boolean value. If the -client wishes to send headers, it has to enable it must add a `headers` field -with the `true` value in its -[`CONNECT` message](https://docs.nats.io/nats-protocol/nats-protocol#connect): - -``` -"lang": "node", -"version": "1.2.3", -"protocol": 1, -"headers": true, -... -``` - -### Publishing Messages With A Header - -Messages that include a header have a `HPUB` protocol: - -``` -HPUB SUBJECT REPLY 23 30␍␊NATS/1.0␍␊Header: X␍␊␍␊PAYLOAD␍␊ -HPUB SUBJECT REPLY 23 23␍␊NATS/1.0␍␊Header: X␍␊␍␊␍␊ -HPUB SUBJECT REPLY 48 55␍␊NATS/1.0␍␊Header1: X␍␊Header1: Y␍␊Header2: Z␍␊␍␊PAYLOAD␍␊ -HPUB SUBJECT REPLY 48 48␍␊NATS/1.0␍␊Header1: X␍␊Header1: Y␍␊Header2: Z␍␊␍␊␍␊ - -HPUB [REPLY] -
-``` - -#### NOTES: - -- `HDR_LEN` includes the entire serialized header, from the start of the version - string (`NATS/1.0`) up to and including the ␍␊ before the payload -- `TOT_LEN` the payload length plus the HDR_LEN - -### MSG with Headers - -Clients will see `HMSG` protocol lines for `MSG`s that contain headers - -``` -HMSG SUBJECT 1 REPLY 23 30␍␊NATS/1.0␍␊Header: X␍␊␍␊PAYLOAD␍␊ -HMSG SUBJECT 1 REPLY 23 23␍␊NATS/1.0␍␊Header: X␍␊␍␊␍␊ -HMSG SUBJECT 1 REPLY 48 55␍␊NATS/1.0␍␊Header1: X␍␊Header1: Y␍␊Header2: Z␍␊␍␊PAYLOAD␍␊ -HMSG SUBJECT 1 REPLY 48 48␍␊NATS/1.0␍␊Header1: X␍␊Header1: Y␍␊Header2: Z␍␊␍␊␍␊ - -HMSG [REPLY] - -``` - -- `HDR_LEN` includes the entire serialized header, from the start of the version - string (`NATS/1.0`) up to and including the ␍␊ before the payload -- `TOT_LEN` the payload length plus the HDR_LEN - -## Decision - -Implemented and merged to master. - -## Consequences - -Use of headers is possible. - -## Compatibility Across NATS Clients - -The following is a list of features to insure compatibility across NATS clients -that support headers. Because the feature in Go client and nats-server leverage -the Go implementation as described above, the API used will determine how header -names are serialized. - -### Case-sensitive Operations - -In order to promote compatibility across clients, this section describes how -clients should behave. All operations are _case-sensitive_. Implementations -should provide an option(s) to enable clients to work in a case-insensitive or -format header names canonically. - -#### Reading Values - -`GET` and `VALUES` are case-sensitive operations. - -- `GET` returns a `string` of the first value found matching the specified key - in a case-sensitive lookup or an empty string. -- `VALUES` returns a list of all values that case-sensitive match the specified - key or an empty/nil/null list. - -#### Setting Values - -- `APPEND` is a case-sensitive, and case-preserving operation. The header is set - exactly as specified by the user. -- `SET` and `DELETE` are case-sensitive: - - `DELETE` removes headers in case-sensitive operation - - `SET` can be considered the result of a `DELETE` followed by an `APPEND`. - This means only exact-match keys are deleted, and the specified value is - added under the specified key. - -#### Case-insensitive Option - -The operations `GET`, `VALUES`, `SET`, `DELETE`, `APPEND` in the presence of a -`case-insensitive` match requirement, will operate on equivalent matches. - -This functionality is constrained as follows: - -- `GET` returns the first matching header value in a case-insensitive match. -- `VALUES` returns the union of all headers that case-insensitive match. If the - exact key is not found, an empty/nil/null list is returned. -- `DELETE` removes the all headers that case-insensitive match the specified - key. -- `SET` is the combination of a case-insensitive `DELETE` followed by an - `APPEND`. -- `APPEND` will use the first matching key found and add values. If no key is - found, values are added to a key preserving the specified case. - -Note that case-insensitive operations are only suggested, and not required to be -implemented by clients, specially if the implementation allows the user code to -easily iterate over keys and values. - -### Multiple Header Values Serialization - -When serializing, entries that have more than one value should be serialized one -per line. While the http Header standard, prefers values to be a comma separated -list, this introduces additional parsing requirements and ambiguity from client -code. HTTP itself doesn't implement this requirement on headers such as -`Set-Cookie`. Libraries, such as Go, do not interpret comma-separated values as -lists. diff --git a/doc/adr/0005-lame-duck-notification.md b/doc/adr/0005-lame-duck-notification.md deleted file mode 100644 index d2ab11a7..00000000 --- a/doc/adr/0005-lame-duck-notification.md +++ /dev/null @@ -1,21 +0,0 @@ -# 5. lame-duck-notification - -Date: 2020-07-20 - -## Status - -Accepted - -## Context - -This document describes the _Lame Duck Mode_ server notification. When a server enters lame duck mode, it removes itself from being advertised in the cluster, and slowly starts evicting connected clients as per [`lame_duck_duration`](https://docs.nats.io/nats-server/configuration#runtime-configuration). This document describes how this information is notified -to the client, in order to allow clients to cooperate and initiate an orderly migration to a different server in the cluster. - - -## Decision - -The server notififies that it has entered _lame duck mode_ by sending an [`INFO`](https://docs.nats.io/nats-protocol/nats-protocol#info) update. If the `ldm` property is set to true, the server has entered _lame_duck_mode_ and the client should initiate an orderly self-disconnect or close. Note the `ldm` property is only available on servers that implement the notification feature. - -## Consequences - -By becoming aware of a server changing state to _lame duck mode_ clients can orderly disconnect from a server, and connect to a different server. Currently clients have no automatic support to _disconnect_ while keeping current state. Future documentation will describe strategies for initiating a new connection and exiting the old one. diff --git a/doc/adr/0006-protocol-naming-conventions.md b/doc/adr/0006-protocol-naming-conventions.md deleted file mode 100644 index 02f4f9a5..00000000 --- a/doc/adr/0006-protocol-naming-conventions.md +++ /dev/null @@ -1,55 +0,0 @@ -# 6. protocol-naming-conventions - -Date: 2021-06-28 - -## Status - -Accepted - -## Context - -This document describes naming conventions for these protocol components: - -* Subjects (including Reply Subjects) -* Stream Names -* Consumer Names -* Account Names - -## Prior Work - -Currently the NATS Docs regarding [protocol convention](https://docs.nats.io/nats-protocol/nats-protocol#protocol-conventions) says this: - -> Subject names, including reply subject (INBOX) names, are case-sensitive and must be non-empty alphanumeric strings with no embedded whitespace. All ascii alphanumeric characters except spaces/tabs and separators which are "." and ">" are allowed. Subject names can be optionally token-delimited using the dot character (.), e.g.: -A subject is comprised of 1 or more tokens. Tokens are separated by "." and can be any non space ascii alphanumeric character. The full wildcard token ">" is only valid as the last token and matches all tokens past that point. A token wildcard, "*" matches any token in the position it was listed. Wildcard tokens should only be used in a wildcard capacity and not part of a literal token. - -> Character Encoding: Subject names should be ascii characters for maximum interoperability. Due to language constraints and performance, some clients may support UTF-8 subject names, as may the server. No guarantees of non-ASCII support are provided. - -## Specification - -``` -dot = "." -asterisk = "*" -lt = "<" -gt = ">" -dollar = "$" -colon = ":" -double-quote = ["] -fwd-slash = "/" -backslash = "\" -pipe = "|" -question-mark = "?" -ampersand = "&" -printable = all printable ascii (33 to 126 inclusive) -term = (printable except dot, asterisk or gt)+ -prefix = (printable except dot, asterisk, gt or dollar)+ -filename-safe = (printable except dot, asterisk, lt, gt, colon, double-quote, fwd-slash, backslash, pipe, question-mark, ampersand) - -message-subject = term (dot term | asterisk)* (dot gt)? -reply-to = term (dot term)* -stream-name = term -queue-name = term -durable-name = term -js-internal-prefix = dollar (prefix dot)+ -js-user-prefix = (prefix dot)+ -account-name = (filename-safe)+ maximum 255 characters -``` diff --git a/doc/adr/0007-error-codes.md b/doc/adr/0007-error-codes.md deleted file mode 100644 index 1521da55..00000000 --- a/doc/adr/0007-error-codes.md +++ /dev/null @@ -1,150 +0,0 @@ -# 7. NATS Server Error Codes - -Date: 2021-05-12 -Author: @ripienaar - -## Status - -Partially Implemented in [#1811](https://github.com/nats-io/nats-server/issues/1811) - -The current focus is JetStream APIs, we will as a followup do a refactor and generalization and move onto other -areas of the server. - -## Context - -When a developer performs a Consumer Info API request she might get a 404 error, there is no way to know if this is -a 404 due to the Stream not existing or the Consumer not existing. The only way is to parse the returned error description -like `consumer not found`. Further several other error situations might arise which due to our code design would be surfaced -as a 404 when in fact it's more like a 5xx error - I/O errors and such. - -If users are parsing our error strings it means our error text form part of the Public API, we can never improve errors, -fix spelling errors or translate errors into other languages. - -This ADR describes an additional `error_code` that provides deeper context into the underlying cause of the 404. - -## Design - -We will adopt a numbering system for our errors where every error has a unique number within a range that indicates the -subsystem it belongs to. - -|Range|Description| -|-----|-----------| -|1xxxx|JetStream related errors| -|2xxxx|MQTT related errors| - -The JetStream API error will be adjusted like this initially with later work turning this into a more generic error -usable in other parts of the NATS Server code base. - -```go -// ApiError is included in all responses if there was an error. -type ApiError struct { - Code int `json:"code"` - ErrCode int `json:"err_code,omitempty"` - Description string `json:"description,omitempty"` - URL string `json:"-"` - Help string `json:"-"` -} -``` - -Here the `code` and `error_code` is what we'll consider part of the Public API with `description` being specifically -out of scope for SemVer protection and changes to these will not be considered a breaking change. - -The `ApiError` type will implement `error` and whenever it will be logged will append the code to the log line, for example: - -``` -stream not found (10059) -``` - -The `nats` CLI will have a lookup system like `nats error 10059` that will show details of this error including help, -urls and such. It will also assist in listing and searching errors. The same could be surfaced later in documentation -and other areas. - -## Using in code - -### Raising an error - -Here we raise a `stream not found` error without providing any additional context to the user, the constant is -`JSStreamNotFoundErr` from which you can guess it takes no printf style interpolations vs one that does which would -end in `...ErrF`: - -The go doc for this constant would also include the content of the error to assist via intellisense in your IDE. - -```go -err = doThing() -if err != nil { - return ApiErrors[JSStreamNotFoundErr] -} -``` - -If we have to do string interpolation of the error body, here the `JSStreamRestoreErrF` has the body -`"restore failed: {err}"`, the `NewT()` will simply use `strings.Replaces()` to create a new `ApiError` with the full string, -note this is a new instance of ApiError so normal compare of `err == ApiErrors[x]` won't match: - -```go -err = doRestore() -if err != nil { - return ApiErrors[JSStreamRestoreErrF].NewT("{err}", err) -} -``` - -If we had to handle an error that may be an `ApiError` or a traditional go error we can use the `ErrOr` function, -this will look at the result from `lookupConsumer()`, if it's an `ApiError` that error will be set else `JSConsumerNotFoundErr` be -returned. Essentially the `lookupConsumer()` would return a `JSStreamNotFoundErr` if the stream does not exist else -a `JSConsumerNotFoundErr` or go error on I/O failure for example. - -```go -var resp = JSApiConsumerCreateResponse{ApiResponse: ApiResponse{Type: JSApiStreamCreateResponseType}} - -_, err = lookupConsumer(stream, consumer) -if err != nil { - resp.Error = ApiErrors[JSConsumerNotFoundErr].ErrOr(err) -} -``` - -While the `ErrOr` function returns the `ApiErrors` pointer exactly - meaning `err == ApiErrors[x]`, the counterpart -`ErrOrNewT` will create a new instance. - -### Testing Errors - -Should you need to determine if a error is of a specific kind (error code) this can be done using the `IsNatsErr()` function: - -```go -err = doThing() -if IsNatsErr(err, JSStreamNotFoundErr, JSConsumerNotFoundErr) { - // create the stream and consumer -} else if err !=nil{ - // other critical failure -} -``` - -## Maintaining the errors - -The file `server/errors.json` holds the data used to generate the error constants, lists etc. This is JSON versions of -`server.ErrorsData`. - -```json -[ - { - "constant": "JSClusterPeerNotMemberErr", - "code": 400, - "error_code": 10040, - "description": "peer not a member" - }, - { - "constant": "JSNotEnabledErr", - "code": 503, - "error_code": 10039, - "description": "JetStream not enabled for account", - "help": "This error indicates that JetStream is not enabled either at a global level or at global and account level", - "url": "https://docs.nats.io/jetstream" - } -] -``` - -The `nats` CLI allow you to edit, add and view these files using the `nats errors` command, use the `--errors` flag to -view your local file during development. - -After editing this file run `go generate` in the top of the `nats-server` repo, and it will update the needed files. Check -in the result. - -When run this will verify that the `error_code` and `constant` is unique in each error diff --git a/doc/adr/0009-js-idle-heartbeat.md b/doc/adr/0009-js-idle-heartbeat.md deleted file mode 100644 index 28d9b9dd..00000000 --- a/doc/adr/0009-js-idle-heartbeat.md +++ /dev/null @@ -1,52 +0,0 @@ -# 9. js-idle-heartbeat - -Date: 2021-06-30 - -## Status - -Accepted - -## Context - -The JetStream ConsumerConfig option `idle_heartbeat` enables server-side -heartbeats to be sent to the client. To enable the option on the consumer simply -specify it with a value representing the number of nanoseconds that the server -should use as notification interval. - -The server will only notify after the specified interval has elapsed and no new -messages have been delivered to the consumer. Delivering a message to the -consumer resets the interval. - -The idle heartbeats notifications are sent to the consumer's subscription as a -regular NATS message. The message will have a `code` of `100` with a -`description` of `Idle Heartbeat`. The message will contain additional headers -that the client can use to re-affirm that it has not lost any messages: - -- `Nats-Last-Consumer` indicates the last consumer sequence delivered to the - client. If `0`, no messages have been delivered. -- `Nats-Last-Stream` indicates the sequence number of the newest message in the - stream. - -Here's an example of a client creating a consumer with an idle_heartbeat of 10 -seconds, followed by a server notification. - -``` -$JS.API.CONSUMER.CREATE.FRHZZ447RL7NR8TAICHCZ6 _INBOX.FRHZZ447RL7NR8TAICHCQ8.FRHZZ447RL7NR8TAICHDQ0 136␍␊ -{"config":{"ack_policy":"explicit","deliver_subject":"my.messages","idle_heartbeat":10000000000}, -"stream_name":"FRHZZ447RL7NR8TAICHCZ6"}␍␊ -... - -> HMSG my.messages 2 75 75␍␊NATS/1.0 100 Idle Heartbeat␍␊Nats-Last-Consumer: 0␍␊Nats-Last-Stream: 0␍␊␍␊␍␊ -alive - last stream seq: 0 - last consumer seq: 0 -``` - -This feature is intended as an aid to clients to detect when they have been -disconnected. Without it the consumer's subscription may sit idly waiting for -messages, without knowing that the server might have simply gone away and -recovered elsewhere. - -## Consequences - -Clients can use this information to set client-side timers that track how many -heartbeats have been missed and perhaps take some action such as re-create a -subscription to resume messages. diff --git a/doc/adr/0010-js-purge.md b/doc/adr/0010-js-purge.md deleted file mode 100644 index 797e84fe..00000000 --- a/doc/adr/0010-js-purge.md +++ /dev/null @@ -1,62 +0,0 @@ -# 10. js-purge - -Date: 2021-06-30 - -## Status - -Accepted - -## Context - -JetStream provides the ability to purge streams by sending a request message to: -`$JS.API.STREAM.PURGE.`. The request will return a new message with -the following JSON: - -```typescript -{ - type: "io.nats.jetstream.api.v1.stream_purge_response", - error?: ApiError, - success: boolean, - purged: number -} -``` - -The `error` field is an [ApiError](0007-error-codes.md). The `success` field will be set to `true` if the request -succeeded. The `purged` field will be set to the number of messages that were -purged from the stream. - -## Options - -More fine-grained control over the purge request can be achieved by specifying -additional options as JSON payload. - -```typescript -{ - seq?: number, - keep?: number, - filter?: string -} -``` - -- `seq` is the optional upper-bound sequence for messages to be deleted - (non-inclusive) -- `keep` is the maximum number of messages to be retained (might be less - depending on whether the specified count is available). -- The options `seq` and `keep` are mutually exclusive. -- `filter` is an optional subject (may include wildcards) to filter on. Only - messages matching the filter will be purged. -- `filter` and `seq` purges all messages matching filter having a sequence - number lower than the value specified. -- `filter` and `keep` purges all messages matching filter keeping at most the - specified number of messages. -- If `seq` or `keep` is specified, but `filter` is not, the stream will - remove/keep the specified number of messages. -- To `keep` _N_ number of messages for multiple subjects, invoke `purge` with - different `filter`s. -- If no options are provided, all messages are purged. - -## Consequences - -Tooling and services can use this endpoint to remove messages in creative ways. -For example, a stream may contain a number of samples, at periodic intervals a -service can sum them all and replace them with a single aggregate.