nats-server

mirror of https://github.com/gogrlx/nats-server.git synced 2026-04-02 03:38:42 -07:00

Author	SHA1	Message	Date
Ivan Kozlovic	1eb08505d4	[FIXED] Routes: Pinned Accounts connect/reconnect in some cases The issue is with a server that has a route for a given account but connects to a server that does not support it. The creation of the route for this account will fail - as expected - and the server will stop trying to create the route for this account. But it needs to retry to create this route if it were to reconnect to that same URL in case the server (or its config) is updated to support a route for this account. There was also an issue even with 2.10.0 servers in some gossip situations. Namely, if server B is soliciting connections to A (but not vice-versa) and A would solicit connections to C (but not vice-versa). In this case, connections for pinned-accounts would not be created. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2023-09-28 10:46:32 -06:00
Derek Collison	c5b98f5c79	Make server shutdown an atomic and check inside unsubscribe to avoid unnecessary work. Signed-off-by: Derek Collison <derek@nats.io>	2023-09-26 17:53:58 -07:00
Derek Collison	a0029181ae	Fix datarace Signed-off-by: Derek Collison <derek@nats.io>	2023-09-25 12:04:42 -07:00
Derek Collison	7ce47fd182	Move server running state to atomic to avoid contention at NRG layer. Signed-off-by: Derek Collison <derek@nats.io>	2023-09-25 11:18:15 -07:00
Waldemar Quevedo	baa2805de9	Fix discarding explicit routes while removing duplicate ones (#4414 ) In the new clustering logic for v2.10, sometimes the `TestStressChainedSolicitWorks` test would flake because a node would end up with only implicit routes. In this change, we stamp that one of the remotes is configured so that the nodes at least have one explicit configured remote node.	2023-08-22 08:50:35 -07:00
Waldemar Quevedo	bdb874a6a8	Update LastActivity on connect for routes Signed-off-by: Waldemar Quevedo <wally@nats.io>	2023-08-22 07:10:30 -07:00
Waldemar Quevedo	673f654fbe	Fix discarding explicit routes while removing duplicate ones In the new clustering logic sometimes the TestStressChainedSolicitWorks test would fail because the a node would end up with only implicit routes. In this change, we stamp that one of the remotes is configured so that the nodes at least have one explicit configured remote node. Signed-off-by: Waldemar Quevedo <wally@nats.io>	2023-08-21 15:16:08 -07:00
Derek Collison	4d7cd26956	Add in support for segmented binary stream snapshots. Streams with many interior deletes was causing issues due to the fact that the interior deletes were represented as a sorted []uint64. This approach introduces 3 sub types of delete blocks, avl bitmask tree, a run length encoding, and the legacy format above. We also take into account large interior deletes such that on receiving a snapshot we can skip things we already know about. Signed-off-by: Derek Collison <derek@nats.io>	2023-07-03 08:41:33 -07:00
Derek Collison	f342f6a758	Merge branch 'main' into dev	2023-06-05 14:13:18 -07:00
Nikita Mochalov	4c181bc99a	Use sentinel error	2023-06-05 22:41:09 +03:00
Nikita Mochalov	f71c49511b	Fix client panic on absent server field	2023-06-05 15:27:45 +03:00
Ivan Kozlovic	607b0ca7f3	Fixed cluster permissions configuration reload This is a rework of incorrect changes made in PR #4001. This affects only the `dev` branch. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2023-05-18 19:02:03 -06:00
Ivan Kozlovic	cf474d6333	Revert changes related to leafnode PING interval Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2023-05-16 13:49:00 -06:00
Ivan Kozlovic	67498af2dc	[ADDED] LeafNode: Support for s2 compression This is similar to PR #4115 but for LeafNodes. Compression mode can be set on both side (the accept and in remotes). ``` leafnodes { port: 7422 compression: s2_best remotes [ { url: "nats://host2:74222" compression: s2_better } ] } ``` Possible modes are similar than for routes (described in PR #4115), except that when not defined we default to `s2_auto`. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2023-05-15 17:42:39 -06:00
Derek Collison	4175e4ee9c	Merge branch 'main' into dev	2023-05-06 09:55:34 -07:00
Derek Collison	80db7a22ab	Optimizations for large single hub account leafnode fleets. Added a leafnode lock to allow better traversal without copying of large leafnodes in a single hub account. Signed-off-by: Derek Collison <derek@nats.io>	2023-05-05 13:14:49 -07:00
Ivan Kozlovic	311e3feb5f	Merge branch 'main' into dev	2023-05-03 17:38:40 -06:00
Ivan Kozlovic	8a4ead22bc	Updates based on code review Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2023-05-03 16:14:51 -06:00
Ivan Kozlovic	840c264f45	Cleanup use of s.opts and fixed some lock (deadlock/inversion) issues One should not access s.opts directly but instead use s.getOpts(). Also, server lock needs to be released when performing an account lookup (since this may result in server lock being acquired). A function was calling s.LookupAccount under the client lock, which technically creates a lock inversion situation. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2023-05-03 14:09:02 -06:00
Ivan Kozlovic	d6fe9d4c2d	[ADDED] Support for route S2 compression The new field `compression` in the `cluster{}` block allows to specify which compression mode to use between servers. It can be simply specified as a boolean or a string for the simple modes, or as an object for the "s2_auto" mode where a list of RTT thresholds can be specified. By default, if no compression field is specified, the server will use the s2_auto mode with default RTT thresholds of 10ms, 50ms and 100ms for the "uncompressed", "fast", "better" and "best" modes. ``` cluster { .. # Possible values are "disabled", "off", "enabled", "on", # "accept", "s2_fast", "s2_better", "s2_best" or "s2_auto" compression: s2_fast } ``` To specify a different list of thresholds for the s2_auto, here is how it would look like: ``` cluster { .. compression: { mode: s2_auto # This means that for RTT up to 5ms (included), then # the compression level will be "uncompressed", then # from 5ms+ to 15ms, the mode will switch to "s2_fast", # then from 15ms+ to 50ms, the level will switch to # "s2_better", and anything above 50ms will result # in the "s2_best" compression mode. rtt_thresholds: [5ms, 15ms, 50ms] } } ``` Note that the "accept" mode means that a server will accept compression from a remote and switch to that same compression mode, but will otherwise not initiate compression. That is, if 2 servers are configured with "accept", then compression will actually be "off". If one of the server had say s2_fast then they would both use this mode. If a server has compression mode set (other than "off") but connects to an older server, there will be no compression between those 2 routes. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2023-04-27 17:59:25 -06:00
Derek Collison	c3c4c8f1f5	Merge branch 'main' into dev	2023-04-14 11:47:15 -07:00
Derek Collison	0fe48fe91e	Use new server read locks now that we have them Signed-off-by: Derek Collison <derek@nats.io>	2023-04-14 10:11:40 -07:00
Ivan Kozlovic	83c5c0177a	Changes based on code review Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2023-04-03 09:32:28 -06:00
Ivan Kozlovic	bd1b7b8d55	Cleanup use of s.opts and fixed some lock (deadlock/inversion) issues One should not access s.opts directly but instead use s.getOpts(). Also, server lock needs to be released when performing an account lookup (since this may result in server lock being acquired). A function was calling s.LookupAccount under the client lock, which technically creates a lock inversion situation. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2023-04-03 09:32:28 -06:00
Ivan Kozlovic	105237cba8	[ADDED] Multiple routes and ability to have per-account routes New configuration fields: ``` cluster { ... pool_size: 5 accounts: ["A", "B"] } ``` The configuration `pool_size` in the example above means that this server will create 5 routes to a remote server, assuming that that server has the same `pool_size` setting. Accounts (which are not part of the `accounts[]` configuration) are assigned a specific route in this pool, and this will be the same route on all servers in the cluster. Accounts that are defined in the `accounts` field will each have a dedicated route connection. This will allow suppression of the account name in some of the route protocols, reducing bytes transmitted which may increase performance. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2023-04-03 09:32:25 -06:00
Neil Twigg	14d0ba1c65	Fix some lint errors after move to `golangci-lint`	2022-12-30 20:00:08 +00:00
Ivan Kozlovic	3ec42d5b85	Updates to PR #3611 - Save the TLS name only if not already set - Use the passed URLs slice instead of using s.getOpts().Routes - Enhanced the test - Fixed an unrelated DATA RACE report Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-11-08 09:36:08 -07:00
Ivan Kozlovic	2d181e1c27	[FIXED] Routing: TLS connections to discovered server may fail The server was not setting "server name" in the TLS configuration for route connections, which may lead to failed (re)connect if the certificate does not allow for the IP and the URL did not have the hostname, which would happen with gossip protocol. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-11-07 17:26:17 -07:00
Ivan Kozlovic	170ff49837	[ADDED] JetStream: peer (the hash of server name) in statsz/jsz A request to `$SYS.REQ.SERVER.PING.JSZ` would now return something like this: ``` ... "meta_cluster": { "name": "local", "leader": "A", "peer": "NUmM6cRx", "replicas": [ { "name": "B", "current": true, "active": 690369000, "peer": "b2oh2L6w" }, { "name": "Server name unknown at this time (peerID: jZ6RvVRH)", "current": false, "offline": true, "active": 0, "peer": "jZ6RvVRH" } ], "cluster_size": 3 } ``` Note the "peer" field following the "leader" field that contains the server name. The new field is the node ID, which is a hash of the server name. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-09-16 15:31:37 -06:00
Ivan Kozlovic	b6208c775b	[FIXED] Memory leak when unsubscribing the last queue subscription A server maintains a map for the subject+queue to know the number of members on the same group. However, on unsubscribe when we get to the last one being unsubscribed, we were removing from the map but then unfortunately adding back with a value of 0, which caused a leak. If the same subscription was coming back, then this map entry would be reused, but if it is a never coming back queue sub, then memory could increase continously. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-08-04 18:42:13 -06:00
Ivan Kozlovic	5261d98781	[ADDED] Monitoring: Routez's individual route has now more info Added Start, LastActivity, Uptime and Idle that we normally have in a Connz for non route connections. This info can be useful to determine if a route is recent, etc.. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-05-18 13:18:53 -06:00
Ivan Kozlovic	63c750e295	[CHANGED] Gateway: Detect duplicate names between clusters Gateway connection will be closed and error reported if a remote has a name that is a duplicate of the local cluster. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-03-15 15:00:13 -06:00
Ivan Kozlovic	85b3f8a7fd	Gateways: data race when setting first ping timer This was introduced when fixing #2881. The call to setFirstPingTimer needed to be done under the client's lock. Moved setFirstPingTimer from a server receiver to a client receiver. The only reason it was a server receiver is because we need the server options, but c.srv is always set when invoking this function, so we will get the server from c.srv in that function now. Related to #2881 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-03-04 19:55:07 -07:00
Derek Collison	ca1132a01d	Allow stream placement by tags. Signed-off-by: Derek Collison <derek@nats.io>	2022-02-15 17:07:32 -08:00
Derek Collison	a0a2e32185	Remove dynamic account behaviors. We used these in tests and for experimenting with sandboxed environments like the demo network. Signed-off-by: Derek Collison <derek@nats.io>	2022-02-04 13:32:18 -08:00
Derek Collison	d962500827	Track reply subjects for pending pull requests across clustered consumers. We will only send if all peers in our group are >= 2.7.1 and we will check for updates. When a consumer follower takes over it will notify all pending requests that those requests are invalid now. Signed-off-by: Derek Collison <derek@nats.io>	2022-01-21 16:31:59 -08:00
Derek Collison	52da55c8c6	Implement overflow placement for JetStream streams. This allows stream placement to overflow to adjacent clusters. We also do more balanced placement based on resources (store or mem). We can continue to expand this as well. We also introduce an account requirement that stream configs contain a MaxBytes value. We now track account limits and server limits more distinctly, and do not reserver server resources based on account limits themselves. Signed-off-by: Derek Collison <derek@nats.io>	2022-01-06 19:33:08 -08:00
Ivan Kozlovic	5fc9e0e1cc	[FIXED] Gateway URLs gossip and `/varz` report issues - When detecting duplicate route, it was possible that a server would lose track of the peer's gateway URL, which would prevent it from gossiping that URL to inbound gateway connections - When a server has gateways enabled and has as a remote its own gateway, the monitoring endpoint `/varz` would include it but without the "urls" array. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-10-28 12:05:30 -06:00
Matthias Hanel	41a253dabb	fix daisy chained leaf node subject propagation issue. (#2468 ) fixes #2448 initLeafNodeSmapAndSendSubs did not pick up enough local subscriptions. Signed-off-by: Matthias Hanel <mh@synadia.com>	2021-08-25 18:10:09 -04:00
Derek Collison	925a6fe6b2	Fix for #2388 . Leafnodes with no JS can seamlessly access a HUB with JS. This is the reverse of the early work to have LNs extend a non-JS cluster. Also have mixed mode tests as well. Signed-off-by: Derek Collison <derek@nats.io>	2021-08-01 14:57:47 -07:00
Matthias Hanel	a40ea298e5	[fixed] jetstream unique server name requirement across domains (#2378 ) * [fixed] jetstream unique server name requirement across domains including domain in server info adding check for cluster name in duplicate leaf node connection check This does not address non unique domains in the same domain, say within super cluster. Signed-off-by: Matthias Hanel <mh@synadia.com>	2021-07-27 18:42:19 -04:00
Ivan Kozlovic	d7933631a9	[FIXED] Failed route TLS handshake would leave failed conn's lock, locked This is a regression introduced in v2.2.6. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-06-22 14:05:43 -06:00
Matthias Hanel	b1dee292e6	[changed] pinned certs to check the server connected to as well (#2247 ) * [changed] pinned certs to check the server connected to as well on reload clients with removed pinned certs will be disconnected. The check happens only on tls handshake now. Signed-off-by: Matthias Hanel <mh@synadia.com>	2021-05-24 17:28:32 -04:00
R.I.Pienaar	5e06e5e232	Export the clientOpts structure This structure is used in ClientAuthentication, an interface designed to let 3rd parties extend the authentication mechanisms of the server In order to allow those 3rd parties to create unit tests, mocks etc we need to export this structure so it's accessible externally Signed-off-by: R.I.Pienaar <rip@devco.net>	2021-05-07 15:51:31 +02:00
Ivan Kozlovic	e2e3de9977	[FIXED] Message loop with cluster, leaf nodes and queue subs In a setup with a cluster of servers to which 2 different leaf nodes attach to, and queue subs are attached to one of the leaf, if the leaf server is restarted and reconnects to another server in the cluster, there was a risk for an infinite message loop between some servers in the "hub" cluster. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-04-28 17:11:51 -06:00
Ivan Kozlovic	1014041be3	[FIXED] Possible panic due to concurrent access to unlocked map This could happen when a leafnode has permissions set and another connection (client, etc..) is about to assign a message to the leafnode while the leafnode itself is receiving messages and they both check permissions at the same time. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-04-20 21:18:13 -06:00
Jaime Piña	e12181cb83	Return not ready for connection reason Currently, we use ReadyForConnections in server tests to wait for the server to be ready. However, when this fails we don't get a clue about why it failed. This change adds a new unexported method called readyForConnections that returns an error describing which check failed. The exported ReadyForConnections version works exactly as before. The unexported version gets used in internal tests only.	2021-04-20 11:45:08 -07:00
Ivan Kozlovic	56d0d9ec87	Do not propagate service import interest across GW and ROUTES Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-04-15 11:34:36 -06:00
Derek Collison	e438d2f5fa	Mixed mode improvements. 1. When in mixed mode and only running the global account we now will check the account for JS. 2. Added code to decrease the cluster set size if we guessed wrong in mixed mode setup. Signed-off-by: Derek Collison <derek@nats.io>	2021-04-09 14:58:35 -07:00
Ivan Kozlovic	21a9bfa1d8	[FIXED] Leafnode: incorrect loop detection in multi-cluster setup If leafnodes from a cluster were to reconnect to a server in a different cluster, it was possible for that server to send to the leafnodes some their own subscriptions that could cause an inproper loop detection error. There was also a defect that would cause subscriptions over route for leafnode subscriptions to be registered under the wrong key, which would lead to those subscriptions not being properly removed on route disconnect. Finally, during route disconnect, the leafnodes map was not updated. This PR fixes that too. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-04-05 16:49:37 -06:00

1 2 3 4 5 ...

282 Commits