nats-server

mirror of https://github.com/gogrlx/nats-server.git synced 2026-04-15 18:50:41 -07:00

Author	SHA1	Message	Date
Derek Collison	c16d361ead	Merge branch 'main' into dev	2023-06-27 20:41:57 -07:00
Derek Collison	9805101953	Use an account protected method to check for service imports to avoid data race when reloading accounts. Signed-off-by: Derek Collison <derek@nats.io>	2023-06-27 19:52:43 -07:00
Derek Collison	8a8c37231f	Merge branch 'main' into dev	2023-06-10 20:56:42 -07:00
Derek Collison	11963e51fe	Optimize statsz locking and only send if we know we have external interest. Signed-off-by: Derek Collison <derek@nats.io>	2023-06-10 20:25:05 -07:00
Derek Collison	4c26cbb3de	Merge branch 'main' into dev	2023-05-12 12:38:20 -07:00
Waldemar Quevedo	286a1632ca	Use monotonic time for measuring time internally Signed-off-by: Waldemar Quevedo <wally@nats.io>	2023-05-12 08:27:46 -07:00
Ivan Kozlovic	311e3feb5f	Merge branch 'main' into dev	2023-05-03 17:38:40 -06:00
Waldemar Quevedo	938ffcba20	Fix race in reload and gateway sublist check Signed-off-by: Waldemar Quevedo <wally@nats.io>	2023-05-02 17:51:53 -07:00
Ivan Kozlovic	105237cba8	[ADDED] Multiple routes and ability to have per-account routes New configuration fields: ``` cluster { ... pool_size: 5 accounts: ["A", "B"] } ``` The configuration `pool_size` in the example above means that this server will create 5 routes to a remote server, assuming that that server has the same `pool_size` setting. Accounts (which are not part of the `accounts[]` configuration) are assigned a specific route in this pool, and this will be the same route on all servers in the cluster. Accounts that are defined in the `accounts` field will each have a dedicated route connection. This will allow suppression of the account name in some of the route protocols, reducing bytes transmitted which may increase performance. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2023-04-03 09:32:25 -06:00
peaaceChoi	038037381b	Fix some typos in code comment	2023-01-12 10:31:32 +09:00
Ivan Kozlovic	8d9c57ad44	[IMPROVED] Fan-out performance There was an observed degradation (around 5%) for large fan out in v2.9.0 compared to earlier release. This is because we added accounting of the in/out messages for the account, which result in 4 atomic operations, 2 for in and 2 for out, however, it means that for a fan-out of say 100 matching subscriptions, it is now 2 + 2 * 100 = 202. This PR rework how the stats accounting is done which removes the regression and even boost a bit the numbers since we are doing the server stats update as an aggregate too. There are still degradation for queues and no-sub at all that need to be looked at. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-09-27 19:43:32 -06:00
Ivan Kozlovic	170ff49837	[ADDED] JetStream: peer (the hash of server name) in statsz/jsz A request to `$SYS.REQ.SERVER.PING.JSZ` would now return something like this: ``` ... "meta_cluster": { "name": "local", "leader": "A", "peer": "NUmM6cRx", "replicas": [ { "name": "B", "current": true, "active": 690369000, "peer": "b2oh2L6w" }, { "name": "Server name unknown at this time (peerID: jZ6RvVRH)", "current": false, "offline": true, "active": 0, "peer": "jZ6RvVRH" } ], "cluster_size": 3 } ``` Note the "peer" field following the "leader" field that contains the server name. The new field is the node ID, which is a hash of the server name. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-09-16 15:31:37 -06:00
Ivan Kozlovic	8d1fb4bc92	[FIXED] JetStream: possible routing issues through gateways Internally jetstream may subscribe to some subject and then send a request with a reply subject matching that subscription. Due to interest propagation through a super cluster, it is possible that the reply comes back to a node that is not yet aware of the subscription interest which would cause the reply to be dropped. Some code detects that the subscription is recent and "map" the reply subject so that it can be routed back to the origin server. However, this was done with the use of the connection object that created the subscription, but at the time of the send, a different internal "*client" object may be used which would then cause the code to not be aware of the recent subscription and not do the mapping. This code was changed to scope at the account level instead of connection. A recent change in PR #3412 is no longer needed and was reverted in favor of changes in this PR. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-08-31 14:18:28 -06:00
Derek Collison	98bf861a7a	Updates to stream and consumer move logic. Signed-off-by: Derek Collison <derek@nats.io>	2022-08-30 16:11:35 -07:00
Ivan Kozlovic	9d1e773e8f	[FIXED] Gateway: system request/replies may not work properly When a subscription is recently made, gateway code ensures that if there is a reply subject, the reply is "mapped" or rewritten to allow the reply to come back to the origin cluster, regardless of subscription interest propagation. The issue was that this uses a map with a `*client` as the key but the pointer for SYSTEM clients would not always be the same, which meant that the rewrite would not happen, causing possible "loss" of replies. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-08-29 14:05:51 -06:00
Ivan Kozlovic	f6c4e5fcee	[CHANGED] Gateway: Switch all accounts to interest-only mode We are phasing out the optimistic-only mode. Servers accepting inbound gateway connections will switch the accounts to interest-only mode. The servers with outbound gateway connection will check interest and ignore the "optimistic" mode if it is known that the corresponding inbound is going to switch the account to interest-only. This is done using a boolean in the gateway INFO protocol. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-08-19 16:41:44 -06:00
Ivan Kozlovic	5d3ee8ebf4	[FIXED] Gateway: possible panic if monitor endpoint inspected too soon The monitoring http server is started early and the gateway setup (when configured) may not be fully ready when the `/gatewayz` endpoint is inspected and could cause a panic. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-08-17 13:30:58 -06:00
Ivan Kozlovic	3c9a7cc6e5	Move to Go 1.19, remote io/util, fix data race and a flapper Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-08-05 09:55:37 -06:00
Ivan Kozlovic	98c1f0ecb2	Fixed some data race and some flappers Got a data race: ``` ================== WARNING: DATA RACE Write at 0x00c001c736b0 by goroutine 605: runtime.mapassign_faststr() /home/travis/.gimme/versions/go1.17.8.linux.amd64/src/runtime/map_faststr.go:202 +0x0 github.com/nats-io/nats-server/v2/server.(Account).addServiceImport() /home/travis/gopath/src/github.com/nats-io/nats-server/server/accounts.go:1868 +0xb7b github.com/nats-io/nats-server/v2/server.(Account).AddServiceImportWithClaim() ... Previous read at 0x00c001c736b0 by goroutine 301: runtime.mapaccess2_faststr() /home/travis/.gimme/versions/go1.17.8.linux.amd64/src/runtime/map_faststr.go:107 +0x0 github.com/nats-io/nats-server/v2/server.(Server).registerSystemImports() /home/travis/gopath/src/github.com/nats-io/nats-server/server/events.go:1577 +0x284 github.com/nats-io/nats-server/v2/server.(Server).updateAccountClaimsWithRefresh() ... ``` Also, remove some condition in gateway.go on how we were checking if a subject was a serviec reply, which was causing a test to flap. Finally, used AckSync() in a rest (instead of m.Respond(nil)) to prevent it from flapping. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-03-29 19:02:41 -06:00
Ivan Kozlovic	63c750e295	[CHANGED] Gateway: Detect duplicate names between clusters Gateway connection will be closed and error reported if a remote has a name that is a duplicate of the local cluster. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-03-15 15:00:13 -06:00
Ivan Kozlovic	85b3f8a7fd	Gateways: data race when setting first ping timer This was introduced when fixing #2881. The call to setFirstPingTimer needed to be done under the client's lock. Moved setFirstPingTimer from a server receiver to a client receiver. The only reason it was a server receiver is because we need the server options, but c.srv is always set when invoking this function, so we will get the server from c.srv in that function now. Related to #2881 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-03-04 19:55:07 -07:00
Ivan Kozlovic	08d6aaa78f	[FIXED] Gateway: connect could fail due to PING sent before CONNECT When a gateway connection was created (either accepted or initiated) the timer to fire the first PING was started at that time, which means that for an outbound connection, if the INFO coming from the other side was delayed, it was possible for the outbound to send the PING protocol before the CONNECT, which would cause the accepting side to close the connection due to a "parse" error (since the CONNECT for an inbound is supposed to be the very first protocol). Also noticed that we were not setting the auth timer like we do for the other type of connections. If authorization{timeout:<n>} is not set, the default is 2 seconds. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-02-23 15:19:20 -07:00
Ivan Kozlovic	5fc9e0e1cc	[FIXED] Gateway URLs gossip and `/varz` report issues - When detecting duplicate route, it was possible that a server would lose track of the peer's gateway URL, which would prevent it from gossiping that URL to inbound gateway connections - When a server has gateways enabled and has as a remote its own gateway, the monitoring endpoint `/varz` would include it but without the "urls" array. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-10-28 12:05:30 -06:00
Ivan Kozlovic	0bd38bd424	[FIXED] Monitoring: `/varz` gateway URLs not always updated When servers leave a cluster and their gateway URLs was not in the remote cluster's configuration, it is possible that their gateway URL do not disappear from the list of URLs in the `/varz` monitoring endpoint. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-10-26 13:11:06 -06:00
Matthias Hanel	1c508220d8	Review comment Signed-off-by: Matthias Hanel <mh@synadia.com>	2021-10-19 18:03:59 -04:00
Matthias Hanel	c4a3a4c95e	fix timer not being stopped prior to reset Signed-off-by: Matthias Hanel <mh@synadia.com>	2021-10-19 16:56:20 -04:00
Derek Collison	f13fa767c2	Remove the swapping of accounts during processing of service imports. When processing service imports we would swap out the accounts during processing. With the addition of internal subscriptions and internal clients publishing in JetStream we had an issue with the wrong account being used. This was specific to delyaed pull subscribers trying to unsubscribe due to max of 1 while other JetStream API calls were running concurrently.	2021-07-26 07:57:10 -07:00
Derek Collison	1270977322	When receiving a response across a gateway that has headers and a globally routed subject (_GR_) we were dropping header information. Signed-off-by: Derek Collison <derek@nats.io>	2021-06-10 14:29:33 -07:00
Matthias Hanel	b1dee292e6	[changed] pinned certs to check the server connected to as well (#2247 ) * [changed] pinned certs to check the server connected to as well on reload clients with removed pinned certs will be disconnected. The check happens only on tls handshake now. Signed-off-by: Matthias Hanel <mh@synadia.com>	2021-05-24 17:28:32 -04:00
Matthias Hanel	6f6f22e9a7	[added] pinned_cert option to tls block hex(sha256(spki)) (#2233 ) * [added] pinned_cert option to tls block hex(sha256(spki)) When read form config, the values are automatically lower cased. The check when seeing the values programmatically requires lower case to avoid having to alter the map at this point. Signed-off-by: Matthias Hanel <mh@synadia.com>	2021-05-20 17:00:09 -04:00
Ivan Kozlovic	2881e4a1f0	[FIXED] MQTT fixes and improvements Some issues that have been fixed would manifest by timeouts on connect, unexpected memory usage on high publish message rate. Some details: - Replies were not always GW routed properly because we were looking at the wrong connection's rsubs - GW routed replies would not be found because they were tracked in the subscription's client object, which may not be the same used to send the reply - Increased the mqtt timeout to wait for JS replies since in some tests it was sometimes taking more than the original 2 seconds - Incoming gateway messages destined for an MQTT internal subscription may have been rejected as a no interest if the account had service imports - Don't use time.After(), instead create explicit timer so it can be stopped when not timing out. - Unnecessary copy of a slice since we were converting to a string anyway. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-05-04 20:48:14 -06:00
Jaime Piña	e12181cb83	Return not ready for connection reason Currently, we use ReadyForConnections in server tests to wait for the server to be ready. However, when this fails we don't get a clue about why it failed. This change adds a new unexported method called readyForConnections that returns an error describing which check failed. The exported ReadyForConnections version works exactly as before. The unexported version gets used in internal tests only.	2021-04-20 11:45:08 -07:00
Ivan Kozlovic	56d0d9ec87	Do not propagate service import interest across GW and ROUTES Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-04-15 11:34:36 -06:00
Derek Collison	8eefff2b3b	Make sure the jetstream accounts use the name as the key to the map. This prevents possible double adds under reload or restart scenarios. Signed-off-by: Derek Collison <derek@nats.io>	2021-03-18 17:29:26 -07:00
Ivan Kozlovic	cbcff97244	[CHANGED] Move Gateway interest-only mode switch from INF to DBG Also fixed a test that would sometimes fail depending on timing. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-03-14 11:34:36 -06:00
Ivan Kozlovic	27f51d4028	Fix ephemeral consumer delete in single cluster Also remove retry of sources/mirror in the setSourceConsumer() itself when not getting a response. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-03-10 15:16:31 -07:00
Ivan Kozlovic	e7e756034a	Switch Gateway JS accounts to interest-only mode + some other fixes - Fixed the close of a TLS connection which starting Go 1.16 set the deadline to 5 seconds. - Fixed an issue with setHeader that was causing these error messages ``` === RUN TestServiceImportReplyMatchCycleMultiHops nats: message could not decode headers on connection [4] for subscription on "foo" --- PASS: TestServiceImportReplyMatchCycleMultiHops (0.04s) ``` - Fixed names of tests in norace_test.go since they must start with TestNoRace in order to make sure that we execute them in Travis: ``` go test -v -run=TestNoRace --failfast -p=1 ./... ``` Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-03-03 19:15:28 -07:00
Matthias Hanel	c50ee2a1c6	[Changed] all times exposed will be computed in UTC (#1943 ) This also applies to times that end up in that json. Where applicable moved time.Now() to where it is used. Moved calls to .UTC() to where time is created it that time is converted later anyway. Signed-off-by: Matthias Hanel <mh@synadia.com>	2021-03-02 21:37:42 -05:00
Ivan Kozlovic	1652fe62ef	Updates to when do snapshot Remove panic on runAsLeader when not able to subscribe (which happens on shutdown) Gateway name access does not need lock since it is immutable. Will prevent deadlocks in some situations. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-02-23 19:06:07 -07:00
Derek Collison	bb58d455f6	Revert switching to interest only mode Signed-off-by: Derek Collison <derek@nats.io>	2021-02-23 18:00:47 -08:00
Derek Collison	6d6a6c07ff	Don't send empty subjects, always put system account in interest only Signed-off-by: Derek Collison <derek@nats.io>	2021-02-23 10:57:12 -08:00
Derek Collison	fa8a74ceb5	Allow placement directives for metacontroller stepdown to allow placement to new clusters. Signed-off-by: Derek Collison <derek@nats.io>	2021-02-19 10:55:22 -08:00
Ivan Kozlovic	61bd1b8d86	MQTT clustering Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-02-19 08:50:00 -07:00
Ivan Kozlovic	8598de6dbe	[FIXED] Gateway's implicit connection not using global user/pass If a gateway is configured with an authorization block containing username and password and accepts an unknown Gateway connection, when initiating the outbound connection, it should use the gateway authorization's user/pass information. Resolves #1912 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-02-16 10:06:06 -07:00
Derek Collison	6d32c307ef	Remove pretty indent for json. Signed-off-by: Derek Collison <derek@nats.io>	2021-02-06 20:09:44 -08:00
Ivan Kozlovic	2b8c6e0124	Support for Websocket Leafnode connections Added two options in the remote leaf node configuration - compress, for websocket only at the moment - ws_masking, to force remote leafnode connections to mask websocket frames (default is no masking since it is communication between server to server) Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-01-28 13:13:11 -07:00
Ivan Kozlovic	131be1cb33	Make TLS client/server handshake helpers function This reduces code duplication Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-01-28 13:13:11 -07:00
Ivan Kozlovic	ef38abe75b	Fixed gateway reply mapping following changes in JetStream clustering Those changes are required to maintain backward compatibility. Since the replies are "_G_.<gateway name hash>.<server ID hash>" and the hash were 6 characters long, changing to 8 the hash function would break things. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-01-15 17:32:04 -07:00
Derek Collison	f0cdf89c61	JetStream Clustering WIP Signed-off-by: Derek Collison <derek@nats.io>	2021-01-14 01:14:52 -08:00
Ivan Kozlovic	d24e9b75b3	Fixed GW implicit reconnection PR #1412 had a fix for races during implicit GW reconnection. However, the fix was a bit too simplistic in that it was checking only if there was any inbound gateway to decide to try to reconnect an implicit disconnected GW. We need to check the name, not only presence of inbound GW connections. Related to #1412 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-12-28 12:28:55 -07:00

1 2 3

149 Commits