nats-server

mirror of https://github.com/gogrlx/nats-server.git synced 2026-04-15 10:40:41 -07:00

Author	SHA1	Message	Date
Neil Twigg	14d0ba1c65	Fix some lint errors after move to `golangci-lint`	2022-12-30 20:00:08 +00:00
Ivan Kozlovic	3ec42d5b85	Updates to PR #3611 - Save the TLS name only if not already set - Use the passed URLs slice instead of using s.getOpts().Routes - Enhanced the test - Fixed an unrelated DATA RACE report Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-11-08 09:36:08 -07:00
Ivan Kozlovic	2d181e1c27	[FIXED] Routing: TLS connections to discovered server may fail The server was not setting "server name" in the TLS configuration for route connections, which may lead to failed (re)connect if the certificate does not allow for the IP and the URL did not have the hostname, which would happen with gossip protocol. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-11-07 17:26:17 -07:00
Ivan Kozlovic	170ff49837	[ADDED] JetStream: peer (the hash of server name) in statsz/jsz A request to `$SYS.REQ.SERVER.PING.JSZ` would now return something like this: ``` ... "meta_cluster": { "name": "local", "leader": "A", "peer": "NUmM6cRx", "replicas": [ { "name": "B", "current": true, "active": 690369000, "peer": "b2oh2L6w" }, { "name": "Server name unknown at this time (peerID: jZ6RvVRH)", "current": false, "offline": true, "active": 0, "peer": "jZ6RvVRH" } ], "cluster_size": 3 } ``` Note the "peer" field following the "leader" field that contains the server name. The new field is the node ID, which is a hash of the server name. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-09-16 15:31:37 -06:00
Ivan Kozlovic	b6208c775b	[FIXED] Memory leak when unsubscribing the last queue subscription A server maintains a map for the subject+queue to know the number of members on the same group. However, on unsubscribe when we get to the last one being unsubscribed, we were removing from the map but then unfortunately adding back with a value of 0, which caused a leak. If the same subscription was coming back, then this map entry would be reused, but if it is a never coming back queue sub, then memory could increase continously. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-08-04 18:42:13 -06:00
Ivan Kozlovic	5261d98781	[ADDED] Monitoring: Routez's individual route has now more info Added Start, LastActivity, Uptime and Idle that we normally have in a Connz for non route connections. This info can be useful to determine if a route is recent, etc.. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-05-18 13:18:53 -06:00
Ivan Kozlovic	63c750e295	[CHANGED] Gateway: Detect duplicate names between clusters Gateway connection will be closed and error reported if a remote has a name that is a duplicate of the local cluster. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-03-15 15:00:13 -06:00
Ivan Kozlovic	85b3f8a7fd	Gateways: data race when setting first ping timer This was introduced when fixing #2881. The call to setFirstPingTimer needed to be done under the client's lock. Moved setFirstPingTimer from a server receiver to a client receiver. The only reason it was a server receiver is because we need the server options, but c.srv is always set when invoking this function, so we will get the server from c.srv in that function now. Related to #2881 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-03-04 19:55:07 -07:00
Derek Collison	ca1132a01d	Allow stream placement by tags. Signed-off-by: Derek Collison <derek@nats.io>	2022-02-15 17:07:32 -08:00
Derek Collison	a0a2e32185	Remove dynamic account behaviors. We used these in tests and for experimenting with sandboxed environments like the demo network. Signed-off-by: Derek Collison <derek@nats.io>	2022-02-04 13:32:18 -08:00
Derek Collison	d962500827	Track reply subjects for pending pull requests across clustered consumers. We will only send if all peers in our group are >= 2.7.1 and we will check for updates. When a consumer follower takes over it will notify all pending requests that those requests are invalid now. Signed-off-by: Derek Collison <derek@nats.io>	2022-01-21 16:31:59 -08:00
Derek Collison	52da55c8c6	Implement overflow placement for JetStream streams. This allows stream placement to overflow to adjacent clusters. We also do more balanced placement based on resources (store or mem). We can continue to expand this as well. We also introduce an account requirement that stream configs contain a MaxBytes value. We now track account limits and server limits more distinctly, and do not reserver server resources based on account limits themselves. Signed-off-by: Derek Collison <derek@nats.io>	2022-01-06 19:33:08 -08:00
Ivan Kozlovic	5fc9e0e1cc	[FIXED] Gateway URLs gossip and `/varz` report issues - When detecting duplicate route, it was possible that a server would lose track of the peer's gateway URL, which would prevent it from gossiping that URL to inbound gateway connections - When a server has gateways enabled and has as a remote its own gateway, the monitoring endpoint `/varz` would include it but without the "urls" array. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-10-28 12:05:30 -06:00
Matthias Hanel	41a253dabb	fix daisy chained leaf node subject propagation issue. (#2468 ) fixes #2448 initLeafNodeSmapAndSendSubs did not pick up enough local subscriptions. Signed-off-by: Matthias Hanel <mh@synadia.com>	2021-08-25 18:10:09 -04:00
Derek Collison	925a6fe6b2	Fix for #2388 . Leafnodes with no JS can seamlessly access a HUB with JS. This is the reverse of the early work to have LNs extend a non-JS cluster. Also have mixed mode tests as well. Signed-off-by: Derek Collison <derek@nats.io>	2021-08-01 14:57:47 -07:00
Matthias Hanel	a40ea298e5	[fixed] jetstream unique server name requirement across domains (#2378 ) * [fixed] jetstream unique server name requirement across domains including domain in server info adding check for cluster name in duplicate leaf node connection check This does not address non unique domains in the same domain, say within super cluster. Signed-off-by: Matthias Hanel <mh@synadia.com>	2021-07-27 18:42:19 -04:00
Ivan Kozlovic	d7933631a9	[FIXED] Failed route TLS handshake would leave failed conn's lock, locked This is a regression introduced in v2.2.6. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-06-22 14:05:43 -06:00
Matthias Hanel	b1dee292e6	[changed] pinned certs to check the server connected to as well (#2247 ) * [changed] pinned certs to check the server connected to as well on reload clients with removed pinned certs will be disconnected. The check happens only on tls handshake now. Signed-off-by: Matthias Hanel <mh@synadia.com>	2021-05-24 17:28:32 -04:00
R.I.Pienaar	5e06e5e232	Export the clientOpts structure This structure is used in ClientAuthentication, an interface designed to let 3rd parties extend the authentication mechanisms of the server In order to allow those 3rd parties to create unit tests, mocks etc we need to export this structure so it's accessible externally Signed-off-by: R.I.Pienaar <rip@devco.net>	2021-05-07 15:51:31 +02:00
Ivan Kozlovic	e2e3de9977	[FIXED] Message loop with cluster, leaf nodes and queue subs In a setup with a cluster of servers to which 2 different leaf nodes attach to, and queue subs are attached to one of the leaf, if the leaf server is restarted and reconnects to another server in the cluster, there was a risk for an infinite message loop between some servers in the "hub" cluster. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-04-28 17:11:51 -06:00
Ivan Kozlovic	1014041be3	[FIXED] Possible panic due to concurrent access to unlocked map This could happen when a leafnode has permissions set and another connection (client, etc..) is about to assign a message to the leafnode while the leafnode itself is receiving messages and they both check permissions at the same time. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-04-20 21:18:13 -06:00
Jaime Piña	e12181cb83	Return not ready for connection reason Currently, we use ReadyForConnections in server tests to wait for the server to be ready. However, when this fails we don't get a clue about why it failed. This change adds a new unexported method called readyForConnections that returns an error describing which check failed. The exported ReadyForConnections version works exactly as before. The unexported version gets used in internal tests only.	2021-04-20 11:45:08 -07:00
Ivan Kozlovic	56d0d9ec87	Do not propagate service import interest across GW and ROUTES Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-04-15 11:34:36 -06:00
Derek Collison	e438d2f5fa	Mixed mode improvements. 1. When in mixed mode and only running the global account we now will check the account for JS. 2. Added code to decrease the cluster set size if we guessed wrong in mixed mode setup. Signed-off-by: Derek Collison <derek@nats.io>	2021-04-09 14:58:35 -07:00
Ivan Kozlovic	21a9bfa1d8	[FIXED] Leafnode: incorrect loop detection in multi-cluster setup If leafnodes from a cluster were to reconnect to a server in a different cluster, it was possible for that server to send to the leafnodes some their own subscriptions that could cause an inproper loop detection error. There was also a defect that would cause subscriptions over route for leafnode subscriptions to be registered under the wrong key, which would lead to those subscriptions not being properly removed on route disconnect. Finally, during route disconnect, the leafnodes map was not updated. This PR fixes that too. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-04-05 16:49:37 -06:00
Ivan Kozlovic	0f53bf6580	Fixed data race with nodeInfo Took the approach of storing struct instead of pointer. Of course, when changing the offline bool from false to true, it means that we need to call Store again (with same key). This is based on the assumption that those Load/Store are not too frequent. Otherwise, we may need to use locking (and keep *nodeInfo) Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-03-03 13:28:45 -07:00
Matthias Hanel	4f2db7d187	Fixed linter issues Signed-off-by: Matthias Hanel <mh@synadia.com>	2021-03-02 20:21:44 -05:00
Derek Collison	1c79d96de8	user single node info struct Signed-off-by: Derek Collison <derek@nats.io>	2021-02-06 20:10:29 -08:00
Derek Collison	a1e0f7dc1a	First pass at supercluster enablement. This allows metacontrollers to span superclusters. Also includes placement directives for streams. By default they select the request origin cluster. Signed-off-by: Derek Collison <derek@nats.io>	2021-02-03 17:28:13 -08:00
Ivan Kozlovic	2b8c6e0124	Support for Websocket Leafnode connections Added two options in the remote leaf node configuration - compress, for websocket only at the moment - ws_masking, to force remote leafnode connections to mask websocket frames (default is no masking since it is communication between server to server) Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-01-28 13:13:11 -07:00
Ivan Kozlovic	131be1cb33	Make TLS client/server handshake helpers function This reduces code duplication Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-01-28 13:13:11 -07:00
Derek Collison	a1730f1b31	Report on RAFT group information. This adds in optional reporting to stream and consumer info when running in clsutered mode. Signed-off-by: Derek Collison <derek@nats.io>	2021-01-20 11:58:31 -08:00
Ivan Kozlovic	42dcdd2eb2	Simplify sendSubsToRoute() Since we were creating subs on the fly, sub.im would always be nil. We were passing a client because it was needed in sendRouteSubOrUnSubProtos(). This PR simply fills the buffer with each account's subscriptions. There is also no need to have subs sent from different go routine based on some threshold. Routes are no longer subject to max pending. Some code has been made into a function so that they can be shared by sendSubsToRoute() and sendRouteSubOrUnSubProtos(). The function is simply adding to given buffer the RS+/- protocol. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-01-19 14:01:43 -07:00
Ivan Kozlovic	ef38abe75b	Fixed gateway reply mapping following changes in JetStream clustering Those changes are required to maintain backward compatibility. Since the replies are "_G_.<gateway name hash>.<server ID hash>" and the hash were 6 characters long, changing to 8 the hash function would break things. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-01-15 17:32:04 -07:00
Derek Collison	f0cdf89c61	JetStream Clustering WIP Signed-off-by: Derek Collison <derek@nats.io>	2021-01-14 01:14:52 -08:00
Ivan Kozlovic	67425d23c8	Add c.isMqtt() and c.isWebsocket() This hides the check on "c.mqtt != nil" or "c.ws != nil". Added some tests. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-12-02 15:52:06 -07:00
Ivan Kozlovic	77aead807c	Send LS- without origin to route When cluster origin code was added, a server may send LS+ with an origin cluster name in the protocol. Parsing code from a ROUTER connection was adjusted to understand this LS+ protocol. However, the server was also sending an LS- with origin but the parsing code was not able to understand that. When the unsub was for a queue subscription, this would cause the parser to error out and close the route connection. This PR sends an LS- without the origin in this case (so that tracing makes sense in term of LS+/LS- sent to a route). The receiving side then traces appropriate LS- but processes as a normal RS-. Resolves #1751 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-11-30 13:31:32 -07:00
Ivan Kozlovic	13df1a55fd	Changed warning message Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-10-09 09:36:30 -06:00
Ivan Kozlovic	df9d5f5fd9	Accepting route warns if remote server has same name Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-10-08 17:59:33 -06:00
Matthias Hanel	634ce9f7c8	[Adding] Accountz monitoring endpoint and INFO monitoring req subject Returned imports/exports are formated like jwt exports imports, even if they originating account is from config. Fixes #1604 Signed-off-by: Matthias Hanel <mh@synadia.com>	2020-09-23 16:22:48 -04:00
Ivan Kozlovic	2ad2bed170	[ADDED] Support for route hostname resolution We previously simply called DialTimeout() on a route's url when soliciting. If it resolved to the IP of the host, it would create a route to self, which server detects, but then would not try again with other IPs that would have allowed to form a cluster with other servers running on the other IPs. This PR keeps track of local IPs + cluster port and exclude them from the list of IPs returned by LookupHost API. This even prevent solicitation of routes to self. Only non-local IPs will be tried. Resolves #1586 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-09-08 13:40:17 -06:00
Phil Pennock	3c680eceb9	Inhibit Go's default TCP keepalive settings for NATS (#1562 ) Inhibit Go's default TCP keepalive settings for NATS Go 1.13 changed the semantics of the tuning parameters for TCP keepalives, including the default value. This affects all TCP listeners. The NATS protocol has its own L7 keepalive system (PING/PONG) and the Go defaults are not a good fit for some valid deployment scenarios, while Go doesn't directly expose a working API for tuning these. Rather than add a configuration knob and pull in another dependency (with portability issues) just disable TCP keepalives for all listeners used for speaking the NATS protocol. Change the tests so we test the same logic. Do not change HTTP monitoring, profiling, or the websocket API listeners. Change KeepAlive on client connections too.	2020-08-14 13:37:59 -04:00
Ivan Kozlovic	c620175353	Rework closeConnection() This change allows the removal of the connection and update of the server state to be done "in place" but still delay the flushing of and close of tcp connection to the writeLoop. With ref counting we ensure that the reconnect happens after the flushing but not before the state has been updated. Had to fix some places where we may have called closeConnection() from under the server lock since it now would deadlock for sure. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-07-31 15:30:17 -06:00
Ivan Kozlovic	96ccf91566	[FIXED] Possible deadlock with solicited leafnodes when cluster conflict We cannot call c.closeConnection() under the server lock because closeConnection() can invoke server lock in some cases. Created a test that should run without `-race` to reproduce the deadlock (which it does) but sometimes would fail because cluster would not be formed. This unconvered an issue with conflict resolution which test TestRouteClusterNameConflictBetweenStaticAndDynamic() can reproduce easily. The issue was that we were not updating a dynamic name with the remote if the remote was non dynamic. Resolves #1543 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-07-30 18:45:36 -06:00
Matthias Hanel	3da66ad80d	Remove unnecessary account fetch from remote remove functions Changed: removeReplySub, removeRemoteSubs and processRemoteUnsub Signed-off-by: Matthias Hanel <mh@synadia.com>	2020-07-28 11:00:17 -04:00
Matthias Hanel	946e8415a0	Incorporating review comments	2020-07-27 19:19:43 -04:00
Matthias Hanel	00faefec06	Reduce usage of tmpAccounts to only location where it is needed imports On import handle it with priority as in non recursive situations, it won't be present.	2020-07-27 17:38:39 -04:00
Matthias Hanel	37692d2cf9	[Fixed] Skip fetch when a non config based account resolver is used Resolves #1532 Instead of the fetched account we create a dummy account that is expired. Any client connecting will trigger a fetch of the actual account jwt. This also avoids one fetch, thus the unit test was changed to reflect this. Unlike other resolver the memory resolver does not depend on external systems. It is purely based on server configuration. Therefore, fetch can be done and not finding an account means there is a configuration issue.	2020-07-27 17:36:55 -04:00
Ivan Kozlovic	9b0967a5d1	[FIXED] Handling of gossiped URLs If some servers in the cluster have the same connect URLs (due to the use of client advertise), then it would be possible to have a server sends the connect_urls INFO update to clients with missing URLs. Resolves #1515 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-07-15 17:39:12 -06:00
Ivan Kozlovic	4d495104de	Fixed no_responders use of sendProtoNow() The call sendProtoNow() should not normally be used (only when setting up a connection when the writeloop is not yet started and server needs to send something before being able to start the writeLoop. Instead, code should use enqueueProto(). For this particular case though, use queueOutbound() directly and add to the producer's pcd map. Also fixed other places where we were using queueOutbound() + flushSignal() which is what enqueueProto is doing. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-07-09 17:55:14 -06:00

1 2 3 4 5 ...

257 Commits