nats-server

mirror of https://github.com/gogrlx/nats-server.git synced 2026-04-14 10:10:42 -07:00

Author	SHA1	Message	Date
Ivan Kozlovic	d7933631a9	[FIXED] Failed route TLS handshake would leave failed conn's lock, locked This is a regression introduced in v2.2.6. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-06-22 14:05:43 -06:00
Matthias Hanel	b1dee292e6	[changed] pinned certs to check the server connected to as well (#2247 ) * [changed] pinned certs to check the server connected to as well on reload clients with removed pinned certs will be disconnected. The check happens only on tls handshake now. Signed-off-by: Matthias Hanel <mh@synadia.com>	2021-05-24 17:28:32 -04:00
R.I.Pienaar	5e06e5e232	Export the clientOpts structure This structure is used in ClientAuthentication, an interface designed to let 3rd parties extend the authentication mechanisms of the server In order to allow those 3rd parties to create unit tests, mocks etc we need to export this structure so it's accessible externally Signed-off-by: R.I.Pienaar <rip@devco.net>	2021-05-07 15:51:31 +02:00
Ivan Kozlovic	e2e3de9977	[FIXED] Message loop with cluster, leaf nodes and queue subs In a setup with a cluster of servers to which 2 different leaf nodes attach to, and queue subs are attached to one of the leaf, if the leaf server is restarted and reconnects to another server in the cluster, there was a risk for an infinite message loop between some servers in the "hub" cluster. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-04-28 17:11:51 -06:00
Ivan Kozlovic	1014041be3	[FIXED] Possible panic due to concurrent access to unlocked map This could happen when a leafnode has permissions set and another connection (client, etc..) is about to assign a message to the leafnode while the leafnode itself is receiving messages and they both check permissions at the same time. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-04-20 21:18:13 -06:00
Jaime Piña	e12181cb83	Return not ready for connection reason Currently, we use ReadyForConnections in server tests to wait for the server to be ready. However, when this fails we don't get a clue about why it failed. This change adds a new unexported method called readyForConnections that returns an error describing which check failed. The exported ReadyForConnections version works exactly as before. The unexported version gets used in internal tests only.	2021-04-20 11:45:08 -07:00
Ivan Kozlovic	56d0d9ec87	Do not propagate service import interest across GW and ROUTES Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-04-15 11:34:36 -06:00
Derek Collison	e438d2f5fa	Mixed mode improvements. 1. When in mixed mode and only running the global account we now will check the account for JS. 2. Added code to decrease the cluster set size if we guessed wrong in mixed mode setup. Signed-off-by: Derek Collison <derek@nats.io>	2021-04-09 14:58:35 -07:00
Ivan Kozlovic	21a9bfa1d8	[FIXED] Leafnode: incorrect loop detection in multi-cluster setup If leafnodes from a cluster were to reconnect to a server in a different cluster, it was possible for that server to send to the leafnodes some their own subscriptions that could cause an inproper loop detection error. There was also a defect that would cause subscriptions over route for leafnode subscriptions to be registered under the wrong key, which would lead to those subscriptions not being properly removed on route disconnect. Finally, during route disconnect, the leafnodes map was not updated. This PR fixes that too. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-04-05 16:49:37 -06:00
Ivan Kozlovic	0f53bf6580	Fixed data race with nodeInfo Took the approach of storing struct instead of pointer. Of course, when changing the offline bool from false to true, it means that we need to call Store again (with same key). This is based on the assumption that those Load/Store are not too frequent. Otherwise, we may need to use locking (and keep *nodeInfo) Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-03-03 13:28:45 -07:00
Matthias Hanel	4f2db7d187	Fixed linter issues Signed-off-by: Matthias Hanel <mh@synadia.com>	2021-03-02 20:21:44 -05:00
Derek Collison	1c79d96de8	user single node info struct Signed-off-by: Derek Collison <derek@nats.io>	2021-02-06 20:10:29 -08:00
Derek Collison	a1e0f7dc1a	First pass at supercluster enablement. This allows metacontrollers to span superclusters. Also includes placement directives for streams. By default they select the request origin cluster. Signed-off-by: Derek Collison <derek@nats.io>	2021-02-03 17:28:13 -08:00
Ivan Kozlovic	2b8c6e0124	Support for Websocket Leafnode connections Added two options in the remote leaf node configuration - compress, for websocket only at the moment - ws_masking, to force remote leafnode connections to mask websocket frames (default is no masking since it is communication between server to server) Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-01-28 13:13:11 -07:00
Ivan Kozlovic	131be1cb33	Make TLS client/server handshake helpers function This reduces code duplication Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-01-28 13:13:11 -07:00
Derek Collison	a1730f1b31	Report on RAFT group information. This adds in optional reporting to stream and consumer info when running in clsutered mode. Signed-off-by: Derek Collison <derek@nats.io>	2021-01-20 11:58:31 -08:00
Ivan Kozlovic	42dcdd2eb2	Simplify sendSubsToRoute() Since we were creating subs on the fly, sub.im would always be nil. We were passing a client because it was needed in sendRouteSubOrUnSubProtos(). This PR simply fills the buffer with each account's subscriptions. There is also no need to have subs sent from different go routine based on some threshold. Routes are no longer subject to max pending. Some code has been made into a function so that they can be shared by sendSubsToRoute() and sendRouteSubOrUnSubProtos(). The function is simply adding to given buffer the RS+/- protocol. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-01-19 14:01:43 -07:00
Ivan Kozlovic	ef38abe75b	Fixed gateway reply mapping following changes in JetStream clustering Those changes are required to maintain backward compatibility. Since the replies are "_G_.<gateway name hash>.<server ID hash>" and the hash were 6 characters long, changing to 8 the hash function would break things. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-01-15 17:32:04 -07:00
Derek Collison	f0cdf89c61	JetStream Clustering WIP Signed-off-by: Derek Collison <derek@nats.io>	2021-01-14 01:14:52 -08:00
Ivan Kozlovic	67425d23c8	Add c.isMqtt() and c.isWebsocket() This hides the check on "c.mqtt != nil" or "c.ws != nil". Added some tests. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-12-02 15:52:06 -07:00
Ivan Kozlovic	77aead807c	Send LS- without origin to route When cluster origin code was added, a server may send LS+ with an origin cluster name in the protocol. Parsing code from a ROUTER connection was adjusted to understand this LS+ protocol. However, the server was also sending an LS- with origin but the parsing code was not able to understand that. When the unsub was for a queue subscription, this would cause the parser to error out and close the route connection. This PR sends an LS- without the origin in this case (so that tracing makes sense in term of LS+/LS- sent to a route). The receiving side then traces appropriate LS- but processes as a normal RS-. Resolves #1751 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-11-30 13:31:32 -07:00
Ivan Kozlovic	13df1a55fd	Changed warning message Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-10-09 09:36:30 -06:00
Ivan Kozlovic	df9d5f5fd9	Accepting route warns if remote server has same name Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-10-08 17:59:33 -06:00
Matthias Hanel	634ce9f7c8	[Adding] Accountz monitoring endpoint and INFO monitoring req subject Returned imports/exports are formated like jwt exports imports, even if they originating account is from config. Fixes #1604 Signed-off-by: Matthias Hanel <mh@synadia.com>	2020-09-23 16:22:48 -04:00
Ivan Kozlovic	2ad2bed170	[ADDED] Support for route hostname resolution We previously simply called DialTimeout() on a route's url when soliciting. If it resolved to the IP of the host, it would create a route to self, which server detects, but then would not try again with other IPs that would have allowed to form a cluster with other servers running on the other IPs. This PR keeps track of local IPs + cluster port and exclude them from the list of IPs returned by LookupHost API. This even prevent solicitation of routes to self. Only non-local IPs will be tried. Resolves #1586 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-09-08 13:40:17 -06:00
Phil Pennock	3c680eceb9	Inhibit Go's default TCP keepalive settings for NATS (#1562 ) Inhibit Go's default TCP keepalive settings for NATS Go 1.13 changed the semantics of the tuning parameters for TCP keepalives, including the default value. This affects all TCP listeners. The NATS protocol has its own L7 keepalive system (PING/PONG) and the Go defaults are not a good fit for some valid deployment scenarios, while Go doesn't directly expose a working API for tuning these. Rather than add a configuration knob and pull in another dependency (with portability issues) just disable TCP keepalives for all listeners used for speaking the NATS protocol. Change the tests so we test the same logic. Do not change HTTP monitoring, profiling, or the websocket API listeners. Change KeepAlive on client connections too.	2020-08-14 13:37:59 -04:00
Ivan Kozlovic	c620175353	Rework closeConnection() This change allows the removal of the connection and update of the server state to be done "in place" but still delay the flushing of and close of tcp connection to the writeLoop. With ref counting we ensure that the reconnect happens after the flushing but not before the state has been updated. Had to fix some places where we may have called closeConnection() from under the server lock since it now would deadlock for sure. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-07-31 15:30:17 -06:00
Ivan Kozlovic	96ccf91566	[FIXED] Possible deadlock with solicited leafnodes when cluster conflict We cannot call c.closeConnection() under the server lock because closeConnection() can invoke server lock in some cases. Created a test that should run without `-race` to reproduce the deadlock (which it does) but sometimes would fail because cluster would not be formed. This unconvered an issue with conflict resolution which test TestRouteClusterNameConflictBetweenStaticAndDynamic() can reproduce easily. The issue was that we were not updating a dynamic name with the remote if the remote was non dynamic. Resolves #1543 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-07-30 18:45:36 -06:00
Matthias Hanel	3da66ad80d	Remove unnecessary account fetch from remote remove functions Changed: removeReplySub, removeRemoteSubs and processRemoteUnsub Signed-off-by: Matthias Hanel <mh@synadia.com>	2020-07-28 11:00:17 -04:00
Matthias Hanel	946e8415a0	Incorporating review comments	2020-07-27 19:19:43 -04:00
Matthias Hanel	00faefec06	Reduce usage of tmpAccounts to only location where it is needed imports On import handle it with priority as in non recursive situations, it won't be present.	2020-07-27 17:38:39 -04:00
Matthias Hanel	37692d2cf9	[Fixed] Skip fetch when a non config based account resolver is used Resolves #1532 Instead of the fetched account we create a dummy account that is expired. Any client connecting will trigger a fetch of the actual account jwt. This also avoids one fetch, thus the unit test was changed to reflect this. Unlike other resolver the memory resolver does not depend on external systems. It is purely based on server configuration. Therefore, fetch can be done and not finding an account means there is a configuration issue.	2020-07-27 17:36:55 -04:00
Ivan Kozlovic	9b0967a5d1	[FIXED] Handling of gossiped URLs If some servers in the cluster have the same connect URLs (due to the use of client advertise), then it would be possible to have a server sends the connect_urls INFO update to clients with missing URLs. Resolves #1515 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-07-15 17:39:12 -06:00
Ivan Kozlovic	4d495104de	Fixed no_responders use of sendProtoNow() The call sendProtoNow() should not normally be used (only when setting up a connection when the writeloop is not yet started and server needs to send something before being able to start the writeLoop. Instead, code should use enqueueProto(). For this particular case though, use queueOutbound() directly and add to the producer's pcd map. Also fixed other places where we were using queueOutbound() + flushSignal() which is what enqueueProto is doing. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-07-09 17:55:14 -06:00
Ivan Kozlovic	9288283d90	Fixed accept loops that could leave connections opened This was discovered with the test TestLeafNodeWithGatewaysServerRestart that was sometimes failing. Investigation showed that when cluster B was shutdown, one of the server on A that had a connection from B that just broke tried to reconnect (as part of reconnect retries of implicit gateways) to a server in B that was in the process of shuting down. The connection had been accepted but createGateway not called because the server's running boolean had been set to false as part of the shutdown. However, the connection was not closed so the server on A had a valid connection to a dead server from cluster B. When the B cluster (now single server) was restarted and a LeafNode connection connected to it, then the gateway from B to A was created, that server on A did not create outbound connection to that B server because it already had one (the zombie one). So this PR strengthens the starting of accept loops and also make sure that if a connection (all type of connections) is not accepted because the server is shuting down, that connection is properly closed. Since all accept loops had almost same code, made a generic function that accept functions to call specific create connection functions. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-07-06 17:03:19 -06:00
Derek Collison	6c805eebc7	Properly support leadnode clusters. Leafnodes that formed clusters were partially supported. This adds proper support for origin cluster, subscription suppression and data message no echo for the origin cluster. Signed-off-by: Derek Collison <derek@nats.io>	2020-06-26 09:03:22 -07:00
Derek Collison	c7e4d8b194	Avoid data race on cluster name Signed-off-by: Derek Collison <derek@nats.io>	2020-06-18 13:17:50 -07:00
Derek Collison	1e52a1007b	More updates based on feedback Signed-off-by: Derek Collison <derek@nats.io>	2020-06-13 08:00:57 -07:00
Derek Collison	146d8f5dcb	Updates based on feedback, sped up some slow tests Signed-off-by: Derek Collison <derek@nats.io>	2020-06-12 17:26:43 -07:00
Derek Collison	dd61535e5a	Cluster names are now required. Added cluster names as required for prep work for clustered JetStream. System can dynamically pick a cluster name and settle on one even in large clusters. Signed-off-by: Derek Collison <derek@nats.io>	2020-06-12 15:48:38 -07:00
Derek Collison	4dee03b587	Allow mixed TLS and non-TLS on same port Signed-off-by: Derek Collison <derek@nats.io>	2020-06-05 18:04:11 -07:00
Ivan Kozlovic	dc0f688cbf	[FIXED] LameDuckMode sends INFO to clients Also send an INFO to routes so that the remotes can remove the LDM's server client URLs and notify their own clients of this change. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-05-21 12:15:20 -06:00
Ivan Kozlovic	9715848a8e	[ADDED] Websocket support Websocket support can be enabled with a new websocket configuration block: ``` websocket { # Specify a host and port to listen for websocket connections # listen: "host:port" # It can also be configured with individual parameters, # namely host and port. # host: "hostname" # port: 4443 # This will optionally specify what host:port for websocket # connections to be advertised in the cluster # advertise: "host:port" # TLS configuration is required tls { cert_file: "/path/to/cert.pem" key_file: "/path/to/key.pem" } # If same_origin is true, then the Origin header of the # client request must match the request's Host. # same_origin: true # This list specifies the only accepted values for # the client's request Origin header. The scheme, # host and port must match. By convention, the # absence of port for an http:// scheme will be 80, # and for https:// will be 443. # allowed_origins [ # "http://www.example.com" # "https://www.other-example.com" # ] # This enables support for compressed websocket frames # in the server. For compression to be used, both server # and client have to support it. # compression: true # This is the total time allowed for the server to # read the client request and write the response back # to the client. This include the time needed for the # TLS handshake. # handshake_timeout: "2s" } ``` Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-05-20 11:14:39 -06:00
Derek Collison	019c105ca7	Updates based on feedback, more tests, few bug fixes Signed-off-by: Derek Collison <derek@nats.io>	2020-05-19 14:33:06 -07:00
Derek Collison	f5ceab339a	Server support for headers between routes Signed-off-by: Derek Collison <derek@nats.io>	2020-05-19 14:33:06 -07:00
Derek Collison	ea5e5bd364	Services rewrite #2 This contains a rewrite to the services layer for exporting and importing. The code this merges to already had a first significant rewrite that moved from special interest processing to plain subscriptions. This code changes the prior version's dealing with reverse mapping which was based mostly on thresholds and manual pruning, with some sporadic timer usage. This version uses the jetstream branch's code that understands interest and failed deliveries. So this code is much more tuned to reacting to interest changes. It also removes thresholds and goes only by interest changes or expirations based around a new service export property, response thresholds. This allows a service provider to provide semantics on how long a response should take at a maximum. This commit also introduces formal support for service export streamed and chunked response types send an empty message to signify EOF. This commit also includes additions to the service latency tracking such that errors are now sent, not only successful interactions. We have added a Status field and an optional Error fields to ServiceLatency. We support the following Status codes, these are directly from HTTP. 400 Bad Request (request did not have a reply subject) 408 Request Timeout (when system detects request interest went away, old request style to make dependable).. 503 Service Unavailable (no service responders running) 504 Service Timeout (The new response threshold expired) Signed-off-by: Derek Collison <derek@nats.io>	2020-05-19 14:26:46 -07:00
Derek Collison	df774e44b0	Rework how service imports are handled to avoid performance hits Signed-off-by: Derek Collison <derek@nats.io>	2020-05-19 14:18:34 -07:00
Derek Collison	8d1f3cc7c2	Allow JetStream consumers to work across multi-server hops Signed-off-by: Derek Collison <derek@nats.io>	2020-05-19 14:16:03 -07:00
Ivan Kozlovic	fef94759ab	[FIXED] Update remote gateway URLs when node goes away in cluster If a node in the cluster goes away, an async INFO is sent to inbound gateway connections so they get a chance to update their list of remote gateway URLs. Same happens when a node is added to the cluster. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-04-20 13:48:47 -06:00
Matthias Hanel	6a1c3fc29b	Moving inbound tracing to the caller (client.parse) Tracing for outgoing operations is always done while holding the client lock. Signed-off-by: Matthias Hanel <mh@synadia.com>	2020-03-04 17:31:18 -05:00

1 2 3 4 5

241 Commits