nats-server

mirror of https://github.com/gogrlx/nats-server.git synced 2026-04-02 03:38:42 -07:00

Author	SHA1	Message	Date
Ivan Kozlovic	0f53bf6580	Fixed data race with nodeInfo Took the approach of storing struct instead of pointer. Of course, when changing the offline bool from false to true, it means that we need to call Store again (with same key). This is based on the assumption that those Load/Store are not too frequent. Otherwise, we may need to use locking (and keep *nodeInfo) Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-03-03 13:28:45 -07:00
Matthias Hanel	4f2db7d187	Fixed linter issues Signed-off-by: Matthias Hanel <mh@synadia.com>	2021-03-02 20:21:44 -05:00
Derek Collison	1c79d96de8	user single node info struct Signed-off-by: Derek Collison <derek@nats.io>	2021-02-06 20:10:29 -08:00
Derek Collison	a1e0f7dc1a	First pass at supercluster enablement. This allows metacontrollers to span superclusters. Also includes placement directives for streams. By default they select the request origin cluster. Signed-off-by: Derek Collison <derek@nats.io>	2021-02-03 17:28:13 -08:00
Ivan Kozlovic	2b8c6e0124	Support for Websocket Leafnode connections Added two options in the remote leaf node configuration - compress, for websocket only at the moment - ws_masking, to force remote leafnode connections to mask websocket frames (default is no masking since it is communication between server to server) Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-01-28 13:13:11 -07:00
Ivan Kozlovic	131be1cb33	Make TLS client/server handshake helpers function This reduces code duplication Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-01-28 13:13:11 -07:00
Derek Collison	a1730f1b31	Report on RAFT group information. This adds in optional reporting to stream and consumer info when running in clsutered mode. Signed-off-by: Derek Collison <derek@nats.io>	2021-01-20 11:58:31 -08:00
Ivan Kozlovic	42dcdd2eb2	Simplify sendSubsToRoute() Since we were creating subs on the fly, sub.im would always be nil. We were passing a client because it was needed in sendRouteSubOrUnSubProtos(). This PR simply fills the buffer with each account's subscriptions. There is also no need to have subs sent from different go routine based on some threshold. Routes are no longer subject to max pending. Some code has been made into a function so that they can be shared by sendSubsToRoute() and sendRouteSubOrUnSubProtos(). The function is simply adding to given buffer the RS+/- protocol. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-01-19 14:01:43 -07:00
Ivan Kozlovic	ef38abe75b	Fixed gateway reply mapping following changes in JetStream clustering Those changes are required to maintain backward compatibility. Since the replies are "_G_.<gateway name hash>.<server ID hash>" and the hash were 6 characters long, changing to 8 the hash function would break things. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-01-15 17:32:04 -07:00
Derek Collison	f0cdf89c61	JetStream Clustering WIP Signed-off-by: Derek Collison <derek@nats.io>	2021-01-14 01:14:52 -08:00
Ivan Kozlovic	67425d23c8	Add c.isMqtt() and c.isWebsocket() This hides the check on "c.mqtt != nil" or "c.ws != nil". Added some tests. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-12-02 15:52:06 -07:00
Ivan Kozlovic	77aead807c	Send LS- without origin to route When cluster origin code was added, a server may send LS+ with an origin cluster name in the protocol. Parsing code from a ROUTER connection was adjusted to understand this LS+ protocol. However, the server was also sending an LS- with origin but the parsing code was not able to understand that. When the unsub was for a queue subscription, this would cause the parser to error out and close the route connection. This PR sends an LS- without the origin in this case (so that tracing makes sense in term of LS+/LS- sent to a route). The receiving side then traces appropriate LS- but processes as a normal RS-. Resolves #1751 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-11-30 13:31:32 -07:00
Ivan Kozlovic	13df1a55fd	Changed warning message Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-10-09 09:36:30 -06:00
Ivan Kozlovic	df9d5f5fd9	Accepting route warns if remote server has same name Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-10-08 17:59:33 -06:00
Matthias Hanel	634ce9f7c8	[Adding] Accountz monitoring endpoint and INFO monitoring req subject Returned imports/exports are formated like jwt exports imports, even if they originating account is from config. Fixes #1604 Signed-off-by: Matthias Hanel <mh@synadia.com>	2020-09-23 16:22:48 -04:00
Ivan Kozlovic	2ad2bed170	[ADDED] Support for route hostname resolution We previously simply called DialTimeout() on a route's url when soliciting. If it resolved to the IP of the host, it would create a route to self, which server detects, but then would not try again with other IPs that would have allowed to form a cluster with other servers running on the other IPs. This PR keeps track of local IPs + cluster port and exclude them from the list of IPs returned by LookupHost API. This even prevent solicitation of routes to self. Only non-local IPs will be tried. Resolves #1586 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-09-08 13:40:17 -06:00
Phil Pennock	3c680eceb9	Inhibit Go's default TCP keepalive settings for NATS (#1562 ) Inhibit Go's default TCP keepalive settings for NATS Go 1.13 changed the semantics of the tuning parameters for TCP keepalives, including the default value. This affects all TCP listeners. The NATS protocol has its own L7 keepalive system (PING/PONG) and the Go defaults are not a good fit for some valid deployment scenarios, while Go doesn't directly expose a working API for tuning these. Rather than add a configuration knob and pull in another dependency (with portability issues) just disable TCP keepalives for all listeners used for speaking the NATS protocol. Change the tests so we test the same logic. Do not change HTTP monitoring, profiling, or the websocket API listeners. Change KeepAlive on client connections too.	2020-08-14 13:37:59 -04:00
Ivan Kozlovic	c620175353	Rework closeConnection() This change allows the removal of the connection and update of the server state to be done "in place" but still delay the flushing of and close of tcp connection to the writeLoop. With ref counting we ensure that the reconnect happens after the flushing but not before the state has been updated. Had to fix some places where we may have called closeConnection() from under the server lock since it now would deadlock for sure. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-07-31 15:30:17 -06:00
Ivan Kozlovic	96ccf91566	[FIXED] Possible deadlock with solicited leafnodes when cluster conflict We cannot call c.closeConnection() under the server lock because closeConnection() can invoke server lock in some cases. Created a test that should run without `-race` to reproduce the deadlock (which it does) but sometimes would fail because cluster would not be formed. This unconvered an issue with conflict resolution which test TestRouteClusterNameConflictBetweenStaticAndDynamic() can reproduce easily. The issue was that we were not updating a dynamic name with the remote if the remote was non dynamic. Resolves #1543 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-07-30 18:45:36 -06:00
Matthias Hanel	3da66ad80d	Remove unnecessary account fetch from remote remove functions Changed: removeReplySub, removeRemoteSubs and processRemoteUnsub Signed-off-by: Matthias Hanel <mh@synadia.com>	2020-07-28 11:00:17 -04:00
Matthias Hanel	946e8415a0	Incorporating review comments	2020-07-27 19:19:43 -04:00
Matthias Hanel	00faefec06	Reduce usage of tmpAccounts to only location where it is needed imports On import handle it with priority as in non recursive situations, it won't be present.	2020-07-27 17:38:39 -04:00
Matthias Hanel	37692d2cf9	[Fixed] Skip fetch when a non config based account resolver is used Resolves #1532 Instead of the fetched account we create a dummy account that is expired. Any client connecting will trigger a fetch of the actual account jwt. This also avoids one fetch, thus the unit test was changed to reflect this. Unlike other resolver the memory resolver does not depend on external systems. It is purely based on server configuration. Therefore, fetch can be done and not finding an account means there is a configuration issue.	2020-07-27 17:36:55 -04:00
Ivan Kozlovic	9b0967a5d1	[FIXED] Handling of gossiped URLs If some servers in the cluster have the same connect URLs (due to the use of client advertise), then it would be possible to have a server sends the connect_urls INFO update to clients with missing URLs. Resolves #1515 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-07-15 17:39:12 -06:00
Ivan Kozlovic	4d495104de	Fixed no_responders use of sendProtoNow() The call sendProtoNow() should not normally be used (only when setting up a connection when the writeloop is not yet started and server needs to send something before being able to start the writeLoop. Instead, code should use enqueueProto(). For this particular case though, use queueOutbound() directly and add to the producer's pcd map. Also fixed other places where we were using queueOutbound() + flushSignal() which is what enqueueProto is doing. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-07-09 17:55:14 -06:00
Ivan Kozlovic	9288283d90	Fixed accept loops that could leave connections opened This was discovered with the test TestLeafNodeWithGatewaysServerRestart that was sometimes failing. Investigation showed that when cluster B was shutdown, one of the server on A that had a connection from B that just broke tried to reconnect (as part of reconnect retries of implicit gateways) to a server in B that was in the process of shuting down. The connection had been accepted but createGateway not called because the server's running boolean had been set to false as part of the shutdown. However, the connection was not closed so the server on A had a valid connection to a dead server from cluster B. When the B cluster (now single server) was restarted and a LeafNode connection connected to it, then the gateway from B to A was created, that server on A did not create outbound connection to that B server because it already had one (the zombie one). So this PR strengthens the starting of accept loops and also make sure that if a connection (all type of connections) is not accepted because the server is shuting down, that connection is properly closed. Since all accept loops had almost same code, made a generic function that accept functions to call specific create connection functions. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-07-06 17:03:19 -06:00
Derek Collison	6c805eebc7	Properly support leadnode clusters. Leafnodes that formed clusters were partially supported. This adds proper support for origin cluster, subscription suppression and data message no echo for the origin cluster. Signed-off-by: Derek Collison <derek@nats.io>	2020-06-26 09:03:22 -07:00
Derek Collison	c7e4d8b194	Avoid data race on cluster name Signed-off-by: Derek Collison <derek@nats.io>	2020-06-18 13:17:50 -07:00
Derek Collison	1e52a1007b	More updates based on feedback Signed-off-by: Derek Collison <derek@nats.io>	2020-06-13 08:00:57 -07:00
Derek Collison	146d8f5dcb	Updates based on feedback, sped up some slow tests Signed-off-by: Derek Collison <derek@nats.io>	2020-06-12 17:26:43 -07:00
Derek Collison	dd61535e5a	Cluster names are now required. Added cluster names as required for prep work for clustered JetStream. System can dynamically pick a cluster name and settle on one even in large clusters. Signed-off-by: Derek Collison <derek@nats.io>	2020-06-12 15:48:38 -07:00
Derek Collison	4dee03b587	Allow mixed TLS and non-TLS on same port Signed-off-by: Derek Collison <derek@nats.io>	2020-06-05 18:04:11 -07:00
Ivan Kozlovic	dc0f688cbf	[FIXED] LameDuckMode sends INFO to clients Also send an INFO to routes so that the remotes can remove the LDM's server client URLs and notify their own clients of this change. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-05-21 12:15:20 -06:00
Ivan Kozlovic	9715848a8e	[ADDED] Websocket support Websocket support can be enabled with a new websocket configuration block: ``` websocket { # Specify a host and port to listen for websocket connections # listen: "host:port" # It can also be configured with individual parameters, # namely host and port. # host: "hostname" # port: 4443 # This will optionally specify what host:port for websocket # connections to be advertised in the cluster # advertise: "host:port" # TLS configuration is required tls { cert_file: "/path/to/cert.pem" key_file: "/path/to/key.pem" } # If same_origin is true, then the Origin header of the # client request must match the request's Host. # same_origin: true # This list specifies the only accepted values for # the client's request Origin header. The scheme, # host and port must match. By convention, the # absence of port for an http:// scheme will be 80, # and for https:// will be 443. # allowed_origins [ # "http://www.example.com" # "https://www.other-example.com" # ] # This enables support for compressed websocket frames # in the server. For compression to be used, both server # and client have to support it. # compression: true # This is the total time allowed for the server to # read the client request and write the response back # to the client. This include the time needed for the # TLS handshake. # handshake_timeout: "2s" } ``` Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-05-20 11:14:39 -06:00
Derek Collison	019c105ca7	Updates based on feedback, more tests, few bug fixes Signed-off-by: Derek Collison <derek@nats.io>	2020-05-19 14:33:06 -07:00
Derek Collison	f5ceab339a	Server support for headers between routes Signed-off-by: Derek Collison <derek@nats.io>	2020-05-19 14:33:06 -07:00
Derek Collison	ea5e5bd364	Services rewrite #2 This contains a rewrite to the services layer for exporting and importing. The code this merges to already had a first significant rewrite that moved from special interest processing to plain subscriptions. This code changes the prior version's dealing with reverse mapping which was based mostly on thresholds and manual pruning, with some sporadic timer usage. This version uses the jetstream branch's code that understands interest and failed deliveries. So this code is much more tuned to reacting to interest changes. It also removes thresholds and goes only by interest changes or expirations based around a new service export property, response thresholds. This allows a service provider to provide semantics on how long a response should take at a maximum. This commit also introduces formal support for service export streamed and chunked response types send an empty message to signify EOF. This commit also includes additions to the service latency tracking such that errors are now sent, not only successful interactions. We have added a Status field and an optional Error fields to ServiceLatency. We support the following Status codes, these are directly from HTTP. 400 Bad Request (request did not have a reply subject) 408 Request Timeout (when system detects request interest went away, old request style to make dependable).. 503 Service Unavailable (no service responders running) 504 Service Timeout (The new response threshold expired) Signed-off-by: Derek Collison <derek@nats.io>	2020-05-19 14:26:46 -07:00
Derek Collison	df774e44b0	Rework how service imports are handled to avoid performance hits Signed-off-by: Derek Collison <derek@nats.io>	2020-05-19 14:18:34 -07:00
Derek Collison	8d1f3cc7c2	Allow JetStream consumers to work across multi-server hops Signed-off-by: Derek Collison <derek@nats.io>	2020-05-19 14:16:03 -07:00
Ivan Kozlovic	fef94759ab	[FIXED] Update remote gateway URLs when node goes away in cluster If a node in the cluster goes away, an async INFO is sent to inbound gateway connections so they get a chance to update their list of remote gateway URLs. Same happens when a node is added to the cluster. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-04-20 13:48:47 -06:00
Matthias Hanel	6a1c3fc29b	Moving inbound tracing to the caller (client.parse) Tracing for outgoing operations is always done while holding the client lock. Signed-off-by: Matthias Hanel <mh@synadia.com>	2020-03-04 17:31:18 -05:00
Matthias Hanel	fe373ac597	Incorporating comments. c -> client defer in oneliner argument order Signed-off-by: Matthias Hanel <mh@synadia.com>	2020-03-04 15:48:19 -05:00
Matthias Hanel	f5bd07b36c	[FIXED] trace/debug/sys_log reload will affect existing clients Fixed #1296, by altering client state on reload Detect a trace level change on reload and update all clients. To avoid data races, read client.trace while holding the lock, pass the value into functionis that trace while not holding the lock. Delete unused client.debug. Signed-off-by: Matthias Hanel <mh@synadia.com>	2020-03-04 13:54:15 -05:00
Ivan Kozlovic	c73be88ac0	Updated based on comments Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-01-06 16:57:48 -07:00
Ivan Kozlovic	947798231b	[UPDATED] TCP Write and SlowConsumer handling - All writes will now be done by the writeLoop, unless when the writeLoop has not been started yet (likely in connection init). - Slow consumers for non CLIENT connections will be reported but not failed. The idea is that routes, gateway, etc.. connections should stay connected as much as possible. However if a flush operation times out and no data at all has been written, the connection will be closed (regardless of type). - Slow consumers due to max pending is only for CLIENT connections. This allows sending of SUBs through routes, etc.. to not have to be chunked. - The backpressure to CLIENT connections is increased (up to 1sec) based on the sub's connection pending bytes level. - Connection is flushed on close from the writeLoop as to not block the "fast path". Some tests have been fixed and adapted since now closeConnection() is not flushing/closing/removing connection in place. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-12-31 15:06:27 -07:00
Ivan Kozlovic	a22da91647	[FIXED] Closing of Gateway or Route TLS connection may hang This could happen if the remote server is running but not dequeueing from the socket. TLS connection Close() may send/read and so we need to protect with a deadline. For non client/leaf connection, do not call flushOutbound(). Set the write deadline regardless of handshakeComplete flag, and set it to a low value. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-12-04 17:27:00 -07:00
Derek Collison	6ad8287bbe	Introduced wildcard handling of _R_ mapped replies. We had too much special processing, so reduced to a single wildcard which will propagate across routes and gateways and is consistent with gateway handling of globally routed subjects and timeouts. Signed-off-by: Derek Collison <derek@nats.io>	2019-11-16 12:50:53 -08:00
Ivan Kozlovic	d85f9a9388	Fixed bug with duplicate route and GW replies When a duplicate route is detected and closed, we need to clear the route's hash in order to prevent the removal from the server's routeByHash map. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-11-15 17:24:50 -07:00
Ivan Kozlovic	aa843945c9	Work on Gateways reply mapping - New prefix that includes origin server for the request - Mapping done if request is service import or requestor has recent subscription - Subscription considered recent if less than 250ms - Destination server strip GW prefix before giving to client and restore when getting a reply on that subject - Mapping removed aftert 250ms - Server rejects client publish on "$GNR." (the new prefix) - Cluster and server hash are now 8 chars long and from base 62 alphabets - Mapped replies need to be sent to leafnode servers due to race (cluster B sends RS+ on GW inbound then RMSG on outbound, the RS+ may be processed later and cluster A may have given message to LN before RS+ on reply subject. So LN needs to accept the mapped reply but will strip to give to client and reassemble before sending it back) Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2019-11-06 16:06:49 -07:00
Ivan Kozlovic	d20f76cbaa	Merge pull request #1166 from nats-io/add_servername_to_routestat [ADDED] Server name in the RouteStat for statsz	2019-10-28 13:19:53 -06:00

1 2 3 4 5

232 Commits