nats-server

mirror of https://github.com/gogrlx/nats-server.git synced 2026-04-02 03:38:42 -07:00

Author	SHA1	Message	Date
Derek Collison	74c06ed046	Shutdown cluster on errors Signed-off-by: Derek Collison <derek@nats.io>	2021-01-20 11:58:31 -08:00
Derek Collison	ff54c9dc9c	Reworked snapshot and restore. Underestimated the effort to get stream restore working properly in cluster mode. Some good bug fixes and stability improvments. Signed-off-by: Derek Collison <derek@nats.io>	2021-01-20 11:58:31 -08:00
Derek Collison	a1730f1b31	Report on RAFT group information. This adds in optional reporting to stream and consumer info when running in clsutered mode. Signed-off-by: Derek Collison <derek@nats.io>	2021-01-20 11:58:31 -08:00
Ivan Kozlovic	f5df209022	Fixed SIGSEGV when sending update for unknown stream Will now return an error that the stream is unknown. Resolves #1827 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-01-20 12:42:14 -07:00
Ivan Kozlovic	7d1a4778b8	Merge pull request #1826 from nats-io/fix_consumer_loop_delivery_exit Fix stop of consumer's delivery loop	2021-01-20 10:34:57 -07:00
Ivan Kozlovic	c4a284b58f	Fix stop of consumer's delivery loop I noticed that some consumer go routines were left running at the end of the test suite. It turns out that there was a race the way the consumer's qch was closed. Since it was closed and then set to nil, it is possible that the go routines that are started and then try to capture o.qch would actually get qch==nil, wich then when doing a select on that nil channel would block forever. So we know pass the qch to the 2 go routines loopAndGatherMsgs() and loopAndDeliverMsgs() so that when we close the channel there is no risk of that race happening. I do believe that there is still something that should be looked at: it seems that a consumer's delivery loop can now be started/stopped many times based on leadership acquired/lost. If that is the case, I think that the consumer should wait for previous go routine to complete before trying to start new ones. Also moved 3 JetStream tests to the test/norace_test.go file because they would consumer several GB of memory when running with the -race flag. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-01-19 17:39:32 -07:00
Ivan Kozlovic	42dcdd2eb2	Simplify sendSubsToRoute() Since we were creating subs on the fly, sub.im would always be nil. We were passing a client because it was needed in sendRouteSubOrUnSubProtos(). This PR simply fills the buffer with each account's subscriptions. There is also no need to have subs sent from different go routine based on some threshold. Routes are no longer subject to max pending. Some code has been made into a function so that they can be shared by sendSubsToRoute() and sendRouteSubOrUnSubProtos(). The function is simply adding to given buffer the RS+/- protocol. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-01-19 14:01:43 -07:00
Derek Collison	78747b2414	Stability improvements around startup and restore. We were incorrectly starting clustering before enabling accounts and restoring state. Signed-off-by: Derek Collison <derek@nats.io>	2021-01-17 13:44:49 -08:00
Derek Collison	a603f439bb	Make all requests same timeout Signed-off-by: Derek Collison <derek@nats.io>	2021-01-16 14:17:35 -08:00
Derek Collison	a18a6803c1	Added support for stream and consumer lists. This utilizes a scatter and gather approach. Signed-off-by: Derek Collison <derek@nats.io>	2021-01-16 12:42:45 -08:00
Derek Collison	cb69df7118	Add proper support for stream update Signed-off-by: Derek Collison <derek@nats.io>	2021-01-16 06:29:37 -08:00
Derek Collison	b606dceb59	Stabilize restart/catchup for raft. Signed-off-by: Derek Collison <derek@nats.io>	2021-01-16 05:47:48 -08:00
Ivan Kozlovic	1874964498	Merge pull request #1812 from nats-io/leafnode_fixes Fixed some leafnode issues introduced from JS cluster work	2021-01-15 18:22:02 -07:00
Ivan Kozlovic	0d78bce9cf	Fixed some leafnode issues introduced from JS cluster work Also fixed a flapper. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-01-15 12:00:34 -07:00
Ryota	91a1d9a556	Update error message with correct config value	2021-01-15 13:18:31 +00:00
Ivan Kozlovic	6c4229300a	Fixed service import cycle detection that broke with JS clustering Also added some no-op error handler for some tests to silence the error report in the log. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-01-14 11:27:36 -07:00
Derek Collison	1b0e740123	Fix for race Signed-off-by: Derek Collison <derek@nats.io>	2021-01-14 07:28:14 -08:00
Derek Collison	491b3c34cd	Tweak timing more for tests Signed-off-by: Derek Collison <derek@nats.io>	2021-01-14 07:07:04 -08:00
Derek Collison	ab2a645791	Fix for various flappers Signed-off-by: Derek Collison <derek@nats.io>	2021-01-14 06:54:08 -08:00
Derek Collison	4bfe9d4c24	Fixes to PR. Add nats to default storage directory Fix race in raft, change leader notice Fix test crash on failure Signed-off-by: Derek Collison <derek@nats.io>	2021-01-14 05:56:05 -08:00
Derek Collison	37cf7584bd	Merge branch 'master' into jsc	2021-01-14 02:52:35 -07:00
Derek Collison	f0cdf89c61	JetStream Clustering WIP Signed-off-by: Derek Collison <derek@nats.io>	2021-01-14 01:14:52 -08:00
Ivan Kozlovic	7b116379cb	Propose going back to condition variable to notify writeLoop This is how it was up to v2.1.2 included (changed in v2.1.4 onward). I added a benchmark that has 3 subscribers running and increase the number of publishers: 1, 2, 5 and 10. This is the comparison between the pre-PR and post-PR: ``` benchcmp old.txt new.txt benchmark old ns/op new ns/op delta Benchmark___BumpPubCount_1x3-16 396 385 -2.78% Benchmark___BumpPubCount_2x3-16 495 406 -17.98% Benchmark___BumpPubCount_5x3-16 542 395 -27.12% Benchmark__BumpPubCount_10x3-16 549 515 -6.19% benchmark old MB/s new MB/s speedup Benchmark___BumpPubCount_1x3-16 717.27 737.54 1.03x Benchmark___BumpPubCount_2x3-16 574.31 699.02 1.22x Benchmark___BumpPubCount_5x3-16 524.35 718.80 1.37x Benchmark__BumpPubCount_10x3-16 517.26 551.53 1.07x ``` It is inline with what the user reported, seeing a 20% drop in performance when going from 1 publisher to 2. But, as we can see, the difference between go channel and cond variable reduces with the increased number of publishers after a certain number. I am not sure of the performance impact on other situations, so this PR is more of a proposal than a fix. Resolves #1786 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2021-01-12 12:24:37 -07:00
Ivan Kozlovic	14aecb2202	Fixed headers support for inbound leafnode connection The server that solicits a LeafNode connection does not send an INFO, so the accepting side had no way to know if the remote supports headers or not. The solicit side will now send the headers support capability in the CONNECT protocol so that the receiving side can mark the inbound connection with headers support based on that and its own support for headers. Resolves #1781 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-12-21 11:53:24 -07:00
Alberto Ricart	f09992a889	updated iteration of signing keys (previously a list, now a map). (#1779 )	2020-12-17 13:59:18 -07:00
Derek Collison	eb403ed4d0	Merge pull request #1773 from nats-io/cycle_wc_bug Catch condition where a serviceImport response matched the original import.	2020-12-14 08:20:55 -08:00
Derek Collison	ced28eca93	Fix flapper Signed-off-by: Derek Collison <derek@nats.io>	2020-12-13 10:29:34 -08:00
Derek Collison	a3f7e97f9a	Catch condition where a serviceImport response matched the original import subject. Signed-off-by: Derek Collison <derek@nats.io>	2020-12-13 10:17:29 -08:00
Ivan Kozlovic	d5f255b98e	Merge pull request #1771 from nats-io/gw_ln_tls_config_reload [FIXED] Config reload for gateways/leaf remote TLS configurations	2020-12-12 10:51:52 -07:00
Ivan Kozlovic	9f345ac420	Reduce risk of failure for TestJetStreamConsumerMaxDeliveryAndServerRestart Just increased the AckWait from 20ms to 100ms and reduced max deliveries from 4 to 3. I believe that there is still the risk that the message is redelivered while the server is being shutdown and that message is not making it to the sub. But using those new values (100ms/3), I have ran 200 rounds on a Linux VM and did not get the failure (but did before the change). Again, this is not proper test fix, but may help. This test has been failing 11 times already (keeping track in spreadsheet) and causes several minutes of tests to have to be recycled. Note that the test ran in about 0.4s and now 0.7s, so not that much and would be worth the added delay if it helps not breaking the whole test suite! Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-12-11 19:49:58 -07:00
Ivan Kozlovic	399ff89817	Fixed debug num subs tests Subject interest propagation delays could cause some of the system service tests to fail. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-12-11 19:27:23 -07:00
Ivan Kozlovic	ce5f9d6683	Fixed some flappers Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-12-11 17:30:33 -07:00
Ivan Kozlovic	fc1521636c	[FIXED] Config reload for gateways/leaf remote TLS configurations Presence of TLS config in any remote gateway or leafnode would cause the config reload to fail (because TLS config internal content may change which fails the DeepEqual check). This PR excludes the TLS configs in such case to check for changes in gateways and leafnodes. Although GW and LN config reload is technically supported, this PR updates the internal remotes' TLS configuration so that changes/updates to TLS certificates would take effect after a configuration reload. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2020-12-11 16:56:25 -07:00
Derek Collison	a97e84d8b9	Merge pull request #1760 from nats-io/jsbug [FIXES] https://github.com/nats-io/jetstream/issues/396	2020-12-02 16:29:39 -08:00
Derek Collison	0f7d18d6e8	Fixes https://github.com/nats-io/jetstream/issues/396 Had a deadlock with new preconditions. We need to hold lock across Store() call but that call could call into storeUpdate() such that we may need to acquire the lock. We can enter this callback from the storage layer itself and the lock would not be held so added an atomic. Signed-off-by: Derek Collison <derek@nats.io>	2020-12-02 16:18:00 -08:00
Derek Collison	cddf23c200	Limit search depth for account cycles for imports Signed-off-by: Derek Collison <derek@nats.io>	2020-12-02 11:44:27 -08:00
Derek Collison	9b107c0f4b	Merge pull request #1759 from nats-io/acc_cycles Better implementation to detect various cycles from account imports/exports.	2020-12-02 10:02:24 -08:00
Waldemar Quevedo	a9a6bdc04f	Merge pull request #1732 from nats-io/rdn-ordering Match DNs regardless of order when using TLS auth	2020-12-02 09:25:36 -08:00
Derek Collison	705cc0f5ea	Better impl for detecting cycles between accounts Signed-off-by: Derek Collison <derek@nats.io>	2020-12-02 08:56:19 -08:00
Derek Collison	bfb726e8e9	Make sure to clear JS resources on reload Signed-off-by: Derek Collison <derek@nats.io>	2020-11-30 17:18:33 -08:00
Derek Collison	4e6d600ecc	Also make sure account works after reload Signed-off-by: Derek Collison <derek@nats.io>	2020-11-30 16:18:36 -08:00
Derek Collison	7e27042e6e	Fix for #1736 When a system account was configured and not the default when we did a reload we would lose the JetStream service exports. Signed-off-by: Derek Collison <derek@nats.io>	2020-11-30 16:11:50 -08:00
Derek Collison	4532447908	Remove limitation on ackall for filtered consumers Signed-off-by: Derek Collison <derek@nats.io>	2020-11-28 07:18:17 -08:00
R.I.Pienaar	5e5b2e4dfd	ensure the stream originating a pub error is reported Signed-off-by: R.I.Pienaar <rip@devco.net>	2020-11-27 12:24:41 +01:00
Derek Collison	954f5a9093	Flattened filters for stream names API Signed-off-by: Derek Collison <derek@nats.io>	2020-11-25 07:46:56 -08:00
Derek Collison	44a1373f89	JetStream changes. Made several changes based on feedback. 1. Made PubAckResponse only optionally include an ApiError and not force an API type. 2. Allow FilterSubject to be set on a consumer config and cleared if it matches the only stream subject. 3. Remove LookupStream by subject, and add in filters for stream names API. Signed-off-by: Derek Collison <derek@nats.io>	2020-11-25 06:50:25 -08:00
Matthias Hanel	f8872c8307	Added more straight forward loop detection fail Signed-off-by: Matthias Hanel <mh@synadia.com>	2020-11-23 23:44:30 -05:00
Matthias Hanel	66fff6259a	Adding test that fails where there is no cycle but sometimes passes Signed-off-by: Matthias Hanel <mh@synadia.com>	2020-11-23 17:11:51 -05:00
Matthias Hanel	f467f32f4a	We prevent cycles between services but not streams Signed-off-by: Matthias Hanel <mh@synadia.com>	2020-11-23 16:19:41 -05:00
Matthias Hanel	352f6b3b45	Imported services can be renamed, this eludes cycle detection Signed-off-by: Matthias Hanel <mh@synadia.com>	2020-11-23 15:49:49 -05:00

... 2 3 4 5 6 ...

908 Commits