nats-server

mirror of https://github.com/gogrlx/nats-server.git synced 2026-04-17 03:24:40 -07:00

Author	SHA1	Message	Date
Derek Collison	6a2063f5b3	Revert logic Signed-off-by: Derek Collison <derek@nats.io>	2023-02-06 22:14:37 +04:00
Derek Collison	e9a983c802	Do not let !NeedSnapshot() avoid snapshots and compaction. Signed-off-by: Derek Collison <derek@nats.io>	2023-02-01 22:05:25 -07:00
Derek Collison	e0798d26eb	Merge pull request #3831 from nats-io/snapshots Minor fixes and optimizations for snapshots.	2023-01-30 19:53:22 -08:00
Derek Collison	6058056e3b	Minor fixes and optimizations for snapshots. We were snappshotting more then needed, so double check that we should be doing this at the stream and consumer level. At the raft level, we should have always been compacting the WAL to last+1, so made that consistent. Also fixed bug that would not skip last if more items behind the snapshot. Signed-off-by: Derek Collison <derek@nats.io>	2023-01-30 17:54:18 -08:00
Waldemar Quevedo	13372508e2	Fix for isGroupLeaderless when JS not available (due to shutdown) Signed-off-by: Waldemar Quevedo <wally@nats.io>	2023-01-30 15:29:42 -08:00
Derek Collison	52a78c0352	Small optimizations. 1. Only snapshot with minSnap time window like consumers and meta. Make it consistent for all to 5s. 2. Only snapshot at the end of processing all entries pending vs inside the loop. 3. Use fast state when calculating sync request, do not need deleted details there. Signed-off-by: Derek Collison <derek@nats.io>	2023-01-29 10:58:00 -08:00
Neil Twigg	83932b4be6	Don't mark a clustered stream as unhealthy if making forward progress, add `TestJetStreamClusterCurrentVsHealth`	2023-01-26 16:57:34 +00:00
Derek Collison	461aad17a5	Merge pull request #3820 from nats-io/issue-3791 [FIXED] Select consumer peer(s) from active peers only.	2023-01-26 08:27:11 -08:00
Derek Collison	e15eb22ca6	When we create a consumer with less replicas then the stream, make sure to select from online peers. Signed-off-by: Derek Collison <derek@nats.io>	2023-01-25 20:08:04 -08:00
Derek Collison	a5cbd0b029	Fixed a bug that would not properly process updates on a stream on restart. During restart if the stream existed but was also in a meta-snapshot delivered by the leader we would not process the update properly. Signed-off-by: Derek Collison <derek@nats.io>	2023-01-25 18:16:33 -08:00
Neil Twigg	1baa1fbda8	Use highwayhash for last stream, consumer and cluster snapshots	2023-01-12 16:16:14 +00:00
Derek Collison	6c5f0a669d	Ensure we add in new consumers from a meta snapshot from the leader. Signed-off-by: Derek Collison <derek@nats.io>	2023-01-04 22:18:31 -08:00
Neil Twigg	14d0ba1c65	Fix some lint errors after move to `golangci-lint`	2022-12-30 20:00:08 +00:00
Todd Beets	c463b398db	Validate no overlapping stream subscriptions on update config (non-clustered jetstream)	2022-12-16 12:58:59 -08:00
Derek Collison	5f9a69e4f9	Make sure js is non-nil. Signed-off-by: Derek Collison <derek@nats.io>	2022-12-13 16:37:00 -08:00
Derek Collison	fa67c50bec	Make sure we clear the old raft node from our stream assignment. This would not allow a re-assignment of a peer to work correctly. Signed-off-by: Derek Collison <derek@nats.io>	2022-12-12 12:51:08 -05:00
Derek Collison	2f27438230	Make stream removal from a server consistent. Signed-off-by: Derek Collison <derek@nats.io>	2022-12-06 17:11:43 -08:00
Todd Beets	3fdfb8a12f	Merge branch 'main' into ut-replacepeer # Conflicts: # server/jetstream_cluster_3_test.go	2022-12-06 10:51:22 -08:00
Todd Beets	ef27d4d534	tag policies not honored in reassignment after peer remove	2022-12-04 20:39:11 -08:00
Derek Collison	5f7c8e21a2	Fixed issues with multiple concurrent stream create requests. First issue was applications not getting any response. However, there was also a more serious issue that would create multiple raft groups for each concurrent request. The servers would only run one stream monitor loop, however they would update the state to the new raft group's name, so on server restart the stream would be using a different raft group then existing servers. Signed-off-by: Derek Collison <derek@nats.io>	2022-12-04 19:13:51 -08:00
Derek Collison	36ef788112	When determing whether we need an ack, no need to copy since under consumer lock. Signed-off-by: Derek Collison <derek@nats.io>	2022-11-14 11:47:31 -08:00
Ivan Kozlovic	304744ce08	Merge pull request #3615 from nats-io/js_acc_max_streams_consumers [FIXED] JetStream: Account max streams/consumers not always honoured	2022-11-09 18:02:51 -07:00
Ivan Kozlovic	1b892837cb	[FIXED] JetStream: Account max streams/consumers not always honoured This could happen during concurrent requests where the assignments are not yet fully processed. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-11-09 17:29:20 -07:00
Derek Collison	e008e015b3	Make sure to enforce HA asset limits during peer processing as well as assignment. Signed-off-by: Derek Collison <derek@nats.io>	2022-11-09 16:24:54 -08:00
Ivan Kozlovic	ca237bdfa0	[FIXED] JetStream: Stream scale down while it has no quorum If a stream R2 had one of its server network-partitioned and at that time the stream was edited to be scaled down to an R1 it would cause the stream to no longer have quorum even when the network partition is resolved. Signed-off-by: Derek Collison <derek@nats.io> Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-11-04 09:08:31 -06:00
Derek Collison	56919ebc97	On stream proposal failures we could accidentally warn on high stream lag. We were not taking the clfs into account. Signed-off-by: Derek Collison <derek@nats.io>	2022-11-02 14:40:31 -07:00
Ivan Kozlovic	ab4470ccdc	[FIXED] JetStream: possible panic on some rare cases Very difficult to reproduce. Had to run TestJetStreamSuperClusterMoveCancel in covermode=atomic on a slow machine to hit the condition where the monitorConsumer go routine is started by RAFT node is nil, which caused the warning message to produce the panic (since n is nil) Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-11-02 10:02:09 -06:00
Ivan Kozlovic	55e651c118	[FIXED] JetStream: processing of snapshot with expired messages The issue that a "first sequence mismatch" during processing of a snapshot was causing the state to be reset and caused a lot of catchup from the follower. An attempt to fix that in PR #3567 caused an issue that was addressed in PR #3589. However, this was then causing the follower to sometime never able to catchup or took a very long time. This PR - we believe - addresses the original and subsequent issues. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-11-01 12:58:45 -06:00
Derek Collison	121bf6ebb5	Move to past check for nil Signed-off-by: Derek Collison <derek@nats.io>	2022-10-27 17:30:07 -07:00
Derek Collison	2241ad089e	Make local error since non-fatal for now. Signed-off-by: Derek Collison <derek@nats.io>	2022-10-25 16:56:10 -07:00
Derek Collison	aa52c2fecf	Added warning for high message lag into a clustered stream. Signed-off-by: Derek Collison <derek@nats.io>	2022-10-25 16:11:35 -07:00
Derek Collison	db13766f18	Merge pull request #3576 from nats-io/signal-pull-consumers Removed ephemeral consumer migration.	2022-10-25 17:35:35 -05:00
Derek Collison	f0afa49b9f	Make sure to stop raft nodes on all monitor exits. Signed-off-by: Derek Collison <derek@nats.io>	2022-10-25 14:48:28 -07:00
Derek Collison	ff2cd1d7f9	Fixed test and bug that would override consumer replicas. Signed-off-by: Derek Collison <derek@nats.io>	2022-10-25 14:35:20 -07:00
Ivan Kozlovic	7ca85e0e80	[FIXED] JetStream: Update of an R1 consumer would not get a response The update was accepted but the server would not respond to the client/CLI. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-10-25 09:04:35 -06:00
Ivan Kozlovic	f8aa3ac11d	[FIXED] JetStream: "first sequence mismatch" error on catchup with message expiration When a server was restarted and expired messages, but the leader had a snapshot that still had the old messages we would reset complete follower stream state, this fix just skips over the expired as we prepare the request to the leader. Resolves #3516 Signed-off-by: Derek Collison <derek@nats.io> Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-10-17 17:02:08 -06:00
Ivan Kozlovic	9bd11580e3	[FIXED] JetStream: User-defined ephemeral Name not used in cluster mode If the user sends a CONSUMER.CREATE request with a configuration that specifies the name that the user wants for the ephemeral consumer, this would not work on cluster mode, that is, the server would still pick a name instead of using the provided one. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-10-10 13:48:38 -06:00
Ivan Kozlovic	3472f6aec2	[FIXED] JetStream: unresponsiveness while creating raft group Originally, createRaftGroup() would not hold the jetstream's lock for the whole duration. But some race reports made us change this function to keep the lock for the whole duration. A test called TestJetStreamClusterRaceOnRAFTCreate() was demonstrating the race between "consumer info" request handling and createRaftGroup code. Since then, the race has been fixed, so this PR restores the more fine-grained locking inside createRaftGroup. Resolves #3516 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-10-04 17:27:36 -06:00
Derek Collison	52b5cd12bb	Allow meta layer to snapshot on a clean shutdown. Signed-off-by: Derek Collison <derek@nats.io>	2022-09-29 09:17:12 -06:00
Ivan Kozlovic	e151cfcd57	[FIXED] JetStream: Scale down of consumer to R1 would not get a response Updating a consumer configuration from say R3 to R1 would work but no response was received by the client sending the request. Resolves #3493 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-09-27 10:02:31 -06:00
Ivan Kozlovic	170ff49837	[ADDED] JetStream: peer (the hash of server name) in statsz/jsz A request to `$SYS.REQ.SERVER.PING.JSZ` would now return something like this: ``` ... "meta_cluster": { "name": "local", "leader": "A", "peer": "NUmM6cRx", "replicas": [ { "name": "B", "current": true, "active": 690369000, "peer": "b2oh2L6w" }, { "name": "Server name unknown at this time (peerID: jZ6RvVRH)", "current": false, "offline": true, "active": 0, "peer": "jZ6RvVRH" } ], "cluster_size": 3 } ``` Note the "peer" field following the "leader" field that contains the server name. The new field is the node ID, which is a hash of the server name. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-09-16 15:31:37 -06:00
Ivan Kozlovic	378fed164d	[FIXED] JetStream: possible panic on peer remove on server shutdown This was discovered by new test TestJetStreamClusterRemovePeerByID. I saw this on Travis and repeating the test locally with -count=10 I was able to reproduce. The issue is cc.meta being nil but accessing cc.meta.ID() directly. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-09-16 15:06:58 -06:00
Ivan Kozlovic	f113163b9f	Change ByID boolean to Peer string and add Peer id in replicas output The CLI will now be able to display the peer IDs in MetaGroupInfo if it choses to do so, and possibly help user select the peer ID from a list with a new command to remove by peer ID instead of by server name. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-09-15 10:39:23 -06:00
Deepak	e9ce118c56	Fix peer randomisation when creating consumers groups for replica=1 Signed-off-by: Deepak <sah.sslpu@gmail.com>	2022-09-14 13:58:49 +05:30
Matthias Hanel	f7cb5b1f0d	changed format of JSClusterNoPeers error (#3459 ) * changed format of JSClusterNoPeers error This error was introduced in #3342 and reveals to much information This change gets rid of cluster names and peer counts. All other counts where changed to booleans, which are only included in the output when the filter was hit. In addition, the set of not matching tags is included. Furthermore, the static error description in server/errors.json is moved into selectPeerError sample errors: 1) no suitable peers for placement, tags not matched ['cloud:GCP', 'country:US']" 2) no suitable peers for placement, insufficient storage Signed-off-by: Matthias Hanel <mh@synadia.com> Signed-off-by: Ivan Kozlovic <ivan@synadia.com> Co-authored-by: Ivan Kozlovic <ivan@synadia.com>	2022-09-08 18:25:48 -07:00
Derek Collison	c3203a3bb5	Use lostQuorum default versus live for reporting. Signed-off-by: Derek Collison <derek@nats.io>	2022-09-07 13:56:38 -07:00
Derek Collison	b86e941ce4	tweak lost quorum reporting Signed-off-by: Derek Collison <derek@nats.io>	2022-09-07 10:57:01 -07:00
Derek Collison	fbf2233e4a	Only complain about leaderless group with previous leader if we know jetstream has been running for some threshold. Signed-off-by: Derek Collison <derek@nats.io>	2022-09-07 08:47:55 -07:00
Ivan Kozlovic	5573933034	Bump back the defaultMaxTotalCatchupOutBytes to 128MB Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-08-31 09:19:28 -06:00
Derek Collison	98bf861a7a	Updates to stream and consumer move logic. Signed-off-by: Derek Collison <derek@nats.io>	2022-08-30 16:11:35 -07:00

1 2 3 4 5 ...

389 Commits