nats-server

mirror of https://github.com/gogrlx/nats-server.git synced 2026-04-14 02:07:59 -07:00

Author	SHA1	Message	Date
Ivan Kozlovic	6ffa6d1e4b	[FIXED] JetStream: possible panic on stream info when leader not elected It is possible that a stream info request would be handled at a time where the raft group would not yet be set/created, causing a panic. Resolves #3626 (at least the panic reports there) Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-11-15 11:56:41 -07:00
Ivan Kozlovic	3358247e6b	Added warning if internal sub callback takes too long Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-10-10 14:39:37 -06:00
Derek Collison	fef702a688	[FIXED] bug in consumer names paging, did not honor limits and returned duplicate results. Signed-off-by: Derek Collison <derek@nats.io>	2022-09-29 06:14:00 -07:00
Ivan Kozlovic	170ff49837	[ADDED] JetStream: peer (the hash of server name) in statsz/jsz A request to `$SYS.REQ.SERVER.PING.JSZ` would now return something like this: ``` ... "meta_cluster": { "name": "local", "leader": "A", "peer": "NUmM6cRx", "replicas": [ { "name": "B", "current": true, "active": 690369000, "peer": "b2oh2L6w" }, { "name": "Server name unknown at this time (peerID: jZ6RvVRH)", "current": false, "offline": true, "active": 0, "peer": "jZ6RvVRH" } ], "cluster_size": 3 } ``` Note the "peer" field following the "leader" field that contains the server name. The new field is the node ID, which is a hash of the server name. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-09-16 15:31:37 -06:00
Ivan Kozlovic	f113163b9f	Change ByID boolean to Peer string and add Peer id in replicas output The CLI will now be able to display the peer IDs in MetaGroupInfo if it choses to do so, and possibly help user select the peer ID from a list with a new command to remove by peer ID instead of by server name. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-09-15 10:39:23 -06:00
Ivan Kozlovic	e1f0361b98	[ADDED] JetStream: ability to remove a server by peer ID instead of name This can be helpful after a partial cluster restart since in that case the server name may not be known. However "server report jetstream" would report the peer ID that then can be used. For instance here is the output after a cluster restart where server "C" is not restarted. ``` nats -s nats://sys:pwd@localhost:4222 server report jetstream ... ╭────────────────────────────────────────────────────────────────────────────────────────────────╮ │ RAFT Meta Group Information │ ├─────────────────────────────────────────────────────┬────────┬─────────┬────────┬────────┬─────┤ │ Name │ Leader │ Current │ Online │ Active │ Lag │ ├─────────────────────────────────────────────────────┼────────┼─────────┼────────┼────────┼─────┤ │ A │ yes │ true │ true │ 0.00s │ 0 │ │ B │ │ true │ true │ 0.53s │ 0 │ │ Server name unknown at this time (peerID: jZ6RvVRH) │ │ false │ false │ 0.00s │ 0 │ ╰─────────────────────────────────────────────────────┴────────┴─────────┴────────┴────────┴─────╯ ``` With a change to the NATS CLI we could have something like: ``` nats -s nats://sys:pwd@localhost:4222 server raft peer-remove jZ6RvVRH --by_id ``` Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-09-14 18:10:26 -06:00
Matthias Hanel	f7cb5b1f0d	changed format of JSClusterNoPeers error (#3459 ) * changed format of JSClusterNoPeers error This error was introduced in #3342 and reveals to much information This change gets rid of cluster names and peer counts. All other counts where changed to booleans, which are only included in the output when the filter was hit. In addition, the set of not matching tags is included. Furthermore, the static error description in server/errors.json is moved into selectPeerError sample errors: 1) no suitable peers for placement, tags not matched ['cloud:GCP', 'country:US']" 2) no suitable peers for placement, insufficient storage Signed-off-by: Matthias Hanel <mh@synadia.com> Signed-off-by: Ivan Kozlovic <ivan@synadia.com> Co-authored-by: Ivan Kozlovic <ivan@synadia.com>	2022-09-08 18:25:48 -07:00
jnmoyne	95c1946231	Implements pagination for JS Stream Info requests	2022-09-08 10:45:20 -07:00
Derek Collison	c3203a3bb5	Use lostQuorum default versus live for reporting. Signed-off-by: Derek Collison <derek@nats.io>	2022-09-07 13:56:38 -07:00
Derek Collison	98bf861a7a	Updates to stream and consumer move logic. Signed-off-by: Derek Collison <derek@nats.io>	2022-08-30 16:11:35 -07:00
Derek Collison	aa94a0bc0f	New consumer create that allows elevation of stream and consumer names, and optional filter subject to the request subject. Similar to changes in direct get allows proper security if needed for filter subject selection. Signed-off-by: Derek Collison <derek@nats.io>	2022-08-30 09:29:38 -07:00
Matthias Hanel	7015e46dd9	fix move cancel issue where tags and peers diverge (#3354 ) This can happen if the move was initiated by the user. A subsequent cancel resets the initial peer list. The original peer list was picked on the old set of tags. A cancel would then keep the new list of tags but reset to the old peers. Thus tags and peers diverge. The problem is that at the time of cancel, the old placement tags can't be found anymore. This fix causes cancel to remove the placement tags, if the old peers do not satisfy the new placement tags. Signed-off-by: Matthias Hanel <mh@synadia.com>	2022-08-10 18:48:18 +02:00
Derek Collison	a5119008a5	Fix up some processing during account purge to fix flapping tests Signed-off-by: Derek Collison <derek@nats.io>	2022-08-08 11:06:10 -06:00
Matthias Hanel	52c4872666	better error when peer selection fails (#3342 ) * better error when peer selection fails It is pretty hard to diagnose what went wrong when not enough peers for an operation where found. This change now returns counts of reasons why peers where discarded. Changed the error to JSClusterNoPeers as it seems more appropriate of an error for that operation. Not having enough resources is one of the conditions for a peer not being considered. But so is having a non matching tag. Which is why JSClusterNoPeers seems more appropriate In addition, JSClusterNoPeers was already used as error after one call to selectPeerGroup already. example: no suitable peers for placement: peer selection cluster 'C' with 3 peers offline: 0 excludeTag: 1 noTagMatch: 2 noSpace: 0 uniqueTag: 0 misc: 0 Examle for mqtt: mid:12 - "mqtt" - unable to connect: create sessions stream for account "$G": no suitable peers for placement: peer selection cluster 'MQTT' with 3 peers offline: 0 excludeTag: 0 noTagMatch: 0 noSpace: 0 uniqueTag: 0 misc: 0 (10005) Signed-off-by: Matthias Hanel <mh@synadia.com> * review comment Signed-off-by: Matthias Hanel <mh@synadia.com>	2022-08-06 00:17:01 +02:00
Ivan Kozlovic	d90854a45f	Merge pull request #3341 from nats-io/go_1_19 Move to Go 1.19, remote io/util, fix data race and a flapper	2022-08-05 12:49:06 -06:00
Matthias Hanel	c56f3b9fbd	Adding account purge operation (#3319 ) * Adding account purge operation The new request is available for the system account. The subject to send the request to is $JS.API.ACCOUNT.PURGE.* With the name of the account to purge instead of the wildcard. Also added directory cleanup code such that server do not end up with empty streams directories and account dirs that only contain streams Also adding ACCOUNT to leaf node domain rewrite table Addresses #3186 and #3306 by providing a way to get rid of the streams for existing and non existing accounts Signed-off-by: Matthias Hanel <mh@synadia.com>	2022-08-05 18:24:19 +02:00
Ivan Kozlovic	3c9a7cc6e5	Move to Go 1.19, remote io/util, fix data race and a flapper Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-08-05 09:55:37 -06:00
Derek Collison	28ccaa4371	Direct get across a leafnode using cross domain mappings to a queue subscriber did not work. The interest moved across the leafnode would be for the mapping, and not the actual qsub. So when received if we did detect that we are mapped and do not have a queue filter present make sure to ignore. This will allow queue subscriber processing on the local server that received the message from the leafnode. Signed-off-by: Derek Collison <derek@nats.io>	2022-08-03 20:21:28 -07:00
Derek Collison	c82c49451c	Allow direct get by subject to be all subject based. This avoids marshalling or unmarshalling but also allows subject based permissioning. Signed-off-by: Derek Collison <derek@nats.io>	2022-08-02 18:19:33 -07:00
Matthias Hanel	3358205de3	add implementation for consumer replica change (#3293 ) * add implementation for consumer replica change fixes #3262 also check peer list on every update Signed-off-by: Matthias Hanel <mh@synadia.com>	2022-07-27 03:56:28 +02:00
Ivan Kozlovic	ebeca00e20	[FIXED] JetStream/Cluster: Stream names/infos would return bad response If there are more stream names that the current limit of 1024, getting the list of names would return them all instead of using pagination. For "stream infos", the Total amount returned would be the API limit instead of the actual number of streams. Resolves https://github.com/nats-io/natscli/issues/541 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-07-25 14:41:05 -06:00
Ivan Kozlovic	1da5ecfb96	[IMPROVED] JetStream: stream already exists error description The `JSStreamNameExistErr` will now include in the description that the stream exists with a different configuration, because that is the error clients would get when trying to add a stream with a different configuration (otherwise this is a no-op and client don't get an error). Since that error was used in case of restore, a new error is added but uses the same description prefix "stream name already in use" but adds ", cannot restore" to indicate that this is a restore failure because the stream already exists. Resolves #3273 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-07-21 10:20:07 -06:00
Matthias Hanel	89b5e872ac	Move and cancel fixes (#3270 ) The Move/Cancel/Downscale mechanism did not take into account that the consumer's replica count can be set independently. This also alters peer selection to have the ability to skip unique tag prefix check for server that will be replaced. Say you have 3 az, and want to add another server to az:1, in order to replace a server that is the same zone. Without this change, uniqueTagPrefix check would filter the server to replace with and cause a failure. The cancel move response could not be received due to the wrong account name. Signed-off-by: Matthias Hanel <mh@synadia.com>	2022-07-18 18:42:03 +02:00
Matthias Hanel	023500e1da	add the ability to cancel a move in progress (#3253 ) * add the ability to cancel a move in progress Move to individual subjects for move and cancel_move New subjects are: $JS.API.ACCOUNT.STREAM.MOVE.. $JS.API.ACCOUNT.STREAM.CANCEL_MOVE.. last and second to last token are account and stream name Signed-off-by: Matthias Hanel <mh@synadia.com>	2022-07-12 21:54:18 +02:00
Derek Collison	7ff534d1da	Allow get next for json stream get version Signed-off-by: Derek Collison <derek@nats.io>	2022-07-08 07:19:15 -07:00
Matthias Hanel	70be4b77f9	fixes peer removal, simplifies move, more tests Make sure when processing a peer removal that the stream assignment agrees. When a new leader takes over it can resend a peer removal, and if the stream/consumer really was rescheduled we could remove by accident. Also need to make sure that when we remove a stream we remove the node as part of the stream assignment. If we didn't, if the same asset returned to this server we would not start up the monitoring loop. Simplify migration logic in monitorStream, to be driven by leader only Improved unit tests Added failure when server not in peer list Move command does not require server anymore Signed-off-by: Matthias Hanel <mh@synadia.com>	2022-07-07 03:32:13 +02:00
Derek Collison	1ea608eabf	Allows direct get to also do get next for subject with starting sequence Signed-off-by: Derek Collison <derek@nats.io>	2022-07-06 14:22:28 -07:00
Derek Collison	4075721651	Allow direct msg get for stream to operate in queue group and allows mirrors to opt-in to the same group. Signed-off-by: Derek Collison <derek@nats.io>	2022-07-02 14:16:55 -07:00
R.I.Pienaar	97ad346c34	fix json tag for meta stream move Signed-off-by: R.I.Pienaar <rip@devco.net>	2022-07-02 11:51:50 +02:00
Matthias Hanel	6bd14e1b7a	removed commented out code (#3228 ) Signed-off-by: Matthias Hanel <mh@synadia.com>	2022-06-29 20:31:12 +02:00
Derek Collison	abc5905aa9	Merge pull request #3221 from nats-io/direct Made direct get from a stream part of the $JS.API hierarchy vs separate.	2022-06-28 09:59:44 -07:00
Derek Collison	b8ef9b19a0	Made direct get from a stream part of the $JS.API hierarchy vs separate. Also for direct get and for pull requests, if we are not on a client connection check how long we have been away from the readloop. If need be execute in a separate go routine. Signed-off-by: Derek Collison <derek@nats.io>	2022-06-28 08:53:48 -07:00
Matthias Hanel	3421c49310	[Add] ability for operator to move streams (#3217 ) Also added: ability to reload tags special tag (!jetstream) to remove peer from peer placement $JS.API.SERVER.STREAM.MOVE subject to initiate move away from a server This changes a detail about regular stream move as well. Before differing cluster names where used to start/stop a transfer. Now only the peer list and it's size relative to configured replica matter. Once a transfer is considered completed, excess peers will be dropped from the beginning of the list. This allows transfers within the cluster as well. Signed-off-by: Matthias Hanel <mh@synadia.com>	2022-06-28 02:36:32 +02:00
Derek Collison	1ade8fc881	When stream or consumer names contained path separators it prevented backup and restore. Signed-off-by: Derek Collison <derek@nats.io>	2022-06-20 11:59:18 -07:00
Derek Collison	301eb11725	Merge pull request #3168 from nats-io/no_fds_imp [IMPROVED] Loaded server and low on resources like FDs.	2022-06-06 06:01:20 -07:00
Derek Collison	e1c8f9fb55	This improves when a server is under load or low on resources like FDs and a user is trying to delete a stream with lots of consumers. Signed-off-by: Derek Collison <derek@nats.io>	2022-06-04 16:49:17 -07:00
Derek Collison	c8a730ce55	Stream get for KV was going through API layer, but with popularity needed a more peformant and lighter weight and direct approach. Signed-off-by: Derek Collison <derek@nats.io>	2022-05-30 16:34:54 -07:00
Derek Collison	daa4b97eeb	Don't do advisories or API stats for a direct get msg from a stream. Signed-off-by: Derek Collison <derek@nats.io>	2022-05-30 09:32:07 -07:00
Ivan Kozlovic	4bf81420e2	[FIXED] Fast routed JetStream API requests were dropped If a JS API request is received from a non client connection, it was processed in its own go routine. To reduce the number of such go routine, we were limiting the number of outstanding routines to 4096. However, in some situations, it was possible to issue many requests at the same time that would then cause those requests to be dropped. (an example was an MQTT benchmark tool that would create 5000 sessions, each with one QoS1 R1 consumer (with the use of consumer_replicas=1). On abrupt exit of the tool, the consumers and their sessions needed to be deleted. Since would cause fast incoming delete consumer requests which would cause the original code to drop some of them) Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-05-23 11:15:55 -06:00
Matthias Hanel	114474562c	removed redundant republish code (#3138 ) was not removed when moving the code into checkStreamCfg Signed-off-by: Matthias Hanel <mh@synadia.com>	2022-05-19 18:36:28 -04:00
Derek Collison	c166c9b199	Enable republishing of messages once stored in a stream. This enables lightweight distribution of messages to very large number of NATS subscribers. We add in metadata as headers that allows for gap detection which enables initial value (via JetStream, maybe KV) and realtime NATS core updates but all globally ordered. Signed-off-by: Derek Collison <derek@nats.io>	2022-05-17 15:18:54 -07:00
Derek Collison	50be0a6599	Allow explicit configuration of consumer's replica count and allow a consumer to force memory storage. Signed-off-by: Derek Collison <derek@nats.io>	2022-05-16 19:03:56 -07:00
Derek Collison	6bbc5f627c	Support for MaxBytes for pull requests. Signed-off-by: Derek Collison <derek@nats.io>	2022-05-16 08:43:33 -07:00
Ivan Kozlovic	5050092468	[FIXED] JetStream: possible lock inversion When updating usage, there is a lock inversion in that the jetStream lock was acquired while under the stream's (mset) lock, which is not correct. Also, updateUsage was locking the jsAccount lock, which again, is not really correct since jsAccount contains streams, so it should be jsAccount->stream, not the other way around. Removed the locking of jetStream to check for clustered state since js.clustered is immutable. Replaced using jsAccount lock to update usage with a dedicated lock. Originally moved all the update/limit fields in jsAccount to new structure to make sure that I would see all code that is updating or reading those fields, and also all functions so that I could make sure that I use the new lock when calling these. Once that works was done, and to reduce code changes, I put the fields back into jsAccount (although I grouped them under the new usageMu mutex field). Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-05-02 09:50:32 -06:00
Matthias Hanel	d520a27c36	[fixed] step down timing, consumer stream seqno, clear redelivery (#3079 ) Step down timing for consumers or streams. Signals loss of leadership and sleeps before stepping down. This makes it less likely that messages are being processed during step down. When becoming leader, consumer stream seqno got reset, even though the consumer existed already. Proper cleanup of redelivery data structures and timer Signed-off-by: Matthias Hanel <mh@synadia.com>	2022-04-27 03:32:08 -04:00
Leander Kohler	966d9d56f4	Add JSConsumerDeliveryNakAdvisory The advisory `JSAdvisoryConsumerMsgNakPre` will be triggered when a message is naked	2022-04-25 16:13:32 +02:00
Matthias Hanel	79b4374d01	[Fixed] limits enforcement issues (#3046 ) * [Fixed] limits enforcement issues stream create had checks that stream restore did not have. Moved code into commonly used function checkStreamCfg. Also introduced (cluster/non clustered) StreamLimitsCheck functions to perform checks specific to clustered /non clustered data structures. Checking for valid stream config and limits/reservations before receiving all the data. Now fails the request right away. Added a jetstream limit "max_request_batch" to limit fetch batch size Shortened max name length from 256 to 255, more common file name limit Added check for loop in cyclic source stream configurations features related to limits Signed-off-by: Matthias Hanel <mh@synadia.com>	2022-04-18 01:53:48 -04:00
Ivan Kozlovic	eb4856e4a7	Cleanup timers on consumer leader change Signed-off-by: Ivan Kozlovic <ivan@synadia.com>	2022-04-16 13:37:46 -06:00
Derek Collison	d5ed5b1d92	fix Signed-off-by: Derek Collison <derek@nats.io>	2022-04-15 13:38:12 -07:00
Derek Collison	3d9fdff315	Protect against no cluster Signed-off-by: Derek Collison <derek@nats.io>	2022-04-15 13:24:55 -07:00

1 2 3 4 5 ...

266 Commits