Commit Graph

204 Commits

Author SHA1 Message Date
Ivan Kozlovic
50c3986863 [FIXED] JetStream stream catchup issues
- A stream could become leader when it should not, causing
messages to be lost.
- A catchup could stall because the server sending data
could bail out of the runCatchup routine but still send
the EOF signal.
- Deadlock with monitoring of Jsz

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Signed-off-by: Derek Collison <derek@nats.io>
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-04-12 16:05:12 -06:00
Derek Collison
37cbac99e7 Improvements to the raft layer for general stability and support of scale up and down and asset move.
Also fixed a bug that would allow a leadership transfer when catching up.

Signed-off-by: Derek Collison <derek@nats.io>
2022-04-10 08:59:39 -07:00
Matthias Hanel
d9da66d67e returns -1 for new unlimited/unset limits and tests/fixes info counts (#3002)
iterates on tiered limits

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-04-05 12:25:55 -04:00
Matthias Hanel
569328bc18 fix crash on shutdown with meta being nil (#2999)
Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-04-04 17:40:19 -04:00
Ivan Kozlovic
19783a9f11 [CHANGED] Rate limit similar warnings
Some warnings, especially when dealing with JS limits that were
printed on a per-message basis, are now limited to ~1 per second
if the content of the warning is already found in a map.

This is also for "client" warnings, but the client porting of the
warning is not taken into account so that helps with reducing logging
for similar content, but coming from different clients.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-04-01 15:24:03 -06:00
Matthias Hanel
a77f95faa8 error handling and info when moving a stream from non existing tier (#2992)
adds unit test to test this scenario
improves reporting of correct error
only show info for non existing tiers where streams exist

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-04-01 14:21:35 -04:00
Matthias Hanel
92f4dc986a added max_ack_pending setting to js account limits (#2982)
* added max_ack_penind setting to js account limits

because of the addition, defaults now have to be set later (depend on
these new limits now)

also re-organized the code to closer track how stream create looks

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-03-31 14:17:16 -04:00
Matthias Hanel
1445153130 Adding max stream bytes check (#2970)
* Adding max stream bytes check

Also start checking on  stream update

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-03-30 15:50:28 -04:00
Matthias Hanel
1aeaaf0ca3 Adding server limits (max ack pending/dedupe window) to js config (#2967)
* Adding server limits (max ack pending/dedupe window) to js config

Also shifting consumer config check to jsConsumerCreate as in clustered
mode this was enforced in the wrong place

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-03-29 13:19:36 -04:00
Matthias Hanel
9ee03aedd1 update message size was too short when nothing needed to be sent (#2968)
Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-03-28 23:37:30 -04:00
Matthias Hanel
0c5f3688a7 [ADDED] Tiered limits and fix limit issues on updates (#2945)
* Adding tiered limits and fix limit issues on updates

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-03-28 20:47:54 -04:00
Ivan Kozlovic
4739eebfc4 [FIXED] JetStream: possible deadlock during consumer leadership change
Would possibly show up when a consumer leader changes for a consumer
that had redelivered messages and for instance messages were inbound
on the stream.

Resolves #2912

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-25 12:21:51 -06:00
Ivan Kozlovic
b4128693ed Ensure file path is correct during stream restore
Also had to change all references from `path.` to `filepath.` when
dealing with files, so that it works properly on Windows.

Fixed also lots of tests to defer the shutdown of the server
after the removal of the storage, and fixed some config files
directories to use the single quote `'` to surround the file path,
again to work on Windows.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-09 13:31:51 -07:00
Matthias Hanel
9a2da9ed8c Adding denies $KV.>/$OBJ.> along leaf connections on differing domain (#2916)
* Adding denies $KV.>/$OBJ.> along leaf connections on differing domain

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-03-09 13:17:59 -05:00
Derek Collison
11cad6be6b In the process of working on #2885 with a user, I was struggling to map $SYS directories to consumer names.
This change allows a bit better logging on startup to more easily map a RAFT log directory etc to the stream/consumer.

Signed-off-by: Derek Collison <derek@nats.io>
2022-03-03 09:50:00 -08:00
Derek Collison
fb15dfd9b7 Allow replica updates during stream update.
Also add in HAAssets count to Jsz.

Signed-off-by: Derek Collison <derek@nats.io>
2022-02-13 19:33:46 -08:00
Derek Collison
6fd41e5ea4 Updates based on review feedback
Signed-off-by: Derek Collison <derek@nats.io>
2022-01-24 10:23:47 -08:00
Derek Collison
d962500827 Track reply subjects for pending pull requests across clustered consumers.
We will only send if all peers in our group are >= 2.7.1 and we will check for updates.
When a consumer follower takes over it will notify all pending requests that those requests are invalid now.

Signed-off-by: Derek Collison <derek@nats.io>
2022-01-21 16:31:59 -08:00
Ivan Kozlovic
84f6cbb760 Pooling pubMsg and jsPubMsg objects
This should help with GC pressure, however, it may have an effect
on performance (based on some benchmark). Calling sync.Pool.Get/Put
too often has a performance impact...

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-01-13 13:14:25 -07:00
Ivan Kozlovic
92e8997506 Replaced system event queue
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-01-13 13:03:33 -07:00
Derek Collison
52da55c8c6 Implement overflow placement for JetStream streams.
This allows stream placement to overflow to adjacent clusters.
We also do more balanced placement based on resources (store or mem). We can continue to expand this as well.
We also introduce an account requirement that stream configs contain a MaxBytes value.

We now track account limits and server limits more distinctly, and do not reserver server resources based on account limits themselves.

Signed-off-by: Derek Collison <derek@nats.io>
2022-01-06 19:33:08 -08:00
Derek Collison
1a37f0963a Avoid race condition
Signed-off-by: Derek Collison <derek@nats.io>
2021-12-29 08:26:10 -08:00
Matthias Hanel
3e8b66286d Js leaf deny (#2693)
Along a leaf node connection, unless the system account is shared AND the JetStream domain name is identical, the default JetStream traffic (without a domain set) will be denied.

As a consequence, all clients that wants to access a domain that is not the one in the server they are connected to, a domain name must be specified.
Affected from this change are setups where: a leaf node had no local JetStream OR the server the leaf node connected to had no local JetStream. 
One of the two accounts that are connected via a leaf node remote, must have no JetStream enabled.
The side that does not have JetStream enabled, will loose JetStream access and it's clients must set `nats.Domain` manually.

For workarounds on how to restore the old behavior, look at:
https://github.com/nats-io/nats-server/pull/2693#issuecomment-996212582

New config values added:
`default_js_domain` is a mapping from account to domain, settable when JetStream is not enabled in an account.
`extension_hint` are hints for non clustered server to start in clustered mode (and be usable to extend)
`js_domain` is a way to set the JetStream domain to use for mqtt.

Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-12-16 16:53:20 -05:00
Derek Collison
6f5263e12d Add in a warning when detecting subjects on a mirror
Signed-off-by: Derek Collison <derek@nats.io>
2021-12-01 14:00:31 -07:00
Derek Collison
ca12a11be3 There were situations where invalid subjects could be assigned to streams.
This will patch them on the fly during recovery. Specifically subjects with leading or trailing spaces and mirror streams with any subjects at all.

Signed-off-by: Derek Collison <derek@nats.io>
2021-12-01 14:00:23 -07:00
Derek Collison
cd54b4028d Undo race fix which could cause deadlock
Signed-off-by: Derek Collison <derek@nats.io>
2021-11-04 15:36:03 -07:00
Derek Collison
7ef0cc5651 Fix for race on js.cluster status
Signed-off-by: Derek Collison <derek@nats.io>
2021-11-04 15:09:40 -07:00
Derek Collison
c9f615933a Honor JetStream server settings of 0.
Signed-off-by: Derek Collison <derek@nats.io>
2021-10-12 14:16:47 -07:00
Derek Collison
bbffd71c4a Improvements to meta raft layer around snapshots and recovery.
Signed-off-by: Derek Collison <derek@nats.io>
2021-10-12 05:53:52 -07:00
Ivan Kozlovic
d34b8fdd7d [FIXED] JetStream: data race on shutdown
Replaced use of eventsEnabled() with EventsEnabled() that will
check under server lock. Also found another reference when
creating templates.

Resolves #2588

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-09-30 11:17:12 -06:00
Derek Collison
7340dde408 Improve handling when exceeding account resources and trying to catch up streams, etc.
Signed-off-by: Derek Collison <derek@nats.io>
2021-09-23 14:21:14 -07:00
Derek Collison
944b90d4a3 Fix for no storage limits
Signed-off-by: Derek Collison <derek@nats.io>
2021-09-22 15:28:43 -07:00
Matthias Hanel
29a6367889 incorporating review comments.
Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-09-21 17:13:53 -04:00
Matthias Hanel
9911b37b0c [added] value to JS stats showing memory used from accounts with reservations
[fixed] reservations accounting issue on reload introduced by:
commit: bfb726e8e9
clearResources appeared to have been a workaround and broke
reload for non global accounts

Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-09-21 16:35:24 -04:00
Derek Collison
9534372113 Fix for #2551
When a mirror would be processed before the origin stream we would not recover the consumers due to failure on looking up source's subjects.
This change processes all streams first then does all consumers.

Signed-off-by: Derek Collison <derek@nats.io>
2021-09-21 08:53:12 -07:00
Matthias Hanel
5b9d20871d Exposing reserved memory in jsz/varz
Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-09-16 18:55:50 -04:00
Derek Collison
cfbc69b12c Allow clustered JetStream to allow duplicate stream creation like single server mode.
Resolves #2528

Signed-off-by: Derek Collison <derek@nats.io>
2021-09-15 20:18:44 -07:00
Ivan Kozlovic
0c9952f21d Additional lock inversion between jetStream and jsAccount
Order seem to be jetStream -> jsAccount. In JetStreamUsage()
we were doing the opposite. Moved the js.mu.RLock() to encompass
the whole getting of statistics from jsa. We could do in 2 phases:
get js's RLock to get cluster info (if clustered), then get
jsa's RLock for the rest, and set cluster info with what we gathered
in the first phase.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-08-31 14:18:31 -06:00
Derek Collison
476c264560 If we are in a simple mixed-mode setup with just global account and system account and clustered, allow pass through.
Signed-off-by: Derek Collison <derek@nats.io>
2021-08-26 09:41:01 -07:00
Derek Collison
3a20582ad5 Add in optional compression schemes for Accept-Encoding on server api requests.
Signed-off-by: Derek Collison <derek@nats.io>
2021-08-23 13:06:18 -07:00
Derek Collison
a7cf0ad985 Add in additional checks for failures during filestore encryption.
Signed-off-by: Derek Collison <derek@nats.io>
2021-08-17 14:08:50 -07:00
R.I.Pienaar
76ab1b8d17 attempt to improve UX of the error system
Previously we had a few confusing functions like NewT
and similar that were quite fragile to use due to minimal
validation and a panic in go stdlib string Replacer.

Now we generate helper methods for every string, these
are used to access errors, fill in templates and conditional
returns of error type using the new Unless() option

We now get compile time errors for some common mistakes
and have better IDE helpers for arguments etc

Signed-off-by: R.I.Pienaar <rip@devco.net>
2021-08-10 16:08:28 +02:00
Derek Collison
925a6fe6b2 Fix for #2388. Leafnodes with no JS can seamlessly access a HUB with JS.
This is the reverse of the early work to have LNs extend a non-JS cluster.
Also have mixed mode tests as well.

Signed-off-by: Derek Collison <derek@nats.io>
2021-08-01 14:57:47 -07:00
Derek Collison
f13fa767c2 Remove the swapping of accounts during processing of service imports.
When processing service imports we would swap out the accounts during processing.
With the addition of internal subscriptions and internal clients publishing in JetStream we had an issue with the wrong account being used.
This was specific to delyaed pull subscribers trying to unsubscribe due to max of 1 while other JetStream API calls were running concurrently.
2021-07-26 07:57:10 -07:00
Derek Collison
99fed910f0 Improvements to large numbers of JetStream R1 consumers per stream.
1. We were holding open FDs longer than we should for consumers causing issues with open FD limits. We now do not hold them open and cap updates a bit better.

2. When doing a stream delete, consumer delete was repeating alot of work that was not necessary, causing longer delays. This has been optimized a bit, still more improvements to be made.

3. We cover all JS under a single export, but that was also trapping GetNext for pull based consumers, and since this was a no-op (is handled at user account level) we were creating alot of garbage service import responses and reverse map entries that had to be garbage collected. We have a fix in to avoind this but still looking for a better one.

4. Still had some lingering references to all exports vs single JS export.

Signed-off-by: Derek Collison <derek@nats.io>
2021-06-29 05:45:55 -07:00
Derek Collison
bf6335dff9 Add in ability to have encrypted JetStream filestores.
This supports XChaChaPoly1305 for Seal and Open and ChaCha20 for our message blocks which use highway hashes and sequence numbers for authenticity.
We support snapshot and restore as well.

Signed-off-by: Derek Collison <derek@nats.io>
2021-06-21 19:28:10 -07:00
dtest11
a268905cd5 remove config !nil check,beacuse the if branch is always not nil 2021-06-18 18:02:45 +08:00
Derek Collison
08cdb2d2ea Make filtered consumers in large mixed streams more efficient.
Allow wider scoped filtered subjects.

We introduce a per subject information tracking to filestore to optimize for large mux'd streams and more efficient filtered consumers.

Signed-off-by: Derek Collison <derek@nats.io>
2021-06-15 04:44:05 -07:00
Derek Collison
ceebc3ae07 When checking limits we would check total ask against the server limits if limits were not set.
We were also dynamically setting account limits based on a single server limit.

Signed-off-by: Derek Collison <derek@nats.io>
2021-06-12 10:27:43 -07:00
Matthias Hanel
2caf2303f2 [adding] jetstream info to statsz (#2269)
* [adding] jetstream info to statsz

Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-06-10 11:54:56 -04:00