Commit Graph

516 Commits

Author SHA1 Message Date
Ivan Kozlovic
5f0ee2344a [FIXED] JetStream: stream mirror updates not rejected in standalone
Updates to stream mirror config are rejected in cluster mode, but
were not in standalone. This PR adds the check in standalone mode.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-07-15 10:51:57 -06:00
Derek Collison
59a0a3ba5c Allow stream and mirror direct get abilities to be updated.
Signed-off-by: Derek Collison <derek@nats.io>
2022-07-08 07:13:34 -07:00
Ivan Kozlovic
be8734e32b [CHANGED] JetStream: accept "Nats-Expected-Last-Sequence" with "0"
We use to ignore if the seq was 0, but now would treat it as
a requirement that the stream be empty if header is present but
set to 0.

This relates to client PR: https://github.com/nats-io/nats.go/pull/958

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-07-07 09:44:27 -06:00
Matthias Hanel
70be4b77f9 fixes peer removal, simplifies move, more tests
Make sure when processing a peer removal that the stream assignment agrees.
When a new leader takes over it can resend a peer removal, and if the stream/consumer really was rescheduled we could remove by accident.

Also need to make sure that when we remove a stream we remove the node as part of the stream assignment.
If we didn't, if the same asset returned to this server we would not start up the monitoring loop.

Simplify migration logic in monitorStream, to be driven by leader only

Improved unit tests

Added failure when server not in peer list

Move command does not require server anymore

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-07-07 03:32:13 +02:00
Derek Collison
f8939b40bc Do not unsubscribe from direct access on leader stepdown, only stopping.
Also wait for stream to have replicas and leader for test.

Signed-off-by: Derek Collison <derek@nats.io>
2022-07-06 16:20:12 -07:00
Derek Collison
1ea608eabf Allows direct get to also do get next for subject with starting sequence
Signed-off-by: Derek Collison <derek@nats.io>
2022-07-06 14:22:28 -07:00
Derek Collison
5690059dac Reserve a system queue group
Signed-off-by: Derek Collison <derek@nats.io>
2022-07-06 13:16:13 -07:00
Derek Collison
4075721651 Allow direct msg get for stream to operate in queue group and allows mirrors to opt-in to the same group.
Signed-off-by: Derek Collison <derek@nats.io>
2022-07-02 14:16:55 -07:00
Derek Collison
a77d5941ef Allow filtered stream mirrors
Signed-off-by: Derek Collison <derek@nats.io>
2022-06-29 08:12:38 -07:00
Derek Collison
abc5905aa9 Merge pull request #3221 from nats-io/direct
Made direct get from a stream part of the $JS.API hierarchy vs separate.
2022-06-28 09:59:44 -07:00
Matthias Hanel
aabaf6f106 [fixed] reload related races (#3222)
account.rm had races caused by reload copying rm from one account to
another

mset.store was used outsisde the lock

in rare cases the stasz message was not received in time.
Trigger automatically now

sometimes a statsz message received before reload cause issues.
try receiving a second time

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-06-28 18:36:13 +02:00
Derek Collison
b8ef9b19a0 Made direct get from a stream part of the $JS.API hierarchy vs separate.
Also for direct get and for pull requests, if we are not on a client connection check how long we have been away from the readloop.
If need be execute in a separate go routine.

Signed-off-by: Derek Collison <derek@nats.io>
2022-06-28 08:53:48 -07:00
Derek Collison
301eb11725 Merge pull request #3168 from nats-io/no_fds_imp
[IMPROVED] Loaded server and low on resources like FDs.
2022-06-06 06:01:20 -07:00
Derek Collison
b64f3095ce Republish on the republish subject, place original in a header like direct get
Signed-off-by: Derek Collison <derek@nats.io>
2022-06-05 16:21:39 -07:00
Derek Collison
dfc74dd4c1 Make sure only stream leader does a republish
Signed-off-by: Derek Collison <derek@nats.io>
2022-06-05 15:27:07 -07:00
Derek Collison
fddc31adb5 Merge pull request #3158 from nats-io/kv-direct-get
[IMPROVED] Fast and Direct access to stream messages.
2022-06-05 07:04:18 -07:00
Derek Collison
e1c8f9fb55 This improves when a server is under load or low on resources like FDs and a user is trying to delete a stream with lots of consumers.
Signed-off-by: Derek Collison <derek@nats.io>
2022-06-04 16:49:17 -07:00
R.I.Pienaar
52a1c542f5 export the correct subject transformer subject
While the TransformSubject function was doing the right
thing it did not match first and so would panic for subjects
that do not match the mapping.

The map function does the right thing so this is a more
appropriate function to export.

This undoes the exporting of unsafe TransformSubject and
exports the safer Match instead.

Signed-off-by: R.I.Pienaar <rip@devco.net>
2022-06-02 18:26:12 +02:00
Derek Collison
0979bce543 Updates based on group feedback.
1. Do not use original subject since this could use Request() and we want to use muxing.
2  Place original subject and timestamp into headers.

Signed-off-by: Derek Collison <derek@nats.io>
2022-05-31 19:15:52 -07:00
Derek Collison
c8a730ce55 Stream get for KV was going through API layer, but with popularity needed a more peformant and lighter weight and direct approach.
Signed-off-by: Derek Collison <derek@nats.io>
2022-05-30 16:34:54 -07:00
Derek Collison
e08f6d863d Allow for republish to be headers only
Signed-off-by: Derek Collison <derek@nats.io>
2022-05-30 12:05:17 -07:00
Derek Collison
5592315e89 Suppress consumer create and R1 stream update advisories on server restart.
Signed-off-by: Derek Collison <derek@nats.io>
2022-05-30 09:58:35 -07:00
R.I.Pienaar
dc9d6776f8 Export the subject transformer
This exports the one key function of the subject transformer
allowing external tools to be written to test mappings are
valid and see how they would interact without the hassle of
configuring a serrver

The APIs are specifically marked as being unsupported and
having kept the transform struct itself unexported one can
not cast from the interface to the real implementation

Signed-off-by: R.I.Pienaar <rip@devco.net>
2022-05-27 10:33:59 +02:00
Derek Collison
46f7f7bfc9 Consumer pending was not correct when stream had max msgs per subject set > 1 and a consumer that filtered out part of the stream was created.
Also make sure to update stream's config on a stream restore in case of changes.

Signed-off-by: Derek Collison <derek@nats.io>
2022-05-24 14:44:15 -07:00
Ivan Kozlovic
53e3c53d96 [FIXED] JetStream: consumer with deliver new may miss messages
This could happen when a consumer had not sent anything to the
attached NATS subscription and there was a consumer leader
step down or server restart.

Signed-off-by: Derek Collison <derek@nats.io>
2022-05-23 12:01:48 -06:00
Derek Collison
790d643431 Consumer's num pending can now rely on the stream's store vs trying to maintain furing runtime which could be wrong under certain conditions.
Signed-off-by: Derek Collison <derek@nats.io>
2022-05-20 08:45:43 -07:00
Derek Collison
e3249d8b6c Move cfg check for republish to common func
Signed-off-by: Derek Collison <derek@nats.io>
2022-05-17 15:33:43 -07:00
Derek Collison
c166c9b199 Enable republishing of messages once stored in a stream.
This enables lightweight distribution of messages to very large number of NATS subscribers.
We add in metadata as headers that allows for gap detection which enables initial value (via JetStream, maybe KV) and realtime NATS core updates but all globally ordered.

Signed-off-by: Derek Collison <derek@nats.io>
2022-05-17 15:18:54 -07:00
Derek Collison
bcecae42ac Fix for #3119
Signed-off-by: Derek Collison <derek@nats.io>
2022-05-12 15:45:29 -07:00
Derek Collison
88ebfdaee8 Merge pull request #3109 from nats-io/issue-3107-3069
[FIXED] Downstream sourced retention policy streams during restart have redelivered messages
2022-05-09 09:13:48 -07:00
Derek Collison
b35988adf9 Remember the last timestamp by not removing last msgBlk when empty and during purge pull last timestamp forward until new messages arrive.
When a downstream stream uses retention modes that delete messages, fallback to timebased start time for the new source consumers.

Signed-off-by: Derek Collison <derek@nats.io>
2022-05-09 09:04:19 -07:00
Derek Collison
6507cba2a9 Fix for race on recovery
Signed-off-by: Derek Collison <derek@nats.io>
2022-05-07 12:42:56 -07:00
Ivan Kozlovic
5050092468 [FIXED] JetStream: possible lock inversion
When updating usage, there is a lock inversion in that the jetStream
lock was acquired while under the stream's (mset) lock, which is
not correct. Also, updateUsage was locking the jsAccount lock, which
again, is not really correct since jsAccount contains streams, so
it should be jsAccount->stream, not the other way around.

Removed the locking of jetStream to check for clustered state since
js.clustered is immutable.

Replaced using jsAccount lock to update usage with a dedicated lock.

Originally moved all the update/limit fields in jsAccount to new
structure to make sure that I would see all code that is updating
or reading those fields, and also all functions so that I could
make sure that I use the new lock when calling these. Once that
works was done, and to reduce code changes, I put the fields back
into jsAccount (although I grouped them under the new usageMu mutex
field).

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-05-02 09:50:32 -06:00
Matthias Hanel
d520a27c36 [fixed] step down timing, consumer stream seqno, clear redelivery (#3079)
Step down timing for consumers or streams.
Signals loss of leadership and sleeps before stepping down.
This makes it less likely that messages are being processed during step
down.

When becoming leader, consumer stream seqno got reset,
even though the consumer existed already.

Proper cleanup of redelivery data structures and timer

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-04-27 03:32:08 -04:00
Derek Collison
f702e279ab Fix for a consumer recovery issue.
Also update healthz to check all assets that are assigned, not just running.

Signed-off-by: Derek Collison <derek@nats.io>
2022-04-26 19:22:19 -07:00
Ivan Kozlovic
b9463b322f [FIXED] JetStream: stream mirror issues in mixed mode clusters
Similar to PR #3061

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-04-20 23:21:15 -06:00
Ivan Kozlovic
df61a335c7 Merge pull request #3061 from nats-io/js_fix_stream_source
[FIXED] JetStream: stream sources issue in mixed mode clusters
2022-04-20 23:20:41 -06:00
Ivan Kozlovic
9975a38c6e [FIXED] JetStream: stream sources issue in mixed mode clusters
The main issue was that in mixed-mode, the interest through gateway
may still be in optimistic mode, which when creating the source
consumer would start delivery before we had a chance to setup
the subscription to receive those messages.

The approach is to create the subscription prior to sending
the consumer create request. Also refactored a bit the code in
the hope to make the retries a bit more bullet proof.

We may also look at making sure that gateways are switched to
interest-mode when detecting a mixed-mode setup.

Also fixed a defect that could cause a source to be canceled
when updating a stream.

Resovles #2801

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-04-20 21:02:35 -06:00
Matthias Hanel
ff5d60973d introducing max_age/dupe_window minimum value of 100ms. (#3056)
Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-04-20 13:58:19 -04:00
Ivan Kozlovic
a78ccdcb2f [FIXED] JetStream: some stream SOURCE issues
- Possibly missing some early messages from the sourced stream
- In some cancel situations, the processing of sourced messages
would not longer work

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-04-18 12:42:16 -06:00
Matthias Hanel
79b4374d01 [Fixed] limits enforcement issues (#3046)
* [Fixed] limits enforcement issues

stream create had checks that stream restore did not have.
Moved code into commonly used function checkStreamCfg.
Also introduced (cluster/non clustered) StreamLimitsCheck functions to
perform checks specific to clustered /non clustered data structures.

Checking for valid stream config and limits/reservations before
receiving all the data. Now fails the request right away.

Added a jetstream limit "max_request_batch" to limit fetch batch size

Shortened max name length from 256 to 255, more common file name limit

Added check for loop in cyclic source stream configurations

features related to limits

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-04-18 01:53:48 -04:00
Derek Collison
12730af5c2 Make sure mirror re-syncs after origin is moved.
Speed up mirror and sources heartbeats.

Signed-off-by: Derek Collison <derek@nats.io>
2022-04-15 07:01:55 -07:00
Derek Collison
9748925f13 Improvements to stream and consumer move.
During elected stepdown and transfer allow the new leader to take over before we stepdown.
We could receive a leader change, so make sure to also check migration state.

Signed-off-by: Derek Collison <derek@nats.io>
2022-04-14 07:27:29 -07:00
Ivan Kozlovic
50c3986863 [FIXED] JetStream stream catchup issues
- A stream could become leader when it should not, causing
messages to be lost.
- A catchup could stall because the server sending data
could bail out of the runCatchup routine but still send
the EOF signal.
- Deadlock with monitoring of Jsz

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Signed-off-by: Derek Collison <derek@nats.io>
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-04-12 16:05:12 -06:00
Derek Collison
aa256de55b Add in Domain to alternates
Signed-off-by: Derek Collison <derek@nats.io>
2022-04-11 18:47:19 -06:00
Derek Collison
b7718e2b7a First pass support for stream alternates
Signed-off-by: Derek Collison <derek@nats.io>
2022-04-11 18:47:19 -06:00
Derek Collison
7e38ebcb6e Allow assets such as streams and their associated consumers to migrate between clusters.
The system will allow an update to a stream, and subsequently all attached consumers, to be placed in another cluster either directly or via tag placement.
The meta layer will scale the underlying peerset appropriately to straddle the two clusters for both the stream and consumers, taking into account the consumer type.
Control will then pass to the current leaders of the assets who will monitor the catchup status of the new peers.
(Note we can optimize this later to only traverse once across a GW for any given asset, but for now this is simpler)
Once the original leaders have determined the assets are synched it will pass leadership to a member of the new peerset.
Once the new leader has been elected, it will forward a request for the meta layer to shrink the peerset by removing the old peers.

Signed-off-by: Derek Collison <derek@nats.io>
2022-04-04 18:28:36 -07:00
Ivan Kozlovic
19783a9f11 [CHANGED] Rate limit similar warnings
Some warnings, especially when dealing with JS limits that were
printed on a per-message basis, are now limited to ~1 per second
if the content of the warning is already found in a map.

This is also for "client" warnings, but the client porting of the
warning is not taken into account so that helps with reducing logging
for similar content, but coming from different clients.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-04-01 15:24:03 -06:00
Matthias Hanel
a77f95faa8 error handling and info when moving a stream from non existing tier (#2992)
adds unit test to test this scenario
improves reporting of correct error
only show info for non existing tiers where streams exist

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-04-01 14:21:35 -04:00
Derek Collison
7f78d3e618 Not allowing streams to be created meant we could not recover on server restart.
Signed-off-by: Derek Collison <derek@nats.io>
2022-04-01 06:41:22 -07:00