This exports the one key function of the subject transformer
allowing external tools to be written to test mappings are
valid and see how they would interact without the hassle of
configuring a serrver
The APIs are specifically marked as being unsupported and
having kept the transform struct itself unexported one can
not cast from the interface to the real implementation
Signed-off-by: R.I.Pienaar <rip@devco.net>
This could happen when a consumer had not sent anything to the
attached NATS subscription and there was a consumer leader
step down or server restart.
Signed-off-by: Derek Collison <derek@nats.io>
This enables lightweight distribution of messages to very large number of NATS subscribers.
We add in metadata as headers that allows for gap detection which enables initial value (via JetStream, maybe KV) and realtime NATS core updates but all globally ordered.
Signed-off-by: Derek Collison <derek@nats.io>
When a downstream stream uses retention modes that delete messages, fallback to timebased start time for the new source consumers.
Signed-off-by: Derek Collison <derek@nats.io>
When updating usage, there is a lock inversion in that the jetStream
lock was acquired while under the stream's (mset) lock, which is
not correct. Also, updateUsage was locking the jsAccount lock, which
again, is not really correct since jsAccount contains streams, so
it should be jsAccount->stream, not the other way around.
Removed the locking of jetStream to check for clustered state since
js.clustered is immutable.
Replaced using jsAccount lock to update usage with a dedicated lock.
Originally moved all the update/limit fields in jsAccount to new
structure to make sure that I would see all code that is updating
or reading those fields, and also all functions so that I could
make sure that I use the new lock when calling these. Once that
works was done, and to reduce code changes, I put the fields back
into jsAccount (although I grouped them under the new usageMu mutex
field).
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Step down timing for consumers or streams.
Signals loss of leadership and sleeps before stepping down.
This makes it less likely that messages are being processed during step
down.
When becoming leader, consumer stream seqno got reset,
even though the consumer existed already.
Proper cleanup of redelivery data structures and timer
Signed-off-by: Matthias Hanel <mh@synadia.com>
The main issue was that in mixed-mode, the interest through gateway
may still be in optimistic mode, which when creating the source
consumer would start delivery before we had a chance to setup
the subscription to receive those messages.
The approach is to create the subscription prior to sending
the consumer create request. Also refactored a bit the code in
the hope to make the retries a bit more bullet proof.
We may also look at making sure that gateways are switched to
interest-mode when detecting a mixed-mode setup.
Also fixed a defect that could cause a source to be canceled
when updating a stream.
Resovles #2801
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
- Possibly missing some early messages from the sourced stream
- In some cancel situations, the processing of sourced messages
would not longer work
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
* [Fixed] limits enforcement issues
stream create had checks that stream restore did not have.
Moved code into commonly used function checkStreamCfg.
Also introduced (cluster/non clustered) StreamLimitsCheck functions to
perform checks specific to clustered /non clustered data structures.
Checking for valid stream config and limits/reservations before
receiving all the data. Now fails the request right away.
Added a jetstream limit "max_request_batch" to limit fetch batch size
Shortened max name length from 256 to 255, more common file name limit
Added check for loop in cyclic source stream configurations
features related to limits
Signed-off-by: Matthias Hanel <mh@synadia.com>
During elected stepdown and transfer allow the new leader to take over before we stepdown.
We could receive a leader change, so make sure to also check migration state.
Signed-off-by: Derek Collison <derek@nats.io>
- A stream could become leader when it should not, causing
messages to be lost.
- A catchup could stall because the server sending data
could bail out of the runCatchup routine but still send
the EOF signal.
- Deadlock with monitoring of Jsz
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Signed-off-by: Derek Collison <derek@nats.io>
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
The system will allow an update to a stream, and subsequently all attached consumers, to be placed in another cluster either directly or via tag placement.
The meta layer will scale the underlying peerset appropriately to straddle the two clusters for both the stream and consumers, taking into account the consumer type.
Control will then pass to the current leaders of the assets who will monitor the catchup status of the new peers.
(Note we can optimize this later to only traverse once across a GW for any given asset, but for now this is simpler)
Once the original leaders have determined the assets are synched it will pass leadership to a member of the new peerset.
Once the new leader has been elected, it will forward a request for the meta layer to shrink the peerset by removing the old peers.
Signed-off-by: Derek Collison <derek@nats.io>
Some warnings, especially when dealing with JS limits that were
printed on a per-message basis, are now limited to ~1 per second
if the content of the warning is already found in a map.
This is also for "client" warnings, but the client porting of the
warning is not taken into account so that helps with reducing logging
for similar content, but coming from different clients.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
adds unit test to test this scenario
improves reporting of correct error
only show info for non existing tiers where streams exist
Signed-off-by: Matthias Hanel <mh@synadia.com>
* Adding server limits (max ack pending/dedupe window) to js config
Also shifting consumer config check to jsConsumerCreate as in clustered
mode this was enforced in the wrong place
Signed-off-by: Matthias Hanel <mh@synadia.com>
Also fixed a bug where we were incorrectly not spining up the monitoring loop for a stream when going from 3->1->3.
Signed-off-by: Derek Collison <derek@nats.io>
Would possibly show up when a consumer leader changes for a consumer
that had redelivered messages and for instance messages were inbound
on the stream.
Resolves#2912
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Previously we would rely more heavily on Go's garbage collector since when we loaded a block for an underlying stream we would pass references upward to avoimd copies.
Now we always copy when passing back to the upper layers which allows us to not only expire our cache blocks but pool and reuse them.
The upper layers also had changes made to allow the pooling layer at that level to interoperate with the storage layer optionally.
Also fixed some flappers and a bug where de-dupe might not be reformed correctly.
Signed-off-by: Derek Collison <derek@nats.io>
This was introduced by the change for ipQueues in #2931.
The (*ipQueue).unregister() was written with a protection for
the ipQueue to be nil, however, mset.outq is actually not a bare
ipQueue but a jsOutQ that embeds a pointer to an ipQueue. So we
need to implement register() for jsOutQ.
Added a test that reproduced the issue, but found it with a flapping
test (TestJetStreamLongStreamNamesAndPubAck) that failed due to
a file name too long.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Removal of a stream source that was external was not working properly,
allowing messages to still flow after the removal and until the
server hosting the stream to which the source was removed was
restarted.
Resolves#2920
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Removed the warnings, instead have a sync.Map where they are
registered/unregistered and can be inspected with an undocumented
monitor page.
Added the notion of "in progress" which is the number of messages
that have beend pop()'ed. When recycle() is invoked this count
goes down.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Also had to change all references from `path.` to `filepath.` when
dealing with files, so that it works properly on Windows.
Fixed also lots of tests to defer the shutdown of the server
after the removal of the storage, and fixed some config files
directories to use the single quote `'` to surround the file path,
again to work on Windows.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
The "deleted" advisory was missing because the stream's send loop
was closed before the advisory was pushed to the queue to be sent.
Added tests, both for single and clustered mode to test all stream
advisories.
Resolves#2886
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
This change allows a bit better logging on startup to more easily map a RAFT log directory etc to the stream/consumer.
Signed-off-by: Derek Collison <derek@nats.io>
This should help with GC pressure, however, it may have an effect
on performance (based on some benchmark). Calling sync.Pool.Get/Put
too often has a performance impact...
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>