Commit Graph

3606 Commits

Author SHA1 Message Date
Jaime Piña
33cfc748bf Disable some supercluster limit placement tests (#2937) 2022-03-21 11:05:13 -07:00
Ivan Kozlovic
29ff67e2ac Tests: Replace all Ack() with AckSync() for now
For reason explained in previous commit, for tests that were
expecting the number of ack/pending to be of a certain value after
an Ack(), they would be flapping. Replaced all references and
we can go back to selectively call Ack() when AckSync() is not
needed.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-17 20:25:01 -06:00
Ivan Kozlovic
ac52ecd9ff Fixing flapper
Since acks are now processed in different go-routine, the tests
that use Ack() cannot expect the number of ack messages to be
exact immediately. So in this test use AckSync() to ensure that
the ack is processed. Alternatively, the pending count should
be checked with a checkFor().

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-17 19:53:33 -06:00
Ivan Kozlovic
a23b1b73ef Merge pull request #2931 from nats-io/ipq_changes
Changes to IPQueues
2022-03-17 19:13:02 -06:00
Derek Collison
a4e795c996 Attempt to fix flapper
Signed-off-by: Derek Collison <derek@nats.io>
2022-03-17 17:38:32 -07:00
Ivan Kozlovic
c3da392832 Changes to IPQueues
Removed the warnings, instead have a sync.Map where they are
registered/unregistered and can be inspected with an undocumented
monitor page.
Added the notion of "in progress" which is the number of messages
that have beend pop()'ed. When recycle() is invoked this count
goes down.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-17 17:53:06 -06:00
Derek Collison
69d265601d Merge pull request #2930 from nats-io/dupe_urls_config
Detect exact duplicates for URLs for routes, gateways or leafnodes.
2022-03-17 16:23:56 -07:00
Derek Collison
0bb84bf76b Make warning more detailed
Co-authored-by: Waldemar Quevedo <wally@synadia.com>
2022-03-17 14:59:14 -07:00
Derek Collison
e204a7961d When detecting exact duplicates for URLs for routes, gws or leafnodes, enter a warning and ignore.
If misconfigured could prevent the JetStream system from electing a leader.

Signed-off-by: Derek Collison <derek@nats.io>
2022-03-17 14:52:01 -07:00
Jaime Piña
50ca685a3b Add stream limit update test (#2929)
This adds a test to see if we can update a stream when the stream limit
is 1. Currently, this test fails, so we're skipping it. This test will
be enabled in a future PR.
2022-03-17 13:49:37 -07:00
Derek Collison
0601da2186 Merge pull request #2928 from nats-io/m_ver
Show version on main monitoring page with link to source
2022-03-17 11:24:35 -07:00
Ivan Kozlovic
7d9bb32c1d Fix a flapper
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-17 12:18:22 -06:00
Derek Collison
fa098f1af0 Show version on main monitoring page with link to source
Signed-off-by: Derek Collison <derek@nats.io>
2022-03-17 11:04:11 -07:00
Ivan Kozlovic
fe6d7b305f Merge pull request #2898 from nats-io/js_cons_ack_processing
[CHANGED] JetStream: Redeliveries may be delayed if necessary
2022-03-17 10:57:22 -06:00
Ivan Kozlovic
2c0f5046f1 Merge pull request #2923 from nats-io/gw_detect_duplicate_srv_name
[CHANGED] Gateway: Detect duplicate names between clusters
2022-03-17 10:57:08 -06:00
Derek Collison
dbfa47f9b1 Improve state preservation for consumers, specifically DeliverNew variants when no activity has been present.
Signed-off-by: Derek Collison <derek@nats.io>
2022-03-16 20:55:14 -07:00
Derek Collison
287b567b1c Add consumer check to healthz and allow to be called directly
Signed-off-by: Derek Collison <derek@nats.io>
2022-03-16 20:52:31 -07:00
Jaime Piña
acfd456758 Prevent reserved bytes underflow (#2907) 2022-03-16 15:19:35 -07:00
Derek Collison
59753ec0da Bump to 2.7.5-beta.2
Signed-off-by: Derek Collison <derek@nats.io>
2022-03-16 09:29:58 -07:00
Derek Collison
848670e45c Merge pull request #2925 from nats-io/delete_offline_stream
Offline streams behavior during list and delete improved.
2022-03-16 09:29:20 -07:00
Derek Collison
e4ebc4648e When a stream or consumer was offline we would not properly respond to a delete.
We also would hang if no stream info requests were sent during a stream list due to the asset being offline.

Signed-off-by: Derek Collison <derek@nats.io>
2022-03-15 21:11:23 -07:00
Derek Collison
303bb93c18 Test ack metrics
Signed-off-by: Derek Collison <derek@nats.io>
2022-03-15 16:41:06 -07:00
Ivan Kozlovic
63c750e295 [CHANGED] Gateway: Detect duplicate names between clusters
Gateway connection will be closed and error reported if a remote
has a name that is a duplicate of the local cluster.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-15 15:00:13 -06:00
Ivan Kozlovic
5c0d1999ff Bump version
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-09 14:21:30 -07:00
Ivan Kozlovic
773636c1c5 Release v2.7.4
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-09 13:58:33 -07:00
Ivan Kozlovic
b4128693ed Ensure file path is correct during stream restore
Also had to change all references from `path.` to `filepath.` when
dealing with files, so that it works properly on Windows.

Fixed also lots of tests to defer the shutdown of the server
after the removal of the storage, and fixed some config files
directories to use the single quote `'` to surround the file path,
again to work on Windows.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-09 13:31:51 -07:00
Ivan Kozlovic
0cb0f6d380 Merge pull request #2914 from nats-io/fix_2913
[FIXED] Consumer with no activity can lose quorum
2022-03-09 11:55:50 -07:00
Matthias Hanel
9a2da9ed8c Adding denies $KV.>/$OBJ.> along leaf connections on differing domain (#2916)
* Adding denies $KV.>/$OBJ.> along leaf connections on differing domain

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-03-09 13:17:59 -05:00
Ivan Kozlovic
5a97ee6b64 Merge pull request #2911 from nats-io/fix_lock_inversions
[FIXED] Some lock inversions
2022-03-09 10:31:49 -07:00
Ivan Kozlovic
3538aea34e Merge pull request #2915 from nats-io/fix_atomic_unaligned
[FIXED] Panic when monitoring enabled on non 64bit architectures
2022-03-09 10:30:50 -07:00
Ivan Kozlovic
0fae8067ae [FIXED] Some lock inversions
The established ordering is client -> Account, so fixed few places
where we had Account -> client.

Added a new file, locksordering.txt with the list of known ordering
for some of the objects.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-09 09:47:37 -07:00
Ivan Kozlovic
dde235a92e [FIXED] Panic when monitoring enabled on non 64bit architectures
This is due to an unaligned 64-bit atomic operation. Move the
field at top of structure with 64-bit aligned preceding fields.

Resolves #2011

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-09 09:29:29 -07:00
Derek Collison
3216eb5ee5 When a consumer has no state we are now compacting the log, but were not snapshotting.
This caused issues on leader change and losing quorum.

Signed-off-by: Derek Collison <derek@nats.io>
2022-03-09 07:21:25 -05:00
Matthias Hanel
d0c183106a Fixed lock inversion by not using account lock to get the name
Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-03-07 21:22:41 -05:00
Derek Collison
58da4b917a Made improvements to scale up and down for streams and consumers.
Signed-off-by: Derek Collison <derek@nats.io>
2022-03-06 16:59:02 -08:00
Derek Collison
4e5150e3ae Bump to 2.7.4-beta.2
Signed-off-by: Derek Collison <derek@nats.io>
2022-03-06 10:26:11 -08:00
Derek Collison
eb1ed5574d Merge pull request #2904 from nats-io/peer_remove_bad_consumer_state
[FIXED] Inconsistent durable consumer state after stream peer removal
2022-03-06 10:24:31 -08:00
Ivan Kozlovic
196319b106 [FIXED] JetStream: Some stream advisories missing
The "deleted" advisory was missing because the stream's send loop
was closed before the advisory was pushed to the queue to be sent.

Added tests, both for single and clustered mode to test all stream
advisories.

Resolves #2886

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-06 10:52:42 -07:00
Derek Collison
d52f607881 Merge pull request #2903 from nats-io/active_wrap
Set active for Sources and Mirrors to -1 on no contact
2022-03-06 09:45:23 -08:00
Ivan Kozlovic
be3e299587 Merge pull request #2902 from nats-io/gw_fix_race_set_first_ping_timer
Gateways: data race when setting first ping timer
2022-03-06 10:18:50 -07:00
Derek Collison
31a19729b0 When removing a stream peer with an attached durable consumer, the consumer could become inconsistent.
Signed-off-by: Derek Collison <derek@nats.io>
2022-03-06 05:42:22 -08:00
Derek Collison
4b9bc29e53 If we had not heard from a source or mirror we would still calculate the delta since now.
This would wrap and create a large number which overflowed JSON's 2^53 limit.

Signed-off-by: Derek Collison <derek@nats.io>
2022-03-05 12:46:55 -08:00
Derek Collison
037e3c6bbe Spiffied up monitoring landing page a bit
Signed-off-by: Derek Collison <derek@nats.io>
2022-03-05 09:18:07 -08:00
Ivan Kozlovic
85b3f8a7fd Gateways: data race when setting first ping timer
This was introduced when fixing #2881. The call to setFirstPingTimer
needed to be done under the client's lock.

Moved setFirstPingTimer from a server receiver to a client receiver.
The only reason it was a server receiver is because we need the
server options, but c.srv is always set when invoking this function,
so we will get the server from c.srv in that function now.

Related to #2881

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-04 19:55:07 -07:00
Ivan Kozlovic
52e3b4b545 Merge pull request #2901 from nats-io/leaf_qgroup_weight
[FIXED] LeafNode: queue sub interest not properly sent to new LN
2022-03-04 18:37:04 -07:00
Ivan Kozlovic
1e53d81cb3 [FIXED] LeafNode: queue sub interest not properly sent to new LN
In complex situations, queue members count across various servers
may not be properly accounted for when sent to a new leafnode
connection.

The new test TestLeafNodeQueueGroupWithLateLNJoin has a drawing
of such setup, when after LN1 joined, and then queue members
were removed with 1 left, LN1 was told that there was no
more interest, so message published to LN1 would not reach
the remaining queue sub connected to LN2.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-04 17:03:06 -07:00
Derek Collison
b759ff481f Some users reporting checksums don't match and "no message cache" on recovery.
Signed-off-by: Derek Collison <derek@nats.io>
2022-03-04 11:50:15 -08:00
Derek Collison
1b5f651c22 Fixed bug that would not recover a stream after non-clean shutdown with deleted messages.
Signed-off-by: Derek Collison <derek@nats.io>
2022-03-04 10:48:10 -08:00
Derek Collison
77bce19379 Bump to 2.7.4-beta
Signed-off-by: Derek Collison <derek@nats.io>
2022-03-04 10:08:14 -08:00
Derek Collison
ad6020ae72 Fix for #2885.
When a filtered consumer who has no state, meaning no messages are being processed, it still will receive updates to properly track the delivered sequence as it relates to the entire stream.
Since we did not have state we were inadvertently skipping the compaction logic for the raft store.

Signed-off-by: Derek Collison <derek@nats.io>
2022-03-04 08:53:16 -08:00