Commit Graph

3582 Commits

Author SHA1 Message Date
Derek Collison
e4ebc4648e When a stream or consumer was offline we would not properly respond to a delete.
We also would hang if no stream info requests were sent during a stream list due to the asset being offline.

Signed-off-by: Derek Collison <derek@nats.io>
2022-03-15 21:11:23 -07:00
Ivan Kozlovic
5c0d1999ff Bump version
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-09 14:21:30 -07:00
Ivan Kozlovic
773636c1c5 Release v2.7.4
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-09 13:58:33 -07:00
Ivan Kozlovic
b4128693ed Ensure file path is correct during stream restore
Also had to change all references from `path.` to `filepath.` when
dealing with files, so that it works properly on Windows.

Fixed also lots of tests to defer the shutdown of the server
after the removal of the storage, and fixed some config files
directories to use the single quote `'` to surround the file path,
again to work on Windows.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-09 13:31:51 -07:00
Ivan Kozlovic
0cb0f6d380 Merge pull request #2914 from nats-io/fix_2913
[FIXED] Consumer with no activity can lose quorum
2022-03-09 11:55:50 -07:00
Matthias Hanel
9a2da9ed8c Adding denies $KV.>/$OBJ.> along leaf connections on differing domain (#2916)
* Adding denies $KV.>/$OBJ.> along leaf connections on differing domain

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-03-09 13:17:59 -05:00
Ivan Kozlovic
5a97ee6b64 Merge pull request #2911 from nats-io/fix_lock_inversions
[FIXED] Some lock inversions
2022-03-09 10:31:49 -07:00
Ivan Kozlovic
3538aea34e Merge pull request #2915 from nats-io/fix_atomic_unaligned
[FIXED] Panic when monitoring enabled on non 64bit architectures
2022-03-09 10:30:50 -07:00
Ivan Kozlovic
0fae8067ae [FIXED] Some lock inversions
The established ordering is client -> Account, so fixed few places
where we had Account -> client.

Added a new file, locksordering.txt with the list of known ordering
for some of the objects.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-09 09:47:37 -07:00
Ivan Kozlovic
dde235a92e [FIXED] Panic when monitoring enabled on non 64bit architectures
This is due to an unaligned 64-bit atomic operation. Move the
field at top of structure with 64-bit aligned preceding fields.

Resolves #2011

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-09 09:29:29 -07:00
Derek Collison
3216eb5ee5 When a consumer has no state we are now compacting the log, but were not snapshotting.
This caused issues on leader change and losing quorum.

Signed-off-by: Derek Collison <derek@nats.io>
2022-03-09 07:21:25 -05:00
Matthias Hanel
d0c183106a Fixed lock inversion by not using account lock to get the name
Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-03-07 21:22:41 -05:00
Derek Collison
58da4b917a Made improvements to scale up and down for streams and consumers.
Signed-off-by: Derek Collison <derek@nats.io>
2022-03-06 16:59:02 -08:00
Derek Collison
4e5150e3ae Bump to 2.7.4-beta.2
Signed-off-by: Derek Collison <derek@nats.io>
2022-03-06 10:26:11 -08:00
Derek Collison
eb1ed5574d Merge pull request #2904 from nats-io/peer_remove_bad_consumer_state
[FIXED] Inconsistent durable consumer state after stream peer removal
2022-03-06 10:24:31 -08:00
Ivan Kozlovic
196319b106 [FIXED] JetStream: Some stream advisories missing
The "deleted" advisory was missing because the stream's send loop
was closed before the advisory was pushed to the queue to be sent.

Added tests, both for single and clustered mode to test all stream
advisories.

Resolves #2886

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-06 10:52:42 -07:00
Derek Collison
d52f607881 Merge pull request #2903 from nats-io/active_wrap
Set active for Sources and Mirrors to -1 on no contact
2022-03-06 09:45:23 -08:00
Ivan Kozlovic
be3e299587 Merge pull request #2902 from nats-io/gw_fix_race_set_first_ping_timer
Gateways: data race when setting first ping timer
2022-03-06 10:18:50 -07:00
Derek Collison
31a19729b0 When removing a stream peer with an attached durable consumer, the consumer could become inconsistent.
Signed-off-by: Derek Collison <derek@nats.io>
2022-03-06 05:42:22 -08:00
Derek Collison
4b9bc29e53 If we had not heard from a source or mirror we would still calculate the delta since now.
This would wrap and create a large number which overflowed JSON's 2^53 limit.

Signed-off-by: Derek Collison <derek@nats.io>
2022-03-05 12:46:55 -08:00
Derek Collison
037e3c6bbe Spiffied up monitoring landing page a bit
Signed-off-by: Derek Collison <derek@nats.io>
2022-03-05 09:18:07 -08:00
Ivan Kozlovic
85b3f8a7fd Gateways: data race when setting first ping timer
This was introduced when fixing #2881. The call to setFirstPingTimer
needed to be done under the client's lock.

Moved setFirstPingTimer from a server receiver to a client receiver.
The only reason it was a server receiver is because we need the
server options, but c.srv is always set when invoking this function,
so we will get the server from c.srv in that function now.

Related to #2881

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-04 19:55:07 -07:00
Ivan Kozlovic
52e3b4b545 Merge pull request #2901 from nats-io/leaf_qgroup_weight
[FIXED] LeafNode: queue sub interest not properly sent to new LN
2022-03-04 18:37:04 -07:00
Ivan Kozlovic
1e53d81cb3 [FIXED] LeafNode: queue sub interest not properly sent to new LN
In complex situations, queue members count across various servers
may not be properly accounted for when sent to a new leafnode
connection.

The new test TestLeafNodeQueueGroupWithLateLNJoin has a drawing
of such setup, when after LN1 joined, and then queue members
were removed with 1 left, LN1 was told that there was no
more interest, so message published to LN1 would not reach
the remaining queue sub connected to LN2.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-04 17:03:06 -07:00
Derek Collison
b759ff481f Some users reporting checksums don't match and "no message cache" on recovery.
Signed-off-by: Derek Collison <derek@nats.io>
2022-03-04 11:50:15 -08:00
Derek Collison
1b5f651c22 Fixed bug that would not recover a stream after non-clean shutdown with deleted messages.
Signed-off-by: Derek Collison <derek@nats.io>
2022-03-04 10:48:10 -08:00
Derek Collison
77bce19379 Bump to 2.7.4-beta
Signed-off-by: Derek Collison <derek@nats.io>
2022-03-04 10:08:14 -08:00
Derek Collison
ad6020ae72 Fix for #2885.
When a filtered consumer who has no state, meaning no messages are being processed, it still will receive updates to properly track the delivered sequence as it relates to the entire stream.
Since we did not have state we were inadvertently skipping the compaction logic for the raft store.

Signed-off-by: Derek Collison <derek@nats.io>
2022-03-04 08:53:16 -08:00
Derek Collison
30009fdd78 Merge pull request #2897 from nats-io/js-raft-logging
Better startup logging to help debug RAFT to streams/consumers.
2022-03-03 11:26:09 -07:00
Derek Collison
11cad6be6b In the process of working on #2885 with a user, I was struggling to map $SYS directories to consumer names.
This change allows a bit better logging on startup to more easily map a RAFT log directory etc to the stream/consumer.

Signed-off-by: Derek Collison <derek@nats.io>
2022-03-03 09:50:00 -08:00
Ivan Kozlovic
dfe96944d2 [FIXED] JetStream stream info consumers count in clustered mode
In clustering mode, the number of consumers in stream info may be
wrong in presence of non durable consumers. Ephemeral are handled
by specific nodes. The StreamInfo response would contain only the
consumer count that the stream leader is handling.

This fix overrides the stream's state consumers count with the
number of consumers from the stream assignment record.

Resolves #2895

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-03 09:46:35 -07:00
Ivan Kozlovic
ab2b9f3fbf Release v2.7.3
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-02-24 13:32:53 -07:00
Ivan Kozlovic
0d6f303309 Bump version to v2.7.3-beta.3
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-02-23 17:00:29 -07:00
Ivan Kozlovic
08d6aaa78f [FIXED] Gateway: connect could fail due to PING sent before CONNECT
When a gateway connection was created (either accepted or initiated)
the timer to fire the first PING was started at that time, which
means that for an outbound connection, if the INFO coming from
the other side was delayed, it was possible for the outbound to
send the PING protocol before the CONNECT, which would cause
the accepting side to close the connection due to a "parse" error
(since the CONNECT for an inbound is supposed to be the very
first protocol).

Also noticed that we were not setting the auth timer like we do
for the other type of connections. If authorization{timeout:<n>}
is not set, the default is 2 seconds.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-02-23 15:19:20 -07:00
Ivan Kozlovic
b7917e1d05 Various changes
- Limit IPQueue logging
- Add a rand per raft node. This is because in some situations when
running 3 servers at the same time, we would end-up with identical
inboxes for different subjects on the different nodes which would
cause panics
- Move the creation of internal subscriptions after the tracking
of the node and its peers.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-02-22 16:22:32 -07:00
Derek Collison
58806c1290 Bump to 2.7.3-beta.2
Signed-off-by: Derek Collison <derek@nats.io>
2022-02-17 12:41:30 -08:00
Derek Collison
2942d012f6 Merge pull request #2878 from nats-io/key_file_leak
Cleanup key files when removing message blocks.
2022-02-17 13:26:41 -07:00
Derek Collison
330a40009c Cleanup key files when removing message blocks.
Signed-off-by: Derek Collison <derek@nats.io>
2022-02-17 11:33:41 -08:00
Derek Collison
255a7c3792 Merge pull request #2875 from nats-io/issue_2873
[FIXED] Interest policy and staggered filtered consumers could fail to remove messages.
2022-02-17 12:31:24 -07:00
Derek Collison
4efce40bbd Small improvements to send performance to a full stream.
Cleaned up some locking and if fifo make index updates lazy like writeMsgRecord.

Signed-off-by: Derek Collison <derek@nats.io>
2022-02-17 05:39:27 -08:00
Derek Collison
1c8f7de848 On filtered subjects when consumers were staggered we need to disqualify a filtered consumer if not applicable.
Signed-off-by: Derek Collison <derek@nats.io>
2022-02-16 18:24:27 -08:00
Derek Collison
ca1132a01d Allow stream placement by tags.
Signed-off-by: Derek Collison <derek@nats.io>
2022-02-15 17:07:32 -08:00
Derek Collison
fb15dfd9b7 Allow replica updates during stream update.
Also add in HAAssets count to Jsz.

Signed-off-by: Derek Collison <derek@nats.io>
2022-02-13 19:33:46 -08:00
Derek Collison
5a93b0e9d8 Allow pull requests to specify a heartbeat when idle to detect when a request is invalidated.
Signed-off-by: Derek Collison <derek@nats.io>
2022-02-11 09:51:51 -08:00
R.I.Pienaar
6bb0861eb7 avoid seg fault when stream restore fails
Signed-off-by: R.I.Pienaar <rip@devco.net>
2022-02-11 10:45:09 +01:00
Ivan Kozlovic
55ffde7251 Fixed consumer dlv count and num pending wrong due to redeliveries
Introduced by #2848, so should not have impacted existing releases.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-02-10 17:33:53 -07:00
Derek Collison
9010f3542a Bump to 2.7.3-beta
Signed-off-by: Derek Collison <derek@nats.io>
2022-02-09 17:36:15 -08:00
Derek Collison
6fcd87935c Merge pull request #2859 from nats-io/fss_cleanup
Remove fss files from a snapshot when a block is removed.
2022-02-09 17:33:29 -08:00
Derek Collison
68104d7cf3 During a filestore snapshot we generate the fss files but were not cleaning them up if the block was deleted before a server restart.
https://gist.github.com/nekufa/010185dfb59261f222a0042d3a7d2a1c

Signed-off-by: Derek Collison <derek@nats.io>
2022-02-09 17:12:08 -08:00
Derek Collison
ecfe42630a Merge pull request #2858 from nats-io/add_consumer_with_info
Make sure we snapshot initial consumer info during consumer creation.
2022-02-09 17:05:01 -08:00