Waldemar Quevedo
fc51af9542
Fix for data race in memstore.LoadNextMsg ( #4552 )
2023-09-17 18:15:11 -07:00
Waldemar Quevedo
32021f66f1
Use write lock in memstore.LoadNextMsg
...
Signed-off-by: Waldemar Quevedo <wally@synadia.com >
2023-09-17 17:24:53 -07:00
Derek Collison
6f3805650b
[FIXED] Data race, protect access to c.acc ( #4550 )
...
Signed-off-by: Derek Collison <derek@nats.io >
Resolves #4549
2023-09-17 10:35:34 -07:00
Derek Collison
4e9cd9aa36
Protect access to c.acc
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-09-17 10:01:24 -07:00
Derek Collison
0d9328027f
Change code coverage GHA workflow to use main ( #4546 )
2023-09-16 11:00:04 -07:00
Neil
0283c4bc45
Add Raft goroutine labels, tweak logging ( #4545 )
...
This adds some more debugging information to the Raft goroutines in
pprof and improves the logging when a consumer was already running.
Example:
```
1 @ 0x1025b1838 0x1025c2ac8 0x102a47d1c 0x102a47244 0x102a858e0 0x1025e5ad4
# labels: {"account":"$SYS", "group":"_meta_", "type":"metaleader"}
# 0x102a47d1b github.com/nats-io/nats-server/v2/server.(*raft).runAsFollower+0xbb server/raft.go:1795
# 0x102a47243 github.com/nats-io/nats-server/v2/server.(*raft).run+0x2c3 server/raft.go:1715
# 0x102a858df github.com/nats-io/nats-server/v2/server.(*Server).startGoRoutine.func1+0x17f server/server.go:3609
1 @ 0x1025b1838 0x1025c2ac8 0x102a47d1c 0x102a47244 0x102a858e0 0x1025e5ad4
# labels: {"account":"$G", "group":"S-R3M-hn5zv7o3", "stream":"benchstream", "type":"stream"}
# 0x102a47d1b github.com/nats-io/nats-server/v2/server.(*raft).runAsFollower+0xbb server/raft.go:1795
# 0x102a47243 github.com/nats-io/nats-server/v2/server.(*raft).run+0x2c3 server/raft.go:1715
# 0x102a858df github.com/nats-io/nats-server/v2/server.(*Server).startGoRoutine.func1+0x17f server/server.go:3609
1 @ 0x1025b1838 0x1025c2ac8 0x102a49b60 0x102a47250 0x102a858e0 0x1025e5ad4
# labels: {"account":"$G", "consumer":"foobar", "group":"C-R3M-djqHTUCq", "stream":"benchstream", "type":"consumer"}
# 0x102a49b5f github.com/nats-io/nats-server/v2/server.(*raft).runAsLeader+0x4bf server/raft.go:2198
# 0x102a4724f github.com/nats-io/nats-server/v2/server.(*raft).run+0x2cf server/raft.go:1719
# 0x102a858df github.com/nats-io/nats-server/v2/server.(*Server).startGoRoutine.func1+0x17f server/server.go:3609
```
Signed-off-by: Neil Twigg <neil@nats.io >
2023-09-16 11:28:43 +01:00
Neil Twigg
1f9ddf2bbd
Add Raft goroutine labels, tweak logging
...
Signed-off-by: Neil Twigg <neil@nats.io >
2023-09-16 11:15:06 +01:00
Todd Beets
349e718d39
Changes for max log files option (active plus backups); remove redundant lexical sort of backups; adjust test
2023-09-15 22:08:09 -07:00
Derek Collison
7df0e42ce8
[FIXED] Fix for data race accessing consumer assignment ( #4547 )
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-09-15 16:48:00 -07:00
Todd Beets
46147cf0ea
Add logfile_max_archives feature and test.
2023-09-15 16:21:51 -07:00
Derek Collison
9781025b40
Fix for data race accessing consumer assignment
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-09-15 16:21:12 -07:00
Derek Collison
a5344c099f
AuthCallout request should include TLS data when client is NATS WS client ( #4544 )
...
Make sure the client handshake flag is set when TLS handshake is made as
part of WebSocket connection/upgrade (notionally HTTPS) rather than as
part of the NATS protocol TLS initiation chain. AuthCallout tests the
flag when building the data for the AuthCallout service request.
Added AuthCallout unit test for NATS WS client auth that requires the
TLS data.
2023-09-15 15:56:12 -07:00
Todd Beets
aed99441c6
Use preferred value tests (equal, not equal) rather than booleans for better fail logs
2023-09-15 14:41:41 -07:00
Todd Beets
7b0a12d7da
Add *tls.Conn safe type check as some black box unit tests override the natural underlying type for test purposes which would otherwise cause a panic
2023-09-15 13:52:41 -07:00
Todd Beets
40cf145ee6
Map both 127.0.0.1 and 127.0.1.1 to localhost for HTTPS server host validate
2023-09-15 13:13:24 -07:00
Byron Ruth
8b089b4a12
Change ref to main
...
Signed-off-by: Byron Ruth <byron@nats.io >
2023-09-15 16:12:46 -04:00
Todd Beets
75d2ddb26b
AuthCallout request should include TLS data when client is NATS WS client
2023-09-15 12:36:34 -07:00
Derek Collison
0af378cf85
Bump to 2.10.0-RC.4
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-09-15 08:54:27 -07:00
Derek Collison
d7c66e753f
[FIXED] Possible panic in consumer, needed to recheck if consumer was closed ( #4541 )
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-09-15 08:53:57 -07:00
Derek Collison
097e4097d1
Allow longer times due to travis slowdowns
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-09-15 08:52:50 -07:00
Derek Collison
f2e7ed91cb
Fix for panic in consumer, needed to recheck if consumer was closed
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-09-15 08:40:21 -07:00
Waldemar Quevedo
8f84ea4224
Bump to 2.10.0-RC.3 ( #4537 )
2023-09-14 12:16:56 -07:00
Waldemar Quevedo
76cbef79cc
Bump to 2.10.0-RC.3
...
Signed-off-by: Waldemar Quevedo <wally@synadia.com >
2023-09-14 12:11:09 -07:00
Derek Collison
56c5e4aede
[IMPROVED] Consumer cleanup monitoring and FIX for datarace ( #4536 )
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-09-14 11:57:30 -07:00
Derek Collison
22f40eafa0
Add in jitter in case there are many that all try to cleanuo at the same time
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-09-14 11:24:32 -07:00
Derek Collison
392f25b6da
Fix for data race and adjustment to do a backoff on making sure consumers are cleaned up.
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-09-14 11:21:11 -07:00
Waldemar Quevedo
b79b180498
flake: Fix TestJetStreamConsumerAckFloorFill ( #4534 )
...
Can sometimes fail the first time checking for the ack floor but fine
after checking again.
2023-09-14 10:02:05 -07:00
Waldemar Quevedo
db0faf4538
flake: Fix TestJetStreamConsumerAckFloorFill
...
Can sometimes fail the first time checking for the ack floor
but fine after checking again.
Signed-off-by: Waldemar Quevedo <wally@synadia.com >
2023-09-14 09:31:38 -07:00
Neil Twigg
f38faafbc9
Bump to 2.10.0-RC.2
...
Signed-off-by: Neil Twigg <neil@nats.io >
2023-09-14 16:35:36 +01:00
Neil
46361e86a3
Fix leaking timers in stream sources ( #4532 )
...
Repeated calls to `scheduleSetSourceConsumerRetry` could end up creating
multiple timers for the same source, which would eventually schedule
even more timers, which would result in runaway CPU usage. This PR
instead bounds to one timer per source per stream.
Signed-off-by: Neil Twigg <neil@nats.io >
2023-09-14 16:32:36 +01:00
Neil
f259207270
De-flake TestJetStreamClusterAccountPurge ( #4533 )
...
This adds a new `waitForAccount` test helper that ensures that an
account exists across the cluster, and updates
`TestJetStreamClusterAccountPurge` to use it after submitting new JWTs.
This should prevent `require no error, but got: nats: JetStream not
enabled for account` errors.
Signed-off-by: Neil Twigg <neil@nats.io >
2023-09-14 13:34:54 +01:00
Neil Twigg
904f4c388e
De-flake TestJetStreamClusterAccountPurge by waiting for account to exist
...
Signed-off-by: Neil Twigg <neil@nats.io >
2023-09-14 11:40:30 +01:00
Neil Twigg
6f3f544841
Fix leaking timers in stream sources
...
Signed-off-by: Neil Twigg <neil@nats.io >
2023-09-14 10:30:24 +01:00
Derek Collison
ea93e77b7f
[FIXED] Fix for a call into mb.recalculateFirstForSubj() that did not hold lock. ( #4530 )
...
This unprotected access allowed the cache to most likely be flushed and
after a subsequent writeMsgRecord would have the offset > slot value
which can't happen if lock is held due to us loading cache properly at
beginning of the function.
Signed-off-by: Derek Collison <derek@nats.io >
Resolves #4529
2023-09-13 16:26:39 -07:00
Derek Collison
787f6acf31
Fix for a call into fs.recalculateFirstForSubj() from fs.recalculateFirstForSubj() that did not lock the mb properly.
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-09-13 15:35:34 -07:00
Neil
c7d5441900
Bump to 2.10.0-RC.1 ( #4527 )
...
Signed-off-by: Neil Twigg <neil@nats.io >
2023-09-13 17:22:52 +01:00
Neil Twigg
505190266a
Bump to 2.10.0-RC.1
...
Signed-off-by: Neil Twigg <neil@nats.io >
2023-09-13 17:22:30 +01:00
Neil
6004865e1d
Update nats.go to v1.29.0 ( #4528 )
...
This updates `nats.go` to v1.29.0.
Signed-off-by: Neil Twigg <neil@nats.io >
2023-09-13 17:22:13 +01:00
Neil Twigg
7b85fd1045
Update nats.go to v1.29.0
...
Signed-off-by: Neil Twigg <neil@nats.io >
2023-09-13 16:24:30 +01:00
Neil
cb484eeb43
Update docker nightly builds ( #4526 )
...
main is now the contents of the dev branch, the old nightly-main points
to the last release for now.
2023-09-13 16:01:33 +01:00
Waldemar Quevedo
643d49d8fe
Update docker nightly builds
...
Signed-off-by: Waldemar Quevedo <wally@nats.io >
2023-09-13 07:39:11 -07:00
Phil Pennock
ff6216d84f
BRANCH SURGERY: main is now the prior contents of dev
2023-09-13 10:34:06 -04:00
Neil
59bc094e86
[FIXED] Increased AckWait in TestMQTTQoS2RetriesPublish, TestMQTTQoS2RetriesPubRel ( #4518 )
...
TestMQTTQoS2RetriesPublish to 100ms, and in TestMQTTQoS2RetriesPubRel to
50ms.
A lesser value caused another PUBLISH to be fired while the test was
still processing the final QoS2 flow. Reduced the number of retries we
wait for to make the test a little quicker.
2023-09-13 11:49:17 +01:00
Neil
3f6ac9b675
flake: Fixes TestServerOperatorModeUserInfoExpiration ( #4525 )
2023-09-13 11:49:01 +01:00
Piotr Piotrowski
e08442fbfc
flake: Fixes TestServerOperatorModeUserInfoExpiration
...
Signed-off-by: Piotr Piotrowski <piotr@synadia.com >
2023-09-13 11:57:53 +02:00
Derek Collison
23aab24323
Merge branch 'main' into dev
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-09-12 19:44:18 -07:00
Derek Collison
e7aa54f728
[FIXED] Test fix for TestNoRaceJetStreamClusterStreamReset ( #4521 )
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-09-12 19:42:28 -07:00
Derek Collison
58b5fc4abf
Fix for a flapper
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-09-12 17:19:30 -07:00
Derek Collison
3407eda769
Bump to 2.10.0-beta.56
...
Signed-off-by: Derek Collison <derek@nats.io >
2023-09-12 15:52:00 -07:00
Derek Collison
a345a4040e
[FIXED] Fix for panic from a bug in selecting a filestore block in new fs code. ( #4520 )
...
When num blocks > 32 and we used new binary search in NumPending() we
could return -1, nil. If sequence is inclusive this should always return
valid index and mb.
The reason we could return -1 would be that we were not accounting for
gaps as mb.first.seq can move ahead as first is removed. The panic could
orphan held locks for filestore, consumer and possibly stream which
would lock up a system, leading to memory growth and unstable behaviors.
Signed-off-by: Derek Collison <derek@nats.io >
2023-09-12 15:51:24 -07:00