Commit Graph

323 Commits

Author SHA1 Message Date
Derek Collison
4c8110c3ff Add in support for inactivity thresholds for durable consumers.
Signed-off-by: Derek Collison <derek@nats.io>
2022-06-14 06:51:00 -07:00
Derek Collison
fb51162e37 Make error on make bytes exceeded on a pull request a 409
Signed-off-by: Derek Collison <derek@nats.io>
2022-06-07 06:59:08 -07:00
Derek Collison
72ed48d096 Merge pull request #3149 from nats-io/pull_perf_stable
[FIXED] Spurious pull consumer 408s under load
2022-05-25 09:44:27 -07:00
Derek Collison
d69394efad Fix spurious 408s under load and move processing of acks to their own Go routine.
Signed-off-by: Derek Collison <derek@nats.io>
2022-05-25 09:27:34 -07:00
Derek Collison
46f7f7bfc9 Consumer pending was not correct when stream had max msgs per subject set > 1 and a consumer that filtered out part of the stream was created.
Also make sure to update stream's config on a stream restore in case of changes.

Signed-off-by: Derek Collison <derek@nats.io>
2022-05-24 14:44:15 -07:00
Ivan Kozlovic
53e3c53d96 [FIXED] JetStream: consumer with deliver new may miss messages
This could happen when a consumer had not sent anything to the
attached NATS subscription and there was a consumer leader
step down or server restart.

Signed-off-by: Derek Collison <derek@nats.io>
2022-05-23 12:01:48 -06:00
Derek Collison
790d643431 Consumer's num pending can now rely on the stream's store vs trying to maintain furing runtime which could be wrong under certain conditions.
Signed-off-by: Derek Collison <derek@nats.io>
2022-05-20 08:45:43 -07:00
Derek Collison
41cca8d6c4 Allow proper mix and match of consumer stores and stream stores.
Signed-off-by: Derek Collison <derek@nats.io>
2022-05-18 12:51:48 -07:00
Derek Collison
906eb332fc Make sure consumer store is memory based when selected
Signed-off-by: Derek Collison <derek@nats.io>
2022-05-17 18:48:27 -07:00
Derek Collison
50be0a6599 Allow explicit configuration of consumer's replica count and allow a consumer to force memory storage.
Signed-off-by: Derek Collison <derek@nats.io>
2022-05-16 19:03:56 -07:00
Derek Collison
6bbc5f627c Support for MaxBytes for pull requests.
Signed-off-by: Derek Collison <derek@nats.io>
2022-05-16 08:43:33 -07:00
Ivan Kozlovic
5c3be1ee68 [FIXED] JetStream: panic processing cluster consumer create
Before PR #3099, `waitQueue.isEmpty()` returned `wq.len() == 0`
and `waitQueue.len()` was protecting against the pointer being
nil (and then return 0).

The change in #3099 caused `waitQueue.isEmpty()` to return `wq.n == 0`,
which means that if `wq` was nil, then it would crash.

This PR restores `waitQueue.isEmpty()` to return `wq.len() == 0` and
add the protection for waitQueue being nil in `len()` similar to
how it was prior to PR #3099.

Resolves #3117

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-05-11 11:03:50 -06:00
Ivan Kozlovic
cadf921ed1 [FIXED] JetStream: PullConsumer MaxWaiting==1 and Canceled requests
There was an issue with MaxWaiting==1 that was causing a request
with expiration to actually not expire. This was because processWaiting
would not pick it up because wq.rp was actually equal to wq.wp
(that is, the read pointer was equal to write pointer for a slice
of capacity of 1).

The other issue was that when reaching the maximum of waiting pull
requests, a new request would evict an old one with a "408 Request Canceled".

There is no reason for that, instead the server will first try to
find some existing expired requests (since some of the expiration
is lazily done), but if none is expired, and the queue is full,
the server will return a "409 Exceeded MaxWaiting" to the new
request, and not a "408 Request Canceled" to an old one...

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-05-03 15:17:20 -06:00
Ivan Kozlovic
5050092468 [FIXED] JetStream: possible lock inversion
When updating usage, there is a lock inversion in that the jetStream
lock was acquired while under the stream's (mset) lock, which is
not correct. Also, updateUsage was locking the jsAccount lock, which
again, is not really correct since jsAccount contains streams, so
it should be jsAccount->stream, not the other way around.

Removed the locking of jetStream to check for clustered state since
js.clustered is immutable.

Replaced using jsAccount lock to update usage with a dedicated lock.

Originally moved all the update/limit fields in jsAccount to new
structure to make sure that I would see all code that is updating
or reading those fields, and also all functions so that I could
make sure that I use the new lock when calling these. Once that
works was done, and to reduce code changes, I put the fields back
into jsAccount (although I grouped them under the new usageMu mutex
field).

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-05-02 09:50:32 -06:00
Ivan Kozlovic
d4d37e67f4 [FIXED] JetStream: file store compact and when to write index
When deciding to compact a file, we need to remove from the raw
bytes the empty records, otherwise, for small messages, we would
end-up calling compact() too many times.

When removing a message from the stream, in FIFO cases we would
write the index every 2 seconds at most when doing it in place,
when when dealing with out of order deletes, we would do it for
every single delete, which can be costly. We are now writing
only every 500ms for non FIFO cases.

Also fixed some unrelated code:
- Decision to install a snapshot was based on incorrect logical
expression
- In checkPending(), protect against the timer being nil which
could happen when consumer is stopped or leadership change.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-04-28 12:35:19 -06:00
Matthias Hanel
d520a27c36 [fixed] step down timing, consumer stream seqno, clear redelivery (#3079)
Step down timing for consumers or streams.
Signals loss of leadership and sleeps before stepping down.
This makes it less likely that messages are being processed during step
down.

When becoming leader, consumer stream seqno got reset,
even though the consumer existed already.

Proper cleanup of redelivery data structures and timer

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-04-27 03:32:08 -04:00
Derek Collison
f702e279ab Fix for a consumer recovery issue.
Also update healthz to check all assets that are assigned, not just running.

Signed-off-by: Derek Collison <derek@nats.io>
2022-04-26 19:22:19 -07:00
Leander Kohler
966d9d56f4 Add JSConsumerDeliveryNakAdvisory
The advisory `JSAdvisoryConsumerMsgNakPre` will be triggered
when a message is naked
2022-04-25 16:13:32 +02:00
Ivan Kozlovic
b9463b322f [FIXED] JetStream: stream mirror issues in mixed mode clusters
Similar to PR #3061

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-04-20 23:21:15 -06:00
Derek Collison
656a9534a5 Bump flow control window back up.
Signed-off-by: Derek Collison <derek@nats.io>
2022-04-18 05:33:20 -07:00
Matthias Hanel
79b4374d01 [Fixed] limits enforcement issues (#3046)
* [Fixed] limits enforcement issues

stream create had checks that stream restore did not have.
Moved code into commonly used function checkStreamCfg.
Also introduced (cluster/non clustered) StreamLimitsCheck functions to
perform checks specific to clustered /non clustered data structures.

Checking for valid stream config and limits/reservations before
receiving all the data. Now fails the request right away.

Added a jetstream limit "max_request_batch" to limit fetch batch size

Shortened max name length from 256 to 255, more common file name limit

Added check for loop in cyclic source stream configurations

features related to limits

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-04-18 01:53:48 -04:00
Ivan Kozlovic
eb4856e4a7 Cleanup timers on consumer leader change
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-04-16 13:37:46 -06:00
Ivan Kozlovic
fc873c6f2f Return limit in consumer max_ack_pending limit exceeded
- Updated tests that were checking for the error to include the limit
- Moved some tests above the benchmark ones

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-04-15 18:23:25 -06:00
Ivan Kozlovic
0e841d4acf Tweak ordered consumer flow control and bump to beta.18
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-04-14 17:43:43 -06:00
Derek Collison
04cce6df68 Merge pull request #3020 from nats-io/move-updates
[IMPROVED] Raft layer for general stability and leader election.
2022-04-11 17:33:13 -07:00
Matthias Hanel
13e5ab10bd fix js nex interest check where leaf node masked gw subj propagation (#3016)
basically a gw subject propagation issue could be hidden behind a leaf
node.
also change error text when this was the case

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-04-11 14:04:09 -04:00
Derek Collison
37cbac99e7 Improvements to the raft layer for general stability and support of scale up and down and asset move.
Also fixed a bug that would allow a leadership transfer when catching up.

Signed-off-by: Derek Collison <derek@nats.io>
2022-04-10 08:59:39 -07:00
Derek Collison
7e38ebcb6e Allow assets such as streams and their associated consumers to migrate between clusters.
The system will allow an update to a stream, and subsequently all attached consumers, to be placed in another cluster either directly or via tag placement.
The meta layer will scale the underlying peerset appropriately to straddle the two clusters for both the stream and consumers, taking into account the consumer type.
Control will then pass to the current leaders of the assets who will monitor the catchup status of the new peers.
(Note we can optimize this later to only traverse once across a GW for any given asset, but for now this is simpler)
Once the original leaders have determined the assets are synched it will pass leadership to a member of the new peerset.
Once the new leader has been elected, it will forward a request for the meta layer to shrink the peerset by removing the old peers.

Signed-off-by: Derek Collison <derek@nats.io>
2022-04-04 18:28:36 -07:00
Matthias Hanel
92f4dc986a added max_ack_pending setting to js account limits (#2982)
* added max_ack_penind setting to js account limits

because of the addition, defaults now have to be set later (depend on
these new limits now)

also re-organized the code to closer track how stream create looks

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-03-31 14:17:16 -04:00
Ivan Kozlovic
4ddbdbd74c Rewrite trackDownAccountAndInterest() to make it easier to read
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-30 16:41:22 -06:00
Ivan Kozlovic
c0ab2d4959 [FIXED] Possible panic due to data races
A panic was reported that looked like this:
```
fatal error: concurrent map read and map write
goroutine 200 [running]:
runtime.throw({0xa366ce, 0xe620e0})
	/home/travis/.gimme/versions/go1.17.8.linux.amd64/src/runtime/panic.go:1198 +0x71 fp=0xc00105f098 sp=0xc00105f068 pc=0x434ff1
runtime.mapaccess1_faststr(0x0, 0x0, {0xc0054b6f18, 0x11})
	/home/travis/.gimme/versions/go1.17.8.linux.amd64/src/runtime/map_faststr.go:21 +0x3a5 fp=0xc00105f100 sp=0xc00105f098 pc=0x412285"
github.com/nats-io/nats-server/v2/server.(*consumer).processNextMsgReq(0xc000681000, 0xc00105f2a8, 0x4503e9, 0x11, {0x0, 0xc000246900}, {0xc0054b6f18, 0x11}, {0xc0002469c4, 0x90, ...})
	/home/travis/gopath/src/github.com/nats-io/nats-server/server/consumer.go:2454 +0x8ce fp=0xc00105f250 sp=0xc00105f100 pc=0x77dc2e
github.com/nats-io/nats-server/v2/server.(*consumer).processNextMsgReq-fm(0x9c, 0x7f302e954fff, 0xc00105f2f8, {0xc000774280, 0x400}, {0xc0054b6f18, 0x40}, {0xc0002469c4, 0x90, 0x63c})
	/home/travis/gopath/src/github.com/nats-io/nats-server/server/consumer.go:2380 +0x77 fp=0xc00105f2b8 sp=0xc00105f250 pc=x91e337
github.com/nats-io/nats-server/v2/server.(*client).deliverMsg(0xc0015f8000, 0xc003034f00, 0x41642f, {0xc000246969, 0x4b6166, 0x697}, {0xc0002469a9, 0x4b60be, 0x657}, {0xc0015f9480, ...}, ...)
	/home/travis/gopath/src/github.com/nats-io/nats-server/server/client.go:3180 +0xbb0 fp=0xc00105f530 sp=0xc00105f2b8 pc=0x764470
github.com/nats-io/nats-server/v2/server.(*client).processMsgResults(0xc0015f8000, 0x8cd7a5, 0xc0089fb440, {0xc0002469c4, 0x92, 0x63c}, {0x0, 0x0, 0x4}, {0xc000246969, ...}, ...)
	/home/travis/gopath/src/github.com/nats-io/nats-server/server/client.go:4163 +0x9af fp=0xc00105fa48 sp=0xc00105f530 pc=0x769e4f
github.com/nats-io/nats-server/v2/server.(*client).processInboundRoutedMsg(0xc0015f8000, {0xc0002469c4, 0xc0015f8220, 0x63c})
	/home/travis/gopath/src/github.com/nats-io/nats-server/server/route.go:443 +0x159 fp=0xc00105fae8 sp=0xc00105fa48 pc=0x8ce299
github.com/nats-io/nats-server/v2/server.(*client).processInboundMsg(0xc0015f8000, {0xc0002469c4, 0x92, 0x79e})
	/home/travis/gopath/src/github.com/nats-io/nats-server/server/client.go:3493 +0x36 fp=0xc00105fb18 sp=0xc00105fae8 pc=0x765c76
github.com/nats-io/nats-server/v2/server.(*client).parse(0xc0015f8000, {0xc000246800, 0x800, 0xc087258a5d30c937})
	/home/travis/gopath/src/github.com/nats-io/nats-server/server/parser.go:497 +0x246a fp=0xc00105fd98 sp=0xc00105fb18 pc=0x8a4f6a
github.com/nats-io/nats-server/v2/server.(*client).readLoop(0xc0015f8000, {0x0, 0x0, 0x0})"
	/home/travis/gopath/src/github.com/nats-io/nats-server/server/client.go:1227 +0xe1f fp=0xc00105ffb0 sp=0xc00105fd98 pc=0x75841f
github.com/nats-io/nats-server/v2/server.(*Server).createRoute.func1()
	/home/travis/gopath/src/github.com/nats-io/nats-server/server/route.go:1372 +0x25 fp=0xc00105ffe0 sp=0xc00105ffb0 pc=0x8d46a5
runtime.goexit
```

Writting a test showed the data race:
```
==================
WARNING: DATA RACE
Read at 0x00c0008ea240 by goroutine 62:
  runtime.mapaccess1_faststr()
      /usr/local/go/src/runtime/map_faststr.go:12 +0x0
  github.com/nats-io/nats-server/v2/server.(*consumer).processNextMsgRequest()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/consumer.go:2567 +0xa64
(...)
Previous write at 0x00c0008ea240 by goroutine 15:
  runtime.mapdelete_faststr()
      /usr/local/go/src/runtime/map_faststr.go:300 +0x0
  github.com/nats-io/nats-server/v2/server.(*Account).checkForReverseEntry()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/accounts.go:1759 +0x61c
  github.com/nats-io/nats-server/v2/server.(*client).unsubscribe()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/client.go:2838 +0xa27
(...)
```

After fixing this data race, another showed up:
```
==================
WARNING: DATA RACE
Read at 0x00c000352200 by goroutine 99:
  github.com/nats-io/nats-server/v2/server.(*Account).checkForReverseEntry()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/accounts.go:1752 +0x4b3
  github.com/nats-io/nats-server/v2/server.(*client).unsubscribe()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/client.go:2838 +0xa27
(...)
Previous write at 0x00c000352200 by goroutine 92:
  runtime.slicecopy()
      /usr/local/go/src/runtime/slice.go:284 +0x0
  github.com/nats-io/nats-server/v2/server.(*Account).checkForReverseEntry()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/accounts.go:1737 +0x871
  github.com/nats-io/nats-server/v2/server.(*Account).removeRespServiceImport()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/accounts.go:1622 +0x24c
(...)
```

This PR addresses both.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-30 13:51:52 -06:00
Derek Collison
eb16c35016 OrderedConsumer was very conservative with slow start and small max outstanding bytes. This is increasing perf for longer rtt.
Signed-off-by: Derek Collison <derek@nats.io>
2022-03-30 05:08:36 -07:00
Derek Collison
bfc1462fb3 Merge pull request #2973 from nats-io/issue-2936
[IMPROVED] Consumer snapshot logic in clustered mode and disk usage.
2022-03-29 18:29:31 -07:00
Derek Collison
607858f213 Improved consumer snapshot logic in clustered mode and disk usage.
Also fixed a bug that could cause memory based replicated consumers to no longer work after snapshots and server restarts.

The snapshot logic would allow non-state changing updates to continously grow the raft logs. We also were too conservative on when we snapshotted and why.
Also added in ability to have FileStore.Compact() reclaim space from the block file from the head of last changed block.

Signed-off-by: Derek Collison <derek@nats.io>
2022-03-29 18:02:49 -07:00
Ivan Kozlovic
e1c581334e [CHANGED] JetStream: lower default consumer's maximum ack pending
The default value is lowered from 20,000 to 1,000. This does not
seem to have a performance degradation impact, but may help
with scalability at scale.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-29 15:30:40 -06:00
Matthias Hanel
1aeaaf0ca3 Adding server limits (max ack pending/dedupe window) to js config (#2967)
* Adding server limits (max ack pending/dedupe window) to js config

Also shifting consumer config check to jsConsumerCreate as in clustered
mode this was enforced in the wrong place

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-03-29 13:19:36 -04:00
Matthias Hanel
0c5f3688a7 [ADDED] Tiered limits and fix limit issues on updates (#2945)
* Adding tiered limits and fix limit issues on updates

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-03-28 20:47:54 -04:00
Ivan Kozlovic
25886e8819 [FIXED] JetStream: sampling not updated during consumer update
Resolves #2941

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-28 10:58:58 -06:00
Ivan Kozlovic
5e89374ee9 Fixed another possible lock inversion consumer->stream
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-25 12:21:51 -06:00
Ivan Kozlovic
4739eebfc4 [FIXED] JetStream: possible deadlock during consumer leadership change
Would possibly show up when a consumer leader changes for a consumer
that had redelivered messages and for instance messages were inbound
on the stream.

Resolves #2912

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-25 12:21:51 -06:00
Derek Collison
ef8f543ea5 Improve memory usage through JetStream storage layer.
Previously we would rely more heavily on Go's garbage collector since when we loaded a block for an underlying stream we would pass references upward to avoimd copies.
Now we always copy when passing back to the upper layers which allows us to not only expire our cache blocks but pool and reuse them.

The upper layers also had changes made to allow the pooling layer at that level to interoperate with the storage layer optionally.

Also fixed some flappers and a bug where de-dupe might not be reformed correctly.

Signed-off-by: Derek Collison <derek@nats.io>
2022-03-24 17:45:15 -06:00
Ivan Kozlovic
2253bb6f1a JS: BackOff list caused too frequent checkPending() calls
Since the "next" timer value is set to the AckWait value, which
is the first element in the BackOff list if present, the check
would possibly happen at this interval, even when we were past
the first redelivery and the backoff interval had increased.

The end-user would still see the redelivery be done at the durations
indicated by the BackOff list, but internally, we would be checking
at the initial BackOff's ack wait.

I added a test that uses the store's interface to detect how many
times the checkPending() function is invoked. For this test it
should have been invoked twice, but without the fix it was invoked
15 times.

Also fixed an unrelated test that could possibly deadlock causing
tests to be aborted due to inactivity on Travis.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-23 12:46:17 -06:00
Ivan Kozlovic
c3da392832 Changes to IPQueues
Removed the warnings, instead have a sync.Map where they are
registered/unregistered and can be inspected with an undocumented
monitor page.
Added the notion of "in progress" which is the number of messages
that have beend pop()'ed. When recycle() is invoked this count
goes down.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-17 17:53:06 -06:00
Ivan Kozlovic
fe6d7b305f Merge pull request #2898 from nats-io/js_cons_ack_processing
[CHANGED] JetStream: Redeliveries may be delayed if necessary
2022-03-17 10:57:22 -06:00
Derek Collison
dbfa47f9b1 Improve state preservation for consumers, specifically DeliverNew variants when no activity has been present.
Signed-off-by: Derek Collison <derek@nats.io>
2022-03-16 20:55:14 -07:00
Derek Collison
3216eb5ee5 When a consumer has no state we are now compacting the log, but were not snapshotting.
This caused issues on leader change and losing quorum.

Signed-off-by: Derek Collison <derek@nats.io>
2022-03-09 07:21:25 -05:00
Derek Collison
58da4b917a Made improvements to scale up and down for streams and consumers.
Signed-off-by: Derek Collison <derek@nats.io>
2022-03-06 16:59:02 -08:00
Derek Collison
31a19729b0 When removing a stream peer with an attached durable consumer, the consumer could become inconsistent.
Signed-off-by: Derek Collison <derek@nats.io>
2022-03-06 05:42:22 -08:00
Ivan Kozlovic
804ce102ac [CHANGED] JetStream: Redeliveries may be delayed if necessary
We have seen situations where when a lot of pending messages accumulate,
there is a contention between the processing of the ACKs and the
checking of the pending map.

Decision is made to abort checking of pending list if processing of
ack(s) would be delayed because of that. The result is that a
redelivery may be post-poned.

Internally, the ACKs are also now using a queue to prevent processing
of them from the network handler, which could cause head-of-line
blocking, especially bad for routes.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-03 10:32:35 -07:00
Derek Collison
1c8f7de848 On filtered subjects when consumers were staggered we need to disqualify a filtered consumer if not applicable.
Signed-off-by: Derek Collison <derek@nats.io>
2022-02-16 18:24:27 -08:00