Commit Graph

5799 Commits

Author SHA1 Message Date
Derek Collison
906eb332fc Make sure consumer store is memory based when selected
Signed-off-by: Derek Collison <derek@nats.io>
2022-05-17 18:48:27 -07:00
Derek Collison
bb9e942208 Merge pull request #3129 from nats-io/jetstream-republish
Enable republishing of messages once stored in a stream.
2022-05-17 15:54:22 -07:00
Derek Collison
e3249d8b6c Move cfg check for republish to common func
Signed-off-by: Derek Collison <derek@nats.io>
2022-05-17 15:33:43 -07:00
Derek Collison
c166c9b199 Enable republishing of messages once stored in a stream.
This enables lightweight distribution of messages to very large number of NATS subscribers.
We add in metadata as headers that allows for gap detection which enables initial value (via JetStream, maybe KV) and realtime NATS core updates but all globally ordered.

Signed-off-by: Derek Collison <derek@nats.io>
2022-05-17 15:18:54 -07:00
Derek Collison
2d30d12cd2 Merge pull request #3128 from nats-io/consumer_explicit
Allow explicit consumer configuration of replica count and memory storage
2022-05-16 19:12:25 -07:00
Derek Collison
50be0a6599 Allow explicit configuration of consumer's replica count and allow a consumer to force memory storage.
Signed-off-by: Derek Collison <derek@nats.io>
2022-05-16 19:03:56 -07:00
Derek Collison
77b3e0ce82 Merge pull request #3126 from nats-io/pull_max_bytes
Support for MaxBytes for pull requests.
2022-05-16 09:41:02 -07:00
Derek Collison
6bbc5f627c Support for MaxBytes for pull requests.
Signed-off-by: Derek Collison <derek@nats.io>
2022-05-16 08:43:33 -07:00
Derek Collison
3aa7965ad7 Bump to 2.8.3-beta.2
Signed-off-by: Derek Collison <derek@nats.io>
2022-05-12 16:03:44 -07:00
Derek Collison
b6ebe34734 Merge pull request #3121 from nats-io/issue-3114
General improvements to accounting for the filestore.
2022-05-12 16:01:25 -07:00
Derek Collison
35e373f6e6 Merge pull request #3122 from nats-io/issue-3119
Fix for #3119
2022-05-12 16:01:00 -07:00
Derek Collison
bcecae42ac Fix for #3119
Signed-off-by: Derek Collison <derek@nats.io>
2022-05-12 15:45:29 -07:00
Derek Collison
4291433a46 General improvements to accounting for the filestore. This in response to tracking issue #3114.
Signed-off-by: Derek Collison <derek@nats.io>
2022-05-12 15:43:11 -07:00
Ivan Kozlovic
2cb2a8ebbc Merge pull request #3120 from nats-io/js_data_race
[FIXED] JetStream: Some data races
2022-05-11 19:30:35 -06:00
Ivan Kozlovic
e304589da4 [FIXED] JetStream: Some data races
We were getting a data race checking the js.clustered field in
updateUsage() following fix for lock inversion in PR #3092.
```
=== RUN   TestJetStreamClusterKVMultipleConcurrentCreate
==================
WARNING: DATA RACE
Read at 0x00c0009db5d8 by goroutine 195:
  github.com/nats-io/nats-server/v2/server.(*jsAccount).updateUsage()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream.go:1681 +0x8f
  github.com/nats-io/nats-server/v2/server.(*stream).storeUpdates()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/stream.go:2927 +0x1d9
  github.com/nats-io/nats-server/v2/server.(*stream).storeUpdates-fm()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/stream.go:2905 +0x7d
  github.com/nats-io/nats-server/v2/server.(*fileStore).removeMsg()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/filestore.go:2158 +0x14f7
  github.com/nats-io/nats-server/v2/server.(*fileStore).expireMsgs()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/filestore.go:2777 +0x18f
  github.com/nats-io/nats-server/v2/server.(*fileStore).expireMsgs-fm()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/filestore.go:2770 +0x39
Previous write at 0x00c0009db5d8 by goroutine 128:
  github.com/nats-io/nats-server/v2/server.(*jetStream).setupMetaGroup()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:604 +0xfae
  github.com/nats-io/nats-server/v2/server.(*Server).enableJetStreamClustering()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:514 +0x20a
  github.com/nats-io/nats-server/v2/server.(*Server).enableJetStream()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream.go:400 +0x1168
  github.com/nats-io/nats-server/v2/server.(*Server).EnableJetStream()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream.go:206 +0x651
  github.com/nats-io/nats-server/v2/server.(*Server).Start()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/server.go:1746 +0x1804
  github.com/nats-io/nats-server/v2/server.RunServer·dwrap·4269()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/server_test.go:90 +0x39
Goroutine 195 (running) created at:
  time.goFunc()
      /home/travis/.gimme/versions/go1.17.9.linux.amd64/src/time/sleep.go:180 +0x49
Goroutine 128 (finished) created at:
  github.com/nats-io/nats-server/v2/server.RunServer()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/server_test.go:90 +0x278
  github.com/nats-io/nats-server/v2/server.RunServerWithConfig()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/server_test.go:112 +0x44
  github.com/nats-io/nats-server/v2/server.(*cluster).restartServer()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream_helpers_test.go:1004 +0x1d5
  github.com/nats-io/nats-server/v2/server.TestJetStreamClusterKVMultipleConcurrentCreate()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream_cluster_test.go:8463 +0x64b
  testing.tRunner()
      /home/travis/.gimme/versions/go1.17.9.linux.amd64/src/testing/testing.go:1259 +0x22f
  testing.(*T).Run·dwrap·21()
      /home/travis/.gimme/versions/go1.17.9.linux.amd64/src/testing/testing.go:1306 +0x47
==================
```

Running that test with adding some delay in several places also showed another race:
```
==================
WARNING: DATA RACE
Read at 0x00c00016adb8 by goroutine 160:
  github.com/nats-io/nats-server/v2/server.(*fileStore).expireMsgs()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/filestore.go:2777 +0x106
  github.com/nats-io/nats-server/v2/server.(*fileStore).expireMsgs-fm()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/filestore.go:2771 +0x39

Previous write at 0x00c00016adb8 by goroutine 32:
  github.com/nats-io/nats-server/v2/server.(*fileStore).UpdateConfig()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/filestore.go:360 +0x1c8
  github.com/nats-io/nats-server/v2/server.(*stream).update()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/stream.go:1360 +0x852
  github.com/nats-io/nats-server/v2/server.(*jetStream).processClusterCreateStream()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:2704 +0x4a4
  github.com/nats-io/nats-server/v2/server.(*jetStream).processStreamAssignment()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:2452 +0xad9
  github.com/nats-io/nats-server/v2/server.(*jetStream).applyMetaEntries()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:1407 +0x7e4
  github.com/nats-io/nats-server/v2/server.(*jetStream).monitorCluster()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:887 +0xc75
  github.com/nats-io/nats-server/v2/server.(*jetStream).monitorCluster-fm()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:813 +0x39

Goroutine 160 (running) created at:
  time.goFunc()
      /usr/local/go/src/time/sleep.go:180 +0x49

Goroutine 32 (running) created at:
  github.com/nats-io/nats-server/v2/server.(*Server).startGoRoutine()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/server.go:3013 +0x86
  github.com/nats-io/nats-server/v2/server.(*jetStream).setupMetaGroup()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:612 +0x1092
  github.com/nats-io/nats-server/v2/server.(*Server).enableJetStreamClustering()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:514 +0x20a
  github.com/nats-io/nats-server/v2/server.(*Server).enableJetStream()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/jetstream.go:400 +0x1168
  github.com/nats-io/nats-server/v2/server.(*Server).EnableJetStream()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/jetstream.go:206 +0x651
  github.com/nats-io/nats-server/v2/server.(*Server).Start()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/server.go:1746 +0x1804
  github.com/nats-io/nats-server/v2/server.RunServer·dwrap·4275()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/server_test.go:90 +0x39
==================
```

Both are now addressed, either with proper locking, or with the use of an atomic in the place
where we cannot get the lock (without re-introducing the lock inversion issue).

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-05-11 19:09:24 -06:00
Ivan Kozlovic
7da46d546d Merge pull request #3118 from nats-io/fix_3117
[FIXED] JetStream: panic processing cluster consumer create
2022-05-11 13:15:06 -06:00
Ivan Kozlovic
5c3be1ee68 [FIXED] JetStream: panic processing cluster consumer create
Before PR #3099, `waitQueue.isEmpty()` returned `wq.len() == 0`
and `waitQueue.len()` was protecting against the pointer being
nil (and then return 0).

The change in #3099 caused `waitQueue.isEmpty()` to return `wq.n == 0`,
which means that if `wq` was nil, then it would crash.

This PR restores `waitQueue.isEmpty()` to return `wq.len() == 0` and
add the protection for waitQueue being nil in `len()` similar to
how it was prior to PR #3099.

Resolves #3117

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-05-11 11:03:50 -06:00
Ivan Kozlovic
56d06fd8eb Bump version to 2.8.3-beta.1
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-05-10 17:35:32 -06:00
Ivan Kozlovic
18f9013b10 Merge pull request #3115 from nats-io/raft_lock_issue
[FIXED] JetStream: possible lockup due to a return prior to unlock
2022-05-10 17:32:13 -06:00
Ivan Kozlovic
2ce1dc1561 [FIXED] JetStream: possible lockup due to a return prior to unlock
This would happen in situation where a node receives an append
entry with a term higher than the node's (current leader).

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-05-10 17:11:57 -06:00
Ivan Kozlovic
17cc205293 Merge pull request #3112 from nats-io/fix_3108
[FIXED] Accounts Export/Import isolation with overlap subjects
2022-05-10 14:38:47 -06:00
Matthias Hanel
f87c7d8441 altered move unit test to test tiered/non tiered setup (#3113)
Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-05-09 19:49:22 -04:00
Ivan Kozlovic
c4adf0ffed [FIXED] Accounts Export/Import isolation with overlap subjects
I tracked down this issue to have been introduced with PR #2369,
but the code also touched PR #1891 and PR #3088.

I added a test as described in issue #3108 but did not need
JetStream to demonstrate the issue. With the proposed fix, all
tests that were added in aforementioned PRs still pass, including
the new test.

Resolves #3108

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-05-09 12:59:12 -06:00
Derek Collison
88ebfdaee8 Merge pull request #3109 from nats-io/issue-3107-3069
[FIXED] Downstream sourced retention policy streams during restart have redelivered messages
2022-05-09 09:13:48 -07:00
Derek Collison
b35988adf9 Remember the last timestamp by not removing last msgBlk when empty and during purge pull last timestamp forward until new messages arrive.
When a downstream stream uses retention modes that delete messages, fallback to timebased start time for the new source consumers.

Signed-off-by: Derek Collison <derek@nats.io>
2022-05-09 09:04:19 -07:00
Derek Collison
b47de12bbd Merge pull request #3110 from nats-io/compact-panic
Fix for panic due to not loaded cache during compact
2022-05-07 12:52:02 -07:00
Derek Collison
6507cba2a9 Fix for race on recovery
Signed-off-by: Derek Collison <derek@nats.io>
2022-05-07 12:42:56 -07:00
Derek Collison
fbc9e16253 Fix for panic due to not loaded cache during compact
Signed-off-by: Derek Collison <derek@nats.io>
2022-05-07 09:25:32 -07:00
Ivan Kozlovic
f20fe2c2d8 Bump version to dev 2.8.3-beta
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-05-04 13:20:53 -06:00
Ivan Kozlovic
9e5d25b26d Merge pull request #3104 from nats-io/release_2_8_2
Release v2.8.2
2022-05-04 13:01:56 -06:00
Ivan Kozlovic
10c020ed44 Release v2.8.2
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-05-04 11:52:55 -06:00
Ivan Kozlovic
9e60fe2c4a Merge pull request #3103 from nats-io/revert-3091-DisallowBearerToken
Revert "[added] support for jwt operator option DisallowBearerToken"
2022-05-04 11:23:53 -06:00
Ivan Kozlovic
3cdbba16cb Revert "[added] support for jwt operator option DisallowBearerToken" 2022-05-04 11:11:25 -06:00
Ivan Kozlovic
12dd727310 Merge pull request #3091 from nats-io/DisallowBearerToken
[added] support for jwt operator option DisallowBearerToken
2022-05-04 10:57:22 -06:00
Derek Collison
39c8421d6a Merge pull request #3102 from nats-io/fs-blk-sz
Bump up default block sizes
2022-05-04 09:55:25 -07:00
Derek Collison
7246edc77d Bump up default block sizes
Signed-off-by: Derek Collison <derek@nats.io>
2022-05-04 09:46:15 -07:00
Derek Collison
7c9a2d921a Bump to 2.8.2-beta.5
Signed-off-by: Derek Collison <derek@nats.io>
2022-05-03 16:08:58 -07:00
Ivan Kozlovic
074efcaf93 Merge pull request #3101 from nats-io/js_max_per_sub_improvement
[IMPROVED] JetStream: check max-per-subject once
2022-05-03 17:08:26 -06:00
Ivan Kozlovic
5d90c8eac7 [IMPROVED] JetStream: check max-per-subject once
There was a case where we may have done a check for max-per-subject
limit twice per message. That would apply to streams that have
max-per-subject and also discard_new, which is what KV configures.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-05-03 16:57:26 -06:00
Derek Collison
3fef1025fe Merge pull request #3100 from nats-io/rc-improvements
Raft and cluster improvements.
2022-05-03 15:53:06 -07:00
Ivan Kozlovic
b7d65d6ba8 Merge pull request #3099 from nats-io/js_pull_consumer
[FIXED] JetStream: PullConsumer MaxWaiting==1 and Canceled requests
2022-05-03 16:48:59 -06:00
Derek Collison
6f54b032d6 Raft and cluster improvements.
Signed-off-by: Derek Collison <derek@nats.io>
2022-05-03 15:20:46 -07:00
Ivan Kozlovic
cadf921ed1 [FIXED] JetStream: PullConsumer MaxWaiting==1 and Canceled requests
There was an issue with MaxWaiting==1 that was causing a request
with expiration to actually not expire. This was because processWaiting
would not pick it up because wq.rp was actually equal to wq.wp
(that is, the read pointer was equal to write pointer for a slice
of capacity of 1).

The other issue was that when reaching the maximum of waiting pull
requests, a new request would evict an old one with a "408 Request Canceled".

There is no reason for that, instead the server will first try to
find some existing expired requests (since some of the expiration
is lazily done), but if none is expired, and the queue is full,
the server will return a "409 Exceeded MaxWaiting" to the new
request, and not a "408 Request Canceled" to an old one...

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-05-03 15:17:20 -06:00
Ivan Kozlovic
e5e3902e86 Merge pull request #3096 from nats-io/code_coverage
Fixed GithubAction code coverage to handle test panics
2022-05-02 16:06:01 -06:00
Ivan Kozlovic
00b18166b2 Merge pull request #3094 from nats-io/js_panic
[FIXED] JetStream: possible panic checking for group leader less
2022-05-02 15:31:51 -06:00
Ivan Kozlovic
92755d3329 Fixed GithubAction code coverage to handle test panics
We are ok with a flapper or two, because they should not affect
code coverage that much, so it is better to have those and publish
code coverage than to have to recycle the whole test suite until
we get no test failure.

However, if there is a test panic, then all other tests within this
package will NOT run, which then would have possibly a massive
impact in the code coverage percentage.

These changes will ensure that the run fails if one of the code
coverage output is "empty" (it is actually not empty, but the
initial content is "mode: atomic" and then whe code coverage is
complete, it gets filled with actual code coverage data).
On failure, the push to coverall will not happen.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-05-02 14:51:19 -06:00
Ivan Kozlovic
c9df6374b8 [FIXED] JetStream: possible panic checking for group leader less
Got this stack:
```
goroutine 247945 [running]:
github.com/nats-io/nats-server/v2/server.(*jetStream).isGroupLeaderless(0xc004794e70, 0xc0031b0300)
	/home/runner/work/nats-server/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:661 +0xc2
github.com/nats-io/nats-server/v2/server.(*Server).jsMsgDeleteRequest(0xc001dc9388, 0xc003e6de30, 0xc00222b980, 0xc001454f70, {0xc000668930, 0x24}, {0xc0011dbdb8, 0x11}, {0xc000da93f0, 0xa6, ...})
	/home/runner/work/nats-server/src/github.com/nats-io/nats-server/server/jetstream_api.go:2335 +0x67d
github.com/nats-io/nats-server/v2/server.(*jetStream).apiDispatch.func1()
	/home/runner/work/nats-server/src/github.com/nats-io/nats-server/server/jetstream_api.go:716 +0x85
created by github.com/nats-io/nats-server/v2/server.(*jetStream).apiDispatch
	/home/runner/work/nats-server/src/github.com/nats-io/nats-server/server/jetstream_api.go:715 +0x5c5
```

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-05-02 13:43:40 -06:00
Ivan Kozlovic
94b9c9b406 Merge pull request #3092 from nats-io/js_lock_inversion
[FIXED] JetStream: possible lock inversion
2022-05-02 11:26:17 -06:00
Ivan Kozlovic
0557eafa8f Update locksordering.txt with some more ordering and jsa.usageMu
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-05-02 11:14:31 -06:00
Ivan Kozlovic
5050092468 [FIXED] JetStream: possible lock inversion
When updating usage, there is a lock inversion in that the jetStream
lock was acquired while under the stream's (mset) lock, which is
not correct. Also, updateUsage was locking the jsAccount lock, which
again, is not really correct since jsAccount contains streams, so
it should be jsAccount->stream, not the other way around.

Removed the locking of jetStream to check for clustered state since
js.clustered is immutable.

Replaced using jsAccount lock to update usage with a dedicated lock.

Originally moved all the update/limit fields in jsAccount to new
structure to make sure that I would see all code that is updating
or reading those fields, and also all functions so that I could
make sure that I use the new lock when calling these. Once that
works was done, and to reduce code changes, I put the fields back
into jsAccount (although I grouped them under the new usageMu mutex
field).

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-05-02 09:50:32 -06:00