Commit Graph

243 Commits

Author SHA1 Message Date
Derek Collison
adbb50fc21 Fixed dios capacity to 4 due to testing under heavy load.
Signed-off-by: Derek Collison <derek@nats.io>
2023-02-28 11:45:31 -08:00
Derek Collison
bee149b458 Only need server's rlock here.
Signed-off-by: Derek Collison <derek@nats.io>
2023-02-28 11:17:29 -08:00
Neil Twigg
68961ffedd Refactor ipQueue to use generics, reduce allocations 2023-02-21 14:50:09 +00:00
Derek Collison
f62d929018 Consumer must match replica of parent stream if interest based policy.
Signed-off-by: Derek Collison <derek@nats.io>
2023-01-23 20:16:42 -08:00
Derek Collison
b6ef2c8910 Auto cleanup dangling messages from interest policy streams on server start.
Signed-off-by: Derek Collison <derek@nats.io>
2022-11-14 15:02:22 -08:00
Ivan Kozlovic
abcfe2e7ac Add the pending msgs/bytes on 409 Shutdown
This is related to PR #3572 and PR #3576

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-10-27 13:59:21 -06:00
Derek Collison
15dc72db50 Removed migration of ephemerals, added proper signaling for pul consumers pending requests.
Signed-off-by: Derek Collison <derek@nats.io>
2022-10-25 14:35:20 -07:00
Ivan Kozlovic
bec51ed52b [FIXED] JetStream: User given named ephemeral lost after migration
If an ephemeral was given a name by the user, if the consumer leader
was then shutdown, the ephemeral would be migrated using a server
generated new name instead of keeping the user given name.

Also, in some cases the migration would not even occur. This was
likely due to the fact that RAFT node(s) were shutdown prior to
the ephemeral migration code was invoked.

Resolves #3550

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-10-14 15:20:45 -06:00
Derek Collison
91edd1a8d0 With snapshots both streams are present on restart so sources or mirrors that have a subject change from the origin would not recover.
We now suppress that if we know we are recovering an existing stream.

Signed-off-by: Derek Collison <derek@nats.io>
2022-09-29 09:17:15 -06:00
Derek Collison
52b5cd12bb Allow meta layer to snapshot on a clean shutdown.
Signed-off-by: Derek Collison <derek@nats.io>
2022-09-29 09:17:12 -06:00
Ivan Kozlovic
8c1c6951dc JetStream: R1 durables were incorrectly migrated on shutdown
This could happen for stream with R>1 but with a durable that
has an override of R=1.

Fixed a test to make sure assets have an elected leader.

Also fixed a gateway test that would cause a data race.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-09-07 19:49:16 -06:00
Derek Collison
fbf2233e4a Only complain about leaderless group with previous leader if we know jetstream has been running for some threshold.
Signed-off-by: Derek Collison <derek@nats.io>
2022-09-07 08:47:55 -07:00
Ivan Kozlovic
88ece75765 [FIXED] JetStream: Some nodes may never be reported as offline
In some rare situations, it is possible that nodes are added
to the cluster but are not properly tracked and not shown as
offline when they exit the cluster.

Relates to #3258

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-09-01 12:48:12 -06:00
Ivan Kozlovic
c8af84d3d1 [FIXED] JetStream: expired accounts were counted toward limits
Fix from @JulienVdG in PR #3421. Adding simply a test and make
sure that tests run.

Resolves #3420

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-08-31 16:40:11 -06:00
Derek Collison
56e177c329 Allow stream msgs to be compressed within the raft layer and during upper layer catchups.
Signed-off-by: Derek Collison <derek@nats.io>
2022-08-30 16:10:57 -07:00
Ivan Kozlovic
9a6a2c31ee [ADDED] JetStream: Ability to configure the per server max catchup bytes
The original value was hardcoded to 128MB and 32MB per stream. The
per-server limit is lowered to 32MB but is configurable with
a new configuration parameter:
```
jetstream {
   max_catchup: 8MB
}
```

The per-stream limit was also lowered from 32MB/128,000msgs to
8MB/32,000 messages.

Tests have shown no difference in performance for fast links.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-08-30 13:46:13 -06:00
Derek Collison
9508276b98 Make kek function based on review feedback
Signed-off-by: Derek Collison <derek@nats.io>
2022-08-16 12:49:03 -07:00
Derek Collison
ef91d67708 Support auto-conversion
Signed-off-by: Derek Collison <derek@nats.io>
2022-08-16 08:41:39 -07:00
Derek Collison
827b34a77a Add support for AES cipher encryption for filestore.
Signed-off-by: Derek Collison <derek@nats.io>
2022-08-15 14:21:37 -07:00
Matthias Hanel
c6e37cf7af Fix race between stream stop and monitorStream (#3350)
* Fix race between stream stop and monitorStream

monitorCluster stops the stream, when doing so, monitorStream
needs to be stopped to avoid miscounting of store size.
In a test stop and reset of store size happened first and then
was followed by storing more messages via monitorStream

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-08-10 19:01:21 +02:00
Derek Collison
8c04adc009 Improvements to filestore for large KVs.
Use better indexing for lookups, we used to do simple linear scan backwards, now track first and last block.
Will expire the fss cache at will to reduce memory usage.

Signed-off-by: Derek Collison <derek@nats.io>
2022-08-09 15:51:13 -05:00
Ivan Kozlovic
88424a89ef Remove io/ioutil
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-08-05 13:12:13 -06:00
Ivan Kozlovic
d90854a45f Merge pull request #3341 from nats-io/go_1_19
Move to Go 1.19, remote io/util, fix data race and a flapper
2022-08-05 12:49:06 -06:00
Matthias Hanel
c56f3b9fbd Adding account purge operation (#3319)
* Adding account purge operation

The new request is available for the system account.
The subject to send the request to is $JS.API.ACCOUNT.PURGE.*
With the name of the account to purge instead of the wildcard.

Also added directory cleanup code such that server do not
end up with empty streams directories and account dirs that
only contain streams

Also adding ACCOUNT to leaf node domain rewrite table

Addresses #3186 and #3306 by providing a way to
get rid of the streams for existing and non existing accounts

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-08-05 18:24:19 +02:00
Ivan Kozlovic
3c9a7cc6e5 Move to Go 1.19, remote io/util, fix data race and a flapper
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-08-05 09:55:37 -06:00
Derek Collison
748890adb1 Auto-set and upgrade AllowDirect when MaxMsgsPerSubject is set.
Also allow mirrors to inherit properly.

Signed-off-by: Derek Collison <derek@nats.io>
2022-08-03 12:36:52 -07:00
Ivan Kozlovic
39a0cfccca [FIXED] JetStream: servers may be reported as orphaned
In some situations, a server may report that a remote server is
detected as orphaned (and the node is marked as offline). This is
because the orphaned detection relies on conns update to be received,
however, servers would suppress the update if an account does not
have any connections attached.

This PR ensures that the update is sent regardless if the account
is JS configured (not necessarily enabled at the moment).

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-07-11 16:15:01 -06:00
Derek Collison
1adb3ae3bf Fix for data race - issue 3188
Signed-off-by: Derek Collison <derek@nats.io>
2022-06-13 18:02:57 -07:00
Derek Collison
405de254b6 Mark meta recovering state and use to suppress api responses and api audits during restarts.
Signed-off-by: Derek Collison <derek@nats.io>
2022-06-01 20:06:27 -07:00
Ivan Kozlovic
53e3c53d96 [FIXED] JetStream: consumer with deliver new may miss messages
This could happen when a consumer had not sent anything to the
attached NATS subscription and there was a consumer leader
step down or server restart.

Signed-off-by: Derek Collison <derek@nats.io>
2022-05-23 12:01:48 -06:00
Ivan Kozlovic
8ca3d2f7f5 [FIXED] JetStream: data race with account's jsLimits
Saw this data race report:
```
=== RUN   TestJetStreamJWTClusteredDeleteTierWithStreamAndMove
==================
WARNING: DATA RACE
Write at 0x00c000d88a60 by goroutine 86:
  github.com/nats-io/nats-server/v2/server.(*Server).updateAccountClaimsWithRefresh()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/accounts.go:3299 +0x4532
  github.com/nats-io/nats-server/v2/server.(*Server).UpdateAccountClaims()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/accounts.go:2932 +0x45
  github.com/nats-io/nats-server/v2/server.(*Server).updateAccountWithClaimJWT()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/server.go:1490 +0x3b4
  github.com/nats-io/nats-server/v2/server.(*Server).updateAccount()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/server.go:1463 +0x1f0
  github.com/nats-io/nats-server/v2/server.(*Server).lookupAccount()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/server.go:1428 +0x168
  github.com/nats-io/nats-server/v2/server.(*Server).LookupAccount()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/server.go:1448 +0x2b9
  github.com/nats-io/nats-server/v2/server.(*Server).getRequestInfo()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream_api.go:834 +0x28d
  github.com/nats-io/nats-server/v2/server.(*Server).jsStreamCreateRequest()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream_api.go:1183 +0xca
  github.com/nats-io/nats-server/v2/server.(*Server).jsStreamCreateRequest-fm()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream_api.go:1179 +0xcc
  github.com/nats-io/nats-server/v2/server.(*jetStream).apiDispatch.func1()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream_api.go:717 +0x125
Previous read at 0x00c000d88a60 by goroutine 60:
  github.com/nats-io/nats-server/v2/server.(*Server).configJetStream()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream.go:638 +0x59
  github.com/nats-io/nats-server/v2/server.(*Server).updateAccountClaimsWithRefresh()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/accounts.go:3332 +0x48b3
  github.com/nats-io/nats-server/v2/server.(*Server).UpdateAccountClaims()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/accounts.go:2932 +0x45
  github.com/nats-io/nats-server/v2/server.(*Server).updateAccountWithClaimJWT()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/server.go:1490 +0x3b4
  github.com/nats-io/nats-server/v2/server.(*Server).updateAccount()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/server.go:1463 +0x1f0
  github.com/nats-io/nats-server/v2/server.(*Server).lookupAccount()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/server.go:1428 +0x168
  github.com/nats-io/nats-server/v2/server.(*Server).LookupAccount()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/server.go:1448 +0x2c8
  github.com/nats-io/nats-server/v2/server.(*jetStream).processStreamAssignment()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:2422 +0x2b6
  github.com/nats-io/nats-server/v2/server.(*jetStream).applyMetaEntries()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:1416 +0x7e4
  github.com/nats-io/nats-server/v2/server.(*jetStream).monitorCluster()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:896 +0xc75
  github.com/nats-io/nats-server/v2/server.(*jetStream).monitorCluster-fm()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:822 +0x39
Goroutine 86 (running) created at:
  github.com/nats-io/nats-server/v2/server.(*jetStream).apiDispatch()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream_api.go:716 +0x8fc
  github.com/nats-io/nats-server/v2/server.(*jetStream).apiDispatch-fm()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream_api.go:658 +0xcc
  github.com/nats-io/nats-server/v2/server.(*client).deliverMsg()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/client.go:3175 +0xbde
  github.com/nats-io/nats-server/v2/server.(*client).processMsgResults()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/client.go:4168 +0xf9e
  github.com/nats-io/nats-server/v2/server.(*client).processInboundRoutedMsg()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/route.go:443 +0x2ce
  github.com/nats-io/nats-server/v2/server.(*client).processInboundMsg()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/client.go:3491 +0x79
  github.com/nats-io/nats-server/v2/server.(*client).parse()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/parser.go:497 +0x3886
  github.com/nats-io/nats-server/v2/server.(*client).readLoop()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/client.go:1229 +0x1669
  github.com/nats-io/nats-server/v2/server.(*Server).createRoute.func1()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/route.go:1372 +0x37
Goroutine 60 (running) created at:
  github.com/nats-io/nats-server/v2/server.(*Server).startGoRoutine()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/server.go:3013 +0x86
  github.com/nats-io/nats-server/v2/server.(*jetStream).setupMetaGroup()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:621 +0x108a
  github.com/nats-io/nats-server/v2/server.(*Server).enableJetStreamClustering()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:514 +0x20a
  github.com/nats-io/nats-server/v2/server.(*Server).enableJetStream()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream.go:401 +0x1168
  github.com/nats-io/nats-server/v2/server.(*Server).EnableJetStream()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream.go:207 +0x651
  github.com/nats-io/nats-server/v2/server.(*Server).Start()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/server.go:1746 +0x1804
  github.com/nats-io/nats-server/v2/server.RunServer·dwrap·3827()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/server_test.go:90 +0x39
==================
    testing.go:1152: race detected during execution of test
--- FAIL: TestJetStreamJWTClusteredDeleteTierWithStreamAndMove (6.31s)
```

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-05-18 16:22:53 -06:00
Ivan Kozlovic
e304589da4 [FIXED] JetStream: Some data races
We were getting a data race checking the js.clustered field in
updateUsage() following fix for lock inversion in PR #3092.
```
=== RUN   TestJetStreamClusterKVMultipleConcurrentCreate
==================
WARNING: DATA RACE
Read at 0x00c0009db5d8 by goroutine 195:
  github.com/nats-io/nats-server/v2/server.(*jsAccount).updateUsage()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream.go:1681 +0x8f
  github.com/nats-io/nats-server/v2/server.(*stream).storeUpdates()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/stream.go:2927 +0x1d9
  github.com/nats-io/nats-server/v2/server.(*stream).storeUpdates-fm()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/stream.go:2905 +0x7d
  github.com/nats-io/nats-server/v2/server.(*fileStore).removeMsg()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/filestore.go:2158 +0x14f7
  github.com/nats-io/nats-server/v2/server.(*fileStore).expireMsgs()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/filestore.go:2777 +0x18f
  github.com/nats-io/nats-server/v2/server.(*fileStore).expireMsgs-fm()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/filestore.go:2770 +0x39
Previous write at 0x00c0009db5d8 by goroutine 128:
  github.com/nats-io/nats-server/v2/server.(*jetStream).setupMetaGroup()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:604 +0xfae
  github.com/nats-io/nats-server/v2/server.(*Server).enableJetStreamClustering()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:514 +0x20a
  github.com/nats-io/nats-server/v2/server.(*Server).enableJetStream()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream.go:400 +0x1168
  github.com/nats-io/nats-server/v2/server.(*Server).EnableJetStream()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream.go:206 +0x651
  github.com/nats-io/nats-server/v2/server.(*Server).Start()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/server.go:1746 +0x1804
  github.com/nats-io/nats-server/v2/server.RunServer·dwrap·4269()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/server_test.go:90 +0x39
Goroutine 195 (running) created at:
  time.goFunc()
      /home/travis/.gimme/versions/go1.17.9.linux.amd64/src/time/sleep.go:180 +0x49
Goroutine 128 (finished) created at:
  github.com/nats-io/nats-server/v2/server.RunServer()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/server_test.go:90 +0x278
  github.com/nats-io/nats-server/v2/server.RunServerWithConfig()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/server_test.go:112 +0x44
  github.com/nats-io/nats-server/v2/server.(*cluster).restartServer()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream_helpers_test.go:1004 +0x1d5
  github.com/nats-io/nats-server/v2/server.TestJetStreamClusterKVMultipleConcurrentCreate()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream_cluster_test.go:8463 +0x64b
  testing.tRunner()
      /home/travis/.gimme/versions/go1.17.9.linux.amd64/src/testing/testing.go:1259 +0x22f
  testing.(*T).Run·dwrap·21()
      /home/travis/.gimme/versions/go1.17.9.linux.amd64/src/testing/testing.go:1306 +0x47
==================
```

Running that test with adding some delay in several places also showed another race:
```
==================
WARNING: DATA RACE
Read at 0x00c00016adb8 by goroutine 160:
  github.com/nats-io/nats-server/v2/server.(*fileStore).expireMsgs()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/filestore.go:2777 +0x106
  github.com/nats-io/nats-server/v2/server.(*fileStore).expireMsgs-fm()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/filestore.go:2771 +0x39

Previous write at 0x00c00016adb8 by goroutine 32:
  github.com/nats-io/nats-server/v2/server.(*fileStore).UpdateConfig()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/filestore.go:360 +0x1c8
  github.com/nats-io/nats-server/v2/server.(*stream).update()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/stream.go:1360 +0x852
  github.com/nats-io/nats-server/v2/server.(*jetStream).processClusterCreateStream()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:2704 +0x4a4
  github.com/nats-io/nats-server/v2/server.(*jetStream).processStreamAssignment()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:2452 +0xad9
  github.com/nats-io/nats-server/v2/server.(*jetStream).applyMetaEntries()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:1407 +0x7e4
  github.com/nats-io/nats-server/v2/server.(*jetStream).monitorCluster()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:887 +0xc75
  github.com/nats-io/nats-server/v2/server.(*jetStream).monitorCluster-fm()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:813 +0x39

Goroutine 160 (running) created at:
  time.goFunc()
      /usr/local/go/src/time/sleep.go:180 +0x49

Goroutine 32 (running) created at:
  github.com/nats-io/nats-server/v2/server.(*Server).startGoRoutine()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/server.go:3013 +0x86
  github.com/nats-io/nats-server/v2/server.(*jetStream).setupMetaGroup()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:612 +0x1092
  github.com/nats-io/nats-server/v2/server.(*Server).enableJetStreamClustering()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:514 +0x20a
  github.com/nats-io/nats-server/v2/server.(*Server).enableJetStream()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/jetstream.go:400 +0x1168
  github.com/nats-io/nats-server/v2/server.(*Server).EnableJetStream()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/jetstream.go:206 +0x651
  github.com/nats-io/nats-server/v2/server.(*Server).Start()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/server.go:1746 +0x1804
  github.com/nats-io/nats-server/v2/server.RunServer·dwrap·4275()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/server_test.go:90 +0x39
==================
```

Both are now addressed, either with proper locking, or with the use of an atomic in the place
where we cannot get the lock (without re-introducing the lock inversion issue).

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-05-11 19:09:24 -06:00
Ivan Kozlovic
5050092468 [FIXED] JetStream: possible lock inversion
When updating usage, there is a lock inversion in that the jetStream
lock was acquired while under the stream's (mset) lock, which is
not correct. Also, updateUsage was locking the jsAccount lock, which
again, is not really correct since jsAccount contains streams, so
it should be jsAccount->stream, not the other way around.

Removed the locking of jetStream to check for clustered state since
js.clustered is immutable.

Replaced using jsAccount lock to update usage with a dedicated lock.

Originally moved all the update/limit fields in jsAccount to new
structure to make sure that I would see all code that is updating
or reading those fields, and also all functions so that I could
make sure that I use the new lock when calling these. Once that
works was done, and to reduce code changes, I put the fields back
into jsAccount (although I grouped them under the new usageMu mutex
field).

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-05-02 09:50:32 -06:00
Matthias Hanel
d520a27c36 [fixed] step down timing, consumer stream seqno, clear redelivery (#3079)
Step down timing for consumers or streams.
Signals loss of leadership and sleeps before stepping down.
This makes it less likely that messages are being processed during step
down.

When becoming leader, consumer stream seqno got reset,
even though the consumer existed already.

Proper cleanup of redelivery data structures and timer

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-04-27 03:32:08 -04:00
Derek Collison
f702e279ab Fix for a consumer recovery issue.
Also update healthz to check all assets that are assigned, not just running.

Signed-off-by: Derek Collison <derek@nats.io>
2022-04-26 19:22:19 -07:00
Ivan Kozlovic
363849bf3a [FIXED] JetStream: Mirrors would fail to be recovered
This is a continuation of PR #3060, but extends to clustering.

Verified with manual test that a mirror created with v2.7.4 has
the duplicates window set and on restart with main would still
complain about use of dedup in cluster mode. The mirror stream
was recovered but showing as R1.
With this fix, a restart of the cluster - with existing data -
will properly recover the stream as an R3 and messages that
were published while in a bad state are synchronized.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Signed-off-by: Matthias Hanel mh@synadia.com
2022-04-21 10:59:23 -06:00
Matthias Hanel
9db2e91403 Fixes issue on restart where mirror config was invalid (#3060)
This issue was introduced by a bug that assigned a de dupe default value
do mirrors.
This is an invalid config that used to be checked on stream create.
Subsequently this logic was moved and then got executed on startup.
Ending up checking mirrors that got the bad default assigned.
Fix is to clear de dupe window for mirrors

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-04-20 13:58:30 -04:00
Ivan Kozlovic
2659b30113 [IMPROVED] JetStream: add file names for invalid checksums
On restart, we report when we find error in checksums, but we
did not report the name of the file.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-04-18 13:35:08 -06:00
Matthias Hanel
79b4374d01 [Fixed] limits enforcement issues (#3046)
* [Fixed] limits enforcement issues

stream create had checks that stream restore did not have.
Moved code into commonly used function checkStreamCfg.
Also introduced (cluster/non clustered) StreamLimitsCheck functions to
perform checks specific to clustered /non clustered data structures.

Checking for valid stream config and limits/reservations before
receiving all the data. Now fails the request right away.

Added a jetstream limit "max_request_batch" to limit fetch batch size

Shortened max name length from 256 to 255, more common file name limit

Added check for loop in cyclic source stream configurations

features related to limits

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-04-18 01:53:48 -04:00
Ivan Kozlovic
50c3986863 [FIXED] JetStream stream catchup issues
- A stream could become leader when it should not, causing
messages to be lost.
- A catchup could stall because the server sending data
could bail out of the runCatchup routine but still send
the EOF signal.
- Deadlock with monitoring of Jsz

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Signed-off-by: Derek Collison <derek@nats.io>
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-04-12 16:05:12 -06:00
Derek Collison
37cbac99e7 Improvements to the raft layer for general stability and support of scale up and down and asset move.
Also fixed a bug that would allow a leadership transfer when catching up.

Signed-off-by: Derek Collison <derek@nats.io>
2022-04-10 08:59:39 -07:00
Matthias Hanel
d9da66d67e returns -1 for new unlimited/unset limits and tests/fixes info counts (#3002)
iterates on tiered limits

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-04-05 12:25:55 -04:00
Matthias Hanel
569328bc18 fix crash on shutdown with meta being nil (#2999)
Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-04-04 17:40:19 -04:00
Ivan Kozlovic
19783a9f11 [CHANGED] Rate limit similar warnings
Some warnings, especially when dealing with JS limits that were
printed on a per-message basis, are now limited to ~1 per second
if the content of the warning is already found in a map.

This is also for "client" warnings, but the client porting of the
warning is not taken into account so that helps with reducing logging
for similar content, but coming from different clients.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-04-01 15:24:03 -06:00
Matthias Hanel
a77f95faa8 error handling and info when moving a stream from non existing tier (#2992)
adds unit test to test this scenario
improves reporting of correct error
only show info for non existing tiers where streams exist

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-04-01 14:21:35 -04:00
Matthias Hanel
92f4dc986a added max_ack_pending setting to js account limits (#2982)
* added max_ack_penind setting to js account limits

because of the addition, defaults now have to be set later (depend on
these new limits now)

also re-organized the code to closer track how stream create looks

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-03-31 14:17:16 -04:00
Matthias Hanel
1445153130 Adding max stream bytes check (#2970)
* Adding max stream bytes check

Also start checking on  stream update

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-03-30 15:50:28 -04:00
Matthias Hanel
1aeaaf0ca3 Adding server limits (max ack pending/dedupe window) to js config (#2967)
* Adding server limits (max ack pending/dedupe window) to js config

Also shifting consumer config check to jsConsumerCreate as in clustered
mode this was enforced in the wrong place

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-03-29 13:19:36 -04:00
Matthias Hanel
9ee03aedd1 update message size was too short when nothing needed to be sent (#2968)
Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-03-28 23:37:30 -04:00
Matthias Hanel
0c5f3688a7 [ADDED] Tiered limits and fix limit issues on updates (#2945)
* Adding tiered limits and fix limit issues on updates

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-03-28 20:47:54 -04:00