Commit Graph

42 Commits

Author SHA1 Message Date
Derek Collison
6c5f0a669d Ensure we add in new consumers from a meta snapshot from the leader.
Signed-off-by: Derek Collison <derek@nats.io>
2023-01-04 22:18:31 -08:00
Neil Twigg
14d0ba1c65 Fix some lint errors after move to golangci-lint 2022-12-30 20:00:08 +00:00
Todd Beets
47c87eb71c fix and test for clustered mem store asset no-quorum if leader restarted 2022-12-14 16:16:08 -08:00
Derek Collison
dbc81b9c8b Merge pull request #3700 from mprimi/tests_temp_dir_cleanup
Temporary test files cleanup
2022-12-13 12:27:26 -08:00
Derek Collison
c2188a40ac Merge pull request #3709 from nats-io/zero-bug
Fix for regression in which peer re-assign to a former RG would zero state
2022-12-13 10:03:56 -08:00
Todd Beets
c0ca398b83 use jsz instead of struct direct in final state test 2022-12-12 20:00:14 -08:00
Marco Primi
f8a030bc4a Use testing.TempDir() where possible
Refactor tests to use go built-in temporary directory utility for tests.

Also avoid binding to default port (which may be in use)
2022-12-12 13:18:44 -08:00
Byron Ruth
566d1adfa7 Fix /healthz?js-enabled=true behavior
When js-enabled is set to true, the condition was only checked if
the `getJetStream()` call returned `nil`. However, if it non-nil,
all remaining checks were executed, including assessing the health
of the assets (streams and consumers).

This change addresses two issues:

- Switch to use `js.isEnabled()` which will check whether the value
  is nil OR `js.disabled = true` which can occur if the subsystem
  is temporarily disabled (insufficient resources).
- Correctly exit the check after the assertion and before meta and
  asset checks are performed.

In addition, the option has been renamed to `js-enabled-only` to align
with the `js-server-only` naming. The previous `js-enabled` name still
works, but is mapped to this new option. A warning is emitted noting
the previous option is deprecated.

Fix #3703

Signed-off-by: Byron Ruth <b@devel.io>
2022-12-10 07:34:32 -05:00
Derek Collison
8c6418ee45 Extensive test in support of issue #3191.
Signed-off-by: Derek Collison <derek@nats.io>
2022-12-07 12:36:29 -08:00
Derek Collison
da2059a5f0 Merge pull request #3678 from nats-io/ut-replacepeer
tag policies not honored in reassignment after peer remove
2022-12-06 17:50:11 -08:00
Derek Collison
2f27438230 Make stream removal from a server consistent.
Signed-off-by: Derek Collison <derek@nats.io>
2022-12-06 17:11:43 -08:00
Derek Collison
29b614a057 Fix flapping test
Signed-off-by: Derek Collison <derek@nats.io>
2022-12-06 16:08:45 -08:00
Derek Collison
549b77ca2d Ensure that ephemeral consumers that are deleted on startup properly are removed from the system.
Signed-off-by: Derek Collison <derek@nats.io>
2022-12-06 15:07:46 -08:00
Todd Beets
bfd7d7f9ba improved test to remove false positive result 2022-12-06 12:48:53 -08:00
Todd Beets
3fdfb8a12f Merge branch 'main' into ut-replacepeer
# Conflicts:
#	server/jetstream_cluster_3_test.go
2022-12-06 10:51:22 -08:00
Derek Collison
74c373233f Update to test
Signed-off-by: Derek Collison <derek@nats.io>
2022-12-05 17:32:52 -08:00
Todd Beets
ef27d4d534 tag policies not honored in reassignment after peer remove 2022-12-04 20:39:11 -08:00
Derek Collison
5f7c8e21a2 Fixed issues with multiple concurrent stream create requests.
First issue was applications not getting any response.
However, there was also a more serious issue that would create multiple raft groups for each concurrent request.
The servers would only run one stream monitor loop, however they would update the state to the new raft group's name, so on server restart the stream would be using a different raft group then existing servers.

Signed-off-by: Derek Collison <derek@nats.io>
2022-12-04 19:13:51 -08:00
Derek Collison
08c94096db Allow any type of activity to prolong auto cleanup of a consumer.
Signed-off-by: Derek Collison <derek@nats.io>
2022-11-15 17:25:18 -08:00
Ivan Kozlovic
6ffa6d1e4b [FIXED] JetStream: possible panic on stream info when leader not elected
It is possible that a stream info request would be handled at a
time where the raft group would not yet be set/created, causing
a panic.

Resolves #3626 (at least the panic reports there)

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-11-15 11:56:41 -07:00
Derek Collison
47dd97e389 Fix logic bug that would prevent some messages from being deleted on an interest based stream.
Signed-off-by: Derek Collison <derek@nats.io>
2022-11-13 17:32:38 -08:00
Derek Collison
e008e015b3 Make sure to enforce HA asset limits during peer processing as well as assignment.
Signed-off-by: Derek Collison <derek@nats.io>
2022-11-09 16:24:54 -08:00
Ivan Kozlovic
ca237bdfa0 [FIXED] JetStream: Stream scale down while it has no quorum
If a stream R2 had one of its server network-partitioned and at
that time the stream was edited to be scaled down to an R1 it
would cause the stream to no longer have quorum even when the
network partition is resolved.

Signed-off-by: Derek Collison <derek@nats.io>
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-11-04 09:08:31 -06:00
Ivan Kozlovic
c16ccd34c3 [FIXED] JetStream: Sources with OptStartTime gets redelivered
If start by time is before what we remember during recovery use that instead

Resolves #3559

Signed-off-by: Derek Collison <derek@nats.io>
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-11-03 16:09:06 -06:00
Derek Collison
72ff2edb5f Fix for #3603.
Signed-off-by: Derek Collison <derek@nats.io>
2022-11-03 12:46:41 -07:00
Derek Collison
56919ebc97 On stream proposal failures we could accidentally warn on high stream lag.
We were not taking the clfs into account.

Signed-off-by: Derek Collison <derek@nats.io>
2022-11-02 14:40:31 -07:00
Ivan Kozlovic
fe588dc9ea Fixing a flapper
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-10-28 13:45:58 -06:00
Derek Collison
ff2cd1d7f9 Fixed test and bug that would override consumer replicas.
Signed-off-by: Derek Collison <derek@nats.io>
2022-10-25 14:35:20 -07:00
Ivan Kozlovic
39f31b0dbe [FIXED] JetStream: InactivityThreshold updates not always working
This is based of @neilalexander PR #3558.

It ensures that the timer is reset/canceled on configuration
update (by the leader only).

Fixed also the issue with a super-cluster where the delete timer
would always be reset at every gateway interval check.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-10-25 09:54:01 -06:00
Ivan Kozlovic
f8aa3ac11d [FIXED] JetStream: "first sequence mismatch" error on catchup with message expiration
When a server was restarted and expired messages, but the leader had a snapshot that
still had the old messages we would reset complete follower stream state, this fix
just skips over the expired as we prepare the request to the leader.

Resolves #3516

Signed-off-by: Derek Collison <derek@nats.io>
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-10-17 17:02:08 -06:00
Ivan Kozlovic
90e9c89594 Added specific tests for using non system extended setup similar to NGS
Signed-off-by: Derek Collison derek@nats.io
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-10-17 10:42:03 -06:00
Ivan Kozlovic
bec51ed52b [FIXED] JetStream: User given named ephemeral lost after migration
If an ephemeral was given a name by the user, if the consumer leader
was then shutdown, the ephemeral would be migrated using a server
generated new name instead of keeping the user given name.

Also, in some cases the migration would not even occur. This was
likely due to the fact that RAFT node(s) were shutdown prior to
the ephemeral migration code was invoked.

Resolves #3550

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-10-14 15:20:45 -06:00
Ivan Kozlovic
9bd11580e3 [FIXED] JetStream: User-defined ephemeral Name not used in cluster mode
If the user sends a CONSUMER.CREATE request with a configuration that
specifies the name that the user wants for the ephemeral consumer,
this would not work on cluster mode, that is, the server would still
pick a name instead of using the provided one.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-10-10 13:48:38 -06:00
Ivan Kozlovic
3c7aa554f7 [FIXED] JetStream: return error on negative replicas count
If a stream is created or updated with a negative replicas count,
and error is now returned. Same for consumers.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-10-10 12:32:41 -06:00
Derek Collison
c88784dcad Test to make sure a consumer that is deleted while a server is down recovers correctly.
Signed-off-by: Derek Collison <derek@nats.io>
2022-10-07 09:24:54 -07:00
Derek Collison
52b5cd12bb Allow meta layer to snapshot on a clean shutdown.
Signed-off-by: Derek Collison <derek@nats.io>
2022-09-29 09:17:12 -06:00
Derek Collison
fef702a688 [FIXED] bug in consumer names paging, did not honor limits and returned duplicate results.
Signed-off-by: Derek Collison <derek@nats.io>
2022-09-29 06:14:00 -07:00
Ivan Kozlovic
e151cfcd57 [FIXED] JetStream: Scale down of consumer to R1 would not get a response
Updating a consumer configuration from say R3 to R1 would work
but no response was received by the client sending the request.

Resolves #3493

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-09-27 10:02:31 -06:00
Derek Collison
9774ad5641 Added check on publish error.
Signed-off-by: Derek Collison <derek@nats.io>
2022-09-22 07:13:57 -07:00
Derek Collison
61a3cff274 Also require MaxMsgsPerSubject to be set per peer review feedback.
Signed-off-by: Derek Collison <derek@nats.io>
2022-09-22 06:56:32 -07:00
Derek Collison
2d737edba6 Allow discard new per subject for certain KV type scenarios. Requires general DiscardNewPolicy.
Signed-off-by: Derek Collison <derek@nats.io>
2022-09-22 06:38:29 -07:00
Ivan Kozlovic
3fadccab38 Move new test to new jetstream_cluster_3_test.go file
Since the second batch was already past the 5min mark and a bit
longer than the first batch, it is a good opportunity to add
this new test in a new file. Updated runTestsOnTravis and travis.yml
accordingly.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-09-15 12:13:00 -06:00