Commit Graph

182 Commits

Author SHA1 Message Date
Derek Collison
cfbc69b12c Allow clustered JetStream to allow duplicate stream creation like single server mode.
Resolves #2528

Signed-off-by: Derek Collison <derek@nats.io>
2021-09-15 20:18:44 -07:00
Derek Collison
4cfffc6fa6 We use u16 to encide header len when replicating JetStream messages.
Make sure to error if we exceed that limit.

Signed-off-by: Derek Collison <derek@nats.io>
2021-09-13 18:13:07 -07:00
Derek Collison
dadc3b9fae Fixed a bug when an interest retention stream with noack consumers is in clustered mode.
We were not properly propagating the ack state and proper cleanup of the stream messages.

Signed-off-by: Derek Collison <derek@nats.io>
2021-09-08 15:02:09 -07:00
Derek Collison
3099327697 During peer removal, try to remap any stream or consumer assets.
Also if we do not have room trap add peer and process there.
Fixed a bug that would treat ephemerals same as durables during remapping after peer removal.

Signed-off-by: Derek Collison <derek@nats.io>
2021-09-06 17:29:45 -07:00
Derek Collison
60e45ea3dd Return if pull subscriber and exists
Signed-off-by: Derek Collison <derek@nats.io>
2021-09-01 14:01:00 -07:00
Derek Collison
d809b02491 Fix for Issue #2397
When we had partial state due to server failure or being shutdown ungracefully we could enter into a stream reset state.
The stream reset state is harsh but worked, however there was a bug that would not restart consumers that were attached.
Also if no state exists, or state was truncated, we can detect that and not go through a full reset.

Signed-off-by: Derek Collison <derek@nats.io>
2021-09-01 07:04:50 -07:00
Ivan Kozlovic
9f2e3d335b [FIXED] JetStream: possible deadlock due to lock inversion
The locking is jetStream->Server, not the otherway around. There
was few places where lock inversion could have caused deadlock.

Also, a change made recently to solve a deadlock was causing
a race that is demonstrated with TestJetStreamRaceOnRAFTCreate.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-08-30 16:16:56 -06:00
Derek Collison
bbc4e43489 Fixed creating raft groups when we had a js->s lock pattern.
Signed-off-by: Derek Collison <derek@nats.io>
2021-08-25 13:44:30 -07:00
Derek Collison
3f099f6719 Add in warn for error on catchup
Signed-off-by: Derek Collison <derek@nats.io>
2021-08-16 08:03:53 -07:00
Derek Collison
9405b77e46 Added in last active reporting for consumers for delivered and ack floor.
Signed-off-by: Derek Collison <derek@nats.io>
2021-08-14 11:36:27 -07:00
R.I.Pienaar
76ab1b8d17 attempt to improve UX of the error system
Previously we had a few confusing functions like NewT
and similar that were quite fragile to use due to minimal
validation and a panic in go stdlib string Replacer.

Now we generate helper methods for every string, these
are used to access errors, fill in templates and conditional
returns of error type using the new Unless() option

We now get compile time errors for some common mistakes
and have better IDE helpers for arguments etc

Signed-off-by: R.I.Pienaar <rip@devco.net>
2021-08-10 16:08:28 +02:00
Derek Collison
29536629eb Simplified flow control, avoid stalls due to msg loss
Signed-off-by: Derek Collison <derek@nats.io>
2021-08-09 20:13:17 -07:00
Derek Collison
925a6fe6b2 Fix for #2388. Leafnodes with no JS can seamlessly access a HUB with JS.
This is the reverse of the early work to have LNs extend a non-JS cluster.
Also have mixed mode tests as well.

Signed-off-by: Derek Collison <derek@nats.io>
2021-08-01 14:57:47 -07:00
Derek Collison
9b0158daf9 Allow delivery policy of DeliverLastPerSubject, which is helpful for scoped watchers for K/V.
Signed-off-by: Derek Collison <derek@nats.io>
2021-07-28 12:49:02 -07:00
Derek Collison
f13fa767c2 Remove the swapping of accounts during processing of service imports.
When processing service imports we would swap out the accounts during processing.
With the addition of internal subscriptions and internal clients publishing in JetStream we had an issue with the wrong account being used.
This was specific to delyaed pull subscribers trying to unsubscribe due to max of 1 while other JetStream API calls were running concurrently.
2021-07-26 07:57:10 -07:00
Derek Collison
6337198119 Fix for multiple concurrent ephemeral consumer requests in clustered mode with max consumers set.
Signed-off-by: Derek Collison <derek@nats.io>
2021-07-08 07:02:09 -07:00
Derek Collison
6eef31c0fc Fixed peer info reports that had large last active values.
Also put in safety for lag going upside down as well.

Signed-off-by: Derek Collison <derek@nats.io>
2021-07-06 10:14:43 -07:00
R.I.Pienaar
709e256d64 Add error codes for all consumer creation errors
I wanted to supress some logging of consumer create
errors that just isn't needed and would be really
annoying on large networks, so I added many constants
and updated all errors.

I think only JSConsumerStoreFailedErrF is worth logging
on large networks else there would be quite a lot of
logs generated that one just cannot act on

Signed-off-by: R.I.Pienaar <rip@devco.net>
2021-07-06 14:51:03 +02:00
Derek Collison
99fed910f0 Improvements to large numbers of JetStream R1 consumers per stream.
1. We were holding open FDs longer than we should for consumers causing issues with open FD limits. We now do not hold them open and cap updates a bit better.

2. When doing a stream delete, consumer delete was repeating alot of work that was not necessary, causing longer delays. This has been optimized a bit, still more improvements to be made.

3. We cover all JS under a single export, but that was also trapping GetNext for pull based consumers, and since this was a no-op (is handled at user account level) we were creating alot of garbage service import responses and reverse map entries that had to be garbage collected. We have a fix in to avoind this but still looking for a better one.

4. Still had some lingering references to all exports vs single JS export.

Signed-off-by: Derek Collison <derek@nats.io>
2021-06-29 05:45:55 -07:00
Derek Collison
08197de9e0 FIXED max consumers was not enforced when set on stream
Signed-off-by: Derek Collison <derek@nats.io>
2021-06-25 11:45:36 -07:00
Derek Collison
9398c3ca28 Allow for more advanced purge operations that filter by subject, specify the sequence or number of messages to keep.
Signed-off-by: Derek Collison <derek@nats.io>
2021-06-19 07:04:44 -07:00
R.I.Pienaar
a0fcf0bb65 further tagged error confusion cleanups
Signed-off-by: R.I.Pienaar <rip@devco.net>
2021-06-18 20:11:09 +02:00
Derek Collison
08cdb2d2ea Make filtered consumers in large mixed streams more efficient.
Allow wider scoped filtered subjects.

We introduce a per subject information tracking to filestore to optimize for large mux'd streams and more efficient filtered consumers.

Signed-off-by: Derek Collison <derek@nats.io>
2021-06-15 04:44:05 -07:00
R.I.Pienaar
ee9d10f40b restore old error constants for backwards compat
Signed-off-by: R.I.Pienaar <rip@devco.net>
2021-05-26 08:04:50 +02:00
R.I.Pienaar
0d391b02eb richer api errors proposal
Signed-off-by: R.I.Pienaar <rip@devco.net>
2021-05-26 08:04:50 +02:00
Derek Collison
8888ab51f4 Fix for #2243. We were not allowing replicated acks processing for workqueues properly, only interest retention.
Signed-off-by: Derek Collison <derek@nats.io>
2021-05-24 09:53:31 -07:00
Derek Collison
308355a2fd Fix for #2242.
When we had a duplicate detected in R>1 mode we set the skip sequence indicator but were not using that when dealing with underlying store.

Signed-off-by: Derek Collison <derek@nats.io>
2021-05-24 08:21:41 -07:00
Derek Collison
9ccc843382 Removing peers should wait for RemovePeer entry replication.
Signed-off-by: Derek Collison <derek@nats.io>
2021-05-19 18:58:19 -07:00
Derek Collison
6e17b7a303 Fix for #2213
We do not want to report consumers that were created for the purpose of sources or mirrors.

Signed-off-by: Derek Collison <derek@nats.io>
2021-05-12 07:51:53 -07:00
Derek Collison
06fc2f3f06 Fix data race
Signed-off-by: Derek Collison <derek@nats.io>
2021-05-10 17:29:24 -07:00
Derek Collison
9a517194a1 Merge pull request #2191 from nats-io/raft_catchup_snap
[FIXED] Raft groups could continually spin trying to catch up.
2021-05-07 14:20:37 -07:00
Derek Collison
70a2521f95 For interest or workqueue streams with ephemerals we need to not reduce replication to 1.
We need the consumer state on the stream leader.
Also if we can't find the store yet for a consumer fallback to calculate needsAck.

Signed-off-by: Derek Collison <derek@nats.io>
2021-05-07 12:07:27 -07:00
R.I.Pienaar
b5f846a719 add domain in JS advisories
Signed-off-by: R.I.Pienaar <rip@devco.net>
2021-05-07 19:35:46 +02:00
Derek Collison
c2fcc114a5 Update based on PR feedback, moved to validateOptions
Signed-off-by: Derek Collison <derek@nats.io>
2021-05-06 20:10:44 -07:00
Derek Collison
8499376575 Add in support for JetStream domains.
This allows a domain to be set in the JetStream server block that sets a domain name.
Once set this signals that any leafnode connections should operate as separate JetStream domains.
Each domain <NAME> is accessible via "$JS.<NAME>.API.>", even when connected to the same domain.
Also for mixed mode you can set a jetstream block now that defines a domain but specifies "enabled: false".

Signed-off-by: Derek Collison <derek@nats.io>
2021-05-06 18:46:32 -06:00
Derek Collison
8bf99224c5 This adds ability to have a single node server with a system leafnode expand an existing JetStream cluster domain.
Signed-off-by: Derek Collison <derek@nats.io>
2021-04-30 16:20:32 -07:00
Derek Collison
2ac05785c3 Do not persist or snapshot consumer state after a restore.
This can lead to a data race and is not needed after being applied.

Signed-off-by: Derek Collison <derek@nats.io>
2021-04-21 18:50:38 -07:00
Derek Collison
c9c70dea33 Fix race
Signed-off-by: Derek Collison <derek@nats.io>
2021-04-21 16:17:58 -07:00
Derek Collison
3418847881 Merge pull request #2146 from nats-io/chblock
Make sure to not have the raft layer block on apply channel on exit.
2021-04-21 15:58:50 -07:00
Derek Collison
0678e649d3 Make sure to not have the raft layer block on apply channel on exit.
Signed-off-by: Derek Collison <derek@nats.io>
2021-04-21 15:52:54 -07:00
Derek Collison
50fabe261d Check for overlapping subjects on stream update.
Signed-off-by: Derek Collison <derek@nats.io>
2021-04-21 15:38:38 -07:00
Derek Collison
a181238cf0 Fix for consumer on restore being deleted
Signed-off-by: Derek Collison <derek@nats.io>
2021-04-21 06:54:54 -07:00
Derek Collison
518ff9be14 Concurrent multiple durable subscribers would cause unpredictable behaviors.
Upgraded to current Go client.

Signed-off-by: Derek Collison <derek@nats.io>
2021-04-20 19:50:24 -07:00
Derek Collison
902b9dec12 Merge pull request #2131 from nats-io/updates
General Updates and Stability Improvements
2021-04-20 13:52:39 -07:00
Derek Collison
68ddd519d2 Process upstream missing messages for mirrors better.
Signed-off-by: Derek Collison <derek@nats.io>
2021-04-19 20:15:21 -07:00
Matthias Hanel
b73be52862 [fixed] only become observer if the leaf config has raft not restricted (#2125)
If a subject in the system accounts leafnode deny_imports matches $NRG.>
then jetstream is explicitly disconnected and the server can become
leader.

Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-04-19 13:10:49 -04:00
Derek Collison
542adc4bc3 Make sure clseq does not fall below lseq
Signed-off-by: Derek Collison <derek@nats.io>
2021-04-18 18:47:33 -07:00
Derek Collison
6a7f3a3153 Cleanup error handling, fix deadlock in test
Signed-off-by: Derek Collison <derek@nats.io>
2021-04-16 13:56:54 -07:00
Derek Collison
f6a82a7c98 When messages were no longer available in an upstream stream a mirror could wedge and not resolve.
This fixes that scenario by detecting the situation and inserting skip msgs to catch up.

Signed-off-by: Derek Collison <derek@nats.io>
2021-04-13 11:46:03 -07:00
Derek Collison
755ef74855 When a cluser of leafnodes connects to a cluster or supercluster hub and they share the system account make the leafnode servers observers.
Signed-off-by: Derek Collison <derek@nats.io>
2021-04-12 17:00:55 -07:00