Commit Graph

5265 Commits

Author SHA1 Message Date
Matthias Hanel
78bbcd791f [Adding] support for JS MaxBytesRequired
Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-01-12 22:57:34 -05:00
Derek Collison
08ff14a24e Merge pull request #2771 from nats-io/overflow
Implement overflow placement for JetStream streams.
2022-01-07 11:03:15 -08:00
Derek Collison
d02ad88297 Only report peers that we have seen a stats/usage update for
Signed-off-by: Derek Collison <derek@nats.io>
2022-01-07 10:42:06 -08:00
Derek Collison
16f5c95785 Update atomics placements based on feedback
Signed-off-by: Derek Collison <derek@nats.io>
2022-01-07 09:50:19 -08:00
Derek Collison
de5022ad7e Make cluster placement log more detailed
Signed-off-by: Derek Collison <derek@nats.io>
2022-01-07 07:44:30 -08:00
Derek Collison
52da55c8c6 Implement overflow placement for JetStream streams.
This allows stream placement to overflow to adjacent clusters.
We also do more balanced placement based on resources (store or mem). We can continue to expand this as well.
We also introduce an account requirement that stream configs contain a MaxBytes value.

We now track account limits and server limits more distinctly, and do not reserver server resources based on account limits themselves.

Signed-off-by: Derek Collison <derek@nats.io>
2022-01-06 19:33:08 -08:00
Ivan Kozlovic
ccc9e1621d Merge pull request #2769 from nats-io/ws_x_forwarded_for_changes
Add X-Forwarded-For IP to the client's remote address
2022-01-04 10:14:49 -07:00
Ivan Kozlovic
8d6eacc245 Add X-Forwarded-For IP to the client's remote address
Instead of replacing connection's host with value specified by
this header, we will simply add the address to the logging only.
So instead of having something like:
```
192.168.1.1:5678 - wid:10 - Client connection created
```
we could have:
```
1.2.3.4/192.168.1.1:5678 - wid:10 - Client connection created
```
As seen above, this PR simply prefixes the connection's remote address
with the header's value (if a valid IP).

Related to #2734
Resolves #2767

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-01-03 11:56:12 -07:00
Derek Collison
fbcf1aba30 Merge pull request #2766 from kfabryczny/fix_doc_link
FIX: Fix broken link to monitoring documentation
2021-12-30 08:38:25 -08:00
Klaudiusz Fabryczny
b2b33110e2 FIX: Fix broken link to monitoring documentation 2021-12-30 14:04:12 +01:00
Derek Collison
4c640113bb Merge pull request #2765 from nats-io/race-usage
Avoid race condition
2021-12-29 15:09:57 -08:00
Derek Collison
5932fa1852 Avoid deadlock, release js lock
Signed-off-by: Derek Collison <derek@nats.io>
2021-12-29 10:46:53 -08:00
Derek Collison
1a37f0963a Avoid race condition
Signed-off-by: Derek Collison <derek@nats.io>
2021-12-29 08:26:10 -08:00
Derek Collison
bd495f3b18 Merge pull request #2764 from nats-io/issue-2742
Large number of ephemeral consumers could exhaust Go runtime's max OS threads.
2021-12-29 07:42:30 -08:00
Derek Collison
c5fbb63614 JetStream ephemeral consumers could create a situation where the server would exhaust the OS thread limit - default 10k.
Under certain situations large number of consumers that are racing to update state or delete their stores during a delete
would start taking up OS threads due to blocking disk IO. When this happened and their were a bunch of Go routines becoming
runnable the Go runtime would create extra OS threads to fill in the runnable pool and would exhaust the max thread setting.

This code places a channel as a simple semaphore to limit the number of disk IO blocking OS threads.

Signed-off-by: Derek Collison <derek@nats.io>
2021-12-29 07:05:34 -08:00
Derek Collison
36d34492cd Bump to 2.7.0-beta
Signed-off-by: Derek Collison <derek@nats.io>
2021-12-27 12:04:39 -08:00
Derek Collison
34555aecca Merge pull request #2761 from nats-io/fs_partial_err
Fix for when consumer would stop working due to errPartialCache returned from fileStore.
2021-12-27 12:03:31 -08:00
Derek Collison
b7c61cd0bf Stabilize filstore to eliminate sporadic errPartialCache errors under certain situations. Related to #2732
The filestore would release a msgBlock lock while trying to load a cache block if it thought it needed to flush pending data.
With async false, this should be very rare but was possible after careful inspection.

I constructed an artificial test with sleeps throughout the filestore code to reproduce.
It involved having 2 Go routines that were through and waiting on the last msg block, and another one that was writing.
After the write, but before we flushed after releasing the lock we would also artificially sleep.
This would lead to the second read seeing the cache load was already in progress and return no error.
If the load was for a sequence before the current write sequence, and async was false, the cache fseq would be higher than what was requested.
This would cause the errPartialCache to be returned.

Once returned to the consumer level in loopAndGather, it would exit that Go routine and the consumer would cease to function.

This change removed the unlock of a msgBlock to perform and flush, ensuring that two cacheLoads would not yield the errPartialCache.

I also updated the consumer in the case this does happen in the future to not exit the loopAndGather Go routine.

Signed-off-by: Derek Collison <derek@nats.io>
2021-12-27 09:54:02 -08:00
Matthias Hanel
42ae3f5325 Merge pull request #2757 from nats-io/sys-acc-err
Fixed system account issue where the wrong struct got updated
2021-12-23 12:13:25 -05:00
Derek Collison
89b94ae650 Improved selectMsgBlock with lots of messages. Also have fetchMsg return hint about clearing cache.
Signed-off-by: Derek Collison <derek@nats.io>
2021-12-22 17:45:12 -08:00
Matthias Hanel
fe5f47f43b Fixed system account issue where the wrong struct got updated
s.fetchAccount should not be used for the system account,
 as it creates a new struct

Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-12-22 16:18:00 -05:00
Derek Collison
91042b399f Merge pull request #2755 from nats-io/acc_config_limits
Added in ability to have account limits configured in server config.
2021-12-21 19:50:38 -08:00
Derek Collison
3619241326 Merge pull request #2756 from nats-io/xacc_interest
Added test to show cross account interest for push consumers works.
2021-12-21 19:49:25 -08:00
Derek Collison
c4198d603c Added test to show cross account interest for push consumers works
Signed-off-by: Derek Collison <derek@nats.io>
2021-12-21 19:30:35 -08:00
Derek Collison
b43cb5b352 Added in ability to have account limits configured in server config.
Signed-off-by: Derek Collison <derek@nats.io>
2021-12-21 18:31:07 -08:00
Ivan Kozlovic
8203a083d6 Merge pull request #2753 from nats-io/fs_remove_last
[FIXED] JetStream: stream first/last sequence possibly reset
2021-12-21 09:15:55 -07:00
Ivan Kozlovic
7c3c9ef1ee [FIXED] JetStream: stream first/last sequence possibly reset
A low-level Filestore issue would cause a new block to be created
when the last block was empty, but the index for the new block
would not be forced to be written on disk.

The observed issue could be that with a stream with a WorkQueue
retention policy, its first/last sequence values could be reset
after a pull subscriber would have consumed all messages and
the server was restarted without a clean shutdown.
This would cause the pull subscriber to "stall" until enough
new messages are sent to reach a stream sequence that catches
up with the consumer's view of the stream first sequence prior
to the restart.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-12-20 19:08:08 -07:00
Derek Collison
0968a94265 Merge pull request #2752 from nats-io/memstore_lid
Memstore tracking of interior deletes improved.
2021-12-20 18:06:06 -08:00
Derek Collison
af4d7dbe52 Memory store tracked interior deletes for stream state, but under KV semantics this could be very large.
Actually faster to not track at all and generate on the fly. Saves lots of memory too.

When we update the stream state to include runs, etc will update this as well.

Signed-off-by: Derek Collison <derek@nats.io>
2021-12-20 17:37:16 -08:00
Derek Collison
490acf5f29 Full stream state with interior delete details not needed by recipient of snapshot
Signed-off-by: Derek Collison <derek@nats.io>
2021-12-20 17:37:07 -08:00
Ivan Kozlovic
4d71296cd7 Merge pull request #2751 from nats-io/js_raft_apply_commit_error
[FIXED] JetStream: stream blocked recovering snapshot
2021-12-20 12:06:37 -07:00
Ivan Kozlovic
299b6b53eb [FIXED] JetStream: stream blocked recovering snapshot
If a node falled behind, when catching up with the rest of the
cluster, it is possible that a lot of append entries accumulate
and the server would print warnings such as:
```
[WRN] RAFT [jZ6RvVRH - S-R3F-CQw2ImK6] <some number> append entries pending
```
It would then continously print the following warning:
```
AppendEntry failed to be placed on internal channel
```
When that happens, this node would always be shown with be running the
same number of operations behind (using `nats s info`) if there are
no new messages added to the stream, or an increasing number of
operations if there is still activity.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-12-20 11:41:34 -07:00
Ivan Kozlovic
6810e48874 Merge pull request #2750 from nats-io/fix_2745
[FIXED] JetStream: interest across gateways
2021-12-20 09:45:50 -07:00
Ivan Kozlovic
3053039ff3 [FIXED] JetStream: interest across gateways
If the interest existed prior to the initial creation of the
consumer, the gateway "watcher" would not be started, which means
that interest moving across the super-cluster after that would
not be detected.

The watcher runs every second and not sure if this is costly or
not, so we may want to go a different approach of having a separate
interest change channel that would be specific to gateways. But this
means adding a new sublist where the interest would be registered
and that sublist would need to be updated when processing GW RSub
and RUnsub?

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-12-16 17:20:16 -07:00
Matthias Hanel
3e8b66286d Js leaf deny (#2693)
Along a leaf node connection, unless the system account is shared AND the JetStream domain name is identical, the default JetStream traffic (without a domain set) will be denied.

As a consequence, all clients that wants to access a domain that is not the one in the server they are connected to, a domain name must be specified.
Affected from this change are setups where: a leaf node had no local JetStream OR the server the leaf node connected to had no local JetStream. 
One of the two accounts that are connected via a leaf node remote, must have no JetStream enabled.
The side that does not have JetStream enabled, will loose JetStream access and it's clients must set `nats.Domain` manually.

For workarounds on how to restore the old behavior, look at:
https://github.com/nats-io/nats-server/pull/2693#issuecomment-996212582

New config values added:
`default_js_domain` is a mapping from account to domain, settable when JetStream is not enabled in an account.
`extension_hint` are hints for non clustered server to start in clustered mode (and be usable to extend)
`js_domain` is a way to set the JetStream domain to use for mqtt.

Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-12-16 16:53:20 -05:00
Ivan Kozlovic
575bb4eee0 Merge pull request #2747 from nats-io/fix_tls_map_check
[FIXED] TLS map: panic for existing user but conn type not allowed
2021-12-15 12:15:32 -07:00
Ivan Kozlovic
8e5dff3e30 [FIXED] TLS map: panic for existing user but conn type not allowed
For TLS configuration with `verify_and_map` set to true, if a
connection connects and has a certificate with ID that matches
a user, but that user's `allowed_connection_types` is specified
and does not have the connection type in its list, then the
server will panic.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-12-15 10:09:18 -07:00
Ivan Kozlovic
ad4e14ffb0 Merge pull request #2744 from nats-io/fix_no_auth_check
[FIXED] Check for no_auth_user
2021-12-14 16:13:23 -07:00
Ivan Kozlovic
69525f3083 [FIXED] Check for no_auth_user
Check for a no_auth_user should be done only when no authentication
at all is provided by the user. This was not the case. For instance,
if the user provided a token, the server would still check for
no_auth_user if users are defined. It was not really an issue since
the admin cannot configure users AND token, but it is better for
the application to fail if providing a token that is actually not
being used. If the admin configures a no_auth_user, this should
be used only when no authentication is provided.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-12-14 10:00:54 -07:00
R.I.Pienaar
de3e7cab50 Merge pull request #2743 from ripienaar/jsz_panic
fixes a nil panic on jsz
2021-12-13 18:11:41 +01:00
R.I.Pienaar
1146e66f30 fixes a nil panic on jsz
Appears what happens is that the getPublicConsumers()
is called which produces a list of consumers and that
between the time the list is made and the Info() is
called the ephemeral was removed.

Signed-off-by: R.I.Pienaar <rip@devco.net>
2021-12-13 11:51:33 +01:00
Matthias Hanel
628251d11d Merge pull request #2739 from nats-io/list-missing
Adding missing entry to stream/consumer list
2021-12-09 14:35:02 -05:00
Matthias Hanel
0ba2544c5a removed suffix from "missing" list
Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-12-08 19:33:35 -05:00
Ivan Kozlovic
be066b7a21 Merge pull request #2738 from nats-io/fix_2720
[FIXED] JetStream: panic "could not decode consumer snapshot"
2021-12-08 17:16:51 -07:00
Matthias Hanel
dd735f4a18 Adding missing entry to stream/consumer list
Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-12-08 18:44:40 -05:00
Ivan Kozlovic
1b8878138a [FIXED] JetStream: panic "could not decode consumer snapshot"
Resolves #2720

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-12-08 12:22:03 -07:00
Ivan Kozlovic
f55ee21941 Merge pull request #2735 from nats-io/mqtt_ws
[ADDED] MQTT: Support for Websocket
2021-12-07 09:09:27 -07:00
Ivan Kozlovic
2e07c3f614 [ADDED] MQTT: Support for Websocket
Clients will need to connect to the Websocket port and have `/mqtt`
as the URL path.

Resolves #2433

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-12-06 16:13:13 -07:00
Ivan Kozlovic
67c345270c Merge pull request #2734 from nats-io/fix_2514
[IMPROVED] Websocket: added client IP from X-Forwarded-For header
2021-12-06 16:11:17 -07:00
Ivan Kozlovic
833f823efb [IMPROVED] Websocket: added client IP from X-Forwarded-For header
This is for cases when there is a proxy (Nginx, HAProxy, etc..)
between the client and the NATS Server. If this header is present,
the first IP is the one of the originating client and will be
used as the host/IP in server's representation of the client host.

Resolves #2514

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-12-06 15:00:22 -07:00