This allows stream placement to overflow to adjacent clusters.
We also do more balanced placement based on resources (store or mem). We can continue to expand this as well.
We also introduce an account requirement that stream configs contain a MaxBytes value.
We now track account limits and server limits more distinctly, and do not reserver server resources based on account limits themselves.
Signed-off-by: Derek Collison <derek@nats.io>
Instead of replacing connection's host with value specified by
this header, we will simply add the address to the logging only.
So instead of having something like:
```
192.168.1.1:5678 - wid:10 - Client connection created
```
we could have:
```
1.2.3.4/192.168.1.1:5678 - wid:10 - Client connection created
```
As seen above, this PR simply prefixes the connection's remote address
with the header's value (if a valid IP).
Related to #2734Resolves#2767
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Under certain situations large number of consumers that are racing to update state or delete their stores during a delete
would start taking up OS threads due to blocking disk IO. When this happened and their were a bunch of Go routines becoming
runnable the Go runtime would create extra OS threads to fill in the runnable pool and would exhaust the max thread setting.
This code places a channel as a simple semaphore to limit the number of disk IO blocking OS threads.
Signed-off-by: Derek Collison <derek@nats.io>
The filestore would release a msgBlock lock while trying to load a cache block if it thought it needed to flush pending data.
With async false, this should be very rare but was possible after careful inspection.
I constructed an artificial test with sleeps throughout the filestore code to reproduce.
It involved having 2 Go routines that were through and waiting on the last msg block, and another one that was writing.
After the write, but before we flushed after releasing the lock we would also artificially sleep.
This would lead to the second read seeing the cache load was already in progress and return no error.
If the load was for a sequence before the current write sequence, and async was false, the cache fseq would be higher than what was requested.
This would cause the errPartialCache to be returned.
Once returned to the consumer level in loopAndGather, it would exit that Go routine and the consumer would cease to function.
This change removed the unlock of a msgBlock to perform and flush, ensuring that two cacheLoads would not yield the errPartialCache.
I also updated the consumer in the case this does happen in the future to not exit the loopAndGather Go routine.
Signed-off-by: Derek Collison <derek@nats.io>
A low-level Filestore issue would cause a new block to be created
when the last block was empty, but the index for the new block
would not be forced to be written on disk.
The observed issue could be that with a stream with a WorkQueue
retention policy, its first/last sequence values could be reset
after a pull subscriber would have consumed all messages and
the server was restarted without a clean shutdown.
This would cause the pull subscriber to "stall" until enough
new messages are sent to reach a stream sequence that catches
up with the consumer's view of the stream first sequence prior
to the restart.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Actually faster to not track at all and generate on the fly. Saves lots of memory too.
When we update the stream state to include runs, etc will update this as well.
Signed-off-by: Derek Collison <derek@nats.io>
If a node falled behind, when catching up with the rest of the
cluster, it is possible that a lot of append entries accumulate
and the server would print warnings such as:
```
[WRN] RAFT [jZ6RvVRH - S-R3F-CQw2ImK6] <some number> append entries pending
```
It would then continously print the following warning:
```
AppendEntry failed to be placed on internal channel
```
When that happens, this node would always be shown with be running the
same number of operations behind (using `nats s info`) if there are
no new messages added to the stream, or an increasing number of
operations if there is still activity.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
If the interest existed prior to the initial creation of the
consumer, the gateway "watcher" would not be started, which means
that interest moving across the super-cluster after that would
not be detected.
The watcher runs every second and not sure if this is costly or
not, so we may want to go a different approach of having a separate
interest change channel that would be specific to gateways. But this
means adding a new sublist where the interest would be registered
and that sublist would need to be updated when processing GW RSub
and RUnsub?
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Along a leaf node connection, unless the system account is shared AND the JetStream domain name is identical, the default JetStream traffic (without a domain set) will be denied.
As a consequence, all clients that wants to access a domain that is not the one in the server they are connected to, a domain name must be specified.
Affected from this change are setups where: a leaf node had no local JetStream OR the server the leaf node connected to had no local JetStream.
One of the two accounts that are connected via a leaf node remote, must have no JetStream enabled.
The side that does not have JetStream enabled, will loose JetStream access and it's clients must set `nats.Domain` manually.
For workarounds on how to restore the old behavior, look at:
https://github.com/nats-io/nats-server/pull/2693#issuecomment-996212582
New config values added:
`default_js_domain` is a mapping from account to domain, settable when JetStream is not enabled in an account.
`extension_hint` are hints for non clustered server to start in clustered mode (and be usable to extend)
`js_domain` is a way to set the JetStream domain to use for mqtt.
Signed-off-by: Matthias Hanel <mh@synadia.com>
For TLS configuration with `verify_and_map` set to true, if a
connection connects and has a certificate with ID that matches
a user, but that user's `allowed_connection_types` is specified
and does not have the connection type in its list, then the
server will panic.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Check for a no_auth_user should be done only when no authentication
at all is provided by the user. This was not the case. For instance,
if the user provided a token, the server would still check for
no_auth_user if users are defined. It was not really an issue since
the admin cannot configure users AND token, but it is better for
the application to fail if providing a token that is actually not
being used. If the admin configures a no_auth_user, this should
be used only when no authentication is provided.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Appears what happens is that the getPublicConsumers()
is called which produces a list of consumers and that
between the time the list is made and the Info() is
called the ephemeral was removed.
Signed-off-by: R.I.Pienaar <rip@devco.net>
This is for cases when there is a proxy (Nginx, HAProxy, etc..)
between the client and the NATS Server. If this header is present,
the first IP is the one of the originating client and will be
used as the host/IP in server's representation of the client host.
Resolves#2514
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>