Under load we could have a message committed to the underlying store when a consumer was being created and then it increase num pending again when the stream signals the consumers.
This fix just remembers the last seq of the state when we calculate sgap and test before adding in the stream code.
Signed-off-by: Derek Collison <derek@nats.io>
Some operations could cause the route to block due to lock being
held during store operations. On macOS, having lots of streams/consumers
and restarting the cluster would cause lots of concurrent IO that
would cause lock to be held for too long, causing head-of-line
blocking in processing of messages from a route.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
When we want to track service import response interest across a leafnode we need to send sub and unsub for all response _R_ subjects versus using a wildcard.
Signed-off-by: Derek Collison <derek@nats.io>
Rename function
More easily read math
merged functions together
Changed from predefining error
Fix empty string issue
use same function for max mem store
Cleaned up code, made more consistent, utilize loopAndGather.
Allow pull consumers to have AckAll as well as AckExplicit.
Signed-off-by: Derek Collison <derek@nats.io>
This allows stream placement to overflow to adjacent clusters.
We also do more balanced placement based on resources (store or mem). We can continue to expand this as well.
We also introduce an account requirement that stream configs contain a MaxBytes value.
We now track account limits and server limits more distinctly, and do not reserver server resources based on account limits themselves.
Signed-off-by: Derek Collison <derek@nats.io>
Instead of replacing connection's host with value specified by
this header, we will simply add the address to the logging only.
So instead of having something like:
```
192.168.1.1:5678 - wid:10 - Client connection created
```
we could have:
```
1.2.3.4/192.168.1.1:5678 - wid:10 - Client connection created
```
As seen above, this PR simply prefixes the connection's remote address
with the header's value (if a valid IP).
Related to #2734Resolves#2767
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Under certain situations large number of consumers that are racing to update state or delete their stores during a delete
would start taking up OS threads due to blocking disk IO. When this happened and their were a bunch of Go routines becoming
runnable the Go runtime would create extra OS threads to fill in the runnable pool and would exhaust the max thread setting.
This code places a channel as a simple semaphore to limit the number of disk IO blocking OS threads.
Signed-off-by: Derek Collison <derek@nats.io>
The filestore would release a msgBlock lock while trying to load a cache block if it thought it needed to flush pending data.
With async false, this should be very rare but was possible after careful inspection.
I constructed an artificial test with sleeps throughout the filestore code to reproduce.
It involved having 2 Go routines that were through and waiting on the last msg block, and another one that was writing.
After the write, but before we flushed after releasing the lock we would also artificially sleep.
This would lead to the second read seeing the cache load was already in progress and return no error.
If the load was for a sequence before the current write sequence, and async was false, the cache fseq would be higher than what was requested.
This would cause the errPartialCache to be returned.
Once returned to the consumer level in loopAndGather, it would exit that Go routine and the consumer would cease to function.
This change removed the unlock of a msgBlock to perform and flush, ensuring that two cacheLoads would not yield the errPartialCache.
I also updated the consumer in the case this does happen in the future to not exit the loopAndGather Go routine.
Signed-off-by: Derek Collison <derek@nats.io>