When we triggered a filestore msg block compact we were not properly dealing with interior deletes.
Subsequent lookups past the skipped messages would cause an error and stop delivering messages.
Signed-off-by: Derek Collison <derek@nats.io>
With the availability of a "max message per subject" for a given
stream, it is possible to replace individual streams that were
created per session with a single stream that gets all sessions
as a single message per subject, which subject is composed of
the session client ID hash.
The first time the new stream is created for a given account,
all existing MQTT session streams will be transferred to the
new mux'ed MQTT session stream.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Also if we do not have room trap add peer and process there.
Fixed a bug that would treat ephemerals same as durables during remapping after peer removal.
Signed-off-by: Derek Collison <derek@nats.io>
Under load and pressure from concurrent publishing and consuming with multiple consumers the filestore would
return a partial or no cache error to the upper layers. For consumers this could result in us skipping a stream sequence when we should not.
This change stabilizes the filestore and removes the flush state for msg blocks. I also found some bugs that did not track last sequence properly
after snapshots / restore.
Signed-off-by: Derek Collison <derek@nats.io>
Updated some tests based on this change but also missing defer
connection close or server shutdown.
Fixed how the OCSP run go routine would shutdown, which would
never complete because grWG was not decremented by this go routine
prior to invoking s.Shutdown()
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
The RootCAs was not properly set, which could prevent the server
to create a TLS connection to the account resolver with an error
such as:
```
x509: certificate signed by unknown authority
```
Resolves#1207
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
When expiring requests, the server would send 408 if interest was
still present, which can happen for pull subscribe implementations
that maintain interest for the duration of the pull subscription.
Let's keep the 408 for when a request is "force expired", that
is, a request was removed from the queue because it queue was
full but interest is still found.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
When we had partial state due to server failure or being shutdown ungracefully we could enter into a stream reset state.
The stream reset state is harsh but worked, however there was a bug that would not restart consumers that were attached.
Also if no state exists, or state was truncated, we can detect that and not go through a full reset.
Signed-off-by: Derek Collison <derek@nats.io>
Order seem to be jetStream -> jsAccount. In JetStreamUsage()
we were doing the opposite. Moved the js.mu.RLock() to encompass
the whole getting of statistics from jsa. We could do in 2 phases:
get js's RLock to get cluster info (if clustered), then get
jsa's RLock for the rest, and set cluster info with what we gathered
in the first phase.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
The locking is jetStream->Server, not the otherway around. There
was few places where lock inversion could have caused deadlock.
Also, a change made recently to solve a deadlock was causing
a race that is demonstrated with TestJetStreamRaceOnRAFTCreate.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Note since the hub will disconnect currently on a subscription from a soliciting leaf, we still do strict checks there.
We always properly check if data can flow, so we could remove the sub checks all together.
I did look into ways of returning a scoped subject for explicit allow subscriptions when presented with a wildcard, however this would have meant resolving multiple items.
E.g. allow ['foo', 'bar', 'foo.bar']
With a sub of '*' that would have to expand to ['foo', 'bar']
With a sub of '>' that would have to expand to ['foo', 'bar', "foo.bar']
With a sub of 'foo.*' that would have to expand to ['foo.bar']
I may sleep on this and revisit if I think I can get it to work properly.
Signed-off-by: Derek Collison <derek@nats.io>