This PR backports the OCSP Peer feature option (as in 2.10 train) and
includes two fixes for the existing OCSP Staple feature.
OCSP Staple:
1. Fixed and clarified how NATS Server determines its own Issuer CA when
obtaining and validating an OCSP Response for subsequent staple
2. Eliminated problematic assumption that all node peers are issued by
same CA when NATS Server validates ROUTE and GATEWAY peer nodes
3. Added OCSP Response effectivity checks on ROUTE and GATEWAY
peer-presented staple
Note for #3: Allowed host clock skew between node peers set at
30-seconds. If the OCSP Response contains an empty assertion for
NextUpdate, NATS Server will default to 1-hour validity (after
ThisUpdate). It is recommended that CA OCSP Responder should assert
NextUpdate.
On reload some of the imports from the system account where going
missing on reload, this adds them back after a reload:
```
$SYS.REQ.SERVER.PING.CONNZ
$SYS.REQ.ACCOUNT.PING.STATZ
$SYS.REQ.ACCOUNT.PING.CONNZ
```
This makes configuration files that are empty, or read and processed by
the parser but with no detected values now return an error.
Fixes#4343
Backport from dev branch
(https://github.com/nats-io/nats-server/pull/4347)
This makes configuration files that are empty, or read and processed
by the parser but with no detected values now return an error.
Signed-off-by: Waldemar Quevedo <wally@nats.io>
Do not hold onto no interest subjects from a client in the unlocked cache.
If sending lots of different subjects all with no interest performance could be affected.
Signed-off-by: Derek Collison <derek@nats.io>
Resolves#4341
Three issues were found and resolved.
1. Some purge replays after recovery could execute full purge.
2. Callback was registered without lock, which could lead to skew.
3. Cluster reset could stop stream store and recreate it, which could
lead to double accounting.
Signed-off-by: Derek Collison <derek@nats.io>
Three issues were found and resolved.
1. Purge replays after recovery could execute full purge.
2. Callback was registered without lock, which could lead to skew.
3. Cluster reset could stop stream store and recreate it, which could lead to double accounting.
Signed-off-by: Derek Collison <derek@nats.io>
When a lazy simple state has an outdated first that needs to be updated,
if fseq had moved past it would panic.
This was not common but with latest fix prior in can become more common,
hence why it showed up.
Signed-off-by: Derek Collison <derek@nats.io>
This is a fix for a bad msg blk detected in the field that had sequence
holes.
The stream had max msgs per subject of one and only one subject but had
lots of messages. The stream did not recover correctly, and upon further
inspection determined that a msg blk had holes, which should not be
possible.
We now detect the holes and deal with the situation appropriately.
Heavily tested on the data dump from the field.
Signed-off-by: Derek Collison <derek@nats.io>
The stream had max msgs per subject of one and only one subject but had lots of messages.
The stream did not recover correctly, and upon further inspection determined that a msg blk had holes, which should not be possible.
We now detect the holes and deal with the situation appropriately.
Heavily tested on the data dump from the field.
Signed-off-by: Derek Collison <derek@nats.io>
Previously the Total in paged responses would always equal the size of
the first response this would stall paged clients after the first page.
Now correctly sets the total so paging continues, improves the test to
verify these aspects of the report
Previously the Total in paged responses would always equal the
size of the first response this would stall paged clients after
the first page.
Now correctly sets the total so paging continues, improves the
test to verify these aspects of the report
Signed-off-by: R.I.Pienaar <rip@devco.net>
If we created lots of hashes, beyond server names, like for consumer or
stream NRG group names etc, these maps would grow and not release
memory. Performance hit is ~300ns per call, and we can use string intern
trick if need be at a future date since it is GC friendly.
Signed-off-by: Derek Collison <derek@nats.io>
Resolves#4289
In the benchmark on my machine, this added ~300ns per call, but I think that is ok for now vs the memory usage.
Signed-off-by: Derek Collison <derek@nats.io>
When service imports were reloaded on active accounts with lots of
traffic the server could panic or lose data.
Signed-off-by: Derek Collison <derek@nats.io>