Commit Graph

7913 Commits

Author SHA1 Message Date
Neil Twigg
d7f76da597 Allow switching from limits-based to interest-based retention in stream update
Signed-off-by: Neil Twigg <neil@nats.io>
2023-08-09 11:46:49 +01:00
Neil
6eb77fd46b test: fix TestAccountImportCycle flake (#4381)
Add extra flushes to make test more precise and try to avoid timeouts

```
=== RUN   TestAccountImportCycle
    accounts_test.go:3447: require no error, but got: nats: timeout
--- FAIL: TestAccountImportCycle (1.01s)
```
2023-08-09 11:39:52 +01:00
Neil
617d69d6c7 Match --signal PIDs with globular-style expression. (#4370)
When multiple instances are running on the machine a PID argument
suffixed with a '*' character will signal all matching PIDs.

Example: `nats-server --signal reload=*`

 - [ ] Link to issue, e.g. `Resolves #NNN`
 - [ ] Documentation added (if applicable)
 - [X] Tests added
 - [X] Branch rebased on top of current ~~main~~ dev
- [X] Changes squashed to a single commit (described
[here](http://gitready.com/advanced/2009/02/10/squashing-commits-with-rebase.html))
 - [ ] Build is green in Travis CI
- [X] You have certified that the contribution is your original work and
that you license the work to the project under the [Apache 2
license](https://github.com/nats-io/nats-server/blob/main/LICENSE)
2023-08-09 11:16:56 +01:00
Neil
1e3e88b528 test: fix TestMQTTTLSVerifyAndMap on Go 1.21 (#4380)
Reported error changed slightly in Go 1.21

```
=== RUN   TestMQTTTLSVerifyAndMap
=== RUN   TestMQTTTLSVerifyAndMap/no_filtering,_client_does_not_provide_cert
    mqtt_test.go:1033: Unexpected error: Error reading: remote error: tls: certificate required
--- FAIL: TestMQTTTLSVerifyAndMap (0.04s)
```
2023-08-09 10:44:50 +01:00
Waldemar Quevedo
14a56e28dd test: fix TestAccountImportCycle flake
add extra flushes to make test more precise

Signed-off-by: Waldemar Quevedo <wally@nats.io>
2023-08-08 23:41:18 -07:00
Waldemar Quevedo
e68c411b74 test: fix TestMQTTTLSVerifyAndMap on Go 1.21
reported error changed slightly in Go 1.21

```
=== RUN   TestMQTTTLSVerifyAndMap
=== RUN   TestMQTTTLSVerifyAndMap/no_filtering,_client_does_not_provide_cert
    mqtt_test.go:1033: Unexpected error: Error reading: remote error: tls: certificate required
--- FAIL: TestMQTTTLSVerifyAndMap (0.04s)
```

Signed-off-by: Waldemar Quevedo <wally@nats.io>
2023-08-08 23:10:29 -07:00
Waldemar Quevedo
6703bd7ee3 test: fix TestFileStoreNewWriteIndexInfo hanging (#4378)
`t.Fatalf` being called while holding a lock would sometimes leave
builds hanging until test timeout.

```
=== RUN   TestFileStoreNewWriteIndexInfo/AES-GCM-None
=== RUN   TestFileStoreNewWriteIndexInfo/AES-GCM-S2
    filestore_test.go:5483: require true, but got false
No output has been received in the last 10m0s, this potentially indicates a stalled build or something wrong with the build itself.
```
2023-08-08 17:40:24 -07:00
Waldemar Quevedo
1492cf717f test: fix TestFileStoreNewWriteIndexInfo hanging
t.Fatalf being called while holding a lock would
sometimes leave builds hanging.

Signed-off-by: Waldemar Quevedo <wally@nats.io>
2023-08-08 16:41:15 -07:00
Waldemar Quevedo
961c0d7187 Add Go 1.20 to Travis and Nightly images (#4336)
Picks up https://github.com/nats-io/nats-server/pull/4297 into main.
Includes:

- Using Go 1.20 for the nightly images and Travis tests
- Drop Go 1.18
- Updates to GitHub Actions
- Upgrade to golang-ci
2023-08-08 10:36:23 -07:00
Waldemar Quevedo
0ffd455e32 test: update TestNoRaceJetStreamServiceImportAccountSwapIssue flake (#4376)
Let pull consumer in test fetch messages for slightly longer instead of
at the same time as the producer, to avoid failing due to missing a few
messages:

```
=== RUN   TestNoRaceJetStreamServiceImportAccountSwapIssue
    norace_test.go:1194: Expected to receive 14982 msgs, only got 14981
--- FAIL: TestNoRaceJetStreamServiceImportAccountSwapIssue (3.03s)
```
2023-08-08 02:01:44 -07:00
Tomasz Pietrek
b57675b24d Fix race in consumer create (#4377)
This fixes the race condition in consumer create API by adding a missing
return statement, probably introduced while solving conflicts.

Signed-off-by: Tomasz Pietrek <tomasz@nats.io>
2023-08-08 10:36:57 +02:00
Waldemar Quevedo
b081f8c2ea test: update TestNoRaceJetStreamServiceImportAccountSwapIssue flake
Signed-off-by: Waldemar Quevedo <wally@nats.io>
2023-08-08 01:07:19 -07:00
Tomasz Pietrek
54fe8cb14f Fix race in consumer create
Signed-off-by: Tomasz Pietrek <tomasz@nats.io>
2023-08-08 09:16:44 +02:00
Sylvain Rabot
64b2f5b364 Add Go 1.20 to Travis
- Use golang-ci in go test workflow

Signed-off-by: Sylvain Rabot <sylvain@abstraction.fr>
Signed-off-by: Waldemar Quevedo <wally@nats.io>
2023-08-07 17:12:20 -07:00
Waldemar Quevedo
ab5eeff1c3 test: bump timeout from TestAccountReloadServiceImportPanic (#4374)
It can take slightly longer in Travis close to the deadline so bumping
it for this test:

```
=== RUN   TestAccountReloadServiceImportPanic
--- PASS: TestAccountReloadServiceImportPanic (10.60s)
=== RUN   TestAccountReloadServiceImportPanic
    accounts_test.go:3621: Have not received all responses, want 187876 got 182649
--- FAIL: TestAccountReloadServiceImportPanic (14.09s)
```
2023-08-07 17:05:08 -07:00
Waldemar Quevedo
59b82198b6 test: fix TestClusterTLSMixedIPAndDNS test in +go1.20 (#4373)
Test would fail now with the leafnode not being able to connect due to
the following:

```
[4257] [INF] 127.0.0.1:63538 - lid:6 - Leafnode connection created for account: $G 
[4257] [INF] 127.0.0.1:63547 - lid:6 - Leafnode connection created 
[4257] [DBG] 127.0.0.1:63547 - lid:6 - Starting TLS leafnode server handshake
[4257] [DBG] 127.0.0.1:63538 - lid:6 - Starting TLS leafnode client handshake
[4257] [ERR] 127.0.0.1:63538 - lid:6 - TLS leafnode handshake error: x509: certificate is not valid for any names, but wanted to match localhost
[4257] [INF] 127.0.0.1:63538 - lid:6 - Leafnode connection closed: TLS Handshake Failure - Account: $G
[4257] [ERR] 127.0.0.1:63547 - lid:6 - TLS leafnode handshake error: remote error: tls: bad certificate
[4257] [INF] 127.0.0.1:63547 - lid:6 - Leafnode connection closed: TLS Handshake Failure
```
2023-08-07 17:04:27 -07:00
Waldemar Quevedo
2630e9b597 test: bump timeout from TestAccountReloadServiceImportPanic
It can take slightly longer in a testing environment.

Signed-off-by: Waldemar Quevedo <wally@nats.io>
2023-08-07 16:42:12 -07:00
Waldemar Quevedo
9d43fb9606 test: fix TestClusterTLSMixedIPAndDNS test on +go1.20
Signed-off-by: Waldemar Quevedo <wally@nats.io>
2023-08-07 15:11:49 -07:00
Jason Volk
9c4ae764a1 Match --signal PIDs with globular-style expression.
When multiple instances are running on the machine a PID argument suffixed with
a '*' character will signal all matching PIDs.

Example: `nats-server --signal reload=*`

Signed-off-by: Jason Volk <jason@zemos.net>
2023-08-07 10:16:05 -07:00
Derek Collison
6ca7887992 [IMPROVED] Delete blocks performance (#4371)
Track deleted with single avl.SeqSet dmap for now vs old method for
memory store.

For fileStore, we were trying to be too smart to save space at the
expense of encoding time, so revert back to simple version that is much
100x faster.
 
Size of encoding may be a bit bigger then we wanted, but we want to
prefer speed over size.

Signed-off-by: Derek Collison <derek@nats.io>
2023-08-07 09:18:48 -07:00
Waldemar Quevedo
abe0791313 Fixes to service system imports on reload also when using custom system account (#4372)
Adds back the fix from #4369 and also fixes the export that was going
missing in dev branch when a custom system account was being used.
2023-08-07 09:02:48 -07:00
Neil
c3f256ded6 Add consumer api action (#4217)
Add distinction between create and update to consumer API

As in the server there is only one API for consumer management create
and update,
if clients want to provide to the users guard against overriding
existing consumer with create operation, or accidentaly creating them
with update, they need to rely on calling `Info`.
That adds latency, traffic and load on the server and is still race'y,
as state on the server can change between the info and create calls.

This PR adds `Action` to CreateConsumerRequest, which is a non-breaking
change that allows client's to present it's intent without spliting
Consumer API into create and update.

This is not a prefect solution, but such split, to not be breaking and
does not require new API version.

TODO:
- [x] Add concrete error types to errors.json and use them
- [ ] Add ADR (after LGTM)

Signed-off-by: Tomasz Pietrek <tomasz@nats.io>
2023-08-07 10:55:57 +01:00
Jean-Noël Moyne
2d5c5d68ce Adds a few tests to verify that addConsumerWithAction also works for named ephemeral consumers as well as for durables
Signed-off-by: Jean-Noël Moyne <jnmoyne@gmail.com>
2023-08-07 08:28:21 +02:00
Tomasz Pietrek
d105e68c96 Add consumer api action for create and update
Signed-off-by: Tomasz Pietrek <tomasz@nats.io>
2023-08-07 08:28:21 +02:00
Waldemar Quevedo
6b9008c1f4 Fixes to service imports on reload
Signed-off-by: Waldemar Quevedo <wally@nats.io>
2023-08-05 18:21:01 -07:00
Derek Collison
75e1171bdd No longer compacting multiple blocks, so remove test check
Signed-off-by: Derek Collison <derek@nats.io>
2023-08-05 13:20:38 -07:00
Derek Collison
3b235059fa We were trying to be too smart to save space at the expense of encoding time for filestore.
Revert back to very simple but way faster method. Sometimes 100x faster and only ~8% size increase.

Signed-off-by: Derek Collison <derek@nats.io>
2023-08-05 12:33:30 -07:00
Derek Collison
1f00d0e3f2 Track deleted with single avl.SeqSet dmap for now vs old method.
Size of encoding may be a bit bigger then we wanted, but still way better then old method and very fast.

Signed-off-by: Derek Collison <derek@nats.io>
2023-08-05 12:32:29 -07:00
Waldemar Quevedo
0e7394a788 Remove reload fix from main (#4369)
The fix from #4360 will not work for v2.10 branch features so removing
from dev and working on a different PR.
2023-08-04 17:29:54 -07:00
Waldemar Quevedo
eecb8af997 Remove reload fix from main
This workaround will not work for v2.10 branch features

Signed-off-by: Waldemar Quevedo <wally@nats.io>
2023-08-04 16:57:39 -07:00
Derek Collison
c0c9633024 Fix for flapping test
Signed-off-by: Derek Collison <derek@nats.io>
2023-08-04 15:13:44 -07:00
Derek Collison
20532c28dd Merge branch 'main' into dev 2023-08-04 12:03:13 -07:00
Derek Collison
f2c7a9d37f Fix for flapping test
Signed-off-by: Derek Collison <derek@nats.io>
2023-08-04 12:02:59 -07:00
Derek Collison
3c57adcfe5 Bump to 2.10.0-beta.49
Signed-off-by: Derek Collison <derek@nats.io>
2023-08-04 10:16:09 -07:00
Derek Collison
8079495903 Merge branch 'main' into dev
Signed-off-by: Derek Collison <derek@nats.io>
2023-08-04 10:15:35 -07:00
Derek Collison
b2e7725aed Release v2.9.21 (#4368) 2023-08-04 07:46:27 -07:00
Byron Ruth
c1d1f11a18 Release v2.9.21
Signed-off-by: Byron Ruth <byron@nats.io>
2023-08-04 10:11:06 -04:00
Derek Collison
8c6055babc Bump to 2.9.21-RC.6
Signed-off-by: Derek Collison <derek@nats.io>
2023-08-03 13:25:14 -07:00
Derek Collison
087e14782d [IMPROVED] Also reset clseq to avoid possible immediate sequence mismatch (#4366)
Signed-off-by: Derek Collison <derek@nats.io>
2023-08-03 13:24:24 -07:00
Derek Collison
cbe85c826a Also reset clseq to avoid immediate sequence mismatch
Signed-off-by: Derek Collison <derek@nats.io>
2023-08-03 12:40:17 -07:00
Derek Collison
d522f4656c Bump to 2.9.21-RC.5
Signed-off-by: Derek Collison <derek@nats.io>
2023-08-03 11:17:51 -07:00
Derek Collison
34199ab6a8 [IMPROVED] When taking over make sure to sync and reset clfs for clustered streams. (#4365)
If the failed state of clfs drifts between leaders and followers,
replicas could discard and skip messages possibly incorrectly. This will
force sync if we have a non-zero clfs state when a leader takes over.

Signed-off-by: Derek Collison <derek@nats.io>
2023-08-03 11:17:03 -07:00
Derek Collison
66a8e81d49 Bump Go to 1.19.12 (#4364) 2023-08-03 10:45:39 -07:00
Derek Collison
081140ee67 When taking over make sure to sync and reset clfs for clustered streams.
Signed-off-by: Derek Collison <derek@nats.io>
2023-08-03 10:41:10 -07:00
Byron Ruth
af52adb1ee Bump Go to 1.19.12
Signed-off-by: Byron Ruth <byron@nats.io>
2023-08-03 11:24:58 -04:00
Derek Collison
9de5e3e64d OCSP backports and adds (#4362)
This PR backports the OCSP Peer feature option (as in 2.10 train) and
includes two fixes for the existing OCSP Staple feature.

OCSP Staple: 

1. Fixed and clarified how NATS Server determines its own Issuer CA when
obtaining and validating an OCSP Response for subsequent staple
2. Eliminated problematic assumption that all node peers are issued by
same CA when NATS Server validates ROUTE and GATEWAY peer nodes
3. Added OCSP Response effectivity checks on ROUTE and GATEWAY
peer-presented staple

Note for #3: Allowed host clock skew between node peers set at
30-seconds. If the OCSP Response contains an empty assertion for
NextUpdate, NATS Server will default to 1-hour validity (after
ThisUpdate). It is recommended that CA OCSP Responder should assert
NextUpdate.
2023-08-02 18:10:24 -07:00
Todd Beets
ac43a8d4eb Enhance OCSP peer validation for GATEWAY and ROUTE connections. Nodes no longer required to have same CA issuer. OCSP response effectivity now checked using default clock skew and default validity period if not asserted by responder. 2023-08-02 16:09:21 -07:00
Todd Beets
1f0b70d5fc Fixed local issuer determination for OCSP Staple, issue #3773 2023-08-02 11:52:36 -07:00
Todd Beets
209fcd70eb OCSP Peer Feature 2023-08-02 11:25:48 -07:00
Derek Collison
5577d18c67 Fix some system service imports going missing after reload (#4360)
On reload some of the imports from the system account where going
missing on reload, this adds them back after a reload:

```
$SYS.REQ.SERVER.PING.CONNZ
$SYS.REQ.ACCOUNT.PING.STATZ
$SYS.REQ.ACCOUNT.PING.CONNZ
```
2023-08-02 10:14:07 -07:00