This will allow a leafnode remote connection to prevent unwanted
messages to be received, or prevent local messages to be sent
to the remote server.
Configuration will be something like:
```
leafnodes {
remotes: [
{
url: "nats://localhost:6222"
deny_imports: ["foo.*", "bar"]
deny_exports: ["baz.*", "bat"]
}
]
}
```
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
We need to send the unique LDS subject to all leafnodes to properly detect setups like triangles.
This will have the server who completes the loop be the one that detects the error soley based on
its own loop detection subject.
Otehr changes are just to fix tests that were not waiting for the new LDS sub.
Signed-off-by: Derek Collison <derek@nats.io>
This is in addition to checking if the own subscription comes back.
The duplicated lds subscription must come from a different client.
Added unit tests.
Also prefixed lds with '$' to mark it as system subject going forward.
This moves the loop detection check past other checks.
These checks should not trigger in cases where a loop is initially detected.
Fixes#1305
Signed-off-by: Matthias Hanel <mh@synadia.com>
- All writes will now be done by the writeLoop, unless when the
writeLoop has not been started yet (likely in connection init).
- Slow consumers for non CLIENT connections will be reported but
not failed. The idea is that routes, gateway, etc.. connections
should stay connected as much as possible. However if a flush
operation times out and no data at all has been written, the
connection will be closed (regardless of type).
- Slow consumers due to max pending is only for CLIENT connections.
This allows sending of SUBs through routes, etc.. to not have
to be chunked.
- The backpressure to CLIENT connections is increased (up to 1sec)
based on the sub's connection pending bytes level.
- Connection is flushed on close from the writeLoop as to not block
the "fast path".
Some tests have been fixed and adapted since now closeConnection()
is not flushing/closing/removing connection in place.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Updated all tests that use "async" clients.
- start the writeLoop (this is in preparation for changes in the
server that will not do send-in-place for some protocols, such
as PING, etc..)
- Added missing defers in several tests
- fixed an issue in client.go where test was wrong possibly causing
a panic.
- Had to skip a test for now since it would fail without server code
change.
The next step will be ensure that all protocols are sent through
the writeLoop and that the data is properly flushed on close (important
for -ERR for instance).
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
- Call flushOutbound() for SYSTEM connections
- Flush in place in internalSendLoop when sending the shutdown event
- Fix some tests:
- missing defer client connection Close()
- ensure subs are registered and messages received before shutdown
of leafnode server to check disconnected event's stats.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Running test suite on a Windows VM, I notice several failures.
Updated the compute of the RTT to be at least 1ns. I think that
this is just an issue with the VM I am running, but that change
will have no impact for normal situations (since setting the rtt
to the very minimum duration (1ns) instead of 0) and will prevent
some tests from failing.
Because of those same timer granularity issues, I had to add some
delays between some actions in order for time.Sub()/Since() to
actually report something more than 0.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
- Risk of deadlock when checking if issuer claim are trusted. There
was a RLock() in one thread, then a request for Lock() in another
that was waiting for RLock() to return, but the first thread was
then doing RLock() which was not acquired because this was blocked
by the Lock() request (see e2160cc571)
- Use proper account/locking mode when checking if stream/service
exports/signer have changed.
- Account registration race (regression from https://github.com/nats-io/nats-server/pull/890)
- Move test from #890 to "no race" test since only then could it detect
the double registration.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Ensure that lookupAccount does not hold server lock during
updateAccount and fetchAccount.
Updating the account cannot have the server lock because it is
possible that during updateAccountClaims(), clients are being
removed, which would try to get the server lock (deep down in
closeConnection/s.removeClient).
Added a test that would have show the deadlock prior to changes
in this PR.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
This is the first pass at introducing exported services to the system account for generally debugging of blackbox systems.
The first service reports number of subscribers for a given subject. The payload of the request is the subject, and optional queue group, and can contain wildcards.
Signed-off-by: Derek Collison <derek@nats.io>
The timer was not set with the proper variable, which caused the
check to always think that a new timer should be created, which
would lead to more and more timers being created which translated
to updates being sent more and more frequently.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
- TestSystemAccountConnectionUpdatesStopAfterNoLocal: I believe that
the check on number of notifications was wrong. Since we did not
consume the ones for the connect, the expected count after the
disconnect is 8 instead of 4.
- Possible fix GW tests complaining about number of outbound/inbound
I think that it may be possible that connection does not succeed
right away (remote to fully started, etc) and due to dial timeout
and reconnect attempt delay, I suspect that when given a max time
of 1sec to complete, it may not be enough.
Quick change for now is to override to 2secs for now in the
wait helpers. If that proves conclusive, we could remove the
timeout given to these helpers.
- TestGatewaySendAllSubsBadProtocol: used a t.Fatalf() in checkFor
instead of return fmt.Errorf().
- TestLeafNodeResetsMSGProto: this test is not about change to
interest mode only, so to avoid possible mix of protos, delay
a bit creation of gateway after creation of leaf node.
- Some defer s.Shutdown() were missing
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
What is not completed:
1. TLS
2. config to bind local account.
3. Info updates for solicitor to track topology changes like a client.
4. CONNECT sent after INFO for nonce authroization.
5. Authorization
6. Services and Streams tests.
7. config file parsing.
Signed-off-by: Derek Collison <derek@nats.io>
Converted to authorization error events on different subject.
Add cluster name if gateways are configured and pass in INFO to clients.
Signed-off-by: Derek Collison <derek@nats.io>
Changed account lookup and validation failures to be more understandable by users.
Changed limits to be -1 for unlimited to match jwt pkg.
The limits changed exposed problems with options holding real objects causing issues with reload tests under race mode.
Longer term this code should be reworked such that options only hold config data, not real structs, etc.
Signed-off-by: Derek Collison <derek@nats.io>
Removed the code getting matching subscriptions and trying
to exclude non internal interest since as soon as there is
routing and/or gateway, it is likely that server would end-up
generating the payload and sending. May need to revisit.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Added update to parse and load operator JWTs.
Changed to add in signing keys from operator JWT to list of trusted keys.
Added URL account resolver.
Added account claim updates by system messages.
Signed-off-by: Derek Collison <derek@nats.io>
Specifically this is to support distributed tracking of number of account connections across clusters.
Gateways may not work yet based on attempts to only generate payloads when we know there is outside interest.
Signed-off-by: Derek Collison <derek@nats.io>