We need to send the unique LDS subject to all leafnodes to properly detect setups like triangles.
This will have the server who completes the loop be the one that detects the error soley based on
its own loop detection subject.
Otehr changes are just to fix tests that were not waiting for the new LDS sub.
Signed-off-by: Derek Collison <derek@nats.io>
This is needed to put those accounts into interest only mode for inbound gateway connections. Also added code
to make sure we were doing proper account tracking and would track the global account as well, which used to
be excluded.
Fixes#977
Signed-off-by: Derek Collison <derek@nats.io>
This is in addition to checking if the own subscription comes back.
The duplicated lds subscription must come from a different client.
Added unit tests.
Also prefixed lds with '$' to mark it as system subject going forward.
This moves the loop detection check past other checks.
These checks should not trigger in cases where a loop is initially detected.
Fixes#1305
Signed-off-by: Matthias Hanel <mh@synadia.com>
On connect message resend reset timer with setFirstPingTimer, so RTT can
be obtained quicker.
Disable short first ping in default server options for client_test.
In log_test prevent immediate scheduling by setting ping interval.
Signed-off-by: Matthias Hanel <mh@synadia.com>
For issue #1256, we cleared the possibly saved tlsName on Hanshake failure.
However, this meant that for normal use cases, if a reconnect failed for
any reason we would not be able to reconnect if it is an IP until we get
back to the URL that contained the hostname.
We now clear only if the handshake error is of x509.HostnameError type,
which include errors such as:
```
"x509: Common Name is not a valid hostname: <x>"
"x509: cannot validate certificate for <x> because it doesn't contain any IP SANs"
"x509: certificate is not valid for any names, but wanted to match <x>"
"x509: certificate is valid for <x>, not <y>"
```
Applied the same logic to solicited gateway connections, and fixed the fact
that the tlsConfig should be cloned (since we set the ServerName).
I have also made a change for leafnode connections similar to what we are
doing for gateway connections, which is to use the saved tlsName only if
tlsConfig.ServerName is empty, which may not be the case for users that
embed NATS Server and pass directly tls configuration. In other words,
if the option TLSConfig.ServerName is not empty, always use this value.
Relates to #1256
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
When an account is switched to interest-only mode due to no interest,
it was not possible to switch that account more than once. But the
function switchAccountToInterestMode() that triggers a switch could
possibly doing it more than once. This should not cause problems
but increased the number of traces in a big super cluster.
Also fixed some flappers and a data race.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
- All writes will now be done by the writeLoop, unless when the
writeLoop has not been started yet (likely in connection init).
- Slow consumers for non CLIENT connections will be reported but
not failed. The idea is that routes, gateway, etc.. connections
should stay connected as much as possible. However if a flush
operation times out and no data at all has been written, the
connection will be closed (regardless of type).
- Slow consumers due to max pending is only for CLIENT connections.
This allows sending of SUBs through routes, etc.. to not have
to be chunked.
- The backpressure to CLIENT connections is increased (up to 1sec)
based on the sub's connection pending bytes level.
- Connection is flushed on close from the writeLoop as to not block
the "fast path".
Some tests have been fixed and adapted since now closeConnection()
is not flushing/closing/removing connection in place.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Updated all tests that use "async" clients.
- start the writeLoop (this is in preparation for changes in the
server that will not do send-in-place for some protocols, such
as PING, etc..)
- Added missing defers in several tests
- fixed an issue in client.go where test was wrong possibly causing
a panic.
- Had to skip a test for now since it would fail without server code
change.
The next step will be ensure that all protocols are sent through
the writeLoop and that the data is properly flushed on close (important
for -ERR for instance).
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Running test suite on a Windows VM, I notice several failures.
Updated the compute of the RTT to be at least 1ns. I think that
this is just an issue with the VM I am running, but that change
will have no impact for normal situations (since setting the rtt
to the very minimum duration (1ns) instead of 0) and will prevent
some tests from failing.
Because of those same timer granularity issues, I had to add some
delays between some actions in order for time.Sub()/Since() to
actually report something more than 0.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
This is needed for mapped gateway replies. We had used an extra
token when implementing the new prefix, but it was then removed,
but the leafnode subscription on _GR_.*.*.*.> was not updated.
We now subscribe on _GR_.>
There was a test that was passing because we were using inboxes
that caused the pattern to match. Replaced with single token
reply so that it would have caught this bug.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
- Risk of deadlock when checking if issuer claim are trusted. There
was a RLock() in one thread, then a request for Lock() in another
that was waiting for RLock() to return, but the first thread was
then doing RLock() which was not acquired because this was blocked
by the Lock() request (see e2160cc571)
- Use proper account/locking mode when checking if stream/service
exports/signer have changed.
- Account registration race (regression from https://github.com/nats-io/nats-server/pull/890)
- Move test from #890 to "no race" test since only then could it detect
the double registration.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
We had too much special processing, so reduced to a single wildcard
which will propagate across routes and gateways and is consistent
with gateway handling of globally routed subjects and timeouts.
Signed-off-by: Derek Collison <derek@nats.io>
Changed code on Windows to not use svc code if running in interactive
mode. The original code was running svc.debug.Run() which uses service
code (Execute()) but from the command line. We don't need that.
Also reduced salt on bcrypt password for a config file that started
to cause failures due to test taking too long to finish.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
- New prefix that includes origin server for the request
- Mapping done if request is service import or requestor has
recent subscription
- Subscription considered recent if less than 250ms
- Destination server strip GW prefix before giving to client
and restore when getting a reply on that subject
- Mapping removed aftert 250ms
- Server rejects client publish on "$GNR." (the new prefix)
- Cluster and server hash are now 8 chars long and from base 62
alphabets
- Mapped replies need to be sent to leafnode servers due to race
(cluster B sends RS+ on GW inbound then RMSG on outbound, the
RS+ may be processed later and cluster A may have given message
to LN before RS+ on reply subject. So LN needs to accept the
mapped reply but will strip to give to client and reassemble
before sending it back)
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Allow disabling of short first ping timer for clients.
Adjust names so that full test suite results are aligned.
Removed the account lookup, we use sync.Map but also a no-lock cache.
Signed-off-by: Derek Collison <derek@nats.io>
Ivan had the idea of using the CONNECT to establish a first estimate of RTT
without additional PING/PONGs.
Signed-off-by: Derek Collison <derek@nats.io>
If a client RTT for a requestor is longer than a service RTT, the requestor latency was often zero.
We now wait for the RTT (if zero) before sending out the metric.
Signed-off-by: Derek Collison <derek@nats.io>
This is achieved by subscribing to a unique subject. If the LS+
protocol is coming back for the same subject on the same account,
then this indicates a loop.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
1. Accounts will show up in connection info if auth=1.
2. You can filter by user (?auth=1&user=ivan) or account (?auth=1&acc=eng)
Signed-off-by: Derek Collison <derek@nats.io>