First, the test should be done only for the initial INFO and only
for solicited connections. Based on the content of INFO coming
from different "listen ports", use the CID and LeafNodeURLs for
the indication that we are connected to the proper port.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
For issue #1256, we cleared the possibly saved tlsName on Hanshake failure.
However, this meant that for normal use cases, if a reconnect failed for
any reason we would not be able to reconnect if it is an IP until we get
back to the URL that contained the hostname.
We now clear only if the handshake error is of x509.HostnameError type,
which include errors such as:
```
"x509: Common Name is not a valid hostname: <x>"
"x509: cannot validate certificate for <x> because it doesn't contain any IP SANs"
"x509: certificate is not valid for any names, but wanted to match <x>"
"x509: certificate is valid for <x>, not <y>"
```
Applied the same logic to solicited gateway connections, and fixed the fact
that the tlsConfig should be cloned (since we set the ServerName).
I have also made a change for leafnode connections similar to what we are
doing for gateway connections, which is to use the saved tlsName only if
tlsConfig.ServerName is empty, which may not be the case for users that
embed NATS Server and pass directly tls configuration. In other words,
if the option TLSConfig.ServerName is not empty, always use this value.
Relates to #1256
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
When the server logs information related to a connection, it uses
the connection IP and remote port as a prefix. When it was an IPv6
address, the square brackets would be missing.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Also make the wait bound to 3secs after which writeLoop will attempt
to flush. Will log if it timed out on the wait and entering with
fsp > 0. Limit the report to once every 10 minutes
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
This related to PR #1233.
The computation of the time to stall a fast producer was bogus. Fixed
that and added a unit test for the function computing this stalled
duration.
Also, in PR #1233, I had removed Gosched() when a call to flushOutbound()
realizes that the flag is already set. It was forgetting that readLoop
in some cases will call flushOutbound() in place. So there is still
value in unlock/gosched/lock again in that function.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
When an account is switched to interest-only mode due to no interest,
it was not possible to switch that account more than once. But the
function switchAccountToInterestMode() that triggers a switch could
possibly doing it more than once. This should not cause problems
but increased the number of traces in a big super cluster.
Also fixed some flappers and a data race.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
On config reload, the URL account resolver was recreated and a
Fetch() with empty account was done. Move the empty fetch test
in NewServer() instead.
Added a test that shows that fetch is no longer invoked on reload
but server reports failure on startup.
Resolves#1229
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
- All writes will now be done by the writeLoop, unless when the
writeLoop has not been started yet (likely in connection init).
- Slow consumers for non CLIENT connections will be reported but
not failed. The idea is that routes, gateway, etc.. connections
should stay connected as much as possible. However if a flush
operation times out and no data at all has been written, the
connection will be closed (regardless of type).
- Slow consumers due to max pending is only for CLIENT connections.
This allows sending of SUBs through routes, etc.. to not have
to be chunked.
- The backpressure to CLIENT connections is increased (up to 1sec)
based on the sub's connection pending bytes level.
- Connection is flushed on close from the writeLoop as to not block
the "fast path".
Some tests have been fixed and adapted since now closeConnection()
is not flushing/closing/removing connection in place.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Updated all tests that use "async" clients.
- start the writeLoop (this is in preparation for changes in the
server that will not do send-in-place for some protocols, such
as PING, etc..)
- Added missing defers in several tests
- fixed an issue in client.go where test was wrong possibly causing
a panic.
- Had to skip a test for now since it would fail without server code
change.
The next step will be ensure that all protocols are sent through
the writeLoop and that the data is properly flushed on close (important
for -ERR for instance).
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
- Call flushOutbound() for SYSTEM connections
- Flush in place in internalSendLoop when sending the shutdown event
- Fix some tests:
- missing defer client connection Close()
- ensure subs are registered and messages received before shutdown
of leafnode server to check disconnected event's stats.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
This could happen if the remote server is running but not dequeueing
from the socket. TLS connection Close() may send/read and so we
need to protect with a deadline.
For non client/leaf connection, do not call flushOutbound().
Set the write deadline regardless of handshakeComplete flag, and
set it to a low value.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Prevent sending an A- for a given account if the server has this
account registered and an internal service reply subscription.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Allow auto-rotation of log files when the size is greater than
configured limit.
The backup files have the same name than the original log file
name with the following suffix:
<log name>.yyyy.mm.dd.hh.mm.ss.micros
where:
- yyyy is the year
- mm is the month
- dd is the day
- hh is the hour
- mm is the minute
- ss is the second
- micros is the number of microseconds (to help avoid duplicates)
Resolves#1197
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Running test suite on a Windows VM, I notice several failures.
Updated the compute of the RTT to be at least 1ns. I think that
this is just an issue with the VM I am running, but that change
will have no impact for normal situations (since setting the rtt
to the very minimum duration (1ns) instead of 0) and will prevent
some tests from failing.
Because of those same timer granularity issues, I had to add some
delays between some actions in order for time.Sub()/Since() to
actually report something more than 0.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
With this test, and with previous cloneArg() code, the parsing
will cause subject to become "fff" because c.pa.subject points
to somewhere in the underlying buffer that is now overwritten
with the body payload. With proper cloneArg(), the c.pa.subject
(and other) point to the copy of argBuf.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
This is needed for mapped gateway replies. We had used an extra
token when implementing the new prefix, but it was then removed,
but the leafnode subscription on _GR_.*.*.*.> was not updated.
We now subscribe on _GR_.>
There was a test that was passing because we were using inboxes
that caused the pattern to match. Replaced with single token
reply so that it would have caught this bug.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>