Got a data race:
```
==================
WARNING: DATA RACE
Write at 0x00c001c736b0 by goroutine 605:
runtime.mapassign_faststr()
/home/travis/.gimme/versions/go1.17.8.linux.amd64/src/runtime/map_faststr.go:202 +0x0
github.com/nats-io/nats-server/v2/server.(*Account).addServiceImport()
/home/travis/gopath/src/github.com/nats-io/nats-server/server/accounts.go:1868 +0xb7b
github.com/nats-io/nats-server/v2/server.(*Account).AddServiceImportWithClaim()
...
Previous read at 0x00c001c736b0 by goroutine 301:
runtime.mapaccess2_faststr()
/home/travis/.gimme/versions/go1.17.8.linux.amd64/src/runtime/map_faststr.go:107 +0x0
github.com/nats-io/nats-server/v2/server.(*Server).registerSystemImports()
/home/travis/gopath/src/github.com/nats-io/nats-server/server/events.go:1577 +0x284
github.com/nats-io/nats-server/v2/server.(*Server).updateAccountClaimsWithRefresh()
...
```
Also, remove some condition in gateway.go on how we were checking
if a subject was a serviec reply, which was causing a test to flap.
Finally, used AckSync() in a rest (instead of m.Respond(nil)) to
prevent it from flapping.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Gateway connection will be closed and error reported if a remote
has a name that is a duplicate of the local cluster.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Also had to change all references from `path.` to `filepath.` when
dealing with files, so that it works properly on Windows.
Fixed also lots of tests to defer the shutdown of the server
after the removal of the storage, and fixed some config files
directories to use the single quote `'` to surround the file path,
again to work on Windows.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
- When detecting duplicate route, it was possible that a server
would lose track of the peer's gateway URL, which would prevent
it from gossiping that URL to inbound gateway connections
- When a server has gateways enabled and has as a remote its
own gateway, the monitoring endpoint `/varz` would include it
but without the "urls" array.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
* Redact URLs before logging or returning in error
This does not affect strings which failed to parse, and in such a scenario
there's a mix of "which evil" to accept; we can't sanely find what should be
redacted in those cases, so we leave them alone for debugging.
The JWT library returns some errors for Operator URLs, but it rejects URLs
which contain userinfo, so there can't be passwords in those and they're safe.
Fixes#2597
* Test the URL redaction auxiliary functions
* End-to-end tests for secrets in debug/trace
Create internal/testhelper and move DummyLogger there, so it can be used from
the test/ sub-dir too.
Let DummyLogger optionally accumulate all log messages, not just retain the
last-seen message.
Confirm no passwords logged by TestLeafNodeBasicAuthFailover.
Change TestNoPasswordsFromConnectTrace to check all trace messages, not just the
most recent.
Validate existing trace redaction in TestRouteToSelf.
* Test for password in solicited route reconnect debug
Updated some tests based on this change but also missing defer
connection close or server shutdown.
Fixed how the OCSP run go routine would shutdown, which would
never complete because grWG was not decremented by this go routine
prior to invoking s.Shutdown()
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Some issues that have been fixed would manifest by timeouts on
connect, unexpected memory usage on high publish message rate.
Some details:
- Replies were not always GW routed properly because we were looking
at the wrong connection's rsubs
- GW routed replies would not be found because they were tracked
in the subscription's client object, which may not be the same used
to send the reply
- Increased the mqtt timeout to wait for JS replies since in some
tests it was sometimes taking more than the original 2 seconds
- Incoming gateway messages destined for an MQTT internal subscription
may have been rejected as a no interest if the account had service imports
- Don't use time.After(), instead create explicit timer so it can
be stopped when not timing out.
- Unnecessary copy of a slice since we were converting to a string anyway.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Currently in tests, we have calls to os.Remove and os.RemoveAll where we
don't check the returned error. This hides useful error messages when
tests fail to run, such as "too many open files".
This change checks for more filesystem related errors and calls t.Fatal
if there is an error.
If a gateway is configured with an authorization block containing
username and password and accepts an unknown Gateway connection,
when initiating the outbound connection, it should use the
gateway authorization's user/pass information.
Resolves#1912
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
PR #1412 had a fix for races during implicit GW reconnection.
However, the fix was a bit too simplistic in that it was checking
only if there was any inbound gateway to decide to try to reconnect
an implicit disconnected GW. We need to check the name, not only
presence of inbound GW connections.
Related to #1412
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Presence of TLS config in any remote gateway or leafnode would
cause the config reload to fail (because TLS config internal
content may change which fails the DeepEqual check).
This PR excludes the TLS configs in such case to check for
changes in gateways and leafnodes.
Although GW and LN config reload is technically supported, this
PR updates the internal remotes' TLS configuration so that
changes/updates to TLS certificates would take effect after
a configuration reload.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Connections normally suppress sending PINGs if there was some
activity. We now force GATEWAY connections to send PINGs at the
configured interval or 15 seconds, whichever is the smallest.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Inhibit Go's default TCP keepalive settings for NATS
Go 1.13 changed the semantics of the tuning parameters for TCP keepalives, including the default value. This affects all TCP listeners. The NATS protocol has its own L7 keepalive system (PING/PONG) and the Go defaults are not a good fit for some valid deployment scenarios, while Go doesn't directly expose a working API for tuning these.
Rather than add a configuration knob and pull in another dependency (with portability issues) just disable TCP keepalives for all listeners used for speaking the NATS protocol.
Change the tests so we test the same logic. Do not change HTTP monitoring, profiling, or the websocket API listeners.
Change KeepAlive on client connections too.
This was discovered with the test TestLeafNodeWithGatewaysServerRestart
that was sometimes failing. Investigation showed that when cluster B
was shutdown, one of the server on A that had a connection from B
that just broke tried to reconnect (as part of reconnect retries of
implicit gateways) to a server in B that was in the process of shuting down.
The connection had been accepted but createGateway not called because
the server's running boolean had been set to false as part of the shutdown.
However, the connection was not closed so the server on A had a valid
connection to a dead server from cluster B. When the B cluster (now single
server) was restarted and a LeafNode connection connected to it, then
the gateway from B to A was created, that server on A did not create outbound
connection to that B server because it already had one (the zombie one).
So this PR strengthens the starting of accept loops and also make sure
that if a connection (all type of connections) is not accepted because
the server is shuting down, that connection is properly closed.
Since all accept loops had almost same code, made a generic function
that accept functions to call specific create connection functions.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Added cluster names as required for prep work for clustered JetStream. System can dynamically pick a cluster name and settle on one even in large clusters.
Signed-off-by: Derek Collison <derek@nats.io>
Say server in cluster A accepts a connection from a server in
cluster B.
The gateway is implicit, in that A does not have a configured
remote gateway to B.
Then the server in B is shutdown, which A detects and initiate
a single reconnect attempt (since it is implicit and if the
reconnect retries is not set).
While this happens, a new server in B is restarted and connects
to A. If that happens before the initial reconnect attempt
failed, A will register that new inbound and do not attempt to
solicit because it has already a remote entry for gateway B.
At this point when the reconnect to old server B fails, then
the remote GW entry is removed, and A will not create an outbound
connection to the new B server.
We fix that by checking if there is a registered inbound when
we get to the point of removing the remote on a failed implicit
reconnect. If there is one, we try the reconnection.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Websocket support can be enabled with a new websocket
configuration block:
```
websocket {
# Specify a host and port to listen for websocket connections
# listen: "host:port"
# It can also be configured with individual parameters,
# namely host and port.
# host: "hostname"
# port: 4443
# This will optionally specify what host:port for websocket
# connections to be advertised in the cluster
# advertise: "host:port"
# TLS configuration is required
tls {
cert_file: "/path/to/cert.pem"
key_file: "/path/to/key.pem"
}
# If same_origin is true, then the Origin header of the
# client request must match the request's Host.
# same_origin: true
# This list specifies the only accepted values for
# the client's request Origin header. The scheme,
# host and port must match. By convention, the
# absence of port for an http:// scheme will be 80,
# and for https:// will be 443.
# allowed_origins [
# "http://www.example.com"
# "https://www.other-example.com"
# ]
# This enables support for compressed websocket frames
# in the server. For compression to be used, both server
# and client have to support it.
# compression: true
# This is the total time allowed for the server to
# read the client request and write the response back
# to the client. This include the time needed for the
# TLS handshake.
# handshake_timeout: "2s"
}
```
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
This contains a rewrite to the services layer for exporting and importing. The code this merges to already had a first significant rewrite that moved from special interest processing to plain subscriptions.
This code changes the prior version's dealing with reverse mapping which was based mostly on thresholds and manual pruning, with some sporadic timer usage. This version uses the jetstream branch's code that understands interest and failed deliveries. So this code is much more tuned to reacting to interest changes. It also removes thresholds and goes only by interest changes or expirations based around a new service export property, response thresholds. This allows a service provider to provide semantics on how long a response should take at a maximum.
This commit also introduces formal support for service export streamed and chunked response types send an empty message to signify EOF.
This commit also includes additions to the service latency tracking such that errors are now sent, not only successful interactions. We have added a Status field and an optional Error fields to ServiceLatency.
We support the following Status codes, these are directly from HTTP.
400 Bad Request (request did not have a reply subject)
408 Request Timeout (when system detects request interest went away, old request style to make dependable)..
503 Service Unavailable (no service responders running)
504 Service Timeout (The new response threshold expired)
Signed-off-by: Derek Collison <derek@nats.io>
If a node in the cluster goes away, an async INFO is sent to
inbound gateway connections so they get a chance to update their
list of remote gateway URLs. Same happens when a node is added
to the cluster.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
When an account is switched to interest-only mode due to no interest,
it was not possible to switch that account more than once. But the
function switchAccountToInterestMode() that triggers a switch could
possibly doing it more than once. This should not cause problems
but increased the number of traces in a big super cluster.
Also fixed some flappers and a data race.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
- All writes will now be done by the writeLoop, unless when the
writeLoop has not been started yet (likely in connection init).
- Slow consumers for non CLIENT connections will be reported but
not failed. The idea is that routes, gateway, etc.. connections
should stay connected as much as possible. However if a flush
operation times out and no data at all has been written, the
connection will be closed (regardless of type).
- Slow consumers due to max pending is only for CLIENT connections.
This allows sending of SUBs through routes, etc.. to not have
to be chunked.
- The backpressure to CLIENT connections is increased (up to 1sec)
based on the sub's connection pending bytes level.
- Connection is flushed on close from the writeLoop as to not block
the "fast path".
Some tests have been fixed and adapted since now closeConnection()
is not flushing/closing/removing connection in place.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
This could happen if the remote server is running but not dequeueing
from the socket. TLS connection Close() may send/read and so we
need to protect with a deadline.
For non client/leaf connection, do not call flushOutbound().
Set the write deadline regardless of handshakeComplete flag, and
set it to a low value.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Prevent sending an A- for a given account if the server has this
account registered and an internal service reply subscription.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Running test suite on a Windows VM, I notice several failures.
Updated the compute of the RTT to be at least 1ns. I think that
this is just an issue with the VM I am running, but that change
will have no impact for normal situations (since setting the rtt
to the very minimum duration (1ns) instead of 0) and will prevent
some tests from failing.
Because of those same timer granularity issues, I had to add some
delays between some actions in order for time.Sub()/Since() to
actually report something more than 0.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
We had too much special processing, so reduced to a single wildcard
which will propagate across routes and gateways and is consistent
with gateway handling of globally routed subjects and timeouts.
Signed-off-by: Derek Collison <derek@nats.io>
Use centralized sync map to gather *client that have GW replies.
Tested with concurrent receiving clients and perf is as good as
with timer per client but reduces need of that timer per client
object.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
- Add atomic in client to skip check in processInboundClientMsg()
if value is 0. Avoids getting the lock in fast path if not needed.
- Have a timer per client instead of the global server list that
was expiring: noticed a lot of contention there when running
some perf/profiling tests. The timer is also not reset for
every timestamp that is not yet expired since this too affects
performance. Instead fires are regular interval and cleared
when map is empty after a cycle.
- Move processing of gw map rely on its own function (in inbound msg).
I have verified that this is inlined same way as when code was
directly in processInboundClientMsg.
- Use string(subj[]) for prefix detection: I have verified that
it is actually faster.
- Builds the RMSG with appends to local buffer in handleGatewayReply()
instead of using fmt.Sprintf().
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
- New prefix that includes origin server for the request
- Mapping done if request is service import or requestor has
recent subscription
- Subscription considered recent if less than 250ms
- Destination server strip GW prefix before giving to client
and restore when getting a reply on that subject
- Mapping removed aftert 250ms
- Server rejects client publish on "$GNR." (the new prefix)
- Cluster and server hash are now 8 chars long and from base 62
alphabets
- Mapped replies need to be sent to leafnode servers due to race
(cluster B sends RS+ on GW inbound then RMSG on outbound, the
RS+ may be processed later and cluster A may have given message
to LN before RS+ on reply subject. So LN needs to accept the
mapped reply but will strip to give to client and reassemble
before sending it back)
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
If cluster A configures a gateway to cluster B, the server on A
tries to connect to that server URL. If there is no server on B
at that address, but a server on B with different address connects
to server on cluster A, that server should be able to create its
outbound connection in response.
That was not the case because the configured URLs were snapshot
before the loop of trying to connect. When accepting an inbound
connection and updating the array, this new URL was not being used.
The issue is only if the server on A had no outbound connection
at that time.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
I noticed that TestNoRaceRoutedQueueAutoUnsubscribe started to
fail a lot on Travis. Running locally I could see a 45 to 50%
failures. After investigation I realized that the issue was that
we have wrongly re-used `subscription.nm` and set to -1 on unsubscribe
however, I believe that it was possible that when subscription was
closed, the server may have already picked that consumer for a delivery
which then causes nm==-1 to be bumped to 0, which was wrong.
Commenting out the subscription.close() that sets nm to -1, I could
not get the test to fail on macOS but would still get 7% failure on
Linux VM. Adding the check to see if sub is closed in deliverMsg()
completely erase the failures, even on Linux VM.
We could still use `nm` set to -1 but check on deliverMsg(), the
same way I use the closed int32 now.
Fixed some flappers.
Updated .travis.yml to failfast if one of the command in the
`script` fails. User `set -e` and `set +e` as recommended in
https://github.com/travis-ci/travis-ci/issues/1066
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
This was introduced in PR#930. The first commit had the route's
check if the flushOutbound() returned false, and if so would
locally unlock/lock the connection's lock. Unfortunately, this
was replaced in the second commit (a6aeed3a6b)
to the flushOutbound() function itself.
This causes the function closeConnection() to possibly unlock
the connection while calling flushOutbound(), which if the
connection is closed due to both a tls timeout for instance
and explicitly, it would result in the connection being scheduled
for a reconnect (if explicit gateway connection, possibly route).
Added defensive code in Gateway to register a unique outbound gateway.
Fixed a test that was now failing with newer Go version in which
they fixed url.Parse()
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
When a leafnode connection is created, the server forces all
gateway inbound connections to switch to InterestMode. Do this only
once, regardless of how many times the LN (re)connects.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>