Made changes to processSub() to accept subscription properties,
including the icb callback so that it is set prior to add the
subscription to the account's sublist, which prevent races.
Fixed some other racy conditions, notably in addServiceImportSub()
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
This change allows the removal of the connection and update of
the server state to be done "in place" but still delay the flushing
of and close of tcp connection to the writeLoop. With ref counting
we ensure that the reconnect happens after the flushing but not
before the state has been updated.
Had to fix some places where we may have called closeConnection()
from under the server lock since it now would deadlock for sure.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
We cannot call c.closeConnection() under the server lock because
closeConnection() can invoke server lock in some cases.
Created a test that should run without `-race` to reproduce the deadlock
(which it does) but sometimes would fail because cluster would not be
formed. This unconvered an issue with conflict resolution which
test TestRouteClusterNameConflictBetweenStaticAndDynamic() can reproduce
easily. The issue was that we were not updating a dynamic name with
the remote if the remote was non dynamic.
Resolves#1543
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
The new option Websocket.NoTLS would have to be set to true
to disable the server check that enforces TLS configuration.
Resolves#1529
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Resolves#1532
Instead of the fetched account we create a dummy account that is
expired. Any client connecting will trigger a fetch of the actual
account jwt.
This also avoids one fetch, thus the unit test was changed to reflect
this.
Unlike other resolver the memory resolver does not depend on external
systems. It is purely based on server configuration. Therefore, fetch
can be done and not finding an account means there is a configuration issue.
The check that an account has to be signed by a configured operator is
done after fetch as well. As a consequence an account claim will never
become an Account in memory.
The original check during client or leaf authentication is left in
place.
Adding unit tests.
Modifying existing tests to not rely on an account but it's name instead.
Signed-off-by: Matthias Hanel <mh@synadia.com>
This is handy for client libraries that start the server as
external executable and pass command line arguments. Without
specifying the cluster name, routes can take time to establish
and cause some tests to fail.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Previously unlimited accounts - ones who inherit server values - would
be unable to publish any messags at all
Signed-off-by: R.I.Pienaar <rip@devco.net>
If some servers in the cluster have the same connect URLs (due
to the use of client advertise), then it would be possible to
have a server sends the connect_urls INFO update to clients with
missing URLs.
Resolves#1515
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
The call sendProtoNow() should not normally be used (only when
setting up a connection when the writeloop is not yet started and
server needs to send something before being able to start the
writeLoop.
Instead, code should use enqueueProto(). For this particular
case though, use queueOutbound() directly and add to the
producer's pcd map.
Also fixed other places where we were using queueOutbound() +
flushSignal() which is what enqueueProto is doing.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
That is, if the server receives "SUB foo 1" more than once from
the same client, we would register in the client map this subscription
only once, and add to the account's sublist only once, however we
would have updated shadow subscriptions and route/gateway maps for
each SUB protocol, which would result in inability to send unsubscribe
to routes when the client goes away or unsubscribes.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
If an IPv6 address contains some "%" characters, this was causing
the connection name in log statement to mess up the Sprintf formatting.
The solution is to escape those "%" characters.
Resolves#1505
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
This was discovered with the test TestLeafNodeWithGatewaysServerRestart
that was sometimes failing. Investigation showed that when cluster B
was shutdown, one of the server on A that had a connection from B
that just broke tried to reconnect (as part of reconnect retries of
implicit gateways) to a server in B that was in the process of shuting down.
The connection had been accepted but createGateway not called because
the server's running boolean had been set to false as part of the shutdown.
However, the connection was not closed so the server on A had a valid
connection to a dead server from cluster B. When the B cluster (now single
server) was restarted and a LeafNode connection connected to it, then
the gateway from B to A was created, that server on A did not create outbound
connection to that B server because it already had one (the zombie one).
So this PR strengthens the starting of accept loops and also make sure
that if a connection (all type of connections) is not accepted because
the server is shuting down, that connection is properly closed.
Since all accept loops had almost same code, made a generic function
that accept functions to call specific create connection functions.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
In some tests, async clients are used with a running server (which
was not original intent). This resulted in the client writeLoop
to be started twice.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>