When a leafnode connection is created, the server forces all
gateway inbound connections to switch to InterestMode. Do this only
once, regardless of how many times the LN (re)connects.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Both sides will log when an account is switched to interest-only
mode. There are 2 traces (start/complete) per account.
They are logged at [INF] level.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Such endpoint will list the gateway/cluster name, address and port
then list of outbound/inbound connections.
For each remote gateway there will be at most one outbound connection.
There can be 0 or more inbound connections for the same remote
gateway.
For each of these outbound/inbound connection, the connection info
similar to Connz is reported. Optionally, one can include the
interest mode/stats for each account.
Here are possible options:
* No specific options
http://host:port/gatewayz
* Limit to specific remote gateway, say name "B":
http://host:port/gatewayz/gw_name=B
* Include accounts (default limit to 1024 accounts)
http://host:port/gatewayz/accs=1
* Specific limit, say 200 (note accs=1 in this case is optional)
http://host:port/gatewayz/accs=1&accs_limit=200
* Specific account, say "acc_1". Note that accs=1 is not required then
http://host:port/gatewayz/acc_name=acc_1
* Above options can be mixed: specific remote gateway (B), with 100
accounts reported
http://host:port/gatewayz/gw_name=B&accs_limit=200
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
- TestSystemAccountConnectionUpdatesStopAfterNoLocal: I believe that
the check on number of notifications was wrong. Since we did not
consume the ones for the connect, the expected count after the
disconnect is 8 instead of 4.
- Possible fix GW tests complaining about number of outbound/inbound
I think that it may be possible that connection does not succeed
right away (remote to fully started, etc) and due to dial timeout
and reconnect attempt delay, I suspect that when given a max time
of 1sec to complete, it may not be enough.
Quick change for now is to override to 2secs for now in the
wait helpers. If that proves conclusive, we could remove the
timeout given to these helpers.
- TestGatewaySendAllSubsBadProtocol: used a t.Fatalf() in checkFor
instead of return fmt.Errorf().
- TestLeafNodeResetsMSGProto: this test is not about change to
interest mode only, so to avoid possible mix of protos, delay
a bit creation of gateway after creation of leaf node.
- Some defer s.Shutdown() were missing
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Suppose two servers, SA in cluster A and SB in cluster B. If SA
sends a message to SB on an account for which there is no interest
at all (account not known or no subscription), SB will send an A-
and keep track that it sent an A- for this account.
When a queue subscription is created on SB, SB will send and RS+
to A because A needs to have perfect knowledge of all queue subs
in all clusters.
If then a regular subscription is also created on SB, SB will
think that it needs to send an A+ because it had sent an A- for
this account. However, SA had an entry for this account for the
queue sub. The A+ would clear the entry in the map and would cause
SA to not send messages to SB even if they would have been a
match for the queue sub on SB.
We fix this in two ways:
- Clear the possible A- in SB when sending an RS+ for queue sub
- Processing of A-/A+ to be aware of a possible entry in the map
due to queue subs.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
This addresses the following race:
- client connection creates a subscription on a reply subject
- client connection sends a request
- server sends the subscription to inbound gateway
- server sends the message to outbound gateway (those may be
to different servers)
- receiving server sends to sub interested in request subject
- app sends reply
- its server then check for interest on the reply's subject
In interestOnly mode, there is a possibility that this server
has not received the interest on the reply subject yet and would
then drop the reply.
This PR detects above scenario and will prefix the reply subject
to identify the origin cluster if it is detected that the last
subscription from the sending connection was created less than
a second ago.
Once the destination has this prefix, the destination cluster
will always send back that message to origin cluster even if
there is no registered interest.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Checks that if not provided server fails to connect to remote
gateway. Once set to expected hostname ("localhost"), connection
works.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
We now send A- if an account does not exists, or if there is no
interest on a given subject and no existing subscription.
An A+ is sent if an A- was previously sent and a subscription
for this account is registered.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Otherwise, this may be sent to servers in the cluster and to other
gateways which may result in attempt to connect to self which
in case of TLS would produce error.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
This is not complete solution and is a bit hacky but is a start
to be able to have service import work at least in some basic
cases.
Also fixed a bug where replySub would not be removed from
connection's list of subs after delivery.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Changed account lookup and validation failures to be more understandable by users.
Changed limits to be -1 for unlimited to match jwt pkg.
The limits changed exposed problems with options holding real objects causing issues with reload tests under race mode.
Longer term this code should be reworked such that options only hold config data, not real structs, etc.
Signed-off-by: Derek Collison <derek@nats.io>
We can't use a simple sync.Map here because the noInterest map
for inbound gateway connections are used concurrently. Indeed,
whenever an account would have been registered or a new sub created
this could trigger an update of that map in order to clear the
fact that we had sent an A-/RS- and now are sending an A+/RS+.
So changed to simple map but protected by gw connection's lock.
Without this change, server would panic if there are messages
published to cluster A that are sent to server B while a sub
is then created on matching subject on B.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Need to make sure message is received before unsub'ing because
otherwise it would be possible that the unsub happens before
message is delivered, which would have resulted in an RS- while
we were expecting the message to not cause one.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
- If/when splitting buffer to pass to queueOutbound(), it has to
be include full protocol.
- Fix counting of total queue subs
- Fix tests
- Send RS- if no plain sub interest even if there is queue sub
interest.
- Removed a one-liner function
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
- Solve RS+ with wildcards
- Solve issue with messages not send to remote gateways queue subs
if there was a qsub on local server.
- Made rcache a perAccountCache since it is now used by routes and
gateways
- Order outbound gateways only on RTT updates
- Print a server's gateway name on startup
- Augment/add some tests
- Update TLS handling: when connecting, use hostname for ServerName
if url is not IP, otherwise use a hostname that we saved when
parsing/adding URLs for the remote gateway.
- Send big buffer in chunks if needed.
- Add caching for qsubs match
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>