Commit Graph

54 Commits

Author SHA1 Message Date
Derek Collison
d246359dc8 Merge pull request #1028 from nats-io/leaf_gw_si
Bug fix for service import with leafnodes and gws
2019-05-31 11:29:33 -07:00
Derek Collison
3cf6f6a5d2 Bug fix for service import with leafnodes and gws
Signed-off-by: Derek Collison <derek@nats.io>
2019-05-31 11:22:02 -07:00
Ivan Kozlovic
37f4e71246 Fixed race due to use of byte slice instead of string
The go routine that is started during interest mode switch was
using the accName (which was a byte slice) instead of account,
which was a string copy of that byte slice. It meant that when
printing the notice, the underlying buffer may have be overwriten
by the readloop.

Changing accName to a string - since we were doing a copy anyway,
better change it at the function param level.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-30 18:43:01 -06:00
Ivan Kozlovic
37b3546e7b Switch gateway to InterestMode only once
When a leafnode connection is created, the server forces all
gateway inbound connections to switch to InterestMode. Do this only
once, regardless of how many times the LN (re)connects.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-30 17:21:15 -06:00
Ivan Kozlovic
66f5325cee Merge pull request #1018 from nats-io/gw_log_interest_switch
Added logging of account interest mode switch for gateways
2019-05-28 15:33:06 -06:00
Ivan Kozlovic
f5991e8a2b Merge pull request #1015 from nats-io/restore_conn_error_default_attempts_to_one
Update to connect/reconnect error reports logic
2019-05-28 14:57:29 -06:00
Ivan Kozlovic
2d4c3dd38f Added logging of account interest mode switch for gateways
Both sides will log when an account is switched to interest-only
mode. There are 2 traces (start/complete) per account.
They are logged at [INF] level.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-28 14:55:45 -06:00
Ivan Kozlovic
5478eaf01e Added /gatewayz endpoint
Such endpoint will list the gateway/cluster name, address and port
then list of outbound/inbound connections.
For each remote gateway there will be at most one outbound connection.
There can be 0 or more inbound connections for the same remote
gateway.

For each of these outbound/inbound connection, the connection info
similar to Connz is reported. Optionally, one can include the
interest mode/stats for each account.

Here are possible options:

* No specific options

http://host:port/gatewayz

* Limit to specific remote gateway, say name "B":

http://host:port/gatewayz/gw_name=B

* Include accounts (default limit to 1024 accounts)

http://host:port/gatewayz/accs=1

* Specific limit, say 200 (note accs=1 in this case is optional)

http://host:port/gatewayz/accs=1&accs_limit=200

* Specific account, say "acc_1". Note that accs=1 is not required then

http://host:port/gatewayz/acc_name=acc_1

* Above options can be mixed: specific remote gateway (B), with 100
  accounts reported

http://host:port/gatewayz/gw_name=B&accs_limit=200

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-28 12:41:09 -06:00
Ivan Kozlovic
d2578f9e05 Update to connect/reconnect error reports logic
Changed the introduced new option and added a new one. The idea
is to be able to differentiate between never connected and reconnected
event. The never connected situation will be logged at first attempt
and every hour (by default, configurable).
However, once connected and if trying to reconnect, will report every
attempts by default, but this is configurable too.

These two options are supported for config reload.

Related to #1000
Related to #1001
Resolves #969

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-26 17:51:01 -06:00
Ivan Kozlovic
b325cf1e4a Fixed loss of queue subscription interest across Gateways in some cases
Suppose two servers, SA in cluster A and SB in cluster B. If SA
sends a message to SB on an account for which there is no interest
at all (account not known or no subscription), SB will send an A-
and keep track that it sent an A- for this account.

When a queue subscription is created on SB, SB will send and RS+
to A because A needs to have perfect knowledge of all queue subs
in all clusters.

If then a regular subscription is also created on SB, SB will
think that it needs to send an A+ because it had sent an A- for
this account. However, SA had an entry for this account for the
queue sub. The A+ would clear the entry in the map and would cause
SA to not send messages to SB even if they would have been a
match for the queue sub on SB.

We fix this in two ways:
- Clear the possible A- in SB when sending an RS+ for queue sub
- Processing of A-/A+ to be aware of a possible entry in the map
  due to queue subs.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-25 16:27:00 -06:00
Ivan Kozlovic
55597a7e8b [ADDED] URLs to cluster{} in /varz and update of gateway ones
In varz's cluster{} section, there was no URLs field. This PR adds
it and displays the routes defined in the cluster{} config section.
The value gets updated should there be a config reload following
addition/removal of an url from "routes".

If config had 1 route to "nats://127.0.0.1:1234", here is what
it would look like now:
```
"cluster": {
    "addr": "0.0.0.0",
    "cluster_port": 6222,
    "auth_timeout": 1,
    "urls": [
      "127.0.0.1:1234"
    ]
  },
```
Adding route to "127.0.0.1:4567" and doing config reload:
```
"cluster": {
    "addr": "0.0.0.0",
    "cluster_port": 6222,
    "auth_timeout": 1,
    "urls": [
      "127.0.0.1:1234",
      "127.0.0.1:4567"
    ]
  },
```
Note that due to how we handle discovered servers in the cluster,
new urls dynamically discovered will not show in above output.
This could be done, but would need some changes in how we store
things (actually in this case, new urls are not stored, just
attempted to be connected. Once they connect, they would be visible
in /routez).

For gateways, however, this PR displays the combination of the
URLs defined in config and the ones that are discovered after
a connection is made to a give cluster. So say cluster A has a single
url to one server in cluster B, when connecting to that server,
the server on A will get the list of the gateway URLs that one
can connect to, and these will be reflected in /varz. So this is
a different behavior that for routes. As explained above, we could
harmonize the behavior in a future PR.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-24 13:42:41 -06:00
Ivan Kozlovic
48c3f7f846 Fixed some flappers
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-24 09:53:35 -06:00
Ivan Kozlovic
97ee89cc67 Check inbound GW connection connected state in parser
If the first protocol for an inbound gateway connection is not
CONNECT, reject with auth violation.

Fixes #1006

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-22 12:31:16 -06:00
Ivan Kozlovic
1cdc3eb41f Better randomize solicited Gateway URLs
Shuffle the array created when iterating through the gateways URLs
map since map iteration may not be well randomized with small maps.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-21 09:28:59 -06:00
Ivan Kozlovic
7272e4e317 Make the error report attempts configurable
This is a continuation of #1000. Added a configuration to specify
the number of attempts at which the repeated error is reported.
The algo is now to print only the 1st attempt and when current
attempt % <this config param> == 0.

Resolves #969

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-20 16:28:48 -06:00
Ivan Kozlovic
03930ba0e4 [UPDATED] Reduce report of failed connection attempts
This applies to routes, gateways and leaf node connections.
The failed attempts will be printed at the first, after the first
minute and then every hour.
The connect/error statements now include the attempt number.

Note that in debug mode, all attempts are traced, so you may get
double trace (one for debug, one for info/error).

Resolves #969

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-20 10:13:56 -06:00
Derek Collison
1c8d4b4b6e Make sure we are set to RMSG for send to Gateways
Signed-off-by: Derek Collison <derek@nats.io>
2019-05-01 15:31:54 -07:00
Ivan Kozlovic
dce9d672c1 Fixed panic with leafnode and gateway when no interest registered
Say there are 2 clusters, A and B. A client connects to A and
publishes messages on an account that B has no interest in.
Then a leaf node server connects to B (using same account than
the no-interest is for). Cluster B will ask cluster
A to switch to interest mode only for leaf node account. This
would cause a panic.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-01 13:40:17 -06:00
Ivan Kozlovic
9f497a6cd4 Revert to use Sublist but use the SublistNoCache version.
Remove sub from rsubs sublist when user UNSUBs.

Fix bench test that was not actually creating a SUB per request
in the Benchmark_Gateways_Requests_CreateOneSubForEach test.
Also UNSUBs older SUBs after a certain threshold to simulate
actual req/reply.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-04-23 14:13:13 -06:00
Ivan Kozlovic
41436fb787 Updates based on comments
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-04-22 20:00:21 -06:00
Ivan Kozlovic
bb4e8ae0f9 Gateways: Fix race for request reply
This addresses the following race:
- client connection creates a subscription on a reply subject
- client connection sends a request
- server sends the subscription to inbound gateway
- server sends the message to outbound gateway (those may be
  to different servers)
- receiving server sends to sub interested in request subject
- app sends reply
- its server then check for interest on the reply's subject

In interestOnly mode, there is a possibility that this server
has not received the interest on the reply subject yet and would
then drop the reply.

This PR detects above scenario and will prefix the reply subject
to identify the origin cluster if it is detected that the last
subscription from the sending connection was created less than
a second ago.
Once the destination has this prefix, the destination cluster
will always send back that message to origin cluster even if
there is no registered interest.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-04-22 20:00:21 -06:00
Ivan Kozlovic
bf07862140 Fixed invocations of startGoRoutine
Resolves #960

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-04-18 09:51:56 -06:00
Ivan Kozlovic
d8098c134b Reduce startup memory for gateways
Similar to #956 but for gateways code.
Also fixing route test TestLargeClusterMem.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-04-17 15:18:46 -06:00
Ivan Kozlovic
172eca6110 Merge pull request #922 from danielsdeleo/dan/allow-explicit-server-name-in-gateway
Allow explicit server name in tls.Config
2019-04-12 11:20:55 -06:00
Ivan Kozlovic
540b9be8e5 Reworked gateway processing of RS+ and RS-
Invoke updateInterestForAccountOnGateway() as a defer after all
locks have been released.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-03-25 14:04:34 -06:00
Derek Collison
bacb73a403 First pass at leaf nodes. Basic functionality working, including gateways.
What is not completed:
1. TLS
2. config to bind local account.
3. Info updates for solicitor to track topology changes like a client.
4. CONNECT sent after INFO for nonce authroization.
5. Authorization
6. Services and Streams tests.
7. config file parsing.

Signed-off-by: Derek Collison <derek@nats.io>
2019-03-25 08:54:47 -07:00
Ivan Kozlovic
65cc218cba [FIXED] Allow use of custom auth with config reload
Resolves #923

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-03-20 15:45:17 -06:00
danielsdeleo
8ba3abef6a Allow explicit server name in tls.Config
Signed-off-by: Daniel DeLeo <dan@chef.io>
2019-03-11 15:28:32 -07:00
Alexei Volkov
83aefdc714 [ADDED] Cluster tls insecure configuration
Based on @softkbot PR #913.
Removed the command line parameter, which then removes the need for Options.Cluster.TLSInsecure.
Added a test with config reload.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-03-11 14:48:22 -06:00
Ivan Kozlovic
3e24d70ea4 Revert moving e.Lock()/e.Unlock()
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-02-28 14:01:53 -07:00
Ivan Kozlovic
ba748302c4 Gateways: some optimizations
Check sublist only when required.
Send the subs list in place instead of go routine (gateways have
different outbound/inbound connections so they don't suffer same
issue than routes)
Bump the default array size when collecting gateway connections

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-02-28 11:16:05 -07:00
Ivan Kozlovic
18399a3808 Gateways: Rework Account Sub/Unsub
We now send A- if an account does not exists, or if there is no
interest on a given subject and no existing subscription.
An A+ is sent if an A- was previously sent and a subscription
for this account is registered.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-02-26 18:34:30 -07:00
Derek Collison
af78552549 Move ints to proper sizes for all
Signed-off-by: Derek Collison <derek@nats.io>
2019-02-05 15:19:59 -08:00
Ivan Kozlovic
7ad4498a09 Gateways: Remove unused permissions options
Permissions were configured but not implemented. Removing for now.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-01-10 09:49:36 -07:00
Ivan Kozlovic
b075c00103 [FIXED] Memory usage for failed TLS connections
Moving some of the connection initialization post TLS handshake
to avoid temporary memory growth when getting repeated failed
connections to any of the client, route and gateway ports.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-01-09 15:50:23 -07:00
Ivan Kozlovic
4719c618b3 Add some comments
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2018-12-11 07:12:33 -08:00
Ivan Kozlovic
4b70cdfc89 Fix Gateways with Service Imports
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2018-12-11 00:27:40 -08:00
Ivan Kozlovic
efd891d2ae Fix performance degradation introduced by GW code
This impacted even non gateway traffic

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2018-12-08 17:44:32 -07:00
Ivan Kozlovic
dd1b598121 Add more tracing for Gateway connect/ip
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2018-12-08 11:13:54 -07:00
Ivan Kozlovic
6eaa1dc351 Resolve IP if gateway listen is 0.0.0.0 or ::
Otherwise, this may be sent to servers in the cluster and to other
gateways which may result in attempt to connect to self which
in case of TLS would produce error.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2018-12-07 17:28:21 -07:00
Ivan Kozlovic
a9b045498a Update based on comments
Do the swapping to outbound connection only on send.
It means that those subs are stored in the inbound connection and
those are the only type of subs stored there. So on connection close
it is easy to clean them up.
Also instead of having processMsgResults have to return this sub,
simply check the size of r.psubs and if 1, the type of client
associated with it. If gateway, we know we have to do the direct
send.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2018-12-06 09:32:39 -07:00
Ivan Kozlovic
111e050d32 Allow service import to work with Gateways
This is not complete solution and is a bit hacky but is a start
to be able to have service import work at least in some basic
cases.

Also fixed a bug where replySub would not be removed from
connection's list of subs after delivery.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2018-12-05 20:35:43 -07:00
Ivan Kozlovic
0ba587249a Fixing setting of default gateway TLS Timeout
Moved setting to the default value in setBaselineOptions()
so that config reload does not fail.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2018-12-03 18:20:15 -07:00
Ivan Kozlovic
e7b6c5731e Update based on comments
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2018-12-03 17:17:55 -07:00
Ivan Kozlovic
a23ef5b740 Switch to send-all-subs when number of RS- gets too big
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2018-12-03 13:15:11 -07:00
Derek Collison
e2ce2c0cff Change to RawURLEncoding
Signed-off-by: Derek Collison <derek@nats.io>
2018-11-29 17:04:58 -08:00
Derek Collison
2a19de7963 Merge pull request #819 from nats-io/sys
Allow servers to send and receive messages directly
2018-11-29 15:40:21 -08:00
Ivan Kozlovic
f011db47c7 Fixed race issue with lookup/update of the sent no-interest map
We can't use a simple sync.Map here because the noInterest map
for inbound gateway connections are used concurrently. Indeed,
whenever an account would have been registered or a new sub created
this could trigger an update of that map in order to clear the
fact that we had sent an A-/RS- and now are sending an A+/RS+.
So changed to simple map but protected by gw connection's lock.

Without this change, server would panic if there are messages
published to cluster A that are sent to server B while a sub
is then created on matching subject on B.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2018-11-29 14:22:56 -07:00
Derek Collison
574fd62e01 Allow servers to send and receive messages directly
Signed-off-by: Derek Collison <derek@nats.io>
2018-11-29 12:15:08 -08:00
Ivan Kozlovic
086b26f14a Gateways: Ignore reference to self
Allows the use of a global include for all gateways and each
gateway will ignore its own reference.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2018-11-28 14:24:28 -07:00