Commit Graph

47 Commits

Author SHA1 Message Date
Ivan Kozlovic
c73be88ac0 Updated based on comments
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-01-06 16:57:48 -07:00
Ivan Kozlovic
947798231b [UPDATED] TCP Write and SlowConsumer handling
- All writes will now be done by the writeLoop, unless when the
  writeLoop has not been started yet (likely in connection init).
- Slow consumers for non CLIENT connections will be reported but
  not failed. The idea is that routes, gateway, etc.. connections
  should stay connected as much as possible. However if a flush
  operation times out and no data at all has been written, the
  connection will be closed (regardless of type).
- Slow consumers due to max pending is only for CLIENT connections.
  This allows sending of SUBs through routes, etc.. to not have
  to be chunked.
- The backpressure to CLIENT connections is increased (up to 1sec)
  based on the sub's connection pending bytes level.
- Connection is flushed on close from the writeLoop as to not block
  the "fast path".

Some tests have been fixed and adapted since now closeConnection()
is not flushing/closing/removing connection in place.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-12-31 15:06:27 -07:00
Derek Collison
07253c0517 Merge pull request #1196 from nats-io/daisy
Allow interest propagation with daisy-chained leafnodes
2019-11-17 17:46:23 -08:00
Derek Collison
07da68ce56 Allow interest propagation with daisy chained leafnodes
Signed-off-by: Derek Collison <derek@nats.io>
2019-11-17 17:35:20 -08:00
Ivan Kozlovic
e0bc81d0ed Make the Leafnode internal sub on _GR_.>
This is needed for mapped gateway replies. We had used an extra
token when implementing the new prefix, but it was then removed,
but the leafnode subscription on _GR_.*.*.*.> was not updated.
We now subscribe on _GR_.>
There was a test that was passing because we were using inboxes
that caused the pattern to match. Replaced with single token
reply so that it would have caught this bug.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-11-17 17:37:09 -07:00
Ivan Kozlovic
aa843945c9 Work on Gateways reply mapping
- New prefix that includes origin server for the request
- Mapping done if request is service import or requestor has
  recent subscription
- Subscription considered recent if less than 250ms
- Destination server strip GW prefix before giving to client
  and restore when getting a reply on that subject
- Mapping removed aftert 250ms
- Server rejects client publish on "$GNR." (the new prefix)
- Cluster and server hash are now 8 chars long and from base 62
  alphabets
- Mapped replies need to be sent to leafnode servers due to race
  (cluster B sends RS+ on GW inbound then RMSG on outbound, the
  RS+ may be processed later and cluster A may have given message
  to LN before RS+ on reply subject. So LN needs to accept the
  mapped reply but will strip to give to client and reassemble
  before sending it back)

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-11-06 16:06:49 -07:00
Ivan Kozlovic
cbbc21ac25 Some update to leafnode subscription handling
- Send all subs in place if smap is small
- Skip sending update until after sendAllLeafSubs() is done

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-10-30 20:01:49 -06:00
Ivan Kozlovic
51f83220c6 Fix race introduced in #1170
Code for leafnode loop detection had a data race.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-10-29 19:09:21 -06:00
Ivan Kozlovic
6bcb717722 Updates following code review
- Make "lds." a constant
- Create remote's get/reset functions for loop delay
- Bump loop delay to 30 seconds

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-10-29 17:59:15 -06:00
Ivan Kozlovic
279cab2aaf [FIXED] Detect loop between LeafNode servers
This is achieved by subscribing to a unique subject. If the LS+
protocol is coming back for the same subject on the same account,
then this indicates a loop.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-10-29 16:14:35 -06:00
Ivan Kozlovic
18a1702ba2 [ADDED] Basic auth for leafnodes
Added a way to specify which account an accepted leafnode connection
should be bound to when using simple auth (user/password).

Singleton:
```
leafnodes {
  port: ...
  authorization {
    user: leaf
    password: secret
    account: TheAccount
  }
}
```
With above configuration, if a soliciting server creates a LN connection
with url: `nats://leaf:secret@host:port`, then the accepting server
will bind the leafnode connection to the account "TheAccount". This account
need to exist otherwise the connection will be rejected.

Multi:
```
leafnodes {
  port: ...
  authorization {
    users = [
      {user: leaf1, password: secret, account: account1}
      {user: leaf2, password: secret, account: account2}
    ]
  }
}
```
With the above, if a server connects using `leaf1:secret@host:port`, then
the accepting server will bind the connection to account `account1`.

If user/password (either singleton or multi) is defined, then the connecting
server MUST provide the proper credentials otherwise the connection will
be rejected.

If no user/password info is provided, it is still possible to provide the
account the connection should be associated with:
```
leafnodes {
  port: ...
  authorization {
    account: TheAccount
  }
}
```
With the above, a connection without credentials will be bound to the
account "TheAccount".

If credentials are used (jwt, nkey or other), then the server will attempt
to authenticate and if successful associate to the account for that specific
user. If the user authentication fails (wrong password, no such user, etc..)
the connection will be also rejected.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-09-30 19:42:11 -06:00
Derek Collison
52430c304a System level services for debugging.
This is the first pass at introducing exported services to the system account for generally debugging of blackbox systems.
The first service reports number of subscribers for a given subject. The payload of the request is the subject, and optional queue group, and can contain wildcards.

Signed-off-by: Derek Collison <derek@nats.io>
2019-09-17 09:37:35 -07:00
Ivan Kozlovic
cd9f898eb0 Made a server's helper to set first ping timer
Defaults to 1sec but will be opts.PingInterval if value is lower.
All non client connections invoked this function for the first
PING.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-08-26 10:21:43 -06:00
Ivan Kozlovic
90d592e163 Leaf and Route RTT
When a leaf or route connection is created, set the first ping
timer to fire at 1sec, which will allow to compute the RTT
reasonably soon (since the PingInterval could be user configured
and set much higher).

For Route in PR #1101, I was sending the PING on receiving the
INFO which required changing bunch of tests. Changing that to
also use the first timer interval of 1sec and reverted changes
to route tests.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-08-26 09:34:17 -06:00
Ivan Kozlovic
7ca8723942 [FIXED] Some Leafnode issues
- On startup, verify that local account in leafnode (if specified
  can be found otherwise fail startup).
- At runtime, print error and continue trying to reconnect.
  Will need to decide a better approach.
- When using basic auth (user/password), it was possible for a
  solicited Leafnode connection to not use user/password when
  trying an URL that was discovered through gossip. The server
  now saves the credentials of a configured URL to use with
  the discovered ones.

Updated RouteRTT test in case RTT does not seem to be updated
because getting always the same value.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-08-23 14:08:07 -06:00
Derek Collison
8f5bc503e5 Add ability for cross account import services to return streams as well as singeltons.
Take into account tracking of response maps that are created and do proper cleanup.
Also fixes #1089 which was discovered while working on this.

Signed-off-by: Derek Collison <derek@nats.io>
2019-08-06 14:15:40 -07:00
Ivan Kozlovic
0873b46f67 [FIXED] LeafNode urls may be missing in INFO sent to LN connections
When a cluster of servers are having routes to each other, there
is a chance that the list of leafnode URLs maintained on each
server is not complete. This would result in LN servers connecting
to this cluster to not get the full list of possible URLs the
server could reconnect to.

Also fixed a DATA RACE that appeared when running the updated
TestLeafNodeInfoURLs test. Fixed the race and added specific
test that easily demonstrated the race: TestLeafNodeNoRaceGeneratingNonce

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-07-12 19:15:30 -06:00
Ivan Kozlovic
0a72993d80 Add warning for TLS insecure setting on LeafNodes
Also fix for #1071 in that we need to check remote gateways TLS
config even if main gateway section is not configured with TLS.

Related to #1071

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-07-12 17:22:57 -06:00
Derek Collison
a795920dc3 Report authorization error and use TLS hostname for IPs on leafnodes.
Signed-off-by: Derek Collison <derek@nats.io>
2019-07-12 13:57:16 -07:00
Derek Collison
951ae49100 Prevent multiple solicited leafnodes from forming cycles.
When a solicited leafnode comes from multiple servers that themselves are a cluster, cycles were formed.
This change allows solicited leafnodes to behave similar to gateways in that each server of a cluster
is expected to have a solicted leafnode per destination account and cluster.

We no longer forward subscription interest or messages to a cluster from a server that has a solicited leafnode.

Signed-off-by: Derek Collison <derek@nats.io>
2019-07-10 20:16:47 -07:00
Derek Collison
10d4f1ab7a Convert leafnode solicited remotes to array
Signed-off-by: Derek Collison <derek@nats.io>
2019-07-10 11:53:34 -07:00
Derek Collison
49707317a1 Make sure we route responses across leafnodes
Signed-off-by: Derek Collison <derek@nats.io>
2019-07-08 16:20:40 -07:00
Ivan Kozlovic
156511bba7 [FIXED] Check of maxpayload could be bypassed if size overruns int32
One could craft a PUB protocol to cause server to panic. This can
happen if the size in the PUB protocol overruns an int32.

(note that if authorization is enabled, the user would need to
authenticate first, limiting the impact).

Thank you to Aviv Sasson and Ariel Zelivansky from Twistlock
for the security report!

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-07-01 15:06:08 -06:00
Derek Collison
0f20592fb3 Made leafnode connect a Debugf to be consistent, added first connect Noticef.
Signed-off-by: Derek Collison <derek@nats.io>
2019-06-29 19:11:02 -07:00
Derek Collison
d1a782e014 Messages not distributed evenly when sourced from leafnode.
When messages came from a leafnode there were not being distributed evenly to the destination cluster.

Signed-off-by: Derek Collison <derek@nats.io>
2019-06-11 20:37:49 -07:00
Derek Collison
3cf6f6a5d2 Bug fix for service import with leafnodes and gws
Signed-off-by: Derek Collison <derek@nats.io>
2019-05-31 11:22:02 -07:00
Derek Collison
257b670ae2 Cleaned up logging for leafnodes
Signed-off-by: Derek Collison <derek@nats.io>
2019-05-30 15:53:14 -07:00
Ivan Kozlovic
d2578f9e05 Update to connect/reconnect error reports logic
Changed the introduced new option and added a new one. The idea
is to be able to differentiate between never connected and reconnected
event. The never connected situation will be logged at first attempt
and every hour (by default, configurable).
However, once connected and if trying to reconnect, will report every
attempts by default, but this is configurable too.

These two options are supported for config reload.

Related to #1000
Related to #1001
Resolves #969

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-26 17:51:01 -06:00
Ivan Kozlovic
7272e4e317 Make the error report attempts configurable
This is a continuation of #1000. Added a configuration to specify
the number of attempts at which the repeated error is reported.
The algo is now to print only the 1st attempt and when current
attempt % <this config param> == 0.

Resolves #969

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-20 16:28:48 -06:00
Ivan Kozlovic
03930ba0e4 [UPDATED] Reduce report of failed connection attempts
This applies to routes, gateways and leaf node connections.
The failed attempts will be printed at the first, after the first
minute and then every hour.
The connect/error statements now include the attempt number.

Note that in debug mode, all attempts are traced, so you may get
double trace (one for debug, one for info/error).

Resolves #969

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-20 10:13:56 -06:00
Derek Collison
042e5a539a Optimize updates for leaf node smaps.
Previously we would walk all clients bound to an account to
collect the leaf nodes for updating of the subscription maps.

Signed-off-by: Derek Collison <derek@nats.io>
2019-05-09 17:25:17 -07:00
Derek Collison
bed56ab9cc Make sure remotes send existing sub interest
Signed-off-by: Derek Collison <derek@nats.io>
2019-05-02 17:05:02 -07:00
Derek Collison
90211e5b39 Be safer on gw and sl access
Signed-off-by: Derek Collison <derek@nats.io>
2019-05-02 15:14:47 -07:00
Derek Collison
5292ec1598 Various fixes, init smap for leafnodes with gateways too
Signed-off-by: Derek Collison <derek@nats.io>
2019-05-02 14:22:51 -07:00
Derek Collison
1d736ccc61 Make sure we use correct MSG prefix when mixing between leafnodes and routes.
Signed-off-by: Derek Collison <derek@nats.io>
2019-05-01 15:08:20 -07:00
Derek Collison
2d0abd66af Bump to RC7, remove conditional panic
Signed-off-by: Derek Collison <derek@nats.io>
2019-04-25 17:09:53 -07:00
Derek Collison
2ec3eaeaa9 Leafnode account based connections limits
Signed-off-by: Derek Collison <derek@nats.io>
2019-04-25 14:40:59 -07:00
Ivan Kozlovic
9f497a6cd4 Revert to use Sublist but use the SublistNoCache version.
Remove sub from rsubs sublist when user UNSUBs.

Fix bench test that was not actually creating a SUB per request
in the Benchmark_Gateways_Requests_CreateOneSubForEach test.
Also UNSUBs older SUBs after a certain threshold to simulate
actual req/reply.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-04-23 14:13:13 -06:00
Ivan Kozlovic
41436fb787 Updates based on comments
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-04-22 20:00:21 -06:00
Ivan Kozlovic
bb4e8ae0f9 Gateways: Fix race for request reply
This addresses the following race:
- client connection creates a subscription on a reply subject
- client connection sends a request
- server sends the subscription to inbound gateway
- server sends the message to outbound gateway (those may be
  to different servers)
- receiving server sends to sub interested in request subject
- app sends reply
- its server then check for interest on the reply's subject

In interestOnly mode, there is a possibility that this server
has not received the interest on the reply subject yet and would
then drop the reply.

This PR detects above scenario and will prefix the reply subject
to identify the origin cluster if it is detected that the last
subscription from the sending connection was created less than
a second ago.
Once the destination has this prefix, the destination cluster
will always send back that message to origin cluster even if
there is no registered interest.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-04-22 20:00:21 -06:00
Ivan Kozlovic
ac49f715c4 Fixed invocations of startGoRoutine (continued)
The leafnode start of go routines for readloop and writeloop were
missing from PR #961

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-04-18 10:31:23 -06:00
Derek Collison
bfef3bd5a6 Fix for service import processing across routes for leaf nodes
Signed-off-by: Derek Collison <derek@nats.io>
2019-04-17 14:37:09 -07:00
Ivan Kozlovic
515ca5e70f LeafNode: do hostname resolution and get random one from result
This is similar to what we do with Gateways.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-04-09 16:33:19 -06:00
Ivan Kozlovic
6b1918efb4 LeafNode: support for advertise
A server that creates a LeafNode connection to a remote cluster
will now be notified of all possible LeafNode URLs in that cluster.
The list is updated when nodes in the cluster come and go.

Also support for advertise address, similar to cluster, gateway, etc..

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-04-08 10:54:39 -06:00
Derek Collison
25f51884a2 add route map updates back in
Signed-off-by: Derek Collison <derek@nats.io>
2019-03-26 09:50:53 -07:00
Derek Collison
19c4ccecb8 Better handling of inline info, bug fix for gw and leafnode interest ref count
Signed-off-by: Derek Collison <derek@nats.io>
2019-03-25 15:15:11 -07:00
Derek Collison
bacb73a403 First pass at leaf nodes. Basic functionality working, including gateways.
What is not completed:
1. TLS
2. config to bind local account.
3. Info updates for solicitor to track topology changes like a client.
4. CONNECT sent after INFO for nonce authroization.
5. Authorization
6. Services and Streams tests.
7. config file parsing.

Signed-off-by: Derek Collison <derek@nats.io>
2019-03-25 08:54:47 -07:00