Commit Graph

361 Commits

Author SHA1 Message Date
Ivan Kozlovic
0873b46f67 [FIXED] LeafNode urls may be missing in INFO sent to LN connections
When a cluster of servers are having routes to each other, there
is a chance that the list of leafnode URLs maintained on each
server is not complete. This would result in LN servers connecting
to this cluster to not get the full list of possible URLs the
server could reconnect to.

Also fixed a DATA RACE that appeared when running the updated
TestLeafNodeInfoURLs test. Fixed the race and added specific
test that easily demonstrated the race: TestLeafNodeNoRaceGeneratingNonce

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-07-12 19:15:30 -06:00
Derek Collison
18a2c357e4 Merge pull request #1072 from nats-io/handshake
Report authorization error and use TLS hostname for IPs on leafnodes.
2019-07-12 14:11:53 -07:00
Derek Collison
a795920dc3 Report authorization error and use TLS hostname for IPs on leafnodes.
Signed-off-by: Derek Collison <derek@nats.io>
2019-07-12 13:57:16 -07:00
Ivan Kozlovic
37d08a6c56 [FIXED] Allow TLS InsecureSkipVerify again
This has an effect only on connections created by the server,
so routes and gateways (explicit and implicit).
Make sure that an explicit warning is printed if the insecure
property is set, but otherwise allow it.

Resolves #1062

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-07-12 12:10:28 -06:00
Derek Collison
951ae49100 Prevent multiple solicited leafnodes from forming cycles.
When a solicited leafnode comes from multiple servers that themselves are a cluster, cycles were formed.
This change allows solicited leafnodes to behave similar to gateways in that each server of a cluster
is expected to have a solicted leafnode per destination account and cluster.

We no longer forward subscription interest or messages to a cluster from a server that has a solicited leafnode.

Signed-off-by: Derek Collison <derek@nats.io>
2019-07-10 20:16:47 -07:00
Derek Collison
10d4f1ab7a Convert leafnode solicited remotes to array
Signed-off-by: Derek Collison <derek@nats.io>
2019-07-10 11:53:34 -07:00
Derek Collison
a61d32a82c Test for staggered leafnodes and sub/pub. Verifies fix for #1066
Signed-off-by: Derek Collison <derek@nats.io>
2019-07-10 09:57:43 -07:00
Derek Collison
074c87d49e Merge pull request #1060 from nats-io/gr
Make sure we route  responses across leafnodes
2019-07-08 17:07:57 -07:00
Derek Collison
49707317a1 Make sure we route responses across leafnodes
Signed-off-by: Derek Collison <derek@nats.io>
2019-07-08 16:20:40 -07:00
Derek Collison
f76a6b9a5c When a bound account's maxpayload is not the same make sure we send it to clients that can do async INFO.
Signed-off-by: Derek Collison <derek@nats.io>
2019-07-08 15:20:23 -07:00
Ivan Kozlovic
156511bba7 [FIXED] Check of maxpayload could be bypassed if size overruns int32
One could craft a PUB protocol to cause server to panic. This can
happen if the size in the PUB protocol overruns an int32.

(note that if authorization is enabled, the user would need to
authenticate first, limiting the impact).

Thank you to Aviv Sasson and Ariel Zelivansky from Twistlock
for the security report!

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-07-01 15:06:08 -06:00
Derek Collison
e83e0a7f5c Merge pull request #1048 from nats-io/ping
Stager first ping from server and suppress pings if a ping was received.
2019-07-01 12:06:32 -07:00
Derek Collison
e11a959584 Send ping when RTT update needed
Signed-off-by: Derek Collison <derek@nats.io>
2019-07-01 11:58:06 -07:00
Derek Collison
8a3db71ad5 Updates from comments
Signed-off-by: Derek Collison <derek@nats.io>
2019-07-01 08:47:13 -07:00
Derek Collison
100d0d2b02 Use default port for leafnode remote if not specified
Signed-off-by: Derek Collison <derek@nats.io>
2019-06-29 17:50:21 -07:00
Derek Collison
ebd4deb8b9 Stager first ping from server and suppress pings if a ping was received.
Signed-off-by: Derek Collison <derek@nats.io>
2019-06-29 15:43:15 -07:00
Derek Collison
5b42b99dc1 Allow operator to be inline JWT. Also preloads just warn on validation issues, do not stop starting or reloads.
We issue validation warnings now to the log.

Signed-off-by: Derek Collison <derek@nats.io>
2019-06-24 16:46:22 -07:00
Derek Collison
d1a782e014 Messages not distributed evenly when sourced from leafnode.
When messages came from a leafnode there were not being distributed evenly to the destination cluster.

Signed-off-by: Derek Collison <derek@nats.io>
2019-06-11 20:37:49 -07:00
Ivan Kozlovic
ed1901c792 Update go.mod to satisfy v2 requirements
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-06-03 19:45:47 -06:00
Derek Collison
2a8e630bf1 Fix for leafnode and dq selection over GWs
Signed-off-by: Derek Collison <derek@nats.io>
2019-06-01 16:43:54 -07:00
Derek Collison
adba6dc023 Add in leafnode bound account events for accounting
Signed-off-by: Derek Collison <derek@nats.io>
2019-05-31 16:58:27 -07:00
Derek Collison
3cf6f6a5d2 Bug fix for service import with leafnodes and gws
Signed-off-by: Derek Collison <derek@nats.io>
2019-05-31 11:22:02 -07:00
Derek Collison
874f06a212 Fix bugs on reloadAuthorization
When tls is on routes it can cause reloadAuthorization to be called.
We were assuming configured accounts, but did not copy the remote map.
This copies the remote map when transferring for configured accounts
and also handles operator mode. In operator mode we leave the accounts
in place, and if we have a memory resolver we will remove accounts that
are not longer defined or have bad claims.

Signed-off-by: Derek Collison <derek@nats.io>
2019-05-29 13:19:58 -07:00
Ivan Kozlovic
4ed08dde07 Merge pull request #1013 from nats-io/fix_gw_qinterest_loss
Fixed loss of queue subscription interest across Gateways in some cases
2019-05-26 18:23:06 -06:00
Ivan Kozlovic
ce1e6defab Fix flappers
- TestSystemAccountConnectionUpdatesStopAfterNoLocal: I believe that
  the check on number of notifications was wrong. Since we did not
  consume the ones for the connect, the expected count after the
  disconnect is 8 instead of 4.

- Possible fix GW tests complaining about number of outbound/inbound
  I think that it may be possible that connection does not succeed
  right away (remote to fully started, etc) and due to dial timeout
  and reconnect attempt delay, I suspect that when given a max time
  of 1sec to complete, it may not be enough.
  Quick change for now is to override to 2secs for now in the
  wait helpers. If that proves conclusive, we could remove the
  timeout given to these helpers.

- TestGatewaySendAllSubsBadProtocol: used a t.Fatalf() in checkFor
  instead of return fmt.Errorf().

- TestLeafNodeResetsMSGProto: this test is not about change to
  interest mode only, so to avoid possible mix of protos, delay
  a bit creation of gateway after creation of leaf node.

- Some defer s.Shutdown() were missing

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-26 17:17:08 -06:00
Ivan Kozlovic
b325cf1e4a Fixed loss of queue subscription interest across Gateways in some cases
Suppose two servers, SA in cluster A and SB in cluster B. If SA
sends a message to SB on an account for which there is no interest
at all (account not known or no subscription), SB will send an A-
and keep track that it sent an A- for this account.

When a queue subscription is created on SB, SB will send and RS+
to A because A needs to have perfect knowledge of all queue subs
in all clusters.

If then a regular subscription is also created on SB, SB will
think that it needs to send an A+ because it had sent an A- for
this account. However, SA had an entry for this account for the
queue sub. The A+ would clear the entry in the map and would cause
SA to not send messages to SB even if they would have been a
match for the queue sub on SB.

We fix this in two ways:
- Clear the possible A- in SB when sending an RS+ for queue sub
- Processing of A-/A+ to be aware of a possible entry in the map
  due to queue subs.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-25 16:27:00 -06:00
Ivan Kozlovic
48c3f7f846 Fixed some flappers
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-24 09:53:35 -06:00
Ivan Kozlovic
97ee89cc67 Check inbound GW connection connected state in parser
If the first protocol for an inbound gateway connection is not
CONNECT, reject with auth violation.

Fixes #1006

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-22 12:31:16 -06:00
Derek Collison
d7140a0fd1 Update for client rename
Signed-off-by: Derek Collison <derek@nats.io>
2019-05-10 15:11:30 -07:00
Derek Collison
acfe372d63 Changes for rename from gnatsd -> nats-server
Signed-off-by: Derek Collison <derek@nats.io>
2019-05-06 15:04:24 -07:00
Derek Collison
bed56ab9cc Make sure remotes send existing sub interest
Signed-off-by: Derek Collison <derek@nats.io>
2019-05-02 17:05:02 -07:00
Derek Collison
5292ec1598 Various fixes, init smap for leafnodes with gateways too
Signed-off-by: Derek Collison <derek@nats.io>
2019-05-02 14:22:51 -07:00
Derek Collison
1d736ccc61 Make sure we use correct MSG prefix when mixing between leafnodes and routes.
Signed-off-by: Derek Collison <derek@nats.io>
2019-05-01 15:08:20 -07:00
Ivan Kozlovic
dce9d672c1 Fixed panic with leafnode and gateway when no interest registered
Say there are 2 clusters, A and B. A client connects to A and
publishes messages on an account that B has no interest in.
Then a leaf node server connects to B (using same account than
the no-interest is for). Cluster B will ask cluster
A to switch to interest mode only for leaf node account. This
would cause a panic.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-01 13:40:17 -06:00
Derek Collison
1a6c9819f3 Added account export/import benchmark
Signed-off-by: Derek Collison <derek@nats.io>
2019-04-26 12:03:00 -07:00
Derek Collison
2ec3eaeaa9 Leafnode account based connections limits
Signed-off-by: Derek Collison <derek@nats.io>
2019-04-25 14:40:59 -07:00
Derek Collison
f320f318b7 Fixed merge conflict
Signed-off-by: Derek Collison <derek@nats.io>
2019-04-23 17:28:42 -07:00
Derek Collison
bfe83aff81 Make account lookup faster with sync.Map
Signed-off-by: Derek Collison <derek@nats.io>
2019-04-23 17:13:23 -07:00
Ivan Kozlovic
9f497a6cd4 Revert to use Sublist but use the SublistNoCache version.
Remove sub from rsubs sublist when user UNSUBs.

Fix bench test that was not actually creating a SUB per request
in the Benchmark_Gateways_Requests_CreateOneSubForEach test.
Also UNSUBs older SUBs after a certain threshold to simulate
actual req/reply.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-04-23 14:13:13 -06:00
Ivan Kozlovic
bb4e8ae0f9 Gateways: Fix race for request reply
This addresses the following race:
- client connection creates a subscription on a reply subject
- client connection sends a request
- server sends the subscription to inbound gateway
- server sends the message to outbound gateway (those may be
  to different servers)
- receiving server sends to sub interested in request subject
- app sends reply
- its server then check for interest on the reply's subject

In interestOnly mode, there is a possibility that this server
has not received the interest on the reply subject yet and would
then drop the reply.

This PR detects above scenario and will prefix the reply subject
to identify the origin cluster if it is detected that the last
subscription from the sending connection was created less than
a second ago.
Once the destination has this prefix, the destination cluster
will always send back that message to origin cluster even if
there is no registered interest.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-04-22 20:00:21 -06:00
Waldemar Quevedo
c3ee84a543 Support using SANs present in TLS cert for auth.
Also try multiple email and SANs found in cert until one valid
otherwise, default to the subject in the cert.

```
authorization {
  users [
    { user = "app.nats.dev", permissions = {
	publish {
	  allow = ["sandbox.>"]
	}
	subscribe {
	  allow = ["sandbox.>"]
	}
      }
    }
  ]
}
```

Signed-off-by: Waldemar Quevedo <wally@synadia.com>
2019-04-20 00:59:45 +09:00
Derek Collison
b1d0ec10c6 Merge pull request #959 from nats-io/add_leafnode_test
Test for leafnodes, service imports and clusters
2019-04-17 15:13:15 -07:00
Derek Collison
bfef3bd5a6 Fix for service import processing across routes for leaf nodes
Signed-off-by: Derek Collison <derek@nats.io>
2019-04-17 14:37:09 -07:00
Ivan Kozlovic
d8098c134b Reduce startup memory for gateways
Similar to #956 but for gateways code.
Also fixing route test TestLargeClusterMem.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-04-17 15:18:46 -06:00
Derek Collison
f1d06d6c5b Reduce startup memory for cluster
Signed-off-by: Derek Collison <derek@nats.io>
2019-04-17 11:05:29 -07:00
Ivan Kozlovic
bda267ec2c Add LeafNode import/export test with routes
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-04-16 19:26:28 -06:00
Ivan Kozlovic
4dd1b26cc5 Add a warning if cluster's insecure setting is enabled
For cluster, we allow to skip hostname verification from certificate.
We now print a warning when this option is enabled, both on startup
or if the property is enabled on config reload.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-04-09 17:37:53 -06:00
Ivan Kozlovic
98161722dc Merge pull request #930 from nats-io/route_send_subs_go_routine_threshold
Conditional send of routed subs from a go routine
2019-04-08 14:03:41 -06:00
Ivan Kozlovic
6b1918efb4 LeafNode: support for advertise
A server that creates a LeafNode connection to a remote cluster
will now be notified of all possible LeafNode URLs in that cluster.
The list is updated when nodes in the cluster come and go.

Also support for advertise address, similar to cluster, gateway, etc..

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-04-08 10:54:39 -06:00
Ivan Kozlovic
2a86112a30 Conditional send of routed subs from a go routine
When a route is established, it is possible that each server sends
its list of subscriptions to each other at the same time. Doing
it in place from the readLoop could then cause problems because
each side could reach a point where the outbound socket buffer
is full and no one is dequeuing data (since readLoop is doing
the send of the subs list).
We changed sending this list from a go routine. However, for small
number of subscriptions, it is not required and was causing some
of the tests to fail because of timing issues.

We will now send in place if the estimated size of all protocols
is below a give threshold (1MB).

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-03-26 17:21:33 -06:00