Commit Graph

62 Commits

Author SHA1 Message Date
Derek Collison
dc55356096 Have events look at whether or not a leaf is a hub, regardless of solicit
Signed-off-by: Derek Collison <derek@nats.io>
2020-04-13 15:25:21 -07:00
Derek Collison
82f585d83a Updated to also resend leafnode connect on GW connect via first INFO
Signed-off-by: Derek Collison <derek@nats.io>
2020-04-08 09:55:19 -07:00
Derek Collison
43fbe0ffed This commit allows new servers ina supercluster to be informed of accounts with active leafnode connections.
This is needed to put those accounts into interest only mode for inbound gateway connections. Also added code
to make sure we were doing proper account tracking and would track the global account as well, which used to
be excluded.

Fixes #977

Signed-off-by: Derek Collison <derek@nats.io>
2020-04-07 16:22:15 -07:00
Matthias Hanel
6a1c3fc29b Moving inbound tracing to the caller (client.parse)
Tracing for outgoing operations is always done while
holding the client lock.

Signed-off-by: Matthias Hanel <mh@synadia.com>
2020-03-04 17:31:18 -05:00
Matthias Hanel
f5bd07b36c [FIXED] trace/debug/sys_log reload will affect existing clients
Fixed #1296, by altering client state on reload

Detect a trace level change on reload and update all clients.
To avoid data races, read client.trace while holding the lock,
pass the value into functionis that trace while not holding the lock.
Delete unused client.debug.

Signed-off-by: Matthias Hanel <mh@synadia.com>
2020-03-04 13:54:15 -05:00
Matthias Hanel
bf952a3807 Adding option to enable tracing the system account. (default: false)
Use sys_trace option in config file or --sys_trace on the command line

Signed-off-by: Matthias Hanel <mh@synadia.com>
2020-03-01 19:42:40 -05:00
Ivan Kozlovic
b78ca2f63b Fixes for system events
- Call flushOutbound() for SYSTEM connections
- Flush in place in internalSendLoop when sending the shutdown event
- Fix some tests:
  - missing defer client connection Close()
  - ensure subs are registered and messages received before shutdown
    of leafnode server to check disconnected event's stats.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-12-04 20:55:55 -07:00
Derek Collison
07ab23af0d Need return when acc not found
Signed-off-by: Derek Collison <derek@nats.io>
2019-11-17 09:24:34 -08:00
Derek Collison
7b1bea61e2 Merge pull request #1192 from nats-io/load_account
Do not fetch accounts on system events.
2019-11-16 18:33:23 -08:00
Derek Collison
093b57ed40 Do not fetch accounts on system events.
Noticed we would lookup accounts, but would also fetch them when tracking remote connections, etc.

Signed-off-by: Derek Collison <derek@nats.io>
2019-11-16 18:05:42 -08:00
Derek Collison
6ad8287bbe Introduced wildcard handling of _R_ mapped replies.
We had too much special processing, so reduced to a single wildcard
which will propagate across routes and gateways and is consistent
with gateway handling of globally routed subjects and timeouts.

Signed-off-by: Derek Collison <derek@nats.io>
2019-11-16 12:50:53 -08:00
Derek Collison
3330820502 Fixed a bug where we leaked service imports. Also prior this would have leaked subscriptions as well.
Signed-off-by: Derek Collison <derek@nats.io>
2019-11-14 13:29:17 -08:00
Ivan Kozlovic
8a8695d07c Backward compatibility with previous servers
Want to keep this commit separate so that we can easily remove
when we no longer want to support both prefixes.

- If this server receives a "$GR." message, it takes the subject
  and tries to process this locally. If there is no cluster race
  reply may be received ok (like before).
- If this server sends a routed reply, it detects if sending to
  an older server (then uses $GR.) or not (then uses $GNR)
- Gateway INFO has a new field that indicates if the server is
  using the new prefix.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-11-08 16:22:34 -07:00
Ivan Kozlovic
9b7dab0548 Updates based on code review
- Add atomic in client to skip check in processInboundClientMsg()
  if value is 0. Avoids getting the lock in fast path if not needed.
- Have a timer per client instead of the global server list that
  was expiring: noticed a lot of contention there when running
  some perf/profiling tests. The timer is also not reset for
  every timestamp that is not yet expired since this too affects
  performance. Instead fires are regular interval and cleared
  when map is empty after a cycle.
- Move processing of gw map rely on its own function (in inbound msg).
  I have verified that this is inlined same way as when code was
  directly in processInboundClientMsg.
- Use string(subj[]) for prefix detection: I have verified that
  it is actually faster.
- Builds the RMSG with appends to local buffer in handleGatewayReply()
  instead of using fmt.Sprintf().

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-11-08 15:56:28 -07:00
Ivan Kozlovic
aa843945c9 Work on Gateways reply mapping
- New prefix that includes origin server for the request
- Mapping done if request is service import or requestor has
  recent subscription
- Subscription considered recent if less than 250ms
- Destination server strip GW prefix before giving to client
  and restore when getting a reply on that subject
- Mapping removed aftert 250ms
- Server rejects client publish on "$GNR." (the new prefix)
- Cluster and server hash are now 8 chars long and from base 62
  alphabets
- Mapped replies need to be sent to leafnode servers due to race
  (cluster B sends RS+ on GW inbound then RMSG on outbound, the
  RS+ may be processed later and cluster A may have given message
  to LN before RS+ on reply subject. So LN needs to accept the
  mapped reply but will strip to give to client and reassemble
  before sending it back)

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-11-06 16:06:49 -07:00
Derek Collison
13f217635f Wait on requestor RTT when tracking latency.
If a client RTT for a requestor is longer than a service RTT, the requestor latency was often zero.
We now wait for the RTT (if zero) before sending out the metric.

Signed-off-by: Derek Collison <derek@nats.io>
2019-10-31 08:02:45 -07:00
Ivan Kozlovic
0da1afaf88 Fixed data race
Resolves #1176

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-10-30 20:10:37 -06:00
Ivan Kozlovic
12eb1f5b00 [ADDED] Server name in the RouteStat for statsz
Add the remote server name for a route in the statsz event

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-10-25 16:34:07 -06:00
R.I.Pienaar
bcf96fa1de Allows a descriptive server_name to be set
This adds a new config option server_name that
when set will be exposed in varz, events and more
as a descriptive name for the server.

If unset though the server_name will default to the pk

Signed-off-by: R.I.Pienaar <rip@devco.net>
2019-10-17 18:51:19 +02:00
Ivan Kozlovic
150d47cab3 [FIXED] Locking issue around account lookup/updates
Ensure that lookupAccount does not hold server lock during
updateAccount and fetchAccount.
Updating the account cannot have the server lock because it is
possible that during updateAccountClaims(), clients are being
removed, which would try to get the server lock (deep down in
closeConnection/s.removeClient).
Added a test that would have show the deadlock prior to changes
in this PR.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-09-17 18:48:23 -06:00
Derek Collison
52430c304a System level services for debugging.
This is the first pass at introducing exported services to the system account for generally debugging of blackbox systems.
The first service reports number of subscribers for a given subject. The payload of the request is the subject, and optional queue group, and can contain wildcards.

Signed-off-by: Derek Collison <derek@nats.io>
2019-09-17 09:37:35 -07:00
Derek Collison
94f143ccce Latency tracking updates.
Will now breakout the internal NATS latency to show requestor client RTT, responder client RTT and any internal latency caused by hopping between servers, etc.

Signed-off-by: Derek Collison <derek@nats.io>
2019-09-11 16:43:19 -07:00
Ivan Kozlovic
4253b31dcf [FIXED] Circular account service import dependency
If account A imports from B and B from A, when the account A
is built, it causes B to be fetch, but since B imports from A,
A was fetch/built again in an infinite loop.

Resolves #1117

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-09-10 18:05:21 -06:00
Derek Collison
7989118c3f First pass latency tracking for exported services
Signed-off-by: Derek Collison <derek@nats.io>
2019-08-30 10:52:48 -07:00
Guangming Wang
927991321d Cleanup: fix some typos in code comment
Signed-off-by: Guangming Wang <guangming.wang@daocloud.io>
2019-08-22 21:36:37 +08:00
Derek Collison
8f5bc503e5 Add ability for cross account import services to return streams as well as singeltons.
Take into account tracking of response maps that are created and do proper cleanup.
Also fixes #1089 which was discovered while working on this.

Signed-off-by: Derek Collison <derek@nats.io>
2019-08-06 14:15:40 -07:00
Ivan Kozlovic
ed1901c792 Update go.mod to satisfy v2 requirements
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-06-03 19:45:47 -06:00
Ivan Kozlovic
7f2620904c Fixed setting timer for account connection updates
The timer was not set with the proper variable, which caused the
check to always think that a new timer should be created, which
would lead to more and more timers being created which translated
to updates being sent more and more frequently.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-29 14:28:26 -06:00
Derek Collison
6584a9a828 lint updates
Signed-off-by: Derek Collison <derek@nats.io>
2019-05-06 15:41:38 -07:00
Derek Collison
acfe372d63 Changes for rename from gnatsd -> nats-server
Signed-off-by: Derek Collison <derek@nats.io>
2019-05-06 15:04:24 -07:00
Derek Collison
5292ec1598 Various fixes, init smap for leafnodes with gateways too
Signed-off-by: Derek Collison <derek@nats.io>
2019-05-02 14:22:51 -07:00
Derek Collison
2ec3eaeaa9 Leafnode account based connections limits
Signed-off-by: Derek Collison <derek@nats.io>
2019-04-25 14:40:59 -07:00
Derek Collison
bfe83aff81 Make account lookup faster with sync.Map
Signed-off-by: Derek Collison <derek@nats.io>
2019-04-23 17:13:23 -07:00
Derek Collison
bacb73a403 First pass at leaf nodes. Basic functionality working, including gateways.
What is not completed:
1. TLS
2. config to bind local account.
3. Info updates for solicitor to track topology changes like a client.
4. CONNECT sent after INFO for nonce authroization.
5. Authorization
6. Services and Streams tests.
7. config file parsing.

Signed-off-by: Derek Collison <derek@nats.io>
2019-03-25 08:54:47 -07:00
Derek Collison
c5510d616e Remove delay for global statsz, bump RC version
Signed-off-by: Derek Collison <derek@nats.io>
2019-02-12 15:17:29 -08:00
Derek Collison
c385834f96 Some cleanup on outbound and flush
Signed-off-by: Derek Collison <derek@nats.io>
2019-02-08 19:12:51 -08:00
Derek Collison
af78552549 Move ints to proper sizes for all
Signed-off-by: Derek Collison <derek@nats.io>
2019-02-05 15:19:59 -08:00
Ivan Kozlovic
7449e9ac53 Replace megacheck with staticcheck
Fixed issues reported by staticcheck

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-01-09 14:14:47 -07:00
Derek Collison
cc5873cd72 Added start time to Statsz from server.
Added in more debug for imports processing.
Changed subs reporting for Statsz.

Signed-off-by: Derek Collison <derek@nats.io>
2018-12-19 13:19:00 -08:00
Derek Collison
7fb2886098 Add total to account conn updates
Signed-off-by: Derek Collison <derek@nats.io>
2018-12-08 18:52:04 -08:00
Derek Collison
2ab23ca307 Make public for tooling
Signed-off-by: Derek Collison <derek@nats.io>
2018-12-08 18:33:23 -08:00
Derek Collison
a92ef0252c Should not send disconnect events on account $G.
Converted to authorization error events on different subject.
Add cluster name if gateways are configured and pass in INFO to clients.

Signed-off-by: Derek Collison <derek@nats.io>
2018-12-08 16:07:02 -08:00
Derek Collison
c83d7f8851 Support server ping for statusz
Signed-off-by: Derek Collison <derek@nats.io>
2018-12-07 08:42:01 -08:00
Derek Collison
c5ee8b2cff Server sequences outbound may not appear sequential to other listening servers.
Signed-off-by: Derek Collison <derek@nats.io>
2018-12-06 16:52:13 -08:00
Derek Collison
7b0f2426fa Internal clients aren't weighed against limits
Signed-off-by: Derek Collison <derek@nats.io>
2018-12-06 14:23:59 -08:00
Derek Collison
ef5764eea0 Bump version, add RTT to StatsZ
Signed-off-by: Derek Collison <derek@nats.io>
2018-12-06 11:46:14 -08:00
Derek Collison
18bca5603f Added server version and cluster name to statsz.
Fixed account connection accounting sending after local connections is 0.

Signed-off-by: Derek Collison <derek@nats.io>
2018-12-06 10:57:39 -08:00
Derek Collison
b9aa2a3da4 Enforce account limits on system account too
Signed-off-by: Derek Collison <derek@nats.io>
2018-12-06 08:37:22 -08:00
Derek Collison
2d54fc3ee7 Account lookup failures, account and client limits, options reload.
Changed account lookup and validation failures to be more understandable by users.
Changed limits to be -1 for unlimited to match jwt pkg.

The limits changed exposed problems with options holding real objects causing issues with reload tests under race mode.
Longer term this code should be reworked such that options only hold config data, not real structs, etc.

Signed-off-by: Derek Collison <derek@nats.io>
2018-12-05 14:25:40 -08:00
Derek Collison
53c70e6ce1 Use atomic.Load
Signed-off-by: Derek Collison <derek@nats.io>
2018-12-04 09:09:27 -08:00