Commit Graph

2057 Commits

Author SHA1 Message Date
Derek Collison
376eee46e9 Merge pull request #1022 from nats-io/sc
Add chunk and total bytes to slow consumer log
2019-05-30 10:17:52 -07:00
Derek Collison
42a7797a50 Add chunk and total bytes to slow consumer log
Signed-off-by: Derek Collison <derek@nats.io>
2019-05-30 09:15:20 -07:00
Derek Collison
eb6689143d Merge pull request #1020 from nats-io/bug
Fix for reloadAuthorization bugs
2019-05-29 14:20:58 -07:00
Ivan Kozlovic
44554057d1 Merge pull request #1019 from nats-io/warn_readloop_busy
Print warning if code in readloop execute for more than threshold
2019-05-29 15:16:31 -06:00
Derek Collison
a94fe78c22 Check for resolver reload to no resolver
Signed-off-by: Derek Collison <derek@nats.io>
2019-05-29 14:00:51 -07:00
Ivan Kozlovic
8b78b97a67 Merge pull request #1021 from nats-io/fix_acc_conn_update_timer
Fixed setting timer for account connection updates
2019-05-29 15:00:49 -06:00
Ivan Kozlovic
7f2620904c Fixed setting timer for account connection updates
The timer was not set with the proper variable, which caused the
check to always think that a new timer should be created, which
would lead to more and more timers being created which translated
to updates being sent more and more frequently.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-29 14:28:26 -06:00
Derek Collison
874f06a212 Fix bugs on reloadAuthorization
When tls is on routes it can cause reloadAuthorization to be called.
We were assuming configured accounts, but did not copy the remote map.
This copies the remote map when transferring for configured accounts
and also handles operator mode. In operator mode we leave the accounts
in place, and if we have a memory resolver we will remove accounts that
are not longer defined or have bad claims.

Signed-off-by: Derek Collison <derek@nats.io>
2019-05-29 13:19:58 -07:00
Ivan Kozlovic
33505e4849 Print warning if code in readloop execute for more than threshold
Issue a warning in readLoop if execution of code after connection
Read() until end of for loop reaches a certain threshold.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-28 18:33:48 -06:00
Derek Collison
bd589fb20c Warn on no random client
Signed-off-by: Derek Collison <derek@nats.io>
2019-05-28 16:17:25 -07:00
Ivan Kozlovic
66f5325cee Merge pull request #1018 from nats-io/gw_log_interest_switch
Added logging of account interest mode switch for gateways
2019-05-28 15:33:06 -06:00
Ivan Kozlovic
f5991e8a2b Merge pull request #1015 from nats-io/restore_conn_error_default_attempts_to_one
Update to connect/reconnect error reports logic
2019-05-28 14:57:29 -06:00
Ivan Kozlovic
2d4c3dd38f Added logging of account interest mode switch for gateways
Both sides will log when an account is switched to interest-only
mode. There are 2 traces (start/complete) per account.
They are logged at [INF] level.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-28 14:55:45 -06:00
Ivan Kozlovic
fed8ee0857 Merge pull request #1016 from nats-io/gatewayz
[ADDED] Gatewayz endpoint
2019-05-28 12:56:15 -06:00
Ivan Kozlovic
5478eaf01e Added /gatewayz endpoint
Such endpoint will list the gateway/cluster name, address and port
then list of outbound/inbound connections.
For each remote gateway there will be at most one outbound connection.
There can be 0 or more inbound connections for the same remote
gateway.

For each of these outbound/inbound connection, the connection info
similar to Connz is reported. Optionally, one can include the
interest mode/stats for each account.

Here are possible options:

* No specific options

http://host:port/gatewayz

* Limit to specific remote gateway, say name "B":

http://host:port/gatewayz/gw_name=B

* Include accounts (default limit to 1024 accounts)

http://host:port/gatewayz/accs=1

* Specific limit, say 200 (note accs=1 in this case is optional)

http://host:port/gatewayz/accs=1&accs_limit=200

* Specific account, say "acc_1". Note that accs=1 is not required then

http://host:port/gatewayz/acc_name=acc_1

* Above options can be mixed: specific remote gateway (B), with 100
  accounts reported

http://host:port/gatewayz/gw_name=B&accs_limit=200

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-28 12:41:09 -06:00
Ivan Kozlovic
4ed08dde07 Merge pull request #1013 from nats-io/fix_gw_qinterest_loss
Fixed loss of queue subscription interest across Gateways in some cases
2019-05-26 18:23:06 -06:00
Ivan Kozlovic
a16a53441a Merge pull request #1014 from nats-io/fix_flappers
Fix flappers
2019-05-26 18:16:27 -06:00
Ivan Kozlovic
d2578f9e05 Update to connect/reconnect error reports logic
Changed the introduced new option and added a new one. The idea
is to be able to differentiate between never connected and reconnected
event. The never connected situation will be logged at first attempt
and every hour (by default, configurable).
However, once connected and if trying to reconnect, will report every
attempts by default, but this is configurable too.

These two options are supported for config reload.

Related to #1000
Related to #1001
Resolves #969

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-26 17:51:01 -06:00
Ivan Kozlovic
ce1e6defab Fix flappers
- TestSystemAccountConnectionUpdatesStopAfterNoLocal: I believe that
  the check on number of notifications was wrong. Since we did not
  consume the ones for the connect, the expected count after the
  disconnect is 8 instead of 4.

- Possible fix GW tests complaining about number of outbound/inbound
  I think that it may be possible that connection does not succeed
  right away (remote to fully started, etc) and due to dial timeout
  and reconnect attempt delay, I suspect that when given a max time
  of 1sec to complete, it may not be enough.
  Quick change for now is to override to 2secs for now in the
  wait helpers. If that proves conclusive, we could remove the
  timeout given to these helpers.

- TestGatewaySendAllSubsBadProtocol: used a t.Fatalf() in checkFor
  instead of return fmt.Errorf().

- TestLeafNodeResetsMSGProto: this test is not about change to
  interest mode only, so to avoid possible mix of protos, delay
  a bit creation of gateway after creation of leaf node.

- Some defer s.Shutdown() were missing

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-26 17:17:08 -06:00
Ivan Kozlovic
b325cf1e4a Fixed loss of queue subscription interest across Gateways in some cases
Suppose two servers, SA in cluster A and SB in cluster B. If SA
sends a message to SB on an account for which there is no interest
at all (account not known or no subscription), SB will send an A-
and keep track that it sent an A- for this account.

When a queue subscription is created on SB, SB will send and RS+
to A because A needs to have perfect knowledge of all queue subs
in all clusters.

If then a regular subscription is also created on SB, SB will
think that it needs to send an A+ because it had sent an A- for
this account. However, SA had an entry for this account for the
queue sub. The A+ would clear the entry in the map and would cause
SA to not send messages to SB even if they would have been a
match for the queue sub on SB.

We fix this in two ways:
- Clear the possible A- in SB when sending an RS+ for queue sub
- Processing of A-/A+ to be aware of a possible entry in the map
  due to queue subs.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-25 16:27:00 -06:00
Ivan Kozlovic
7e11b982aa Merge pull request #1012 from nats-io/update_route_gws_urls_in_varz
[ADDED] URLs to cluster{} in /varz and update of gateway ones
2019-05-24 14:43:37 -06:00
Ivan Kozlovic
a3996cbd29 Shorten help function name
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-24 14:21:56 -06:00
Ivan Kozlovic
55597a7e8b [ADDED] URLs to cluster{} in /varz and update of gateway ones
In varz's cluster{} section, there was no URLs field. This PR adds
it and displays the routes defined in the cluster{} config section.
The value gets updated should there be a config reload following
addition/removal of an url from "routes".

If config had 1 route to "nats://127.0.0.1:1234", here is what
it would look like now:
```
"cluster": {
    "addr": "0.0.0.0",
    "cluster_port": 6222,
    "auth_timeout": 1,
    "urls": [
      "127.0.0.1:1234"
    ]
  },
```
Adding route to "127.0.0.1:4567" and doing config reload:
```
"cluster": {
    "addr": "0.0.0.0",
    "cluster_port": 6222,
    "auth_timeout": 1,
    "urls": [
      "127.0.0.1:1234",
      "127.0.0.1:4567"
    ]
  },
```
Note that due to how we handle discovered servers in the cluster,
new urls dynamically discovered will not show in above output.
This could be done, but would need some changes in how we store
things (actually in this case, new urls are not stored, just
attempted to be connected. Once they connect, they would be visible
in /routez).

For gateways, however, this PR displays the combination of the
URLs defined in config and the ones that are discovered after
a connection is made to a give cluster. So say cluster A has a single
url to one server in cluster B, when connecting to that server,
the server on A will get the list of the gateway URLs that one
can connect to, and these will be reflected in /varz. So this is
a different behavior that for routes. As explained above, we could
harmonize the behavior in a future PR.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-24 13:42:41 -06:00
Ivan Kozlovic
795c1cf9cc Merge pull request #1011 from nats-io/fix_flappers
Fixed some flappers
2019-05-24 10:08:14 -06:00
Ivan Kozlovic
48c3f7f846 Fixed some flappers
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-24 09:53:35 -06:00
Ivan Kozlovic
d779a0bd38 Merge pull request #1009 from nats-io/rc14_pre_release
Pre-Release 2.0.0-RC14
v2.0.0-RC14
2019-05-22 14:07:13 -06:00
Ivan Kozlovic
614240178c Pre-Release 2.0.0-RC14
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-22 12:58:47 -06:00
Ivan Kozlovic
68e3fb6344 Merge pull request #1008 from nats-io/fix_gw_panic
Check inbound GW connection connected state in parser
2019-05-22 12:40:23 -06:00
Ivan Kozlovic
97ee89cc67 Check inbound GW connection connected state in parser
If the first protocol for an inbound gateway connection is not
CONNECT, reject with auth violation.

Fixes #1006

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-22 12:31:16 -06:00
Ivan Kozlovic
698d9e642c Merge pull request #1007 from nats-io/readme-updates
Update go-nats to nats.go in readme
2019-05-22 07:11:51 -06:00
Waldemar Quevedo
f960378487 Update go-nats to nats.go in readme 2019-05-22 09:59:25 +02:00
Derek Collison
32b95a2f28 Merge pull request #1005 from nats-io/flappers
Fixes for flappers
2019-05-21 17:52:01 -07:00
Derek Collison
933f5d0df4 Add in TestGatewayServiceExportWithWildcards
Signed-off-by: Derek Collison <derek@nats.io>
2019-05-21 15:28:42 -07:00
Derek Collison
67bb08af8b Fixes for a few flappers.
TestJWTAccountImportActivationExpires
TestGatewayServiceImportWithQueue

Signed-off-by: Derek Collison <derek@nats.io>
2019-05-21 15:12:31 -07:00
Derek Collison
c00226af3e Merge pull request #1004 from nats-io/race
Fix for reload race on global account
2019-05-21 12:13:32 -07:00
Derek Collison
ecfd1a2c85 Max flapper less so
Signed-off-by: Derek Collison <derek@nats.io>
2019-05-21 11:59:34 -07:00
Derek Collison
8a614b49e1 Fix for reload race on global account
Signed-off-by: Derek Collison <derek@nats.io>
2019-05-21 11:37:15 -07:00
Ivan Kozlovic
19ef4e47a3 Merge pull request #1003 from nats-io/randomize_gw_urls
Better randomize solicited Gateway URLs
2019-05-21 10:23:34 -06:00
Ivan Kozlovic
1cdc3eb41f Better randomize solicited Gateway URLs
Shuffle the array created when iterating through the gateways URLs
map since map iteration may not be well randomized with small maps.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-21 09:28:59 -06:00
Ivan Kozlovic
0144e27afd Merge pull request #1001 from nats-io/connect_error_reports
Make the error report attempts configurable
2019-05-21 08:35:13 -06:00
Ivan Kozlovic
7272e4e317 Make the error report attempts configurable
This is a continuation of #1000. Added a configuration to specify
the number of attempts at which the repeated error is reported.
The algo is now to print only the 1st attempt and when current
attempt % <this config param> == 0.

Resolves #969

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-20 16:28:48 -06:00
Ivan Kozlovic
36fd47df5d Merge pull request #1000 from nats-io/reduce_connect_err_report_frequency
[UPDATED] Reduce report of failed connection attempts
2019-05-20 10:52:52 -06:00
Ivan Kozlovic
03930ba0e4 [UPDATED] Reduce report of failed connection attempts
This applies to routes, gateways and leaf node connections.
The failed attempts will be printed at the first, after the first
minute and then every hour.
The connect/error statements now include the attempt number.

Note that in debug mode, all attempts are traced, so you may get
double trace (one for debug, one for info/error).

Resolves #969

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-20 10:13:56 -06:00
Ivan Kozlovic
25cd64b891 Merge pull request #996 from nats-io/goget-mkpasswd
moving mkpasswd to its own directory to enable go get to install the tool
2019-05-16 15:56:29 -06:00
Alberto Ricart
164a9abcc2 Updated README.md 2019-05-16 16:34:44 -05:00
Alberto Ricart
9e2107c860 moving mkpasswd to its own directory to enable go get to install the tool. 2019-05-16 10:06:26 -05:00
Waldemar Quevedo
031276c22e Merge pull request #995 from nats-io/dockerfile-alpine-symlink
Add symlink to gnatsd on alpine image
2019-05-13 15:16:30 -05:00
Waldemar Quevedo
21d6875b6e Add symlink to gnatsd on alpine image
Signed-off-by: Waldemar Quevedo <wally@synadia.com>
2019-05-13 12:44:19 -07:00
Ivan Kozlovic
565efbbbc4 Merge pull request #994 from nats-io/fix_gw_test_race
Fixed gateway test race report
2019-05-13 12:32:34 -06:00
Waldemar Quevedo
edf0e39b62 Merge pull request #993 from nats-io/dockerfile-fixes
Fixes to alpine Dockerfile
2019-05-13 13:24:43 -05:00