Commit Graph

79 Commits

Author SHA1 Message Date
Ivan Kozlovic
55597a7e8b [ADDED] URLs to cluster{} in /varz and update of gateway ones
In varz's cluster{} section, there was no URLs field. This PR adds
it and displays the routes defined in the cluster{} config section.
The value gets updated should there be a config reload following
addition/removal of an url from "routes".

If config had 1 route to "nats://127.0.0.1:1234", here is what
it would look like now:
```
"cluster": {
    "addr": "0.0.0.0",
    "cluster_port": 6222,
    "auth_timeout": 1,
    "urls": [
      "127.0.0.1:1234"
    ]
  },
```
Adding route to "127.0.0.1:4567" and doing config reload:
```
"cluster": {
    "addr": "0.0.0.0",
    "cluster_port": 6222,
    "auth_timeout": 1,
    "urls": [
      "127.0.0.1:1234",
      "127.0.0.1:4567"
    ]
  },
```
Note that due to how we handle discovered servers in the cluster,
new urls dynamically discovered will not show in above output.
This could be done, but would need some changes in how we store
things (actually in this case, new urls are not stored, just
attempted to be connected. Once they connect, they would be visible
in /routez).

For gateways, however, this PR displays the combination of the
URLs defined in config and the ones that are discovered after
a connection is made to a give cluster. So say cluster A has a single
url to one server in cluster B, when connecting to that server,
the server on A will get the list of the gateway URLs that one
can connect to, and these will be reflected in /varz. So this is
a different behavior that for routes. As explained above, we could
harmonize the behavior in a future PR.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-24 13:42:41 -06:00
Ivan Kozlovic
c014211318 [FIXED] Changes to Varz content and fixed race conditions
----------------------------------------------------------------
Backward-incompatibility note:

Varz used to embed *Info and *Options which are other server objects.
However, Info is a struct that servers used to send protocols to other
servers or clients and its content must contain json tags since we
need to marshal those to be sent over. The problem is that it made
those fields now accessible to users calling Varz() and also visible
to the http /varz output. Some fields in Info were introduced in the
2.0 branch that clashed with json tag in Options, which made cluster{}
for instance disappear in the /varz output - because a Cluster string
in Info has the same json tag, and Cluster in Info is empty in some
cases.
For users that embed NATS and were using Server.Varz() directly,
without the use of the monitoring endpoint, they were then given
access (which was not the intent) to server internals (Info and Options).
Fields that were in Info or Options or directly in Varz that did not
clash with each other could be referenced directly, for instace, this
is you could access the server ID:

v, _ := s.Varz(nil)
fmt.Println(v.ID)

Another way would be:

fmt.Println(v.Info.ID)

Same goes for fields that were brought from embedding the Options:

fmt.Println(v.MaxConn)

or

fmt.Println(v.Options.MaxConn)

We have decided to explicitly define fields in Varz, which means
that if you previously accessed fields through v.Info or v.Options,
you will have to update your code to use the corresponding field
directly: v.ID or v.MaxConn for instance.

So fields were also duplicated between Info/Options and Varz itself
so depending on which one your application was accessing, you may
have to update your code.
---------------------------------------------------------------

Other issues that have been fixed is races that were introduced
by the fact that the creation of a Varz object (pointing to
some server data) was done under server lock, but marshaling not
being done under that lock caused races.

The fact that object returned to user through Server.Varz() also
had references to server internal objects had to be fixed by
returning deep copy of those internal objects.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-09 14:33:04 -06:00
Ivan Kozlovic
5e01570ad4 Fixed failed configuration reload due to present of leafnode with TLS
We don't support reload of leafnode config yet, but we need to make
sure it does not fail the reload process if nothing has been changed.
(it would fail because TLSConfig internally do change in some cases)

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-05-02 15:49:56 -06:00
Derek Collison
26929d3e4b Fixed description
Signed-off-by: Derek Collison <derek@nats.io>
2019-04-23 18:56:27 -07:00
Derek Collison
bfe83aff81 Make account lookup faster with sync.Map
Signed-off-by: Derek Collison <derek@nats.io>
2019-04-23 17:13:23 -07:00
Ivan Kozlovic
4dd1b26cc5 Add a warning if cluster's insecure setting is enabled
For cluster, we allow to skip hostname verification from certificate.
We now print a warning when this option is enabled, both on startup
or if the property is enabled on config reload.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-04-09 17:37:53 -06:00
Ivan Kozlovic
81eb065391 Ensure leafnode listen port set to -1 does not prevent config reload
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-03-25 15:04:52 -06:00
Ivan Kozlovic
65cc218cba [FIXED] Allow use of custom auth with config reload
Resolves #923

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-03-20 15:45:17 -06:00
Derek Collison
007c98dc03 Support reload of max_control_line by updating connected clients
Signed-off-by: Derek Collison <derek@nats.io>
2019-02-06 14:33:34 -08:00
Derek Collison
af78552549 Move ints to proper sizes for all
Signed-off-by: Derek Collison <derek@nats.io>
2019-02-05 15:19:59 -08:00
Ivan Kozlovic
d654b18476 Fixed reload of boolean flags
PR #874 caused an issue in case logtime was actually not configured
and not specified in the command line. A reload would then remove
logtime.

Revisited the fix for that and included other boolean flags, such
as debug, trace, etc..

Related to #874

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-01-14 19:18:00 -07:00
Ivan Kozlovic
7449e9ac53 Replace megacheck with staticcheck
Fixed issues reported by staticcheck

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-01-09 14:14:47 -07:00
Derek Collison
eb4a7156ca Hold Rlock on client remapping
Signed-off-by: Derek Collison <derek@nats.io>
2018-12-05 16:21:20 -08:00
Derek Collison
f3f623565c fixes
Signed-off-by: Derek Collison <derek@nats.io>
2018-12-05 16:00:30 -08:00
Derek Collison
2d54fc3ee7 Account lookup failures, account and client limits, options reload.
Changed account lookup and validation failures to be more understandable by users.
Changed limits to be -1 for unlimited to match jwt pkg.

The limits changed exposed problems with options holding real objects causing issues with reload tests under race mode.
Longer term this code should be reworked such that options only hold config data, not real structs, etc.

Signed-off-by: Derek Collison <derek@nats.io>
2018-12-05 14:25:40 -08:00
Ivan Kozlovic
4f8100ebc8 Fix config reload that failed because of Gateways
Although Gateways reload is not supported at the moment, I had
to add the trap in the switch statement because it would find
a difference. The reason is the TLSConfig object that is likely
to not pass the reflect.DeepEqual test. So for now, I exclude this
from the deep equal test and fail the reload only if the user
has explicitly changed the configuration.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2018-12-04 19:25:59 -07:00
Ivan Kozlovic
0ba587249a Fixing setting of default gateway TLS Timeout
Moved setting to the default value in setBaselineOptions()
so that config reload does not fail.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2018-12-03 18:20:15 -07:00
Derek Collison
744795ead5 Allow servers to send system events.
Specifically this is to support distributed tracking of number of account connections across clusters.
Gateways may not work yet based on attempts to only generate payloads when we know there is outside interest.

Signed-off-by: Derek Collison <derek@nats.io>
2018-12-01 13:54:25 -08:00
Ivan Kozlovic
10fd3ca0c6 Gateways [WIP]
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2018-11-27 19:00:03 -07:00
Derek Collison
e14acf9f4e Single server limits
Implemented single server account claim limits for subscriptions and active connections and message payload.

Signed-off-by: Derek Collison <derek@nats.io>
2018-11-25 15:37:53 -08:00
Ivan Kozlovic
49105b7294 Fix data race on config reload with routes and accounts
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2018-11-08 18:12:04 -07:00
Derek Collison
5077025801 Make assiging global account consistent
Signed-off-by: Derek Collison <derek@nats.io>
2018-11-07 09:52:29 -08:00
Derek Collison
b2ec5b3a98 Added more tests, e.g. reload
Signed-off-by: Derek Collison <derek@nats.io>
2018-11-06 19:58:42 -08:00
Derek Collison
1ce1a434b0 Fix for #792
Allow deny clauses for subscriptions to still allow wildcard subscriptions but do not deliver the messages themselves.

Signed-off-by: Derek Collison <derek@nats.io>
2018-11-06 15:00:21 -08:00
Derek Collison
47963303f8 First pass at new cluster design
Signed-off-by: Derek Collison <derek@nats.io>
2018-10-24 21:29:29 -07:00
Ivan Kozlovic
9a1cb08394 Updates based on comments
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2018-10-24 11:05:14 -06:00
Ivan Kozlovic
d35bb56d11 Added support for Accounts reload
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2018-10-23 14:58:53 -06:00
Derek Collison
14cdda8cd4 Updates from comments
Signed-off-by: Derek Collison <derek@nats.io>
2018-09-30 09:36:32 -07:00
Derek Collison
1cbfbfa071 Basic account support
Signed-off-by: Derek Collison <derek@nats.io>
2018-09-29 13:04:19 +02:00
Ivan Kozlovic
178aab096d Updates based on comments
- Removed un-needed lock/unlock
- Buffer SUBs/UNSUBs protocols and ensure flushing when buffer
  gets to a certain size (otherwise route would get disconnected
  with high number of subscriptions)

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2018-09-24 16:06:34 -06:00
Ivan Kozlovic
178766d6c9 [ADDED] Support for route permissions config reload
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2018-09-18 18:28:40 -06:00
Ivan Kozlovic
b1bb181f3d Ensure URLs are compared using reflect.DeepEqual
I don't think it is a good thing to compare the pointers and we
should use the DeepEqual instead.
When comparing a solicited route's URL to the URL that was created
during the parsing of the configuration, the pointers maybe the
same and so u1 == u2 would work. However, there are cases where
the URL is built on the fly based on the received route INFO protocol
so I think it is safer to use a function that does a reflect.DeepEqual
instead.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2018-08-16 09:48:32 -06:00
Ivan Kozlovic
d98d51c8cc [FIXED] Possible cluster Authorization Error during config reload
When changing something in the cluster, such as Timeout and doing
a config reload, the route could be closed with an `Authorization
Error` report. Moreover, the route would not try to reconnect,
even if specified as an explicit route.

There were 2 issues:
- When checking if a solicited route is still valid, we need to
  check the Routes' URL against the URL that we try to connect
  to but not compare the pointers, but either do a reflect
  deep equal, or compare their String representation (this is
  what I do in the PR).
- We should check route authorization only if this is an accepted
  route, not an explicit one. The reason is that we a server
  explicitly connect to another server, it does not get the remote
  server's username and password. So the check would always fail.

Note: It is possible that a config reload even without any change
in the cluster triggers the code checking if routes are properly
authorized, and that happens if there is TLS specified. When
the reload code checks if config has changed, the TLSConfig
between the old and new seem to indicate a change, eventhough there
is apparently none. Another reload does not detect a change. I
suspect some internal state in TLSConfig that causes the
reflect.DeepEqual() to report a difference.

Note2: This commit also contains fixes to regex that staticcheck
would otherwise complain about (they did not have any special
character), and I have removed printing the usage on startup when
getting an error. The usage is still correctly printed if passing
a parameter that is unknown.

Resolves #719

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2018-08-15 18:20:29 -06:00
Alberto Ricart
456c09855e fmt 2018-07-02 15:50:03 -05:00
Derek Collison
ec8e2636de Track closed connections and reason for closing
Signed-off-by: Derek Collison <derek@nats.io>
2018-06-25 17:56:07 -07:00
Alberto Ricart
c35607cd95 [ADD] internal option to write a ports file --ports_file_dir
The added option writes a file in the specified directory called <exename>_<pid>.ports which
contains a JSON representation of ports that the gnatsd has opened.

This change is intended to facilitate testing by having ports be specified with a -1, so
they are auto assigned and allow tests to locate and connect to the launched gnatsd(s).
2018-06-22 16:15:39 -05:00
Derek Collison
17fecd4c9b Support CID in client INFO, allow filtering /connz by CID
Signed-off-by: Derek Collison <derek@nats.io>
2018-06-21 15:23:15 -07:00
Ivan Kozlovic
b0ebdbed7d Fixed typo 2018-04-09 09:14:21 -06:00
Ivan Kozlovic
40cf0107d6 Ensure sig handler routine returns on shutdown, turn it off in most tests
I noticed that when running the test suite, there would be a file
server/log1.txt left. This file is created by one of the config
reload test. Running this test individually was doing the proper
cleanup. I noticed that the Signal test that was checking
that files could be rotated was causing this side effect.
It turns out that none of the config reload tests were disabling
the signal handler (NoSigs=true), and since the go routine would
be left running, running the TestSignalToReOpenLogFile() test
would interact with an already finished test.

I put a thread dump in handleSignals() to track all tests that
were causing this function to start the go routine because NoSigs
was not set to true. I fixed all those tests. At this time, there
are only 2 tests that need to start the signal handler.

I have also fixed the code so that the signal handler routine select
on a server quitCh that is closed on shutdown so that this go routine
exit and is waiting on using the grWG wait group.
2018-04-06 17:14:02 -06:00
Ivan Kozlovic
fb972bd0fc Remove ssl_required references 2018-03-23 13:40:10 -06:00
Derek Collison
00901acc78 Update license to Apache 2 2018-03-15 22:31:07 -07:00
Ivan Kozlovic
1acf330e07 [ADDED] Notification to clients when servers leave the cluster
Until now, a server would only notify clients of servers that join
the cluster. More than that, a server would send ot its clients only
information if new servers were added.
This PR changes this by sending to clients that support async INFO
the list of URLs for all servers in the cluster any time that there
is a change (joining or leaving the cluster).
As of now, clients will not be affected by the change (and will not
take benefit of this: removing servers from their server pool). This
will be addressed in each supported client once this is merged.
2018-02-27 14:22:13 -07:00
Ivan Kozlovic
acf4a31e4b Major updates + support for config reload of client/cluster advertise 2018-02-05 20:15:36 -07:00
Ivan Kozlovic
2befd973cc Fixed DATA RACE and ensure route is not created/accepted on shutdown
- Created a setter for the closed flag.
- Check if route is closed under lock and set a boolean if so,
  so we don't check c.route outside of c's mutex.
- Ensure that we do not create a route on shutdown, which would
  leave a connection hanging (was seen in some config reload tests).
2017-07-19 10:42:18 -06:00
Tyler Treat
2ed9c64f66 Merge branch 'master' of github.com:nats-io/gnatsd into enable_config_reload 2017-06-28 14:42:11 -05:00
Tyler Treat
901a5c7122 Address CR feedback 2017-06-28 11:05:02 -05:00
Tyler Treat
032b0b4cd7 Address CR feedback 2017-06-27 16:53:18 -05:00
Tyler Treat
84d00a0395 Add comment about random map iteration 2017-06-27 16:09:14 -05:00
Tyler Treat
dd3ad77ea8 Replace reloaded varz field with config_load_time 2017-06-27 14:33:06 -05:00
Tyler Treat
9adfae11a2 Add reload count to server for monitoring 2017-06-23 10:03:01 -05:00