Fixes#1314 by:
There was a data race with a write during reloadAuthorization.
Locking was added to all places where it was missing.
In situations were it appeared feasible, access was moved into existing
lock/unlock.
Where it was added, the lock order was already established.
Signed-off-by: Matthias Hanel <mh@synadia.com>
Fixed#1296, by altering client state on reload
Detect a trace level change on reload and update all clients.
To avoid data races, read client.trace while holding the lock,
pass the value into functionis that trace while not holding the lock.
Delete unused client.debug.
Signed-off-by: Matthias Hanel <mh@synadia.com>
A new config section allows to specify specific TLS parameters for
the account resolver:
```
resolver_tls {
cert_file: ...
key_file: ...
ca_file: ...
}
```
Resolves#1271
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
On config reload, the URL account resolver was recreated and a
Fetch() with empty account was done. Move the empty fetch test
in NewServer() instead.
Added a test that shows that fetch is no longer invoked on reload
but server reports failure on startup.
Resolves#1229
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
- All writes will now be done by the writeLoop, unless when the
writeLoop has not been started yet (likely in connection init).
- Slow consumers for non CLIENT connections will be reported but
not failed. The idea is that routes, gateway, etc.. connections
should stay connected as much as possible. However if a flush
operation times out and no data at all has been written, the
connection will be closed (regardless of type).
- Slow consumers due to max pending is only for CLIENT connections.
This allows sending of SUBs through routes, etc.. to not have
to be chunked.
- The backpressure to CLIENT connections is increased (up to 1sec)
based on the sub's connection pending bytes level.
- Connection is flushed on close from the writeLoop as to not block
the "fast path".
Some tests have been fixed and adapted since now closeConnection()
is not flushing/closing/removing connection in place.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Ensure that lookupAccount does not hold server lock during
updateAccount and fetchAccount.
Updating the account cannot have the server lock because it is
possible that during updateAccountClaims(), clients are being
removed, which would try to get the server lock (deep down in
closeConnection/s.removeClient).
Added a test that would have show the deadlock prior to changes
in this PR.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
When tls is on routes it can cause reloadAuthorization to be called.
We were assuming configured accounts, but did not copy the remote map.
This copies the remote map when transferring for configured accounts
and also handles operator mode. In operator mode we leave the accounts
in place, and if we have a memory resolver we will remove accounts that
are not longer defined or have bad claims.
Signed-off-by: Derek Collison <derek@nats.io>
Changed the introduced new option and added a new one. The idea
is to be able to differentiate between never connected and reconnected
event. The never connected situation will be logged at first attempt
and every hour (by default, configurable).
However, once connected and if trying to reconnect, will report every
attempts by default, but this is configurable too.
These two options are supported for config reload.
Related to #1000
Related to #1001Resolves#969
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
In varz's cluster{} section, there was no URLs field. This PR adds
it and displays the routes defined in the cluster{} config section.
The value gets updated should there be a config reload following
addition/removal of an url from "routes".
If config had 1 route to "nats://127.0.0.1:1234", here is what
it would look like now:
```
"cluster": {
"addr": "0.0.0.0",
"cluster_port": 6222,
"auth_timeout": 1,
"urls": [
"127.0.0.1:1234"
]
},
```
Adding route to "127.0.0.1:4567" and doing config reload:
```
"cluster": {
"addr": "0.0.0.0",
"cluster_port": 6222,
"auth_timeout": 1,
"urls": [
"127.0.0.1:1234",
"127.0.0.1:4567"
]
},
```
Note that due to how we handle discovered servers in the cluster,
new urls dynamically discovered will not show in above output.
This could be done, but would need some changes in how we store
things (actually in this case, new urls are not stored, just
attempted to be connected. Once they connect, they would be visible
in /routez).
For gateways, however, this PR displays the combination of the
URLs defined in config and the ones that are discovered after
a connection is made to a give cluster. So say cluster A has a single
url to one server in cluster B, when connecting to that server,
the server on A will get the list of the gateway URLs that one
can connect to, and these will be reflected in /varz. So this is
a different behavior that for routes. As explained above, we could
harmonize the behavior in a future PR.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
----------------------------------------------------------------
Backward-incompatibility note:
Varz used to embed *Info and *Options which are other server objects.
However, Info is a struct that servers used to send protocols to other
servers or clients and its content must contain json tags since we
need to marshal those to be sent over. The problem is that it made
those fields now accessible to users calling Varz() and also visible
to the http /varz output. Some fields in Info were introduced in the
2.0 branch that clashed with json tag in Options, which made cluster{}
for instance disappear in the /varz output - because a Cluster string
in Info has the same json tag, and Cluster in Info is empty in some
cases.
For users that embed NATS and were using Server.Varz() directly,
without the use of the monitoring endpoint, they were then given
access (which was not the intent) to server internals (Info and Options).
Fields that were in Info or Options or directly in Varz that did not
clash with each other could be referenced directly, for instace, this
is you could access the server ID:
v, _ := s.Varz(nil)
fmt.Println(v.ID)
Another way would be:
fmt.Println(v.Info.ID)
Same goes for fields that were brought from embedding the Options:
fmt.Println(v.MaxConn)
or
fmt.Println(v.Options.MaxConn)
We have decided to explicitly define fields in Varz, which means
that if you previously accessed fields through v.Info or v.Options,
you will have to update your code to use the corresponding field
directly: v.ID or v.MaxConn for instance.
So fields were also duplicated between Info/Options and Varz itself
so depending on which one your application was accessing, you may
have to update your code.
---------------------------------------------------------------
Other issues that have been fixed is races that were introduced
by the fact that the creation of a Varz object (pointing to
some server data) was done under server lock, but marshaling not
being done under that lock caused races.
The fact that object returned to user through Server.Varz() also
had references to server internal objects had to be fixed by
returning deep copy of those internal objects.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
We don't support reload of leafnode config yet, but we need to make
sure it does not fail the reload process if nothing has been changed.
(it would fail because TLSConfig internally do change in some cases)
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
For cluster, we allow to skip hostname verification from certificate.
We now print a warning when this option is enabled, both on startup
or if the property is enabled on config reload.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
PR #874 caused an issue in case logtime was actually not configured
and not specified in the command line. A reload would then remove
logtime.
Revisited the fix for that and included other boolean flags, such
as debug, trace, etc..
Related to #874
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Changed account lookup and validation failures to be more understandable by users.
Changed limits to be -1 for unlimited to match jwt pkg.
The limits changed exposed problems with options holding real objects causing issues with reload tests under race mode.
Longer term this code should be reworked such that options only hold config data, not real structs, etc.
Signed-off-by: Derek Collison <derek@nats.io>
Although Gateways reload is not supported at the moment, I had
to add the trap in the switch statement because it would find
a difference. The reason is the TLSConfig object that is likely
to not pass the reflect.DeepEqual test. So for now, I exclude this
from the deep equal test and fail the reload only if the user
has explicitly changed the configuration.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Specifically this is to support distributed tracking of number of account connections across clusters.
Gateways may not work yet based on attempts to only generate payloads when we know there is outside interest.
Signed-off-by: Derek Collison <derek@nats.io>
Implemented single server account claim limits for subscriptions and active connections and message payload.
Signed-off-by: Derek Collison <derek@nats.io>
Allow deny clauses for subscriptions to still allow wildcard subscriptions but do not deliver the messages themselves.
Signed-off-by: Derek Collison <derek@nats.io>
- Removed un-needed lock/unlock
- Buffer SUBs/UNSUBs protocols and ensure flushing when buffer
gets to a certain size (otherwise route would get disconnected
with high number of subscriptions)
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
I don't think it is a good thing to compare the pointers and we
should use the DeepEqual instead.
When comparing a solicited route's URL to the URL that was created
during the parsing of the configuration, the pointers maybe the
same and so u1 == u2 would work. However, there are cases where
the URL is built on the fly based on the received route INFO protocol
so I think it is safer to use a function that does a reflect.DeepEqual
instead.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>