100 Commits

Author SHA1 Message Date
Ivan Kozlovic
67498af2dc [ADDED] LeafNode: Support for s2 compression
This is similar to PR #4115 but for LeafNodes.
Compression mode can be set on both side (the accept and in remotes).
```
leafnodes {
   port: 7422
   compression: s2_best
   remotes [
       {
         url: "nats://host2:74222"
         compression: s2_better
       }
   ]
}
```
Possible modes are similar than for routes (described in PR #4115),
except that when not defined we default to `s2_auto`.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2023-05-15 17:42:39 -06:00
Ivan Kozlovic
d6fe9d4c2d [ADDED] Support for route S2 compression
The new field `compression` in the `cluster{}` block allows to
specify which compression mode to use between servers.

It can be simply specified as a boolean or a string for the
simple modes, or as an object for the "s2_auto" mode where
a list of RTT thresholds can be specified.

By default, if no compression field is specified, the server
will use the s2_auto mode with default RTT thresholds of
10ms, 50ms and 100ms for the "uncompressed", "fast", "better"
and "best" modes.

```
cluster {
..
  # Possible values are "disabled", "off", "enabled", "on",
  # "accept", "s2_fast", "s2_better", "s2_best" or "s2_auto"
  compression: s2_fast
}
```

To specify a different list of thresholds for the s2_auto,
here is how it would look like:
```
cluster {
..
  compression: {
    mode: s2_auto
    # This means that for RTT up to 5ms (included), then
    # the compression level will be "uncompressed", then
    # from 5ms+ to 15ms, the mode will switch to "s2_fast",
    # then from 15ms+ to 50ms, the level will switch to
    # "s2_better", and anything above 50ms will result
    # in the "s2_best" compression mode.
    rtt_thresholds: [5ms, 15ms, 50ms]
  }
}
```

Note that the "accept" mode means that a server will accept
compression from a remote and switch to that same compression
mode, but will otherwise not initiate compression. That is,
if 2 servers are configured with "accept", then compression
will actually be "off". If one of the server had say s2_fast
then they would both use this mode.

If a server has compression mode set (other than "off") but
connects to an older server, there will be no compression between
those 2 routes.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2023-04-27 17:59:25 -06:00
Ivan Kozlovic
105237cba8 [ADDED] Multiple routes and ability to have per-account routes
New configuration fields:
```
cluster {
   ...
   pool_size: 5
   accounts: ["A", "B"]
}
```

The configuration `pool_size` in the example above means that this
server will create 5 routes to a remote server, assuming that that
server has the same `pool_size` setting.

Accounts (which are not part of the `accounts[]` configuration)
are assigned a specific route in this pool, and this will be the
same route on all servers in the cluster.

Accounts that are defined in the `accounts` field will each have
a dedicated route connection. This will allow suppression of the
account name in some of the route protocols, reducing bytes transmitted
which may increase performance.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2023-04-03 09:32:25 -06:00
Derek Collison
ff79afef39 Merge branch 'main' into dev 2022-12-30 12:23:23 -08:00
Neil Twigg
14d0ba1c65 Fix some lint errors after move to golangci-lint 2022-12-30 20:00:08 +00:00
Derek Collison
3877ee2411 Merge branch 'main' into dev 2022-12-13 13:08:35 -08:00
Marco Primi
f8a030bc4a Use testing.TempDir() where possible
Refactor tests to use go built-in temporary directory utility for tests.

Also avoid binding to default port (which may be in use)
2022-12-12 13:18:44 -08:00
Derek Collison
baf9f42d9f Fix tests
Signed-off-by: Derek Collison <derek@nats.io>
2022-11-27 19:49:52 -08:00
Marco Primi
f1883561ee Use testing.TB interface instead of *T
Using interface allows reusing helper function in benchmarks
2022-08-31 14:52:45 -07:00
Ivan Kozlovic
3c9a7cc6e5 Move to Go 1.19, remote io/util, fix data race and a flapper
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-08-05 09:55:37 -06:00
Derek Collison
a0a2e32185 Remove dynamic account behaviors.
We used these in tests and for experimenting with sandboxed environments like the demo network.

Signed-off-by: Derek Collison <derek@nats.io>
2022-02-04 13:32:18 -08:00
Phil Pennock
fc6df0fbbc Redact URLs before logging or returning in error (#2643)
* Redact URLs before logging or returning in error

This does not affect strings which failed to parse, and in such a scenario
there's a mix of "which evil" to accept; we can't sanely find what should be
redacted in those cases, so we leave them alone for debugging.

The JWT library returns some errors for Operator URLs, but it rejects URLs
which contain userinfo, so there can't be passwords in those and they're safe.

Fixes #2597

* Test the URL redaction auxiliary functions

* End-to-end tests for secrets in debug/trace

Create internal/testhelper and move DummyLogger there, so it can be used from
the test/ sub-dir too.

Let DummyLogger optionally accumulate all log messages, not just retain the
last-seen message.

Confirm no passwords logged by TestLeafNodeBasicAuthFailover.

Change TestNoPasswordsFromConnectTrace to check all trace messages, not just the
most recent.

Validate existing trace redaction in TestRouteToSelf.

* Test for password in solicited route reconnect debug
2021-10-27 12:44:59 -04:00
Matthias Hanel
5d1f36dd17 [Fixed] leaf node subscription permission negotiation.
On connect all subscription where sent by the soliciting leaf node.
If creds contains sub deny permissions, the leaf node would be
disconnected.
This waits for the permissions to be exchanged and checks permissions
before sending subscriptions.

Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-04-09 16:53:06 -04:00
Jaime Piña
d929ee1348 Check errors when removing test directories and files
Currently in tests, we have calls to os.Remove and os.RemoveAll where we
don't check the returned error. This hides useful error messages when
tests fail to run, such as "too many open files".

This change checks for more filesystem related errors and calls t.Fatal
if there is an error.
2021-04-07 11:09:47 -07:00
Jaime Piña
e44275b963 Consolidate temporary test files and directories
Currently, temporary test files and directories are written in lots of
different paths within the OS's temp dir. This makes it hard to know
which files are from nats-server and which are unrelated. This in turn
makes it hard to clean up nats-server test files.
2021-04-06 10:42:55 -07:00
Derek Collison
6c805eebc7 Properly support leadnode clusters.
Leafnodes that formed clusters were partially supported. This adds proper support for origin cluster, subscription suppression and data message no echo for the origin cluster.

Signed-off-by: Derek Collison <derek@nats.io>
2020-06-26 09:03:22 -07:00
Derek Collison
dd61535e5a Cluster names are now required.
Added cluster names as required for prep work for clustered JetStream. System can dynamically pick a cluster name and settle on one even in large clusters.

Signed-off-by: Derek Collison <derek@nats.io>
2020-06-12 15:48:38 -07:00
Derek Collison
f5ceab339a Server support for headers between routes
Signed-off-by: Derek Collison <derek@nats.io>
2020-05-19 14:33:06 -07:00
Derek Collison
dd116fcfd4 JetStream first pass basics.
This is the first checkin for JetStream. Has some rudimentary basics working.

TODO
1. Push vs pull mode for observables. (work queues)
2. Disk/File store, memory only for now.
3. clustering code - design shaping up well.
4. Finalize account import semantics.
5. Lots of other little things.

Signed-off-by: Derek Collison <derek@nats.io>
2020-05-19 14:06:29 -07:00
Matthias Hanel
a8e6af30a3 On client connect, send first ping after ping interval.
On connect message resend reset timer with setFirstPingTimer, so RTT can
be obtained quicker.

Disable short first ping in default server options for client_test.
In log_test prevent immediate scheduling by setting ping interval.

Signed-off-by: Matthias Hanel <mh@synadia.com>
2020-03-02 20:10:15 -05:00
Derek Collison
ebd4deb8b9 Stager first ping from server and suppress pings if a ping was received.
Signed-off-by: Derek Collison <derek@nats.io>
2019-06-29 15:43:15 -07:00
Ivan Kozlovic
ed1901c792 Update go.mod to satisfy v2 requirements
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-06-03 19:45:47 -06:00
Derek Collison
acfe372d63 Changes for rename from gnatsd -> nats-server
Signed-off-by: Derek Collison <derek@nats.io>
2019-05-06 15:04:24 -07:00
Derek Collison
bacb73a403 First pass at leaf nodes. Basic functionality working, including gateways.
What is not completed:
1. TLS
2. config to bind local account.
3. Info updates for solicitor to track topology changes like a client.
4. CONNECT sent after INFO for nonce authroization.
5. Authorization
6. Services and Streams tests.
7. config file parsing.

Signed-off-by: Derek Collison <derek@nats.io>
2019-03-25 08:54:47 -07:00
Derek Collison
e9c1d26f0b quick enablement of logging and tracing
Signed-off-by: Derek Collison <derek@nats.io>
2018-12-09 18:07:56 -08:00
Derek Collison
f4f3d3baf1 Updates for operator based configurations.
Added update to parse and load operator JWTs.
Changed to add in signing keys from operator JWT to list of trusted keys.
Added URL account resolver.
Added account claim updates by system messages.

Signed-off-by: Derek Collison <derek@nats.io>
2018-12-02 20:34:33 -08:00
Ivan Kozlovic
10fd3ca0c6 Gateways [WIP]
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2018-11-27 19:00:03 -07:00
Derek Collison
a2e310ffc1 Vendor jwt, fixes for nkey, jwt changes
Signed-off-by: Derek Collison <derek@nats.io>
2018-11-21 19:22:04 -08:00
Derek Collison
47963303f8 First pass at new cluster design
Signed-off-by: Derek Collison <derek@nats.io>
2018-10-24 21:29:29 -07:00
Ivan Kozlovic
d98d51c8cc [FIXED] Possible cluster Authorization Error during config reload
When changing something in the cluster, such as Timeout and doing
a config reload, the route could be closed with an `Authorization
Error` report. Moreover, the route would not try to reconnect,
even if specified as an explicit route.

There were 2 issues:
- When checking if a solicited route is still valid, we need to
  check the Routes' URL against the URL that we try to connect
  to but not compare the pointers, but either do a reflect
  deep equal, or compare their String representation (this is
  what I do in the PR).
- We should check route authorization only if this is an accepted
  route, not an explicit one. The reason is that we a server
  explicitly connect to another server, it does not get the remote
  server's username and password. So the check would always fail.

Note: It is possible that a config reload even without any change
in the cluster triggers the code checking if routes are properly
authorized, and that happens if there is TLS specified. When
the reload code checks if config has changed, the TLSConfig
between the old and new seem to indicate a change, eventhough there
is apparently none. Another reload does not detect a change. I
suspect some internal state in TLSConfig that causes the
reflect.DeepEqual() to report a difference.

Note2: This commit also contains fixes to regex that staticcheck
would otherwise complain about (they did not have any special
character), and I have removed printing the usage on startup when
getting an error. The usage is still correctly printed if passing
a parameter that is unknown.

Resolves #719

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2018-08-15 18:20:29 -06:00
Derek Collison
3b953ce838 Allow localhost to not be defined, only need 127.0.0.1
Signed-off-by: Derek Collison <derek@nats.io>
2018-06-28 16:10:19 -07:00
Derek Collison
26dafe464b Don't send route unsub with max
Signed-off-by: Derek Collison <derek@nats.io>
2018-06-04 17:45:05 -07:00
Derek Collison
df574ce951 varz cluster empty when not defined
Signed-off-by: Derek Collison <derek@nats.io>
2018-06-04 17:45:05 -07:00
Derek Collison
50a99241ea Slow consumer updates and latency improvements.
Use pending bytes as slow consumer trigger, so reintroduce max_pending.
Improve latency with inplace flush calls when appropriate. Utilize simple
time budget for readLoop routine.

Signed-off-by: Derek Collison <derek@nats.io>
2018-06-04 17:45:05 -07:00
Ivan Kozlovic
fb972bd0fc Remove ssl_required references 2018-03-23 13:40:10 -06:00
Derek Collison
00901acc78 Update license to Apache 2 2018-03-15 22:31:07 -07:00
Ivan Kozlovic
72bbd2b982 Restore DefaultTestOptions port to 4222
This is used by RunDefaultServer() and some external projects tests
may rely on the fact that this runs on the default port.

Our tests that want to use ephemeral ports to avoid port conflicts
should be updated to not use these default options and/or RunDefaultServer().
2018-03-07 12:12:19 -07:00
Ivan Kozlovic
0ac50bda76 Try reduce port conflicts in tests 2018-03-06 08:49:45 -07:00
Ivan Kozlovic
20926a6176 Added megacheck
This tool combines staticcheck, gosimple and unused.
Fixed reports from unused.
2017-08-11 17:28:18 -06:00
Derek Collison
76de921f65 Cleanup for Auth 2017-04-20 12:41:48 -07:00
Colin Sullivan
31dc3a7e11 Increase the client dial timeout.
* Windows and stressed machines require more time to connect; this was causing flappers.
2017-01-06 15:35:41 -07:00
Ivan Kozlovic
5f471b6e7f Replace GetListenEndpoint() with ReadyForConnections()
The RunServer() function (and the various variants)
call Server.Start() in a go-routine, but do not return until
it has verified that the server is ready to accept connections.
To do so, it use GetListenEndpoint() to get a suitable connect
address (replacing "0.0.0.0" or "::" with localhost - important
on Windows). It then creates a raw TCP connection to ensure the
server is started, repeating the process in case of failure up
to 10 seconds.

This PR replaces this with a function that checks that client
listener, and route listener if configured, are set. This removes
the need to get a connect address and create test tcp connections.

The reason for this change is that NATS Streaming when starting
the NATS Server (unless configured to connect to a remote one)
calls RunServerWithAuth(), which when getting "localhost" from
GetListenEndpoint(), would fail trying to resolve it. This happened
for the NATS Streaming Docker image built with Go 1.7+.
2016-12-09 14:03:45 -07:00
Derek Collison
8fbacaaea1 Cleanup for cluster opts 2016-12-02 14:29:22 -08:00
Ivan Kozlovic
fda5bd7ac7 [ADDED] Server sends INFO with cluster URLs to clients with support
Clients that will be at the ClientProtoInfo protocol level (or above)
will now receive an asynchronous INFO protocol when the server
they connect to adds a *new* route. This means that when the cluster
adds a new server, all clients in the cluster should now be notified
of this new addition.
2016-07-26 10:55:55 -06:00
Derek Collison
e3b5713ab9 Check for control line violations and memory attacks 2016-07-11 12:03:49 -07:00
Derek Collison
46a9e6f0bc First pass at multi-user support 2016-05-13 12:27:57 -07:00
Ivan Kozlovic
ad1198db85 Add code coverage
-Test coverage was no longer triggered due to the check for BUILD_GOOS
 environment variable that was removed. Removed the check.
-Re-run test package with server code coverage.
-Remove unused functions in test.go.
-Add test for a function in test.go.
-Add missing parse +OK test.
2016-04-22 13:03:04 -06:00
Derek Collison
3e2c3714bc Fix race in interest propogation to new routes 2016-04-15 13:16:13 -07:00
Derek Collison
a4c46694ff gosimple fixes 2016-03-31 07:33:36 -07:00
Colin Sullivan
2baac47820 Address issues found by golint.
* No functional changes
* Did not address the ALL_CAPS issues
* Did not modify public APIs and field names.
2016-03-15 15:21:13 -06:00