Commit Graph

598 Commits

Author SHA1 Message Date
Ivan Kozlovic
25bd5ca352 [FIXED] Unsubscribe may not be propagated through a leaf node
There is a race between the time the processing of a subscription
and the init/send of subscriptions when accepting a leaf node
connection that may cause internally a subscription's subject
to be counted many times, which would then prevent the send of
an LS- when the subscription's interest goes away.

Imagine this sequence of events, each side represents a "thread"
of execution:
```
client readLoop                         leaf node readLoop
----------------------------------------------------------
recv SUB foo 1
sub added to account's sublist

                                         recv CONNECT
                                     auth, added to acc.

updateSmap
smap["foo"]++ -> 1
no LS+ because !allSubsSent

                                         init smap
                                    finds sub in acc sl
                                    smap["foo"]++ -> 2
                                        sends LS+ foo
                                    allSubsSent == true

recv UNSUB 1
updateSmap
smap["foo"]-- -> 1
no LS- because count != 0
----------------------------------------------------------
```
Equivalent result but with slightly diffent execution:
```
client readLoop                         leaf node readLoop
----------------------------------------------------------
recv SUB foo 1
sub added to account's sublist

                                         recv CONNECT
                                     auth, added to acc.

                                         init smap
                                    finds sub in acc sl
                                    smap["foo"]++ -> 1
                                        sends LS+ foo
                                    allSubsSent == true

updateSmap
smap["foo"]++ -> 2
no LS+ because count != 1

recv UNSUB 1
updateSmap
smap["foo"]-- -> 1
no LS- because count != 0
----------------------------------------------------------
```

The approach for the fix is delay the creation of the smap
until we actually initialize the map and send the subs on processing
of the CONNECT.
In the meantime, as soon as the LN connection is registered
and available in updateSmap, we check that smap is nil or
not. If nil, we do nothing.

In "init smap" we keep track of the subscriptions that have been
added to smap. This map will be short lived, just enough to
protect against races above.

In updateSmap, when smap is not nil, we need to checki, if we
are adding, that the subscription has not already been handled.
The tempory subscription map will be ultimately emptied/set to
nil with the use of a timer (if not emptied in place when
processing smap updates).

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-06-05 10:07:15 -06:00
Derek Collison
c969e7e424 Do proper ubsubscribe when shutting off restore endpoint
Signed-off-by: Derek Collison <derek@nats.io>
2020-06-04 08:58:14 -07:00
Derek Collison
164f44ed18 Require reply subjects for restore chunks
Signed-off-by: Derek Collison <derek@nats.io>
2020-06-04 06:56:07 -07:00
Derek Collison
660ea3c807 Snapshot restore now works across leafnodes.
This also introduces the ability to have flow control inbound for restoring a stream.
If the system detects a reply subject it will respond with a nil payload.
For the last EOF message if a reply is present it will respond with a stream info response or error.

Signed-off-by: Derek Collison <derek@nats.io>
2020-06-03 20:00:59 -07:00
Ivan Kozlovic
1e149f4041 Merge pull request #1440 from nats-io/jwt2
Update imports for jwt/v2
2020-06-02 11:10:21 -06:00
Derek Collison
afc7fc367b Remove hdrs for now, find better way to deliver in client
Signed-off-by: Derek Collison <derek@nats.io>
2020-06-02 07:10:23 -07:00
R.I.Pienaar
920dd4269a fix argument order in snapshots
Signed-off-by: R.I.Pienaar <rip@devco.net>
2020-06-02 13:51:50 +02:00
Derek Collison
07ef71ff98 Avoid parsing large sizes for messages
Signed-off-by: Derek Collison <derek@nats.io>
2020-06-01 16:54:41 -07:00
aricart
e7590f3065 jwt2 testbed 2020-06-01 18:00:13 -04:00
Derek Collison
f8d6dd992b Fix JetStream benchmark test
Signed-off-by: Derek Collison <derek@nats.io>
2020-06-01 13:30:34 -07:00
Derek Collison
05e38ae527 Merge branch 'master' into sys-acc 2020-06-01 11:53:14 -07:00
Derek Collison
f6ce833751 Fix flapper
Signed-off-by: Derek Collison <derek@nats.io>
2020-06-01 11:37:06 -07:00
Derek Collison
4d62a7237d Allow redelivery for AckAll policy, general upgrades to pending beahviors. Fixes #1436
Signed-off-by: Derek Collison <derek@nats.io>
2020-05-31 07:50:50 -07:00
Derek Collison
e584d4efee Merge pull request #1435 from nats-io/js-hdrs
First pass header support for JetStream
2020-05-31 06:01:01 -07:00
Derek Collison
dbde2479c2 Add in headers to consumer delivered messages
Signed-off-by: Derek Collison <derek@nats.io>
2020-05-30 15:03:54 -07:00
Derek Collison
eca04c6fce First pass header support for JetStream
Signed-off-by: Derek Collison <derek@nats.io>
2020-05-30 10:04:23 -07:00
Derek Collison
2bd7553c71 System Account on by default.
Most of the changes are to turn it off for tests that were watching subscriptions and such.

Signed-off-by: Derek Collison <derek@nats.io>
2020-05-29 17:56:45 -07:00
Derek Collison
e12907ffa6 Allow snapshots to optionally check all message checksums
Signed-off-by: Derek Collison <derek@nats.io>
2020-05-29 09:57:33 -07:00
Derek Collison
0a206b4c64 Snapshot performance tweaks
Signed-off-by: Derek Collison <derek@nats.io>
2020-05-29 08:07:31 -07:00
Derek Collison
10e49ca1c4 Fix more flappers
Signed-off-by: Derek Collison <derek@nats.io>
2020-05-28 14:19:11 -07:00
Derek Collison
4ca05d9719 Fix gap test from flapping
Signed-off-by: Derek Collison <derek@nats.io>
2020-05-28 13:43:15 -07:00
Derek Collison
625129f20a Fix flapper test where no messages to receive at end
Signed-off-by: Derek Collison <derek@nats.io>
2020-05-28 13:23:00 -07:00
Ivan Kozlovic
b0e43b6aa9 Fix flappers
- TestResponsePermissions: ensure subscription for service is
registered by server before sending requests.
- TestReloadDoesNotWipeAccountsWithOperatorMode: wait for subject
propagation.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-05-28 13:41:02 -06:00
Ivan Kozlovic
5d949bf1ea Merge pull request #1424 from nats-io/fix_1421
[FIXED] Possible removal of interest on queue subs with leaf nodes
2020-05-28 11:16:28 -06:00
Ivan Kozlovic
e9805a3109 [FIXED] Possible removal of interest on queue subs with leaf nodes
Server was incorrectly processing a queue subscription removal
as both a plain sub and queue sub, which may have resulted in
drop of interest even when some queue subs remained.

Resolves #1421

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-05-28 10:21:51 -06:00
Derek Collison
d3ac95a5e6 Add in a terminate delivery for https://github.com/nats-io/jetstream/issues/189
Signed-off-by: Derek Collison <derek@nats.io>
2020-05-28 08:32:34 -07:00
Derek Collison
18822ab866 Merge pull request #1422 from nats-io/snapshot
JetStream Snapshots
2020-05-28 06:59:15 -07:00
Derek Collison
bc0fedbaba Udpates based on PR feedback
Signed-off-by: Derek Collison <derek@nats.io>
2020-05-28 06:08:35 -07:00
Derek Collison
fa59cff105 Add in snapshot and restore JSApi
Signed-off-by: Derek Collison <derek@nats.io>
2020-05-27 20:01:30 -07:00
Waldemar Quevedo
625dd18974 Add support for SPIFFE x.509 SVIDs for auth
This can be enabled by using `verify_and_map`.

```
tls {
 cert_file: "server.pem"
 key_file: "server.key"
 ca_file: "ca.pem"
 timeout: 5
 verify_and_map: true
}

authorization {
  users = [
    {
      user = "spiffe://localhost/my-nats-service/user-a"
    },
    {
      user = "spiffe://localhost/my-nats-service/user-b",
      permissions = { subscribe = { deny = ">" }}
    },
  ]
}
```

Signed-off-by: Waldemar Quevedo <wally@synadia.com>
2020-05-27 13:10:42 -07:00
Derek Collison
8727315eb9 Updated snapshots, added restore, generic hashes
Signed-off-by: Derek Collison <derek@nats.io>
2020-05-26 19:53:16 -07:00
Derek Collison
4c91b69c4f Merge pull request #1413 from nats-io/fix_flappers
Fix flappers
2020-05-26 09:01:04 -07:00
Derek Collison
710ef00383 Don't allow JetStream on system account. Warn when accounts configured but no JS
Signed-off-by: Derek Collison <derek@nats.io>
2020-05-25 12:17:18 -07:00
Derek Collison
3caf6265d4 Properly recover ephemeral consumers after restart
Signed-off-by: Derek Collison <derek@nats.io>
2020-05-25 11:06:55 -07:00
Derek Collison
54aa40b352 Wait a bit longer to get subs
Signed-off-by: Derek Collison <derek@nats.io>
2020-05-25 09:30:36 -07:00
Derek Collison
e27f94eea2 Flush the sub interest
Signed-off-by: Derek Collison <derek@nats.io>
2020-05-25 09:25:06 -07:00
Derek Collison
ceb7e723c9 Don't let bad rtt estimate fail tests
Signed-off-by: Derek Collison <derek@nats.io>
2020-05-25 09:20:36 -07:00
Ivan Kozlovic
e5d6bf0c29 Wait for sub propagation on some NewRouteServiceImport
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-05-25 06:58:23 -07:00
Derek Collison
9dae2cd80f Fixed flapper, will fix bug in other PR
Signed-off-by: Derek Collison <derek@nats.io>
2020-05-25 06:58:23 -07:00
Derek Collison
79ea95fe44 Fix flapper, wait for sub to propagate
Signed-off-by: Derek Collison <derek@nats.io>
2020-05-25 06:58:23 -07:00
Derek Collison
b26d389d5e Use old request style, no pause between new sends
Signed-off-by: Derek Collison <derek@nats.io>
2020-05-25 06:58:23 -07:00
Ivan Kozlovic
46b45b3148 Ensure route INFO is processed before starting queue test
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-05-25 06:58:23 -07:00
Ivan Kozlovic
7d575e3af9 Remove a test that offers no value but keeps failing
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-05-25 06:58:23 -07:00
Derek Collison
a693b677c6 Give a bit of room for slow proxy
Signed-off-by: Derek Collison <derek@nats.io>
2020-05-25 06:58:23 -07:00
Derek Collison
413884d87f Update start time for readloop started, check RTT on flapper test
Signed-off-by: Derek Collison <derek@nats.io>
2020-05-25 06:58:23 -07:00
Derek Collison
a695d7aeb7 Ignore if we do not have minimum measurements
Signed-off-by: Derek Collison <derek@nats.io>
2020-05-25 06:58:23 -07:00
Ivan Kozlovic
e976e63099 Fixing some flappers
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-05-25 06:58:23 -07:00
Ivan Kozlovic
5dba3cdd75 [FIXED] Race condition during implicit Gateway reconnection
Say server in cluster A accepts a connection from a server in
cluster B.
The gateway is implicit, in that A does not have a configured
remote gateway to B.
Then the server in B is shutdown, which A detects and initiate
a single reconnect attempt (since it is implicit and if the
reconnect retries is not set).
While this happens, a new server in B is restarted and connects
to A. If that happens before the initial reconnect attempt
failed, A will register that new inbound and do not attempt to
solicit because it has already a remote entry for gateway B.
At this point when the reconnect to old server B fails, then
the remote GW entry is removed, and A will not create an outbound
connection to the new B server.

We fix that by checking if there is a registered inbound when
we get to the point of removing the remote on a failed implicit
reconnect. If there is one, we try the reconnection.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-05-22 13:01:17 -06:00
Ivan Kozlovic
9715848a8e [ADDED] Websocket support
Websocket support can be enabled with a new websocket
configuration block:

```
websocket {
    # Specify a host and port to listen for websocket connections
    # listen: "host:port"

    # It can also be configured with individual parameters,
    # namely host and port.
    # host: "hostname"
    # port: 4443

    # This will optionally specify what host:port for websocket
    # connections to be advertised in the cluster
    # advertise: "host:port"

    # TLS configuration is required
    tls {
      cert_file: "/path/to/cert.pem"
      key_file: "/path/to/key.pem"
    }

    # If same_origin is true, then the Origin header of the
    # client request must match the request's Host.
    # same_origin: true

    # This list specifies the only accepted values for
    # the client's request Origin header. The scheme,
    # host and port must match. By convention, the
    # absence of port for an http:// scheme will be 80,
    # and for https:// will be 443.
    # allowed_origins [
    #    "http://www.example.com"
    #    "https://www.other-example.com"
    # ]

    # This enables support for compressed websocket frames
    # in the server. For compression to be used, both server
    # and client have to support it.
    # compression: true

    # This is the total time allowed for the server to
    # read the client request and write the response back
    # to the client. This include the time needed for the
    # TLS handshake.
    # handshake_timeout: "2s"
}
```

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-05-20 11:14:39 -06:00
Derek Collison
915e3cd74e Header support for Leafnodes
Signed-off-by: Derek Collison <derek@nats.io>
2020-05-19 14:33:56 -07:00