Commit Graph

1753 Commits

Author SHA1 Message Date
Ivan Kozlovic
cd6d71deaa [ADDED] lame_duck_grace_period option
The grace period used to be hardcoded at 10 seconds.
This option allows the user to configure the amount of time the
server will wait before initiating the closing of client connections.

Note that the grace period needs to be strictly lower than the overall
lame_duck_duration. The server deducts the grace period from that
overall duration and spreads the closing of connections during
that time.
For instance, if there are 1000 connections and the lame duck
duration is set to 30 seconds and grace period to 10, then
the server will use 30-10 = 20 seconds to spread the closing
of those 1000 connections, so say roughly 50 clients per second.

Resolves #1459.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-06-08 11:43:25 -06:00
Derek Collison
d0f65c8a74 Don't leak service import subs on claim updates
Signed-off-by: Derek Collison <derek@nats.io>
2020-06-05 13:28:40 -07:00
Ivan Kozlovic
82968f64d4 Merge pull request #1455 from nats-io/fix_ln_sub_interest_propagation
[FIXED] Unsubscribe may not be propagated through a leaf node
2020-06-05 11:11:26 -06:00
Ivan Kozlovic
25bd5ca352 [FIXED] Unsubscribe may not be propagated through a leaf node
There is a race between the time the processing of a subscription
and the init/send of subscriptions when accepting a leaf node
connection that may cause internally a subscription's subject
to be counted many times, which would then prevent the send of
an LS- when the subscription's interest goes away.

Imagine this sequence of events, each side represents a "thread"
of execution:
```
client readLoop                         leaf node readLoop
----------------------------------------------------------
recv SUB foo 1
sub added to account's sublist

                                         recv CONNECT
                                     auth, added to acc.

updateSmap
smap["foo"]++ -> 1
no LS+ because !allSubsSent

                                         init smap
                                    finds sub in acc sl
                                    smap["foo"]++ -> 2
                                        sends LS+ foo
                                    allSubsSent == true

recv UNSUB 1
updateSmap
smap["foo"]-- -> 1
no LS- because count != 0
----------------------------------------------------------
```
Equivalent result but with slightly diffent execution:
```
client readLoop                         leaf node readLoop
----------------------------------------------------------
recv SUB foo 1
sub added to account's sublist

                                         recv CONNECT
                                     auth, added to acc.

                                         init smap
                                    finds sub in acc sl
                                    smap["foo"]++ -> 1
                                        sends LS+ foo
                                    allSubsSent == true

updateSmap
smap["foo"]++ -> 2
no LS+ because count != 1

recv UNSUB 1
updateSmap
smap["foo"]-- -> 1
no LS- because count != 0
----------------------------------------------------------
```

The approach for the fix is delay the creation of the smap
until we actually initialize the map and send the subs on processing
of the CONNECT.
In the meantime, as soon as the LN connection is registered
and available in updateSmap, we check that smap is nil or
not. If nil, we do nothing.

In "init smap" we keep track of the subscriptions that have been
added to smap. This map will be short lived, just enough to
protect against races above.

In updateSmap, when smap is not nil, we need to checki, if we
are adding, that the subscription has not already been handled.
The tempory subscription map will be ultimately emptied/set to
nil with the use of a timer (if not emptied in place when
processing smap updates).

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-06-05 10:07:15 -06:00
Derek Collison
19fcafa175 Merge pull request #1454 from nats-io/sys
Add default system account back to accounts after reload
2020-06-05 09:06:57 -07:00
Derek Collison
4ea9c12d23 Add default system account back to accounts after reload
Signed-off-by: Derek Collison <derek@nats.io>
2020-06-05 08:59:04 -07:00
Derek Collison
b91c8879ad Consolidate subDetail
Signed-off-by: Derek Collison <derek@nats.io>
2020-06-05 08:37:18 -07:00
Derek Collison
c1ffd48638 Add account details to subsz.
Also allow ability to filter based on account.

Signed-off-by: Derek Collison <derek@nats.io>
2020-06-05 05:53:01 -07:00
Derek Collison
c969e7e424 Do proper ubsubscribe when shutting off restore endpoint
Signed-off-by: Derek Collison <derek@nats.io>
2020-06-04 08:58:14 -07:00
Derek Collison
f7f40f16a5 Bumped version
Signed-off-by: Derek Collison <derek@nats.io>
2020-06-04 07:15:14 -07:00
Derek Collison
f07533c823 Merge pull request #1448 from nats-io/restore
Snapshot restore now works across leafnodes.
2020-06-04 07:13:12 -07:00
Derek Collison
164f44ed18 Require reply subjects for restore chunks
Signed-off-by: Derek Collison <derek@nats.io>
2020-06-04 06:56:07 -07:00
Derek Collison
012d517ba1 Better error handling and reporting for failures
Signed-off-by: Derek Collison <derek@nats.io>
2020-06-04 06:17:50 -07:00
Derek Collison
660ea3c807 Snapshot restore now works across leafnodes.
This also introduces the ability to have flow control inbound for restoring a stream.
If the system detects a reply subject it will respond with a nil payload.
For the last EOF message if a reply is present it will respond with a stream info response or error.

Signed-off-by: Derek Collison <derek@nats.io>
2020-06-03 20:00:59 -07:00
Ivan Kozlovic
5fe60d3084 Removed double connection close trace
In master, this this what happens when a connection is closed
and server runs with `-D`
```
[95023] 2020/06/03 14:55:28.395532 [DBG] 127.0.0.1:54077 - cid:2 - Client connection created
[95023] 2020/06/03 14:55:29.164118 [DBG] 127.0.0.1:54077 - cid:2 - Client connection closed: Client Closed
[95023] 2020/06/03 14:55:29.164191 [DBG] 127.0.0.1:54077 - cid:2 - Client connection closed
```
Notice the trace of connection closed with the reason, and then
the bare connection closed statement.

This double trace was introduced by mistake during the JS branch
work (dd116fcfd4 (diff-853eb184ac73cf9597d7833f6b89e9c9R3547))

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-06-03 15:00:34 -06:00
Ivan Kozlovic
30bc47b87a Merge pull request #1446 from nats-io/ldm_websocket
LameDuckMode takes into account websocket accept loop
2020-06-02 18:30:03 -06:00
Ivan Kozlovic
98ea70a590 LameDuckMode takes into account websocket accept loop
This is related to #1408.
Make sure that we close the websocket "accept loop" if configured
before proceeding with the lame duck mode.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-06-02 17:49:38 -06:00
Matthias Hanel
0a3e89c64a Incorporating comments
Signed-off-by: Matthias Hanel <mh@synadia.com>
2020-06-02 18:38:17 -04:00
Matthias Hanel
cf6fcda75c Added default_permissions to accounts and account jwt
Signed-off-by: Matthias Hanel <mh@synadia.com>
2020-06-02 16:06:01 -04:00
Phil Pennock
939bc01423 Bump version for server to beta.13 2020-06-02 15:21:34 -04:00
Matthias Hanel
d5180025f5 Fix flapper by making the channel buffered
Signed-off-by: Matthias Hanel <mh@synadia.com>
2020-06-02 13:31:41 -04:00
Ivan Kozlovic
1e149f4041 Merge pull request #1440 from nats-io/jwt2
Update imports for jwt/v2
2020-06-02 11:10:21 -06:00
Matthias Hanel
2d61507bb7 Moving nats.go unit test and updating go modules
Signed-off-by: Matthias Hanel <mh@synadia.com>
2020-06-02 12:44:00 -04:00
Derek Collison
afc7fc367b Remove hdrs for now, find better way to deliver in client
Signed-off-by: Derek Collison <derek@nats.io>
2020-06-02 07:10:23 -07:00
Derek Collison
8e9462dea4 Make arg order same for Snapshot
Signed-off-by: Derek Collison <derek@nats.io>
2020-06-02 06:24:46 -07:00
Derek Collison
0d2ca9ba54 Fix race, can't clear direct memory since shared
Signed-off-by: Derek Collison <derek@nats.io>
2020-06-02 06:19:04 -07:00
R.I.Pienaar
920dd4269a fix argument order in snapshots
Signed-off-by: R.I.Pienaar <rip@devco.net>
2020-06-02 13:51:50 +02:00
R.I.Pienaar
c57b86128d publish audit advisories to the correct subject
Signed-off-by: R.I.Pienaar <rip@devco.net>
2020-06-02 12:40:03 +02:00
R.I.Pienaar
3fc5c9284a send stream advisories using a helper 2020-06-02 08:48:11 +02:00
Derek Collison
b5dfb984e9 Fixes for race detections under GHA
Signed-off-by: Derek Collison <derek@nats.io>
2020-06-01 18:34:18 -07:00
Derek Collison
07ef71ff98 Avoid parsing large sizes for messages
Signed-off-by: Derek Collison <derek@nats.io>
2020-06-01 16:54:41 -07:00
Matthias Hanel
547afa47d6 Pulling in updated jwtv2 and using server version stored in operator
Signed-off-by: Matthias Hanel <mh@synadia.com>
2020-06-01 18:08:50 -04:00
aricart
e7590f3065 jwt2 testbed 2020-06-01 18:00:13 -04:00
Derek Collison
05e38ae527 Merge branch 'master' into sys-acc 2020-06-01 11:53:14 -07:00
Derek Collison
5f130128ce Read INFO used bufio.Reader
Signed-off-by: Derek Collison <derek@nats.io>
2020-06-01 11:46:18 -07:00
Derek Collison
5271be49ee Fix race for snapshots
Signed-off-by: Derek Collison <derek@nats.io>
2020-06-01 09:21:22 -07:00
Ivan Kozlovic
c27f717ef6 [FIXED] Log file size limit not honored after re-open signal
When the logfile_size_limit option is specified, the logfile will
be automatically rotated. However, if user still sends the re-open
signal (SIGUSR1), the log file will then no longer apply the
size limit.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-06-01 09:17:32 -06:00
Derek Collison
4d62a7237d Allow redelivery for AckAll policy, general upgrades to pending beahviors. Fixes #1436
Signed-off-by: Derek Collison <derek@nats.io>
2020-05-31 07:50:50 -07:00
Derek Collison
e584d4efee Merge pull request #1435 from nats-io/js-hdrs
First pass header support for JetStream
2020-05-31 06:01:01 -07:00
Derek Collison
dbde2479c2 Add in headers to consumer delivered messages
Signed-off-by: Derek Collison <derek@nats.io>
2020-05-30 15:03:54 -07:00
Derek Collison
9156fa6c03 Merge pull request #1432 from nats-io/snapshot-check
Allow snapshots to optionally check all message checksums
2020-05-30 12:53:40 -07:00
Derek Collison
8e407f8db4 Do snapshot setup in go routine as well for checkMsgs
Signed-off-by: Derek Collison <derek@nats.io>
2020-05-30 11:28:43 -07:00
Derek Collison
eca04c6fce First pass header support for JetStream
Signed-off-by: Derek Collison <derek@nats.io>
2020-05-30 10:04:23 -07:00
Derek Collison
2bd7553c71 System Account on by default.
Most of the changes are to turn it off for tests that were watching subscriptions and such.

Signed-off-by: Derek Collison <derek@nats.io>
2020-05-29 17:56:45 -07:00
Ivan Kozlovic
44e78a1fb6 Fixed some tests
- A race test may have consumed a lot of fds going in TIME_WAIT
that could cause some issues for other tests
- Missing defer filestore.Stop() that would leave flushLoop()
routines
- A defer for the from server in a LeafNode test
- Rework [Re]ConnectErrorReports that was failing often for me
locally (probably due to exhaustion of fds - too many TIME_WAIT).

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-05-29 17:47:08 -06:00
Derek Collison
e12907ffa6 Allow snapshots to optionally check all message checksums
Signed-off-by: Derek Collison <derek@nats.io>
2020-05-29 09:57:33 -07:00
Derek Collison
0a206b4c64 Snapshot performance tweaks
Signed-off-by: Derek Collison <derek@nats.io>
2020-05-29 08:07:31 -07:00
Derek Collison
10e49ca1c4 Fix more flappers
Signed-off-by: Derek Collison <derek@nats.io>
2020-05-28 14:19:11 -07:00
Derek Collison
f6a9d3bc3c Merge pull request #1429 from kingkorf/master
First check bcrypt '$' prefix before performing rexeg on password
2020-05-28 14:18:03 -07:00
Jacob
c1848a997c First check $ prefix 2020-05-28 22:54:20 +02:00