302 Commits

Author SHA1 Message Date
Ivan Kozlovic
eaf5de05e9 Fixed data race caused by moving some code inside startGoRoutine
startGoRoutine will execute the closed function as a go routine,
so passing copyBytes(msg) as the argument caused a race. The
copy needs to be done before startGoRoutine, as it was before
being changed in https://github.com/nats-io/nats-server/pull/2925

Here is the race observed:
```
==================
WARNING: DATA RACE
Write at 0x00c0001dd930 by goroutine 367:
  runtime.racewriterange()
      <autogenerated>:1 +0x29
  internal/poll.ignoringEINTRIO()
      /home/travis/.gimme/versions/go1.17.8.linux.amd64/src/internal/poll/fd_unix.go:582 +0x454
  internal/poll.(*FD).Read()
      /home/travis/.gimme/versions/go1.17.8.linux.amd64/src/internal/poll/fd_unix.go:163 +0x26
  net.(*netFD).Read()
      /home/travis/.gimme/versions/go1.17.8.linux.amd64/src/net/fd_posix.go:56 +0x50
  net.(*conn).Read()
      /home/travis/.gimme/versions/go1.17.8.linux.amd64/src/net/net.go:183 +0xb0
  net.(*TCPConn).Read()
      <autogenerated>:1 +0x64
  github.com/nats-io/nats-server/v2/server.(*client).readLoop()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/client.go:1188 +0x8f7
  github.com/nats-io/nats-server/v2/server.(*Server).createLeafNode.func1()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/leafnode.go:904 +0x5d
Previous read at 0x00c0001dd930 by goroutine 93:
  runtime.slicecopy()
      /home/travis/.gimme/versions/go1.17.8.linux.amd64/src/runtime/slice.go:284 +0x0
  github.com/nats-io/nats-server/v2/server.copyBytes()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/util.go:282 +0x10b
  github.com/nats-io/nats-server/v2/server.(*Server).jsStreamListRequest.func1()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream_api.go:1613 +0x26
Goroutine 367 (running) created at:
  github.com/nats-io/nats-server/v2/server.(*Server).startGoRoutine()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/server.go:3017 +0x86
  github.com/nats-io/nats-server/v2/server.(*Server).createLeafNode()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/leafnode.go:904 +0x1b08
  github.com/nats-io/nats-server/v2/server.(*Server).startLeafNodeAcceptLoop.func1()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/leafnode.go:604 +0x4b
  github.com/nats-io/nats-server/v2/server.(*Server).acceptConnections.func1()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/server.go:2122 +0x58
Goroutine 93 (running) created at:
  github.com/nats-io/nats-server/v2/server.(*Server).startGoRoutine()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/server.go:3017 +0x86
  github.com/nats-io/nats-server/v2/server.(*Server).jsStreamListRequest()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream_api.go:1613 +0xbf1
  github.com/nats-io/nats-server/v2/server.(*Server).jsStreamListRequest-fm()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream_api.go:1554 +0xcc
  github.com/nats-io/nats-server/v2/server.(*jetStream).apiDispatch()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream_api.go:680 +0xcf0
  github.com/nats-io/nats-server/v2/server.(*jetStream).apiDispatch-fm()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream_api.go:652 +0xcc
  github.com/nats-io/nats-server/v2/server.(*client).deliverMsg()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/client.go:3181 +0xbde
  github.com/nats-io/nats-server/v2/server.(*client).processMsgResults()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/client.go:4164 +0xe1e
  github.com/nats-io/nats-server/v2/server.(*client).processInboundLeafMsg()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/leafnode.go:2183 +0x7eb
  github.com/nats-io/nats-server/v2/server.(*client).processInboundMsg()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/client.go:3498 +0xb1
  github.com/nats-io/nats-server/v2/server.(*client).parse()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/parser.go:497 +0x3886
  github.com/nats-io/nats-server/v2/server.(*client).readLoop()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/client.go:1228 +0x1669
  github.com/nats-io/nats-server/v2/server.(*Server).createLeafNode.func1()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/leafnode.go:904 +0x5d
==================
    testing.go:1152: race detected during execution of test
```

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-25 12:51:52 -06:00
Derek Collison
ef8f543ea5 Improve memory usage through JetStream storage layer.
Previously we would rely more heavily on Go's garbage collector since when we loaded a block for an underlying stream we would pass references upward to avoimd copies.
Now we always copy when passing back to the upper layers which allows us to not only expire our cache blocks but pool and reuse them.

The upper layers also had changes made to allow the pooling layer at that level to interoperate with the storage layer optionally.

Also fixed some flappers and a bug where de-dupe might not be reformed correctly.

Signed-off-by: Derek Collison <derek@nats.io>
2022-03-24 17:45:15 -06:00
Ivan Kozlovic
c3da392832 Changes to IPQueues
Removed the warnings, instead have a sync.Map where they are
registered/unregistered and can be inspected with an undocumented
monitor page.
Added the notion of "in progress" which is the number of messages
that have beend pop()'ed. When recycle() is invoked this count
goes down.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-17 17:53:06 -06:00
Derek Collison
e4ebc4648e When a stream or consumer was offline we would not properly respond to a delete.
We also would hang if no stream info requests were sent during a stream list due to the asset being offline.

Signed-off-by: Derek Collison <derek@nats.io>
2022-03-15 21:11:23 -07:00
Ivan Kozlovic
b4128693ed Ensure file path is correct during stream restore
Also had to change all references from `path.` to `filepath.` when
dealing with files, so that it works properly on Windows.

Fixed also lots of tests to defer the shutdown of the server
after the removal of the storage, and fixed some config files
directories to use the single quote `'` to surround the file path,
again to work on Windows.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-09 13:31:51 -07:00
Matthias Hanel
9a2da9ed8c Adding denies $KV.>/$OBJ.> along leaf connections on differing domain (#2916)
* Adding denies $KV.>/$OBJ.> along leaf connections on differing domain

Signed-off-by: Matthias Hanel <mh@synadia.com>
2022-03-09 13:17:59 -05:00
Ivan Kozlovic
dfe96944d2 [FIXED] JetStream stream info consumers count in clustered mode
In clustering mode, the number of consumers in stream info may be
wrong in presence of non durable consumers. Ephemeral are handled
by specific nodes. The StreamInfo response would contain only the
consumer count that the stream leader is handling.

This fix overrides the stream's state consumers count with the
number of consumers from the stream assignment record.

Resolves #2895

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-03 09:46:35 -07:00
Derek Collison
5a93b0e9d8 Allow pull requests to specify a heartbeat when idle to detect when a request is invalidated.
Signed-off-by: Derek Collison <derek@nats.io>
2022-02-11 09:51:51 -08:00
Derek Collison
da9046b2e6 Snapshot initial consumer info when needed.
Signed-off-by: Derek Collison <derek@nats.io>
2022-02-09 15:23:53 -08:00
Derek Collison
6a3cf0f71e Added in ability to get number of subjects from StreamInfo, and optionally details per subject on how many messages each subject has.
This can also be filtered, meaning you can filter out the subjects when asking for details.

Signed-off-by: Derek Collison <derek@nats.io>
2022-02-02 08:51:13 -08:00
Derek Collison
6be9925127 Update config error
Signed-off-by: Derek Collison <derek@nats.io>
2022-01-24 15:02:41 -08:00
Derek Collison
65b168aa8b Updates based on feedback. MaxDeliver needs to be set properly now to be > len(BackOff) but if larger we will reuse last value in BackOff array.
Signed-off-by: Derek Collison <derek@nats.io>
2022-01-24 15:02:39 -08:00
Ivan Kozlovic
84f6cbb760 Pooling pubMsg and jsPubMsg objects
This should help with GC pressure, however, it may have an effect
on performance (based on some benchmark). Calling sync.Pool.Get/Put
too often has a performance impact...

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-01-13 13:14:25 -07:00
Ivan Kozlovic
62a07adeb9 Replaced catchup and stream restore channels
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-01-13 13:09:49 -07:00
Ivan Kozlovic
23ebf9d2f8 Adapted jsOutQ
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-01-13 13:05:27 -07:00
Derek Collison
16f5c95785 Update atomics placements based on feedback
Signed-off-by: Derek Collison <derek@nats.io>
2022-01-07 09:50:19 -08:00
Derek Collison
52da55c8c6 Implement overflow placement for JetStream streams.
This allows stream placement to overflow to adjacent clusters.
We also do more balanced placement based on resources (store or mem). We can continue to expand this as well.
We also introduce an account requirement that stream configs contain a MaxBytes value.

We now track account limits and server limits more distinctly, and do not reserver server resources based on account limits themselves.

Signed-off-by: Derek Collison <derek@nats.io>
2022-01-06 19:33:08 -08:00
Matthias Hanel
3e8b66286d Js leaf deny (#2693)
Along a leaf node connection, unless the system account is shared AND the JetStream domain name is identical, the default JetStream traffic (without a domain set) will be denied.

As a consequence, all clients that wants to access a domain that is not the one in the server they are connected to, a domain name must be specified.
Affected from this change are setups where: a leaf node had no local JetStream OR the server the leaf node connected to had no local JetStream. 
One of the two accounts that are connected via a leaf node remote, must have no JetStream enabled.
The side that does not have JetStream enabled, will loose JetStream access and it's clients must set `nats.Domain` manually.

For workarounds on how to restore the old behavior, look at:
https://github.com/nats-io/nats-server/pull/2693#issuecomment-996212582

New config values added:
`default_js_domain` is a mapping from account to domain, settable when JetStream is not enabled in an account.
`extension_hint` are hints for non clustered server to start in clustered mode (and be usable to extend)
`js_domain` is a way to set the JetStream domain to use for mqtt.

Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-12-16 16:53:20 -05:00
Matthias Hanel
0ba2544c5a removed suffix from "missing" list
Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-12-08 19:33:35 -05:00
Matthias Hanel
dd735f4a18 Adding missing entry to stream/consumer list
Signed-off-by: Matthias Hanel <mh@synadia.com>
2021-12-08 18:44:40 -05:00
Ivan Kozlovic
9f30bf00e0 [FIXED] Corrupted headers receiving from consumer with meta-only
When a consumer is configured with "meta-only" option, and the
stream was backed by a memory store, a memory corruption could
happen causing the application to receive corrupted headers.

Also replaced most of usage of `append(a[:0:0], a...)` to make
copies. This was based on this wiki:
https://github.com/go101/go101/wiki/How-to-efficiently-clone-a-slice%3F

But since Go 1.15, it is actually faster to call make+copy instead.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-12-01 10:50:15 -07:00
R.I.Pienaar
270ff87beb allow streams api to be filtered like list api
Signed-off-by: R.I.Pienaar <rip@devco.net>
2021-11-18 13:59:12 +01:00
Derek Collison
6f7deaaed5 Only pass through to system account for account info api
Signed-off-by: Derek Collison <derek@nats.io>
2021-11-03 12:41:36 -07:00
Derek Collison
6df5f350c7 Allow system account to respond with jetstream not enabled.
Signed-off-by: Derek Collison <derek@nats.io>
2021-11-03 05:34:29 -07:00
Derek Collison
c78d700e90 Fix for #2658
Signed-off-by: Derek Collison <derek@nats.io>
2021-11-02 15:23:15 -07:00
Derek Collison
678469b40b Fix for #2644
Signed-off-by: Derek Collison <derek@nats.io>
2021-10-25 13:12:37 -07:00
Derek Collison
aff0f62106 In addition to sealed we add in other stream perms to control purge, msg deletes and rollups.
Signed-off-by: Derek Collison <derek@nats.io>
2021-10-05 17:43:24 -07:00
Derek Collison
1338a167e3 Merge pull request #2584 from nats-io/sealed-stream
Allow streams to be sealed through a stream update.
2021-09-29 16:02:58 -07:00
Derek Collison
5fc2cc5754 Allow streams to be sealed through a stream update.
Sealed streams can not accept new messages, allow you to delete or purge messages, or have messages expire due to age.
Sealed stream can not be unsealed through an update.

Signed-off-by: Derek Collison <derek@nats.io>
2021-09-29 15:25:38 -07:00
R.I.Pienaar
98ef6cb03c fixes the msg get audit event
We resused msg variable for both the request
and later the data loaded from the stream.

The data thus replaces the request and the audit
had the data as both request and response

Signed-off-by: R.I.Pienaar <rip@devco.net>
2021-09-29 17:36:38 +02:00
Derek Collison
9918f16914 ConsumerInfo in clustered mode would return not found if assigned but not properly setup with a leader.
This change returns an info in those states now to better support client logic when binding consumers.

Signed-off-by: Derek Collison <derek@nats.io>
2021-09-16 10:56:34 -07:00
Ivan Kozlovic
47359c453e Merge pull request #2533 from nats-io/js_cons_fc_hb
[CHANGED] JetStream: flow control requires heartbeats
2021-09-16 10:20:48 -06:00
Derek Collison
cfbc69b12c Allow clustered JetStream to allow duplicate stream creation like single server mode.
Resolves #2528

Signed-off-by: Derek Collison <derek@nats.io>
2021-09-15 20:18:44 -07:00
Ivan Kozlovic
108d4060ff [CHANGED] JetStream: flow control requires heartbeats
The server will now reject the creation of a push consumer with
flow control if no heartbeat is set.

Resolves #2520

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-09-15 09:50:15 -06:00
Ivan Kozlovic
d6f9a06f05 Merge pull request #2530 from nats-io/mqtt_del_sess
MQTT: delete the session record even on restart with clean flag
2021-09-15 09:01:49 -06:00
Ivan Kozlovic
4cc2677573 MQTT: delete the session record even on restart with clean flag
When a session record exists and is currently not connected, if
the user reconnects with the same client ID but with now the
clean flag set, we are required to clear the state. In earlier
implementation (where we were using a stream per session), we
were not deleting the stream to be created right away. Since now
we just have a message per session, it is ok to delete that
message when clearing the existing session that is going to be
replaced.

Also made apiDispatch execute in place for any connection that
is not ROUTER or GATEWAY.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-09-14 16:33:48 -06:00
Derek Collison
ebc581012e Fixed a condition where JetStream assets could be created in multiple leafnodes.
Also added in optional Domain to StreamInfo.

Signed-off-by: Derek Collison <derek@nats.io>
2021-09-14 14:17:49 -07:00
Derek Collison
8615820a87 Avoid potential race
Signed-off-by: Derek Collison <derek@nats.io>
2021-09-01 14:00:31 -07:00
Derek Collison
57ef5fd528 Set consumer config defaults early on to avoid a race condition.
Signed-off-by: Derek Collison <derek@nats.io>
2021-09-01 13:06:08 -07:00
R.I.Pienaar
76ab1b8d17 attempt to improve UX of the error system
Previously we had a few confusing functions like NewT
and similar that were quite fragile to use due to minimal
validation and a panic in go stdlib string Replacer.

Now we generate helper methods for every string, these
are used to access errors, fill in templates and conditional
returns of error type using the new Unless() option

We now get compile time errors for some common mistakes
and have better IDE helpers for arguments etc

Signed-off-by: R.I.Pienaar <rip@devco.net>
2021-08-10 16:08:28 +02:00
Derek Collison
925a6fe6b2 Fix for #2388. Leafnodes with no JS can seamlessly access a HUB with JS.
This is the reverse of the early work to have LNs extend a non-JS cluster.
Also have mixed mode tests as well.

Signed-off-by: Derek Collison <derek@nats.io>
2021-08-01 14:57:47 -07:00
Derek Collison
9b0158daf9 Allow delivery policy of DeliverLastPerSubject, which is helpful for scoped watchers for K/V.
Signed-off-by: Derek Collison <derek@nats.io>
2021-07-28 12:49:02 -07:00
Derek Collison
98970ac03d Add descriptions to JetStream streams and consumers. 2021-07-27 14:05:25 -07:00
Derek Collison
f13fa767c2 Remove the swapping of accounts during processing of service imports.
When processing service imports we would swap out the accounts during processing.
With the addition of internal subscriptions and internal clients publishing in JetStream we had an issue with the wrong account being used.
This was specific to delyaed pull subscribers trying to unsubscribe due to max of 1 while other JetStream API calls were running concurrently.
2021-07-26 07:57:10 -07:00
Ivan Kozlovic
a7e0eeaf50 Removed debug.PrintStack() from apiDispatch [ci skip]
It was introduced by mistake in PR #2324 which I had reported
but failed to be actually removed before PR was merged:
https://github.com/nats-io/nats-server/pull/2324#pullrequestreview-695163635

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2021-07-13 16:49:44 -06:00
Derek Collison
894f26d149 Fix for #2352
Signed-off-by: Derek Collison <derek@nats.io>
2021-07-08 06:13:36 -07:00
Derek Collison
ad4685c84f Fix for crash in test run
Signed-off-by: Derek Collison <derek@nats.io>
2021-06-30 19:30:37 -07:00
Derek Collison
99fed910f0 Improvements to large numbers of JetStream R1 consumers per stream.
1. We were holding open FDs longer than we should for consumers causing issues with open FD limits. We now do not hold them open and cap updates a bit better.

2. When doing a stream delete, consumer delete was repeating alot of work that was not necessary, causing longer delays. This has been optimized a bit, still more improvements to be made.

3. We cover all JS under a single export, but that was also trapping GetNext for pull based consumers, and since this was a no-op (is handled at user account level) we were creating alot of garbage service import responses and reverse map entries that had to be garbage collected. We have a fix in to avoind this but still looking for a better one.

4. Still had some lingering references to all exports vs single JS export.

Signed-off-by: Derek Collison <derek@nats.io>
2021-06-29 05:45:55 -07:00
Derek Collison
c0e47966ab Added in Stream get last message by subject.
This is to aid in K/V overlay for simple Get ops vs creating a full blown consumer.

Signed-off-by: Derek Collison <derek@nats.io>
2021-06-24 13:21:39 -07:00
Derek Collison
9398c3ca28 Allow for more advanced purge operations that filter by subject, specify the sequence or number of messages to keep.
Signed-off-by: Derek Collison <derek@nats.io>
2021-06-19 07:04:44 -07:00