Commit Graph

246 Commits

Author SHA1 Message Date
Derek Collison
35135948a0 Make sure llts update under lock, fss can be force expired so remove.
Signed-off-by: Derek Collison <derek@nats.io>
2022-08-17 14:54:35 -07:00
Derek Collison
d48ccf4c5a When filestore is used for raft layer do not attempt to track subject metadata.
Signed-off-by: Derek Collison <derek@nats.io>
2022-08-17 13:46:13 -07:00
Derek Collison
9508276b98 Make kek function based on review feedback
Signed-off-by: Derek Collison <derek@nats.io>
2022-08-16 12:49:03 -07:00
Derek Collison
ef91d67708 Support auto-conversion
Signed-off-by: Derek Collison <derek@nats.io>
2022-08-16 08:41:39 -07:00
Derek Collison
827b34a77a Add support for AES cipher encryption for filestore.
Signed-off-by: Derek Collison <derek@nats.io>
2022-08-15 14:21:37 -07:00
Derek Collison
d7534dff5f Make sure when SubjectState is called we have loaded fss state.
Signed-off-by: Derek Collison <derek@nats.io>
2022-08-12 07:14:39 -05:00
Derek Collison
6bc82bb4e6 Fic a data race
Signed-off-by: Derek Collison <derek@nats.io>
2022-08-09 17:42:02 -05:00
Derek Collison
8c04adc009 Improvements to filestore for large KVs.
Use better indexing for lookups, we used to do simple linear scan backwards, now track first and last block.
Will expire the fss cache at will to reduce memory usage.

Signed-off-by: Derek Collison <derek@nats.io>
2022-08-09 15:51:13 -05:00
Ivan Kozlovic
3c9a7cc6e5 Move to Go 1.19, remote io/util, fix data race and a flapper
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-08-05 09:55:37 -06:00
Derek Collison
6450301cfc Improved speed of storing new messages when lots of messages are present in KV mode.
Signed-off-by: Derek Collison <derek@nats.io>
2022-08-01 14:02:19 -07:00
Derek Collison
717969510d Make sure to reset block encryption counter when clearing block but holding state for tracking sequences.
Signed-off-by: Derek Collison <derek@nats.io>
2022-07-31 07:59:19 -07:00
Derek Collison
8dc1e4b6de When compact would reclaim head of block space, we needed to update block key for counter for new writes.
Signed-off-by: Derek Collison <derek@nats.io>
2022-07-30 13:05:41 -07:00
Ivan Kozlovic
d0ee9a1252 Add comment that this is to silence the race detector.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-07-27 15:43:38 -06:00
Ivan Kozlovic
880e854637 Fixed data race between UpdateConfif() and subjString
A message block is checking the filestore's cfg.Subjects to see
if it can "intern" the subject or not. The problem is that this
is done under the message block's lock, but not the filestore.
However, during a stream configuration update, the filestore's
cfg field is switched to a new one, causing the datarace.

By making sure we do the switch under all message blocks lock,
we remove the data race (that could be reproduce by running th
test TestJetStreamClusterMoveCancel with -count=10).

We investigating the use of a string interning library but it
showed a little performance degradation that this approach does
not suffer from.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-07-27 14:39:34 -06:00
Derek Collison
52f7765322 When msgs were expired on restart recovery we could lose track on subsequent restart of starting sequence with no additional activity.
Signed-off-by: Derek Collison <derek@nats.io>
2022-07-23 17:15:16 -07:00
Derek Collison
3a10456e68 Short index write could lead to loss of stream sequence for empty stream
Signed-off-by: Derek Collison <derek@nats.io>
2022-07-22 06:37:19 -07:00
Derek Collison
148877b2f0 In the presence of many subjects in a stream and a wildcard filter subject, fall back to linear scan if too many.
Signed-off-by: Derek Collison <derek@nats.io>
2022-06-11 10:12:40 -07:00
Derek Collison
e1c8f9fb55 This improves when a server is under load or low on resources like FDs and a user is trying to delete a stream with lots of consumers.
Signed-off-by: Derek Collison <derek@nats.io>
2022-06-04 16:49:17 -07:00
Derek Collison
6cc14ff84d When stores and load for last for subject where concurrent and competiting for the same msg, we could fail to retrieve a newly placed message.
Signed-off-by: Derek Collison <derek@nats.io>
2022-05-31 18:04:05 -07:00
Ivan Kozlovic
53e3c53d96 [FIXED] JetStream: consumer with deliver new may miss messages
This could happen when a consumer had not sent anything to the
attached NATS subscription and there was a consumer leader
step down or server restart.

Signed-off-by: Derek Collison <derek@nats.io>
2022-05-23 12:01:48 -06:00
Derek Collison
41cca8d6c4 Allow proper mix and match of consumer stores and stream stores.
Signed-off-by: Derek Collison <derek@nats.io>
2022-05-18 12:51:48 -07:00
Derek Collison
4291433a46 General improvements to accounting for the filestore. This in response to tracking issue #3114.
Signed-off-by: Derek Collison <derek@nats.io>
2022-05-12 15:43:11 -07:00
Ivan Kozlovic
e304589da4 [FIXED] JetStream: Some data races
We were getting a data race checking the js.clustered field in
updateUsage() following fix for lock inversion in PR #3092.
```
=== RUN   TestJetStreamClusterKVMultipleConcurrentCreate
==================
WARNING: DATA RACE
Read at 0x00c0009db5d8 by goroutine 195:
  github.com/nats-io/nats-server/v2/server.(*jsAccount).updateUsage()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream.go:1681 +0x8f
  github.com/nats-io/nats-server/v2/server.(*stream).storeUpdates()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/stream.go:2927 +0x1d9
  github.com/nats-io/nats-server/v2/server.(*stream).storeUpdates-fm()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/stream.go:2905 +0x7d
  github.com/nats-io/nats-server/v2/server.(*fileStore).removeMsg()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/filestore.go:2158 +0x14f7
  github.com/nats-io/nats-server/v2/server.(*fileStore).expireMsgs()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/filestore.go:2777 +0x18f
  github.com/nats-io/nats-server/v2/server.(*fileStore).expireMsgs-fm()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/filestore.go:2770 +0x39
Previous write at 0x00c0009db5d8 by goroutine 128:
  github.com/nats-io/nats-server/v2/server.(*jetStream).setupMetaGroup()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:604 +0xfae
  github.com/nats-io/nats-server/v2/server.(*Server).enableJetStreamClustering()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:514 +0x20a
  github.com/nats-io/nats-server/v2/server.(*Server).enableJetStream()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream.go:400 +0x1168
  github.com/nats-io/nats-server/v2/server.(*Server).EnableJetStream()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream.go:206 +0x651
  github.com/nats-io/nats-server/v2/server.(*Server).Start()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/server.go:1746 +0x1804
  github.com/nats-io/nats-server/v2/server.RunServer·dwrap·4269()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/server_test.go:90 +0x39
Goroutine 195 (running) created at:
  time.goFunc()
      /home/travis/.gimme/versions/go1.17.9.linux.amd64/src/time/sleep.go:180 +0x49
Goroutine 128 (finished) created at:
  github.com/nats-io/nats-server/v2/server.RunServer()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/server_test.go:90 +0x278
  github.com/nats-io/nats-server/v2/server.RunServerWithConfig()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/server_test.go:112 +0x44
  github.com/nats-io/nats-server/v2/server.(*cluster).restartServer()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream_helpers_test.go:1004 +0x1d5
  github.com/nats-io/nats-server/v2/server.TestJetStreamClusterKVMultipleConcurrentCreate()
      /home/travis/gopath/src/github.com/nats-io/nats-server/server/jetstream_cluster_test.go:8463 +0x64b
  testing.tRunner()
      /home/travis/.gimme/versions/go1.17.9.linux.amd64/src/testing/testing.go:1259 +0x22f
  testing.(*T).Run·dwrap·21()
      /home/travis/.gimme/versions/go1.17.9.linux.amd64/src/testing/testing.go:1306 +0x47
==================
```

Running that test with adding some delay in several places also showed another race:
```
==================
WARNING: DATA RACE
Read at 0x00c00016adb8 by goroutine 160:
  github.com/nats-io/nats-server/v2/server.(*fileStore).expireMsgs()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/filestore.go:2777 +0x106
  github.com/nats-io/nats-server/v2/server.(*fileStore).expireMsgs-fm()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/filestore.go:2771 +0x39

Previous write at 0x00c00016adb8 by goroutine 32:
  github.com/nats-io/nats-server/v2/server.(*fileStore).UpdateConfig()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/filestore.go:360 +0x1c8
  github.com/nats-io/nats-server/v2/server.(*stream).update()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/stream.go:1360 +0x852
  github.com/nats-io/nats-server/v2/server.(*jetStream).processClusterCreateStream()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:2704 +0x4a4
  github.com/nats-io/nats-server/v2/server.(*jetStream).processStreamAssignment()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:2452 +0xad9
  github.com/nats-io/nats-server/v2/server.(*jetStream).applyMetaEntries()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:1407 +0x7e4
  github.com/nats-io/nats-server/v2/server.(*jetStream).monitorCluster()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:887 +0xc75
  github.com/nats-io/nats-server/v2/server.(*jetStream).monitorCluster-fm()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:813 +0x39

Goroutine 160 (running) created at:
  time.goFunc()
      /usr/local/go/src/time/sleep.go:180 +0x49

Goroutine 32 (running) created at:
  github.com/nats-io/nats-server/v2/server.(*Server).startGoRoutine()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/server.go:3013 +0x86
  github.com/nats-io/nats-server/v2/server.(*jetStream).setupMetaGroup()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:612 +0x1092
  github.com/nats-io/nats-server/v2/server.(*Server).enableJetStreamClustering()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/jetstream_cluster.go:514 +0x20a
  github.com/nats-io/nats-server/v2/server.(*Server).enableJetStream()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/jetstream.go:400 +0x1168
  github.com/nats-io/nats-server/v2/server.(*Server).EnableJetStream()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/jetstream.go:206 +0x651
  github.com/nats-io/nats-server/v2/server.(*Server).Start()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/server.go:1746 +0x1804
  github.com/nats-io/nats-server/v2/server.RunServer·dwrap·4275()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/server_test.go:90 +0x39
==================
```

Both are now addressed, either with proper locking, or with the use of an atomic in the place
where we cannot get the lock (without re-introducing the lock inversion issue).

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-05-11 19:09:24 -06:00
Derek Collison
88ebfdaee8 Merge pull request #3109 from nats-io/issue-3107-3069
[FIXED] Downstream sourced retention policy streams during restart have redelivered messages
2022-05-09 09:13:48 -07:00
Derek Collison
b35988adf9 Remember the last timestamp by not removing last msgBlk when empty and during purge pull last timestamp forward until new messages arrive.
When a downstream stream uses retention modes that delete messages, fallback to timebased start time for the new source consumers.

Signed-off-by: Derek Collison <derek@nats.io>
2022-05-09 09:04:19 -07:00
Derek Collison
fbc9e16253 Fix for panic due to not loaded cache during compact
Signed-off-by: Derek Collison <derek@nats.io>
2022-05-07 09:25:32 -07:00
Derek Collison
7246edc77d Bump up default block sizes
Signed-off-by: Derek Collison <derek@nats.io>
2022-05-04 09:46:15 -07:00
Ivan Kozlovic
5d90c8eac7 [IMPROVED] JetStream: check max-per-subject once
There was a case where we may have done a check for max-per-subject
limit twice per message. That would apply to streams that have
max-per-subject and also discard_new, which is what KV configures.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-05-03 16:57:26 -06:00
Derek Collison
0d928c0338 Merge pull request #3085 from nats-io/small-fss-improvement
Small improvement with fss processing
2022-04-28 13:56:53 -07:00
Ivan Kozlovic
d4d37e67f4 [FIXED] JetStream: file store compact and when to write index
When deciding to compact a file, we need to remove from the raw
bytes the empty records, otherwise, for small messages, we would
end-up calling compact() too many times.

When removing a message from the stream, in FIFO cases we would
write the index every 2 seconds at most when doing it in place,
when when dealing with out of order deletes, we would do it for
every single delete, which can be costly. We are now writing
only every 500ms for non FIFO cases.

Also fixed some unrelated code:
- Decision to install a snapshot was based on incorrect logical
expression
- In checkPending(), protect against the timer being nil which
could happen when consumer is stopped or leadership change.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-04-28 12:35:19 -06:00
Derek Collison
9a96bef4c7 Small improvement with fss processing
Signed-off-by: Derek Collison <derek@nats.io>
2022-04-28 10:23:30 -07:00
Derek Collison
f702e279ab Fix for a consumer recovery issue.
Also update healthz to check all assets that are assigned, not just running.

Signed-off-by: Derek Collison <derek@nats.io>
2022-04-26 19:22:19 -07:00
Ivan Kozlovic
50c3986863 [FIXED] JetStream stream catchup issues
- A stream could become leader when it should not, causing
messages to be lost.
- A catchup could stall because the server sending data
could bail out of the runCatchup routine but still send
the EOF signal.
- Deadlock with monitoring of Jsz

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Signed-off-by: Derek Collison <derek@nats.io>
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-04-12 16:05:12 -06:00
Derek Collison
e7ff38a4ca Add consumerMemStore impl to allow proper replication of state.
Resolves #3006

Signed-off-by: Derek Collison <derek@nats.io>
2022-04-10 08:01:13 -07:00
Derek Collison
ef9728997d During recovery check our guess on the last block.
Signed-off-by: Derek Collison <derek@nats.io>
2022-04-05 19:20:31 -07:00
Derek Collison
ab5e2344e0 When loading blocks in use len(mb.fss) to determine if we can use sfilter optimization.
Also check fs.lmb when the stream config is updated.

Signed-off-by: Derek Collison <derek@nats.io>
2022-04-05 18:49:21 -07:00
Ivan Kozlovic
371ce36712 [IMPROVED] Stream with multiple subjects and consumer with filter
This is more of a regression introduced in v2.7.3 (with PR #2848).
When the store has a list of subjects, finding the next message
to deliver would go through the subjects map and have to match
to find out if it is a subset (if the filter had a wildcard).
In situations where there were lots of subjects (for instance 1
message per subject), but the consumer did not filter on anything
specific, then this processing was becoming slow.

We now check that if the stream has a single subject (even with
wildcard) and the consumer filters on that exact subject, then
we can do a linear scan. We also do a linear scan if the number
of messages in the block is 1/2 the number of subjects in the
subjects map.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-04-05 18:19:17 -06:00
Derek Collison
607858f213 Improved consumer snapshot logic in clustered mode and disk usage.
Also fixed a bug that could cause memory based replicated consumers to no longer work after snapshots and server restarts.

The snapshot logic would allow non-state changing updates to continously grow the raft logs. We also were too conservative on when we snapshotted and why.
Also added in ability to have FileStore.Compact() reclaim space from the block file from the head of last changed block.

Signed-off-by: Derek Collison <derek@nats.io>
2022-03-29 18:02:49 -07:00
Derek Collison
780d4c0dd8 Merge pull request #2960 from nats-io/mem_pool
Additional improvements to memory pooling and management.
2022-03-28 17:10:16 -07:00
Derek Collison
bd0a0b28c7 When recycling blocks we could potentially place partials into a tier. This would possibly cause the load code to thrash since it would not be big enough for a full block and we would need to recycle again and make a new one.
Signed-off-by: Derek Collison <derek@nats.io>
2022-03-28 16:46:46 -07:00
Ivan Kozlovic
f82eda30aa Fix map init
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-28 17:46:01 -06:00
Ivan Kozlovic
909c6754cb Changed subjString to accept a byte slice
This may prevent memory copies when not necessary. Also fixed a bug
there that would check twice if there was only 1 subject and that
subject did not match (say configured subject is foo.* and key is
foo.bar).

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-28 17:37:28 -06:00
Derek Collison
5e5aab378e Additional improvements to memory pooling and management. Also logic fix for firstMatching that did unnecessary work when matching all.
During contention to the head write blk, the system could perform worse memory wise compared to simple go runtime.
Also had some references for the subject of messages bloating memory.

Signed-off-by: Derek Collison <derek@nats.io>
2022-03-28 10:15:23 -07:00
Derek Collison
04d4f08e8c Under heavy contention skip combined with remove could result in index being stamped with underflow for number of messages.
We had a report of a panic on server restart with 2.8.0-beta.1. The panic was trying to malloc the size of a load block based off of the number of messages we thought the block had from the index.
Before, SkipMsg would decrement and when we added the record via writeMsgRecord we would add it back in. However we did release the lock, meaning other things could run.
If in between the decrement, say to 0 (we did protect against underflow there), then a remove and subsequent writeIndexInfo would stamp and underflow.

Signed-off-by: Derek Collison <derek@nats.io>
2022-03-26 11:05:38 -07:00
Derek Collison
ef8f543ea5 Improve memory usage through JetStream storage layer.
Previously we would rely more heavily on Go's garbage collector since when we loaded a block for an underlying stream we would pass references upward to avoimd copies.
Now we always copy when passing back to the upper layers which allows us to not only expire our cache blocks but pool and reuse them.

The upper layers also had changes made to allow the pooling layer at that level to interoperate with the storage layer optionally.

Also fixed some flappers and a bug where de-dupe might not be reformed correctly.

Signed-off-by: Derek Collison <derek@nats.io>
2022-03-24 17:45:15 -06:00
Derek Collison
dbfa47f9b1 Improve state preservation for consumers, specifically DeliverNew variants when no activity has been present.
Signed-off-by: Derek Collison <derek@nats.io>
2022-03-16 20:55:14 -07:00
Ivan Kozlovic
b4128693ed Ensure file path is correct during stream restore
Also had to change all references from `path.` to `filepath.` when
dealing with files, so that it works properly on Windows.

Fixed also lots of tests to defer the shutdown of the server
after the removal of the storage, and fixed some config files
directories to use the single quote `'` to surround the file path,
again to work on Windows.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2022-03-09 13:31:51 -07:00
Derek Collison
3216eb5ee5 When a consumer has no state we are now compacting the log, but were not snapshotting.
This caused issues on leader change and losing quorum.

Signed-off-by: Derek Collison <derek@nats.io>
2022-03-09 07:21:25 -05:00
Derek Collison
b759ff481f Some users reporting checksums don't match and "no message cache" on recovery.
Signed-off-by: Derek Collison <derek@nats.io>
2022-03-04 11:50:15 -08:00
Derek Collison
1b5f651c22 Fixed bug that would not recover a stream after non-clean shutdown with deleted messages.
Signed-off-by: Derek Collison <derek@nats.io>
2022-03-04 10:48:10 -08:00