This can help sync on restarts and improve ghost ephemerals. Also added more code to suppress respnses and API audits when we know we are recovering.
Signed-off-by: Derek Collison <derek@nats.io>
In cases where we had a large subject space, a filestore with many msg blocks, and a filtered consumer with a wildcard filtered subject, creating a consumer could take more memory and time then we wanted.
This improvement works when the consumer is DeliverAll and we used the upper layer in memory psim structure to scan but only in memory and avoid a file read for each msg block.
Signed-off-by: Derek Collison <derek@nats.io>
We noticed this was being called alot in user environments.
When the consumer was filtered with a wilcard and the stream had a high cardinality of subjects and was falling behind this could take a substantial amount of time.
Signed-off-by: Derek Collison <derek@nats.io>
When a stream had a large number of consumers on a server that were sparse, the signaling mechanism would do a linear scan to signal matching consumers. As usage patterns have continued to have more consumers that are filteres and sparse, meaning a message is destined for a single or small number of consumers.
This change moves selection to a sublist that tracks only active consumer leaders for selection, which optimizes selection of consumers to signal when the number of consumers is large.
Signed-off-by: Derek Collison <derek@nats.io>
The bug was when a timestamp for the pending state was exactly -1 which could happen based on timing of the redlivered pending items which would set pending.Timestamp into the future potentially and the timing on the encodeConsumerState call.
Minor fixes to raft.
Signed-off-by: Derek Collison <derek@nats.io>
When we deleted a consumer from an interest policy stream we would make sure to clean up any unacked messages.
However we only based start from the ack floor for the consumer and did not take into account the first sequence of the stream.
Signed-off-by: Derek Collison <derek@nats.io>
A request to `$SYS.REQ.SERVER.PING.JSZ` would now return something
like this:
```
...
"meta_cluster": {
"name": "local",
"leader": "A",
"peer": "NUmM6cRx",
"replicas": [
{
"name": "B",
"current": true,
"active": 690369000,
"peer": "b2oh2L6w"
},
{
"name": "Server name unknown at this time (peerID: jZ6RvVRH)",
"current": false,
"offline": true,
"active": 0,
"peer": "jZ6RvVRH"
}
],
"cluster_size": 3
}
```
Note the "peer" field following the "leader" field that contains
the server name. The new field is the node ID, which is a hash of
the server name.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Code change:
- Do not start the processMirrorMsgs and processSourceMsgs go routine
if the server has been detected to be shutdown. This would otherwise
leave some go routine running at the end of some tests.
- Pass the fch and qch to the consumerFileStore's flushLoop otherwise
in some tests this routine could be left running.
Tests changes:
- Added missing defer NATS connection close
- Added missing defer server shutdown
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Test TestNoRaceJetStreamClusterCorruptWAL() would start to flap
because of the snapshot on cluster shutdown. Disable the snapshot
on exit for this test by stopping the raft node before shutdown.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
- didRemove in applyMetaEntries() could be reset when processing
multiple entries
- change "no race" test names to include JetStream
- separate raft nodes leader stepdown and stop in server
shutdown process
- in InstallSnapshot, call wal.Compact() with lastIndex+1
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Use better indexing for lookups, we used to do simple linear scan backwards, now track first and last block.
Will expire the fss cache at will to reduce memory usage.
Signed-off-by: Derek Collison <derek@nats.io>
- Send snapshot only if leader
- When processing snapshot, start with a smaller inactivity interval
that will double up to 10sec or use 10sec directly once we get a
message. Reason for that is that it is possible that the request
for snapshot is sent while the leader has not yet setup the subscription
that receives the requests (or subscription has not fully reached the
cluster).
- Don't remember snapfile on err.
- Do not consider current if we have not had any activity.
- Stabilize stream scale up under active heavy publishing.
- Due to the publish pressure move the check for followers direct subs spinning up til after we stop publishing.
Signed-off-by: Derek Collison <derek@nats.io>
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
If there are more stream names that the current limit of 1024, getting
the list of names would return them all instead of using pagination.
For "stream infos", the Total amount returned would be the API limit
instead of the actual number of streams.
Resolves https://github.com/nats-io/natscli/issues/541
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
If a JS API request is received from a non client connection, it
was processed in its own go routine. To reduce the number of
such go routine, we were limiting the number of outstanding routines
to 4096. However, in some situations, it was possible to issue
many requests at the same time that would then cause those requests
to be dropped.
(an example was an MQTT benchmark tool that would create 5000
sessions, each with one QoS1 R1 consumer (with the use of consumer_replicas=1).
On abrupt exit of the tool, the consumers and their sessions needed
to be deleted. Since would cause fast incoming delete consumer requests
which would cause the original code to drop some of them)
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
There was an issue with MaxWaiting==1 that was causing a request
with expiration to actually not expire. This was because processWaiting
would not pick it up because wq.rp was actually equal to wq.wp
(that is, the read pointer was equal to write pointer for a slice
of capacity of 1).
The other issue was that when reaching the maximum of waiting pull
requests, a new request would evict an old one with a "408 Request Canceled".
There is no reason for that, instead the server will first try to
find some existing expired requests (since some of the expiration
is lazily done), but if none is expired, and the queue is full,
the server will return a "409 Exceeded MaxWaiting" to the new
request, and not a "408 Request Canceled" to an old one...
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
When deciding to compact a file, we need to remove from the raw
bytes the empty records, otherwise, for small messages, we would
end-up calling compact() too many times.
When removing a message from the stream, in FIFO cases we would
write the index every 2 seconds at most when doing it in place,
when when dealing with out of order deletes, we would do it for
every single delete, which can be costly. We are now writing
only every 500ms for non FIFO cases.
Also fixed some unrelated code:
- Decision to install a snapshot was based on incorrect logical
expression
- In checkPending(), protect against the timer being nil which
could happen when consumer is stopped or leadership change.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
- Remove code coverage from Travis and add it to a GitHub Action
that will be run as a nightly.
- Use tag builds to exclude some tests, such as the "norace" or
JS tests. Since "go test" does not support "negative" regexs, there
is no other way.
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>