Derek Collison 783edaa36d [FIXED] Race condition in some leader failover scenarios leading to messages being potentially sourced more than once. (#4592)
- [X] Tests added
- [X] Branch rebased on top of current main (`git pull --rebase origin
main`)
- [X] Changes squashed to a single commit (described
[here](http://gitready.com/advanced/2009/02/10/squashing-commits-with-rebase.html))
 - [x] Build is green in Travis CI
- [X] You have certified that the contribution is your original work and
that you license the work to the project under the [Apache 2
license](https://github.com/nats-io/nats-server/blob/main/LICENSE)

### Changes proposed in this pull request:

Fixes a race condition in some leader failover scenarios leading to
messages being potentially sourced more than once.

In some failure scenarios where the current leader of a stream sourcing
from other stream(s) gets shutdown while publications are happening on
the stream(s) being sourced leads to `setLeader(true)` being called on
the new leader for the sourcing stream before all the messages having
been sourced by the previous leader are completely processed such that
when the new leader does it's reverse scan from the last message in it's
view of the stream in order to know what sequence number to start the
consumer for the stream being sourced from, such that the last
message(s) sourced by the previous leader get sourced again, leading to
some messages being sourced more than once.

The existing `TestNoRaceJetStreamSuperClusterSources` test would
sidestep the issue by relying on the deduplication window in the
sourcing stream. Without deduplication the test is a flapper.

This avoid the race condition by adding a small delay before scanning
for the last message(s) having been sourced and starting the sources'
consumer(s). Now the test (without using the deduplication window) never
fails because more messages than expected have been received in the
sourcing stream.

(Also adds a guard to give up if `setupSourceConsumers()` is called and
we are no longer the leader for the stream (that check was already
present in `setupMirrorConsumer()` so assuming it was forgotten for
`setupSourceConsumers()`)
2023-09-28 11:22:20 -07:00
2023-09-15 16:12:46 -04:00
2021-07-13 10:07:31 +02:00
2022-08-03 14:01:55 -07:00
2023-09-06 13:20:10 -04:00
2018-03-15 11:38:25 -07:00
2022-08-16 16:48:00 -06:00
2023-09-27 09:55:38 -07:00
2023-09-27 09:55:38 -07:00
2023-09-26 07:37:57 -04:00
2018-03-15 22:31:07 -07:00
2023-03-17 15:09:50 -07:00
2023-09-26 07:37:57 -04:00
2023-09-26 07:37:57 -04:00

NATS Logo

NATS is a simple, secure and performant communications system for digital systems, services and devices. NATS is part of the Cloud Native Computing Foundation (CNCF). NATS has over 40 client language implementations, and its server can run on-premise, in the cloud, at the edge, and even on a Raspberry Pi. NATS can secure and simplify design and operation of modern distributed systems.

License Build Release Slack Coverage Docker Downloads CII Best Practices

Documentation

Contact

  • Twitter: Follow us on Twitter!
  • Google Groups: Where you can ask questions
  • Slack: Click here to join. You can ask question to our maintainers and to the rich and active community.

Contributing

If you are interested in contributing to NATS, read about our...

Roadmap

The NATS product roadmap can be found here.

Security

Security Audit

A third party security audit was performed by Cure53, you can see the full report here.

Reporting Security Vulnerabilities

If you've found a vulnerability or a potential vulnerability in the NATS server, please let us know at nats-security.

License

Unless otherwise noted, the NATS source files are distributed under the Apache Version 2.0 license found in the LICENSE file.

Description
No description provided
Readme Apache-2.0 33 MiB
Languages
Go 99.6%
Shell 0.4%