I noticed that TestNoRaceRoutedQueueAutoUnsubscribe started to
fail a lot on Travis. Running locally I could see a 45 to 50%
failures. After investigation I realized that the issue was that
we have wrongly re-used `subscription.nm` and set to -1 on unsubscribe
however, I believe that it was possible that when subscription was
closed, the server may have already picked that consumer for a delivery
which then causes nm==-1 to be bumped to 0, which was wrong.
Commenting out the subscription.close() that sets nm to -1, I could
not get the test to fail on macOS but would still get 7% failure on
Linux VM. Adding the check to see if sub is closed in deliverMsg()
completely erase the failures, even on Linux VM.
We could still use `nm` set to -1 but check on deliverMsg(), the
same way I use the closed int32 now.
Fixed some flappers.
Updated .travis.yml to failfast if one of the command in the
`script` fails. User `set -e` and `set +e` as recommended in
https://github.com/travis-ci/travis-ci/issues/1066
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
This was introduced in PR#930. The first commit had the route's
check if the flushOutbound() returned false, and if so would
locally unlock/lock the connection's lock. Unfortunately, this
was replaced in the second commit (a6aeed3a6b)
to the flushOutbound() function itself.
This causes the function closeConnection() to possibly unlock
the connection while calling flushOutbound(), which if the
connection is closed due to both a tls timeout for instance
and explicitly, it would result in the connection being scheduled
for a reconnect (if explicit gateway connection, possibly route).
Added defensive code in Gateway to register a unique outbound gateway.
Fixed a test that was now failing with newer Go version in which
they fixed url.Parse()
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Take into account tracking of response maps that are created and do proper cleanup.
Also fixes#1089 which was discovered while working on this.
Signed-off-by: Derek Collison <derek@nats.io>
When a route receives a message, it uses a thread local cache to
find the account and subscriptions match for a given subject.
When not found, an entry is added to this cache. The problem is
that this cache will reference subscriptions that in turn
reference connections.
When the subscriptions/connections are closed, this thread local
cannot be purged from those closed subscriptions (since it is
thread local - no lockin is used).
The real issue is that connection's buffer was not set to nil on
close, which then could cause more than expected memory to be
still referenced. Setting the buffer to nil will help reduce the
memory being used.
When an entry is added to the cache, the cache may reach a size
that will cause the server to prune some entries. From time to
time, the cache will be scanned to look for entries that contain
only closed subscriptions and remove those.
Resolves#1082
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>