Files
nats-server/server/leafnode.go
Ivan Kozlovic 406dc7ee56 Fixed data race on leafnode check for remote cluster
A newly introduced test (TestLeafNodeTwoRemotesBindToSameAccount)
had a server creating two remotes to the same server/account.
This test quite often show the data race:
```
go test -race -v -run=TestLeafNodeTwoRemotesBindToSameAccount ./server -count 100 --failfast
=== RUN   TestLeafNodeTwoRemotesBindToSameAccount
==================
WARNING: DATA RACE
Write at 0x00c000168790 by goroutine 34:
  github.com/nats-io/nats-server/v2/server.(*client).processLeafNodeConnect()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/leafnode.go:1177 +0x314
  github.com/nats-io/nats-server/v2/server.(*client).processConnect()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/client.go:1719 +0x9e4
  github.com/nats-io/nats-server/v2/server.(*client).parse()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/parser.go:870 +0xf88
  github.com/nats-io/nats-server/v2/server.(*client).readLoop()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/client.go:1052 +0x7a5
  github.com/nats-io/nats-server/v2/server.(*Server).createLeafNode.func4()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/leafnode.go:872 +0x52

Previous read at 0x00c000168790 by goroutine 32:
  github.com/nats-io/nats-server/v2/server.(*client).remoteCluster()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/leafnode.go:1203 +0x42d
  github.com/nats-io/nats-server/v2/server.(*Server).updateLeafNodes()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/leafnode.go:1375 +0x2cf
  github.com/nats-io/nats-server/v2/server.(*client).processLeafSub()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/leafnode.go:1619 +0x858
  github.com/nats-io/nats-server/v2/server.(*client).parse()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/parser.go:624 +0x5031
  github.com/nats-io/nats-server/v2/server.(*client).readLoop()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/client.go:1052 +0x7a5
  github.com/nats-io/nats-server/v2/server.(*Server).createLeafNode.func4()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/leafnode.go:872 +0x52

Goroutine 34 (running) created at:
  github.com/nats-io/nats-server/v2/server.(*Server).startGoRoutine()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/server.go:2627 +0xc7
  github.com/nats-io/nats-server/v2/server.(*Server).createLeafNode()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/leafnode.go:872 +0xf7a
  github.com/nats-io/nats-server/v2/server.(*Server).startLeafNodeAcceptLoop.func1()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/leafnode.go:474 +0x5e
  github.com/nats-io/nats-server/v2/server.(*Server).acceptConnections.func1()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/server.go:1784 +0x57

Goroutine 32 (running) created at:
  github.com/nats-io/nats-server/v2/server.(*Server).startGoRoutine()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/server.go:2627 +0xc7
  github.com/nats-io/nats-server/v2/server.(*Server).createLeafNode()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/leafnode.go:872 +0xf7a
  github.com/nats-io/nats-server/v2/server.(*Server).startLeafNodeAcceptLoop.func1()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/leafnode.go:474 +0x5e
  github.com/nats-io/nats-server/v2/server.(*Server).acceptConnections.func1()
      /Users/ivan/dev/go/src/github.com/nats-io/nats-server/server/server.go:1784 +0x57
==================
    testing.go:965: race detected during execution of test
--- FAIL: TestLeafNodeTwoRemotesBindToSameAccount (0.05s)
```

This is because as soon as a LEAF is registered with the account, it is available
in the account's lleafs map, even before the CONNECT for this connectio is processed.
If another LEAF connection is processing a LSUB, the code goes over all leaf connections
for the account and may find the new connection that is in the process of connecting.
The check accesses c.leaf.remoteCluster unlocked which is also set unlocked during
the CONNECT. The fix is to have the set and check on that particular location using
the client's lock.

Ideally I believe that the connection should not have been in the account's lleafs,
or at least not used until the CONNECT for this leaf connection is fully processed.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2020-11-24 15:42:30 -07:00

57 KiB