Commit Graph

2234 Commits

Author SHA1 Message Date
Derek Collison
52430c304a System level services for debugging.
This is the first pass at introducing exported services to the system account for generally debugging of blackbox systems.
The first service reports number of subscribers for a given subject. The payload of the request is the subject, and optional queue group, and can contain wildcards.

Signed-off-by: Derek Collison <derek@nats.io>
2019-09-17 09:37:35 -07:00
Derek Collison
8fdf6a4e3a Merge pull request #1125 from nats-io/json
Shorter names for latency tracking JSON
2019-09-12 15:19:50 -07:00
Derek Collison
26db43001f Shorter names for latency tracking JSON
Signed-off-by: Derek Collison <derek@nats.io>
2019-09-12 15:11:43 -07:00
Ivan Kozlovic
6f1c5d1179 Merge pull request #1124 from nats-io/update_travis_yml
Updates to Travis
2019-09-12 11:48:40 -06:00
Ivan Kozlovic
3d2a961c5a Updates to the script
- Replace EXCLUDE_VENDOR with GO_LIST since `go list` already
  excludes vendor directory.
- Changed misspell invocation because it was not doing what
  it was supposed to.
- Use `./...` in the test commands.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-09-12 10:24:56 -06:00
Derek Collison
b0122f736d Merge pull request #1122 from nats-io/latency-v2
Latency tracking updates.
2019-09-11 17:41:59 -07:00
Derek Collison
25d5cb337d Make json tags consistent
Signed-off-by: Derek Collison <derek@nats.io>
2019-09-11 17:30:01 -07:00
Derek Collison
94f143ccce Latency tracking updates.
Will now breakout the internal NATS latency to show requestor client RTT, responder client RTT and any internal latency caused by hopping between servers, etc.

Signed-off-by: Derek Collison <derek@nats.io>
2019-09-11 16:43:19 -07:00
Ivan Kozlovic
d125f06eaf Merge pull request #1121 from nats-io/fix_max_pending
[FIXED] MaxPending > MaxInt32 causes client to be disconnected
2019-09-11 14:48:11 -06:00
Ivan Kozlovic
effa30ce4a [FIXED] MaxPending > MaxInt32 causes client to be disconnected
Changed some of client.outbound fields to int64.
Moved fields around to minimize size of struct (checked with
unsafe.Sizeof())
Checked benchmark results before/after
Added test

Resolves #1118

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-09-11 14:29:02 -06:00
Ivan Kozlovic
d19b13d093 Merge pull request #1119 from nats-io/fix_1117
[FIXED] Circular account service import dependency
2019-09-10 18:31:13 -06:00
Ivan Kozlovic
4253b31dcf [FIXED] Circular account service import dependency
If account A imports from B and B from A, when the account A
is built, it causes B to be fetch, but since B imports from A,
A was fetch/built again in an infinite loop.

Resolves #1117

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-09-10 18:05:21 -06:00
Waldemar Quevedo
390afecd92 Update ISSUE_TEMPLATE.md 2019-09-10 10:24:52 -07:00
Derek Collison
97f89ffd3f Merge pull request #1115 from nats-io/update-system-account
Update SYS account name
2019-09-05 19:09:42 +03:00
Jaime Piña
176a19de75 Update SYS account name
Currently, the $SYSTEM subject is used in this repo, but it seems like this
subject name is out of date.

This change updates the code to use $SYS to match the documentation.
2019-09-04 13:49:59 -05:00
Derek Collison
4042465eda Merge pull request #1112 from nats-io/prune
Prune remote reply tracking
2019-08-30 17:39:38 -07:00
Derek Collison
67470911fe Prune remote reply tracking
Signed-off-by: Derek Collison <derek@nats.io>
2019-08-30 17:35:20 -07:00
Derek Collison
bb11f7bd2d Merge pull request #1111 from nats-io/latency
Track latency for exported services
2019-08-30 11:02:36 -07:00
Derek Collison
7989118c3f First pass latency tracking for exported services
Signed-off-by: Derek Collison <derek@nats.io>
2019-08-30 10:52:48 -07:00
Ivan Kozlovic
8ca58c7de0 Merge pull request #1110 from nats-io/fix_flush_outbound
Fixed flushOutbound
2019-08-29 13:32:37 -06:00
Ivan Kozlovic
2a8973a62b Fixed flushOutbound
With Go 1.12 (strangely was not able to reproduce with Go 1.11)
the test TestRouteNoCrashOnAddingSubToRoute() would frequently
locks up and consume all avail CPUs on the machine. Running
this test with GOMAXPROCS=2 you would see server.test CPU usage
pegged at 200% (assuming you have at least 2 CPUs).
The reason was that the writeLoop was spinning because another
routine was already in flushOutbound() and stack trace would
show that it was stuck in system calls. It seems that even though
the writeLoop does release the lock but grab it right away was
not allowing the syscall to complete.

So decided to put back the unlock/gosched/lock back in flushOutbound()
when flag is already set, but then protect the closeConnection()
with its own flag (similar to clearConnection) to not re-introduce
issue fixed in #1092.

Had to fix the benchmark test RoutedInterestGraph because after a
route is accepted, the initial PING will be sent after 1sec which
was breaking this test.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-08-29 12:59:27 -06:00
Ivan Kozlovic
8a0120d1b8 Merge pull request #1108 from nats-io/add_leafz
[ADDED] /leafz endpoint
2019-08-26 12:46:44 -06:00
Ivan Kozlovic
cd4b8d3fad [ADDED] /leafz endpoint
Resolves #1061

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-08-26 12:00:24 -06:00
Ivan Kozlovic
6c4a88f34e Merge pull request #1107 from nats-io/leaf_route_rtt
Leaf and Route RTT
2019-08-26 11:57:54 -06:00
Ivan Kozlovic
cd9f898eb0 Made a server's helper to set first ping timer
Defaults to 1sec but will be opts.PingInterval if value is lower.
All non client connections invoked this function for the first
PING.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-08-26 10:21:43 -06:00
Ivan Kozlovic
90d592e163 Leaf and Route RTT
When a leaf or route connection is created, set the first ping
timer to fire at 1sec, which will allow to compute the RTT
reasonably soon (since the PingInterval could be user configured
and set much higher).

For Route in PR #1101, I was sending the PING on receiving the
INFO which required changing bunch of tests. Changing that to
also use the first timer interval of 1sec and reverted changes
to route tests.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-08-26 09:34:17 -06:00
Ivan Kozlovic
5518cbe070 Merge pull request #1106 from nats-io/fix_leafnode
[FIXED] Some Leafnode issues
2019-08-26 09:17:36 -06:00
Ivan Kozlovic
7ca8723942 [FIXED] Some Leafnode issues
- On startup, verify that local account in leafnode (if specified
  can be found otherwise fail startup).
- At runtime, print error and continue trying to reconnect.
  Will need to decide a better approach.
- When using basic auth (user/password), it was possible for a
  solicited Leafnode connection to not use user/password when
  trying an URL that was discovered through gossip. The server
  now saves the credentials of a configured URL to use with
  the discovered ones.

Updated RouteRTT test in case RTT does not seem to be updated
because getting always the same value.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-08-23 14:08:07 -06:00
Derek Collison
452a98572a Merge pull request #1105 from ethan-daocloud/typo-terminator
Cleanup: fix some typos in code comment
2019-08-22 07:57:29 -07:00
Guangming Wang
927991321d Cleanup: fix some typos in code comment
Signed-off-by: Guangming Wang <guangming.wang@daocloud.io>
2019-08-22 21:36:37 +08:00
Ivan Kozlovic
2959b982ea Merge pull request #1101 from nats-io/route_rtt
[ADDED] RTT in routez's route info
2019-08-20 17:23:18 -06:00
Ivan Kozlovic
41a7457ecf Merge pull request #1103 from nats-io/flappers
Fix flappers
2019-08-20 17:21:56 -06:00
Ivan Kozlovic
77c63dbce1 Fix flappers
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-08-20 17:07:22 -06:00
Ivan Kozlovic
02acbb6419 Merge pull request #1102 from nats-io/fix_sub_close
Fixed subscription close
2019-08-20 15:14:29 -06:00
Ivan Kozlovic
2f48ad5150 Fixed subscription close
I noticed that TestNoRaceRoutedQueueAutoUnsubscribe started to
fail a lot on Travis. Running locally I could see a 45 to 50%
failures. After investigation I realized that the issue was that
we have wrongly re-used `subscription.nm` and set to -1 on unsubscribe
however, I believe that it was possible that when subscription was
closed, the server may have already picked that consumer for a delivery
which then causes nm==-1 to be bumped to 0, which was wrong.
Commenting out the subscription.close() that sets nm to -1, I could
not get the test to fail on macOS but would still get 7% failure on
Linux VM. Adding the check to see if sub is closed in deliverMsg()
completely erase the failures, even on Linux VM.

We could still use `nm` set to -1 but check on deliverMsg(), the
same way I use the closed int32 now.

Fixed some flappers.
Updated .travis.yml to failfast if one of the command in the
`script` fails. User `set -e` and `set +e` as recommended in
https://github.com/travis-ci/travis-ci/issues/1066

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-08-20 14:39:23 -06:00
Ivan Kozlovic
89dd13f134 [ADDED] RTT in routez's route info
Added the RTT field to each route reported in routez.
Ensure that when a route is accepted, we send a PING to compute
the first RTT and don't have to wait for the ping timer to fire.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-08-20 14:16:07 -06:00
Ivan Kozlovic
25e15d7162 Merge pull request #1100 from nats-io/remove_skip_email_notifications
Removing skipping email notification [ci skip]
2019-08-15 12:37:17 -06:00
Ivan Kozlovic
ef65ccbfd4 Removing skipping email notification [ci skip]
I had put that in place when testing the release process.
Removing now.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-08-15 11:00:34 -06:00
Ivan Kozlovic
c8ca58efb4 Merge pull request #1095 from nats-io/goreleaser
Prepare for v2.0.4 with goreleaser
v2.0.4
2019-08-15 10:02:32 -06:00
Ivan Kozlovic
a44e3f037c Change checksum file name to SHA256SUMS so it can be used with rget
See:
https://github.com/merklecounty/rget/blob/master/Documentation/integrations.md
6c02b98031/.goreleaser.yml (L30)

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-08-15 09:07:20 -06:00
Ivan Kozlovic
e230e7fde9 Attempt at fixing flapper again
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-08-15 09:06:56 -06:00
Ivan Kozlovic
fc8087daa7 Updates based on comments
- add sha256 algo
- move some mem hungry tests while running with -race to the norace
- remove GOGC=10

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-08-15 09:06:56 -06:00
Ivan Kozlovic
07e3db6b8e Prepare for v2.0.4 with goreleaser
Also fixed some flappers

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-08-15 09:06:56 -06:00
Derek Collison
0fd42cffcb Merge pull request #1096 from nats-io/flap
Fix for flapping test
2019-08-15 07:50:41 -07:00
Derek Collison
2657092a04 Merge pull request #1098 from ethan-daocloud/patch-1
cleanup: fix word errors in errors.go
2019-08-15 07:36:26 -07:00
Guangming Wang
09954eee5c cleanup: fix word errors in errors.go
Signed-off-by: Guangming Wang <guangming.wang@daocloud.io>
2019-08-15 22:12:57 +08:00
Derek Collison
93313a149e Fix for flapping test
Signed-off-by: Derek Collison <derek@nats.io>
2019-08-14 23:52:49 -07:00
Derek Collison
c185a88146 Merge pull request #1094 from nats-io/cover
tmp disable of coveralls testing due to site down
2019-08-13 20:26:30 -07:00
Derek Collison
1fc7e99e76 tmp disable of coveralls testing due to site down
Signed-off-by: Derek Collison <derek@nats.io>
2019-08-13 20:16:18 -07:00
Derek Collison
e76f66c1cd Merge pull request #1093 from wallyqs/typo
Fix typo
2019-08-13 20:05:08 -07:00