Commit Graph

2247 Commits

Author SHA1 Message Date
Derek Collison
43324271ca Friendly version 2019-09-18 11:58:36 -07:00
Derek Collison
25c04069fd Merge pull request #1133 from nats-io/http
Use multiple connections to amortize TLS
2019-09-18 11:51:44 -07:00
Derek Collison
7cf211b056 Use multiple connections to amortize TLS
Signed-off-by: Derek Collison <derek@nats.io>
2019-09-18 11:40:00 -07:00
Derek Collison
fe3c0b03be Update to project ID 2019-09-18 10:50:08 -07:00
Derek Collison
70f526548a Create sponsorship button 2019-09-18 10:48:41 -07:00
Derek Collison
d6f0622c1f Merge pull request #1132 from nats-io/jwt_latency
Add in JWT support for tracking latency
2019-09-18 08:58:42 -07:00
Derek Collison
0551371b31 Add in JWT support for tracking latency
Signed-off-by: Derek Collison <derek@nats.io>
2019-09-18 08:51:43 -07:00
Derek Collison
b98b75b166 Merge pull request #1127 from nats-io/sysdebug
System level services for debugging.
2019-09-17 09:45:53 -07:00
Derek Collison
52430c304a System level services for debugging.
This is the first pass at introducing exported services to the system account for generally debugging of blackbox systems.
The first service reports number of subscribers for a given subject. The payload of the request is the subject, and optional queue group, and can contain wildcards.

Signed-off-by: Derek Collison <derek@nats.io>
2019-09-17 09:37:35 -07:00
Ivan Kozlovic
0ede25e064 Merge pull request #1129 from nats-io/fix-1128
[FIXED] Allow command line `-cluster` to accept -1 for port
2019-09-17 09:01:01 -06:00
Alberto Ricart
eb56ad22ea review comment 2019-09-17 09:56:03 -05:00
Alberto Ricart
af97b5b9df FIX #1128 - Modified the cluster listenstr parsing to allow cluster urls that have
a -1 for a port. This re-enables ability to create clusters on a random
port for testing.
2019-09-16 10:45:27 -05:00
Ivan Kozlovic
5eebc42f47 Merge pull request #1126 from nats-io/fix_acc_lock_issue
Fixed a lock inversion issue with account
2019-09-13 15:11:02 -06:00
Ivan Kozlovic
15201a19cd Fixed a lock inversion issue with account
In updateRouteSubscriptionMap(), when a queue sub is added/removed,
the code locks the account and then the route to send the update.
However, when a route is accepted and the subs are sent, the
opposite (locking wise) occurs. The route is locked, then the account.

This lock inversion is possible because a route is registered (added
to the server's map) and then the subs are sent.

Use a special lock to protect the send, but don't hold the acc.mu
lock while getting the route's lock.

The tests that were created for the original missed queue updates
issue, namely TestClusterLeaksSubscriptions() and
TestQueueSubWeightOrderMultipleConnections() pass with this change.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-09-13 14:30:00 -06:00
Derek Collison
8fdf6a4e3a Merge pull request #1125 from nats-io/json
Shorter names for latency tracking JSON
2019-09-12 15:19:50 -07:00
Derek Collison
26db43001f Shorter names for latency tracking JSON
Signed-off-by: Derek Collison <derek@nats.io>
2019-09-12 15:11:43 -07:00
Ivan Kozlovic
6f1c5d1179 Merge pull request #1124 from nats-io/update_travis_yml
Updates to Travis
2019-09-12 11:48:40 -06:00
Ivan Kozlovic
3d2a961c5a Updates to the script
- Replace EXCLUDE_VENDOR with GO_LIST since `go list` already
  excludes vendor directory.
- Changed misspell invocation because it was not doing what
  it was supposed to.
- Use `./...` in the test commands.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-09-12 10:24:56 -06:00
Derek Collison
b0122f736d Merge pull request #1122 from nats-io/latency-v2
Latency tracking updates.
2019-09-11 17:41:59 -07:00
Derek Collison
25d5cb337d Make json tags consistent
Signed-off-by: Derek Collison <derek@nats.io>
2019-09-11 17:30:01 -07:00
Derek Collison
94f143ccce Latency tracking updates.
Will now breakout the internal NATS latency to show requestor client RTT, responder client RTT and any internal latency caused by hopping between servers, etc.

Signed-off-by: Derek Collison <derek@nats.io>
2019-09-11 16:43:19 -07:00
Ivan Kozlovic
d125f06eaf Merge pull request #1121 from nats-io/fix_max_pending
[FIXED] MaxPending > MaxInt32 causes client to be disconnected
2019-09-11 14:48:11 -06:00
Ivan Kozlovic
effa30ce4a [FIXED] MaxPending > MaxInt32 causes client to be disconnected
Changed some of client.outbound fields to int64.
Moved fields around to minimize size of struct (checked with
unsafe.Sizeof())
Checked benchmark results before/after
Added test

Resolves #1118

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-09-11 14:29:02 -06:00
Ivan Kozlovic
d19b13d093 Merge pull request #1119 from nats-io/fix_1117
[FIXED] Circular account service import dependency
2019-09-10 18:31:13 -06:00
Ivan Kozlovic
4253b31dcf [FIXED] Circular account service import dependency
If account A imports from B and B from A, when the account A
is built, it causes B to be fetch, but since B imports from A,
A was fetch/built again in an infinite loop.

Resolves #1117

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-09-10 18:05:21 -06:00
Waldemar Quevedo
390afecd92 Update ISSUE_TEMPLATE.md 2019-09-10 10:24:52 -07:00
Derek Collison
97f89ffd3f Merge pull request #1115 from nats-io/update-system-account
Update SYS account name
2019-09-05 19:09:42 +03:00
Jaime Piña
176a19de75 Update SYS account name
Currently, the $SYSTEM subject is used in this repo, but it seems like this
subject name is out of date.

This change updates the code to use $SYS to match the documentation.
2019-09-04 13:49:59 -05:00
Derek Collison
4042465eda Merge pull request #1112 from nats-io/prune
Prune remote reply tracking
2019-08-30 17:39:38 -07:00
Derek Collison
67470911fe Prune remote reply tracking
Signed-off-by: Derek Collison <derek@nats.io>
2019-08-30 17:35:20 -07:00
Derek Collison
bb11f7bd2d Merge pull request #1111 from nats-io/latency
Track latency for exported services
2019-08-30 11:02:36 -07:00
Derek Collison
7989118c3f First pass latency tracking for exported services
Signed-off-by: Derek Collison <derek@nats.io>
2019-08-30 10:52:48 -07:00
Ivan Kozlovic
8ca58c7de0 Merge pull request #1110 from nats-io/fix_flush_outbound
Fixed flushOutbound
2019-08-29 13:32:37 -06:00
Ivan Kozlovic
2a8973a62b Fixed flushOutbound
With Go 1.12 (strangely was not able to reproduce with Go 1.11)
the test TestRouteNoCrashOnAddingSubToRoute() would frequently
locks up and consume all avail CPUs on the machine. Running
this test with GOMAXPROCS=2 you would see server.test CPU usage
pegged at 200% (assuming you have at least 2 CPUs).
The reason was that the writeLoop was spinning because another
routine was already in flushOutbound() and stack trace would
show that it was stuck in system calls. It seems that even though
the writeLoop does release the lock but grab it right away was
not allowing the syscall to complete.

So decided to put back the unlock/gosched/lock back in flushOutbound()
when flag is already set, but then protect the closeConnection()
with its own flag (similar to clearConnection) to not re-introduce
issue fixed in #1092.

Had to fix the benchmark test RoutedInterestGraph because after a
route is accepted, the initial PING will be sent after 1sec which
was breaking this test.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-08-29 12:59:27 -06:00
Ivan Kozlovic
8a0120d1b8 Merge pull request #1108 from nats-io/add_leafz
[ADDED] /leafz endpoint
2019-08-26 12:46:44 -06:00
Ivan Kozlovic
cd4b8d3fad [ADDED] /leafz endpoint
Resolves #1061

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-08-26 12:00:24 -06:00
Ivan Kozlovic
6c4a88f34e Merge pull request #1107 from nats-io/leaf_route_rtt
Leaf and Route RTT
2019-08-26 11:57:54 -06:00
Ivan Kozlovic
cd9f898eb0 Made a server's helper to set first ping timer
Defaults to 1sec but will be opts.PingInterval if value is lower.
All non client connections invoked this function for the first
PING.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-08-26 10:21:43 -06:00
Ivan Kozlovic
90d592e163 Leaf and Route RTT
When a leaf or route connection is created, set the first ping
timer to fire at 1sec, which will allow to compute the RTT
reasonably soon (since the PingInterval could be user configured
and set much higher).

For Route in PR #1101, I was sending the PING on receiving the
INFO which required changing bunch of tests. Changing that to
also use the first timer interval of 1sec and reverted changes
to route tests.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-08-26 09:34:17 -06:00
Ivan Kozlovic
5518cbe070 Merge pull request #1106 from nats-io/fix_leafnode
[FIXED] Some Leafnode issues
2019-08-26 09:17:36 -06:00
Ivan Kozlovic
7ca8723942 [FIXED] Some Leafnode issues
- On startup, verify that local account in leafnode (if specified
  can be found otherwise fail startup).
- At runtime, print error and continue trying to reconnect.
  Will need to decide a better approach.
- When using basic auth (user/password), it was possible for a
  solicited Leafnode connection to not use user/password when
  trying an URL that was discovered through gossip. The server
  now saves the credentials of a configured URL to use with
  the discovered ones.

Updated RouteRTT test in case RTT does not seem to be updated
because getting always the same value.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-08-23 14:08:07 -06:00
Derek Collison
452a98572a Merge pull request #1105 from ethan-daocloud/typo-terminator
Cleanup: fix some typos in code comment
2019-08-22 07:57:29 -07:00
Guangming Wang
927991321d Cleanup: fix some typos in code comment
Signed-off-by: Guangming Wang <guangming.wang@daocloud.io>
2019-08-22 21:36:37 +08:00
Ivan Kozlovic
2959b982ea Merge pull request #1101 from nats-io/route_rtt
[ADDED] RTT in routez's route info
2019-08-20 17:23:18 -06:00
Ivan Kozlovic
41a7457ecf Merge pull request #1103 from nats-io/flappers
Fix flappers
2019-08-20 17:21:56 -06:00
Ivan Kozlovic
77c63dbce1 Fix flappers
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-08-20 17:07:22 -06:00
Ivan Kozlovic
02acbb6419 Merge pull request #1102 from nats-io/fix_sub_close
Fixed subscription close
2019-08-20 15:14:29 -06:00
Ivan Kozlovic
2f48ad5150 Fixed subscription close
I noticed that TestNoRaceRoutedQueueAutoUnsubscribe started to
fail a lot on Travis. Running locally I could see a 45 to 50%
failures. After investigation I realized that the issue was that
we have wrongly re-used `subscription.nm` and set to -1 on unsubscribe
however, I believe that it was possible that when subscription was
closed, the server may have already picked that consumer for a delivery
which then causes nm==-1 to be bumped to 0, which was wrong.
Commenting out the subscription.close() that sets nm to -1, I could
not get the test to fail on macOS but would still get 7% failure on
Linux VM. Adding the check to see if sub is closed in deliverMsg()
completely erase the failures, even on Linux VM.

We could still use `nm` set to -1 but check on deliverMsg(), the
same way I use the closed int32 now.

Fixed some flappers.
Updated .travis.yml to failfast if one of the command in the
`script` fails. User `set -e` and `set +e` as recommended in
https://github.com/travis-ci/travis-ci/issues/1066

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-08-20 14:39:23 -06:00
Ivan Kozlovic
89dd13f134 [ADDED] RTT in routez's route info
Added the RTT field to each route reported in routez.
Ensure that when a route is accepted, we send a PING to compute
the first RTT and don't have to wait for the ping timer to fire.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
2019-08-20 14:16:07 -06:00
Ivan Kozlovic
25e15d7162 Merge pull request #1100 from nats-io/remove_skip_email_notifications
Removing skipping email notification [ci skip]
2019-08-15 12:37:17 -06:00