1
0
mirror of https://github.com/taigrr/nats.docs synced 2025-01-18 04:03:23 -08:00

Add NATS on K8S contents (#17)

* Add NATS on K8S contents

Signed-off-by: Waldemar Quevedo <wally@synadia.com>
This commit is contained in:
Waldemar Quevedo 2019-12-18 11:46:41 -08:00 committed by GitHub
parent 94db70a982
commit ccb05cdf92
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
6 changed files with 1340 additions and 0 deletions

View File

@ -186,3 +186,10 @@
* [Developing a Client](nats-protocol/nats-protocol/nats-client-dev.md) * [Developing a Client](nats-protocol/nats-protocol/nats-client-dev.md)
* [NATS Cluster Protocol](nats-protocol/nats-server-protocol.md) * [NATS Cluster Protocol](nats-protocol/nats-server-protocol.md)
## NATS on Kubernetes
* [Introduction](nats-kubernetes/README.md)
* [NATS Streaming Cluster with FT Mode](nats-kubernetes/stan-ft-k8s-aws.md)
* [NATS + Prometheus Operator](nats-kubernetes/prometheus-and-nats-operator)
* [NATS + Cert Manager](nats-kubernetes/nats-cluster-and-cert-manager.md)
* [Securing a NATS Cluster with cfssl](nats-kubernetes/operator-tls-setup-with-cfssl.md)

58
nats-kubernetes/README.md Normal file
View File

@ -0,0 +1,58 @@
# NATS on Kubernetes
In this section you can find several examples of how to deploy NATS, NATS Streaming
and other tools from the NATS ecosystem on Kubernetes.
* [Getting Started](README.md#getting-started)
* [Creating a NATS Streaming Cluster in k8s with FT mode](stan-ft-k8s-aws.md)
* [NATS + Prometheus Operator](prometheus-and-nats-operator)
* [NATS + Cert Manager in k8s](nats-cluster-and-cert-manager.md)
* [Securing a NATS Cluster using cfssl](operator-tls-setup-with-cfssl.md)
## Running NATS on K8S
### Getting started
The fastest and easiest way to get started is with just one shell command:
```sh
curl -sSL https://nats-io.github.io/k8s/setup.sh | sh
```
*In case you don't have a cluster already, you can find some notes on how to create a small cluster using one of the hosted Kubernetes providers [here](https://github.com/nats-io/k8s/docs/create-k8s-cluster.md).*
This will run a `nats-setup` container with the [required policy](https://github.com/nats-io/k8s/blob/master/setup/bootstrap-policy.yml)
and deploy a NATS cluster on Kubernetes with external access, TLS and
decentralized authorization.
[![asciicast](https://asciinema.org/a/282135.svg)](https://asciinema.org/a/282135)
By default, the installer will deploy the [Prometheus Operator](https://github.com/coreos/prometheus-operator) and the
[Cert Manager](https://github.com/jetstack/cert-manager) for metrics and TLS support, and the NATS instances will
also bind the 4222 host port for external access.
You can customize the installer to install without TLS or without Auth
to have a simpler setup as follows:
```sh
# Disable TLS
curl -sSL https://nats-io.github.io/k8s/setup.sh | sh -s -- --without-tls
# Disable Auth and TLS (also disables NATS surveyor and NATS Streaming)
curl -sSL https://nats-io.github.io/k8s/setup.sh | sh -s -- --without-tls --without-auth
```
**Note**: Since [NATS Streaming](https://github.com/nats-io/nats-streaming-server) will be running as a [leafnode](https://github.com/nats-io/docs/tree/master/leafnodes) to NATS
(under the STAN account) and that [NATS Surveyor](https://github.com/nats-io/nats-surveyor)
requires the [system account](,,/nats-server/nats_admin/sys_accounts) to monitor events, disabling auth also means that NATS Streaming and NATS Surveyor based monitoring will be disabled.
The monitoring dashboard setup using NATS Surveyor can be accessed by using port-forward:
kubectl port-forward deployments/nats-surveyor-grafana 3000:3000
Next, open the following URL in your browser:
http://127.0.0.1:3000/d/nats/nats-surveyor?refresh=5s&orgId=1
![surveyor](https://user-images.githubusercontent.com/26195/69106844-79fdd480-0a24-11ea-8e0c-213f251fad90.gif)

View File

@ -0,0 +1,166 @@
```text
kubectl create namespace cert-manager
kubectl label namespace cert-manager certmanager.k8s.io/disable-validation=true
kubectl apply -f https://raw.githubusercontent.com/jetstack/cert-manager/release-0.7/deploy/manifests/cert-manager.yaml
```
```yaml
apiVersion: certmanager.k8s.io/v1alpha1
kind: ClusterIssuer
metadata:
name: selfsigning
spec:
selfSigned: {}
```
```text
clusterissuer.certmanager.k8s.io/selfsigning unchanged
```
``` yaml
apiVersion: certmanager.k8s.io/v1alpha1
kind: Certificate
metadata:
name: nats-ca
spec:
secretName: nats-ca
duration: 8736h # 1 year
renewBefore: 240h # 10 days
issuerRef:
name: selfsigning
kind: ClusterIssuer
commonName: nats-ca
organization:
- Your organization
isCA: true
```
```text
certificate.certmanager.k8s.io/nats-ca configured
```
``` yaml
apiVersion: certmanager.k8s.io/v1alpha1
kind: Issuer
metadata:
name: nats-ca
spec:
ca:
secretName: nats-ca
```
```text
issuer.certmanager.k8s.io/nats-ca created
```
``` yaml
apiVersion: certmanager.k8s.io/v1alpha1
kind: Certificate
metadata:
name: nats-server-tls
spec:
secretName: nats-server-tls
duration: 2160h # 90 days
renewBefore: 240h # 10 days
issuerRef:
name: nats-ca
kind: Issuer
organization:
- Your organization
commonName: nats.default.svc.cluster.local
dnsNames:
- nats.default.svc
```
```text
certificate.certmanager.k8s.io/nats-server-tls created
```
``` yaml
apiVersion: certmanager.k8s.io/v1alpha1
kind: Certificate
metadata:
name: nats-routes-tls
spec:
secretName: nats-routes-tls
duration: 2160h # 90 days
renewBefore: 240h # 10 days
issuerRef:
name: nats-ca
kind: Issuer
organization:
- Your organization
commonName: "*.nats-mgmt.default.svc.cluster.local"
dnsNames:
- "*.nats-mgmt.default.svc"
```
```
certificate.certmanager.k8s.io/nats-routes-tls configured
```
``` yaml
apiVersion: "nats.io/v1alpha2"
kind: "NatsCluster"
metadata:
name: "nats"
spec:
# Number of nodes in the cluster
size: 3
version: "1.4.1"
tls:
# Certificates to secure the NATS client connections:
serverSecret: "nats-server-tls"
# Name of the CA in serverSecret
serverSecretCAFileName: "ca.crt"
# Name of the key in serverSecret
serverSecretKeyFileName: "tls.key"
# Name of the certificate in serverSecret
serverSecretCertFileName: "tls.crt"
# Certificates to secure the routes.
routesSecret: "nats-routes-tls"
# Name of the CA in routesSecret
routesSecretCAFileName: "ca.crt"
# Name of the key in routesSecret
routesSecretKeyFileName: "tls.key"
# Name of the certificate in routesSecret
routesSecretCertFileName: "tls.crt"
```
```text
natscluster.nats.io/nats created
```
``` sh
kubectl get pods -o wide
```
``` sh
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE
nats-1 1/1 Running 0 4s 172.17.0.8 minikube <none>
nats-2 1/1 Running 0 3s 172.17.0.9 minikube <none>
nats-3 1/1 Running 0 2s 172.17.0.10 minikube <none>
```
``` sh
kubectl logs nats-1
```
```text
: [1] 2019/05/08 22:35:11.192781 [INF] Starting nats-server version 1.4.1
: [1] 2019/05/08 22:35:11.192819 [INF] Git commit [3e64f0b]
: [1] 2019/05/08 22:35:11.192952 [INF] Starting http monitor on 0.0.0.0:8222
: [1] 2019/05/08 22:35:11.192981 [INF] Listening for client connections on 0.0.0.0:4222
: [1] 2019/05/08 22:35:11.192987 [INF] TLS required for client connections
: [1] 2019/05/08 22:35:11.192989 [INF] Server is ready
: [1] 2019/05/08 22:35:11.193123 [INF] Listening for route connections on 0.0.0.0:6222
: [1] 2019/05/08 22:35:12.487758 [INF] 172.17.0.9:49444 - rid:1 - Route connection created
: [1] 2019/05/08 22:35:13.450067 [INF] 172.17.0.10:46286 - rid:2 - Route connection created
```

View File

@ -0,0 +1,422 @@
# Secure NATS Cluster in Kubernetes using the NATS Operator
## Features
- Clients TLS setup
- TLS based auth certs via secret
+ Reloading supported by only updating secret
- Routes TLS setup
- Advertising public IP per NATS server for external access
### Creating the Certificates
#### Generating the Root CA Certs
```js
{
"CN": "nats.io",
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"OU": "nats.io"
}
]
}
```
```sh
(
cd certs
# CA certs
cfssl gencert -initca ca-csr.json | cfssljson -bare ca -
)
```
Setup the profiles for the Root CA, we will have 3 main profiles: one
for the clients connecting, one for the servers, and another one for
the full mesh routing connections between the servers.
```js :tangle certs/ca-config.json
{
"signing": {
"default": {
"expiry": "43800h"
},
"profiles": {
"server": {
"expiry": "43800h",
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
]
},
"client": {
"expiry": "43800h",
"usages": [
"signing",
"key encipherment",
"client auth"
]
},
"route": {
"expiry": "43800h",
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
]
}
}
}
}
```
#### Generating the NATS server certs
First we generate the certificates for the server.
```js
{
"CN": "nats.io",
"hosts": [
"localhost",
"*.nats-cluster.default.svc",
"*.nats-cluster-mgmt.default.svc",
"nats-cluster",
"nats-cluster-mgmt",
"nats-cluster.default.svc",
"nats-cluster-mgmt.default.svc",
"nats-cluster.default.svc.cluster.local",
"nats-cluster-mgmt.default.svc.cluster.local",
"*.nats-cluster.default.svc.cluster.local",
"*.nats-cluster-mgmt.default.svc.cluster.local"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"OU": "Operator"
}
]
}
```
```sh
(
# Generating the peer certificates
cd certs
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=server server.json | cfssljson -bare server
)
```
#### Generating the NATS server routes certs
We will also be setting up TLS for the full mesh routes.
```js
{
"CN": "nats.io",
"hosts": [
"localhost",
"*.nats-cluster.default.svc",
"*.nats-cluster-mgmt.default.svc",
"nats-cluster",
"nats-cluster-mgmt",
"nats-cluster.default.svc",
"nats-cluster-mgmt.default.svc",
"nats-cluster.default.svc.cluster.local",
"nats-cluster-mgmt.default.svc.cluster.local",
"*.nats-cluster.default.svc.cluster.local",
"*.nats-cluster-mgmt.default.svc.cluster.local"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"OU": "Operator"
}
]
}
```
```sh
# Generating the peer certificates
(
cd certs
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=route route.json | cfssljson -bare route
)
```
#### Generating the certs for the clients (CNCF && ACME)
```js
{
"CN": "nats.io",
"hosts": [""],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"OU": "CNCF"
}
]
}
```
```sh
(
cd certs
# Generating NATS client certs
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=client client.json | cfssljson -bare client
)
```
#### Kubectl create
```sh :results output
cd certs
kubectl create secret generic nats-tls-example --from-file=ca.pem --from-file=server-key.pem --from-file=server.pem
kubectl create secret generic nats-tls-routes-example --from-file=ca.pem --from-file=route-key.pem --from-file=route.pem
kubectl create secret generic nats-tls-client-example --from-file=ca.pem --from-file=client-key.pem --from-file=client.pem
```
### Create the Auth secret
```js
{
"users": [
{ "username": "CN=nats.io,OU=ACME" },
{ "username": "CN=nats.io,OU=CNCF",
"permissions": {
"publish": ["hello.*"],
"subscribe": ["hello.world"]
}
}
],
"default_permissions": {
"publish": ["SANDBOX.*"],
"subscribe": ["PUBLIC.>"]
}
}
```
```sh
kubectl create secret generic nats-tls-users --from-file=users.json
```
### Create a cluster with TLS
```sh
echo '
apiVersion: "nats.io/v1alpha2"
kind: "NatsCluster"
metadata:
name: "nats-cluster"
spec:
size: 3
# Using custom edge nats server image for TLS verify and map support.
serverImage: "wallyqs/nats-server"
version: "edge-2.0.0-RC5"
tls:
enableHttps: true
# Certificates to secure the NATS client connections:
serverSecret: "nats-tls-example"
# Certificates to secure the routes.
routesSecret: "nats-tls-routes-example"
auth:
tlsVerifyAndMap: true
clientsAuthSecret: "nats-tls-users"
# How long to wait for authentication
clientsAuthTimeout: 5
pod:
# To be able to reload the secret changes
enableConfigReload: true
reloaderImage: connecteverything/nats-server-config-reloader
# Bind the port 4222 as the host port to allow external access.
enableClientsHostPort: true
# Initializer container that resolves the external IP from the
# container where it is running.
advertiseExternalIP: true
# Image of container that resolves external IP from K8S API
bootconfigImage: "wallyqs/nats-boot-config"
bootconfigImageTag: "0.5.0"
# Service account required to be able to find the external IP
template:
spec:
serviceAccountName: "nats-server"
' | kubectl apply -f -
```
### Create APP using certs
#### Adding a new pod which uses the certificates
Development
```go
package main
import (
"encoding/json"
"flag"
"fmt"
"log"
"time"
"github.com/nats-io/go-nats"
"github.com/nats-io/nuid"
)
func main() {
var (
serverList string
rootCACertFile string
clientCertFile string
clientKeyFile string
)
flag.StringVar(&serverList, "s", "tls://nats-1.nats-cluster.default.svc:4222", "List of NATS of servers available")
flag.StringVar(&rootCACertFile, "cacert", "./certs/ca.pem", "Root CA Certificate File")
flag.StringVar(&clientCertFile, "cert", "./certs/client.pem", "Client Certificate File")
flag.StringVar(&clientKeyFile, "key", "./certs/client-key.pem", "Client Private key")
flag.Parse()
log.Println("NATS endpoint:", serverList)
log.Println("Root CA:", rootCACertFile)
log.Println("Client Cert:", clientCertFile)
log.Println("Client Key:", clientKeyFile)
// Connect options
rootCA := nats.RootCAs(rootCACertFile)
clientCert := nats.ClientCert(clientCertFile, clientKeyFile)
alwaysReconnect := nats.MaxReconnects(-1)
var nc *nats.Conn
var err error
for {
nc, err = nats.Connect(serverList, rootCA, clientCert, alwaysReconnect)
if err != nil {
log.Printf("Error while connecting to NATS, backing off for a sec... (error: %s)", err)
time.Sleep(1 * time.Second)
continue
}
break
}
nc.Subscribe("discovery.*.status", func(m *nats.Msg) {
log.Printf("[Received on %q] %s", m.Subject, string(m.Data))
})
discoverySubject := fmt.Sprintf("discovery.%s.status", nuid.Next())
info := struct {
InMsgs uint64 `json:"in_msgs"`
OutMsgs uint64 `json:"out_msgs"`
Reconnects uint64 `json:"reconnects"`
CurrentServer string `json:"current_server"`
Servers []string `json:"servers"`
}{}
for range time.NewTicker(1 * time.Second).C {
stats := nc.Stats()
info.InMsgs = stats.InMsgs
info.OutMsgs = stats.OutMsgs
info.Reconnects = stats.Reconnects
info.CurrentServer = nc.ConnectedUrl()
info.Servers = nc.Servers()
payload, err := json.Marshal(info)
if err != nil {
log.Printf("Error marshalling data: %s", err)
}
err = nc.Publish(discoverySubject, payload)
if err != nil {
log.Printf("Error during publishing: %s", err)
}
nc.Flush()
}
}
```
```text
FROM golang:1.11.0-alpine3.8 AS builder
COPY . /go/src/github.com/nats-io/nats-kubernetes/examples/nats-cluster-routes-tls/app
WORKDIR /go/src/github.com/nats-io/nats-kubernetes/examples/nats-cluster-routes-tls/app
RUN apk add --update git
RUN go get -u github.com/nats-io/go-nats
RUN go get -u github.com/nats-io/nuid
RUN CGO_ENABLED=0 go build -o nats-client-app -v -a ./client.go
FROM scratch
COPY --from=builder /go/src/github.com/nats-io/nats-kubernetes/examples/nats-cluster-routes-tls/app/nats-client-app /nats-client-app
ENTRYPOINT ["/nats-client-app"]
```
```sh
docker build . -t wallyqs/nats-client-app
docker run wallyqs/nats-client-app
docker push wallyqs/nats-client-app
```
Pod spec
```sh :results output
echo '
apiVersion: apps/v1beta2
kind: Deployment
# The name of the deployment
metadata:
name: nats-client-app
spec:
# This selector has to match the template.metadata.labels section
# which is below in the PodSpec
selector:
matchLabels:
name: nats-client-app
# Number of instances
replicas: 1
# PodSpec
template:
metadata:
labels:
name: nats-client-app
spec:
volumes:
- name: "client-tls-certs"
secret:
secretName: "nats-tls-client-example"
containers:
- name: nats-client-app
command: ["/nats-client-app", "-s", "tls://nats-cluster.default.svc:4222", "-cacert", '/etc/nats-client-tls-certs/ca.pem', '-cert', '/etc/nats-client-tls-certs/client.pem', '-key', '/etc/nats-client-tls-certs/client-key.pem']
image: wallyqs/nats-client-app:latest
imagePullPolicy: Always
volumeMounts:
- name: "client-tls-certs"
mountPath: "/etc/nats-client-tls-certs/"
' | kubectl apply -f -
```

View File

@ -0,0 +1,297 @@
# Prometheus Operator + NATS Operator
## Installing the Operators
Install the NATS Operator:
``` sh
$ kubectl apply -f https://raw.githubusercontent.com/nats-io/nats-operator/master/deploy/00-prereqs.yaml
$ kubectl apply -f https://raw.githubusercontent.com/nats-io/nats-operator/master/deploy/10-deployment.yaml
```
Install the Prometheus Operator along with its RBAC definition (prometheus-operator service account):
```yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
app.kubernetes.io/component: controller
app.kubernetes.io/name: prometheus-operator
app.kubernetes.io/version: v0.30.0
name: prometheus-operator
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: prometheus-operator
subjects:
- kind: ServiceAccount
name: prometheus-operator
namespace: default
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
app.kubernetes.io/component: controller
app.kubernetes.io/name: prometheus-operator
app.kubernetes.io/version: v0.30.0
name: prometheus-operator
rules:
- apiGroups:
- apiextensions.k8s.io
resources:
- customresourcedefinitions
verbs:
- '*'
- apiGroups:
- monitoring.coreos.com
resources:
- alertmanagers
- prometheuses
- prometheuses/finalizers
- alertmanagers/finalizers
- servicemonitors
- podmonitors
- prometheusrules
verbs:
- '*'
- apiGroups:
- apps
resources:
- statefulsets
verbs:
- '*'
- apiGroups:
- ""
resources:
- configmaps
- secrets
verbs:
- '*'
- apiGroups:
- ""
resources:
- pods
verbs:
- list
- delete
- apiGroups:
- ""
resources:
- services
- services/finalizers
- endpoints
verbs:
- get
- create
- update
- delete
- apiGroups:
- ""
resources:
- nodes
verbs:
- list
- watch
- apiGroups:
- ""
resources:
- namespaces
verbs:
- get
- list
- watch
---
apiVersion: apps/v1beta2
kind: Deployment
metadata:
labels:
app.kubernetes.io/component: controller
app.kubernetes.io/name: prometheus-operator
app.kubernetes.io/version: v0.30.0
name: prometheus-operator
namespace: default
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/component: controller
app.kubernetes.io/name: prometheus-operator
template:
metadata:
labels:
app.kubernetes.io/component: controller
app.kubernetes.io/name: prometheus-operator
app.kubernetes.io/version: v0.30.0
spec:
containers:
- args:
- --kubelet-service=kube-system/kubelet
- --logtostderr=true
- --config-reloader-image=quay.io/coreos/configmap-reload:v0.0.1
- --prometheus-config-reloader=quay.io/coreos/prometheus-config-reloader:v0.30.0
image: quay.io/coreos/prometheus-operator:v0.30.0
name: prometheus-operator
ports:
- containerPort: 8080
name: http
resources:
limits:
cpu: 200m
memory: 200Mi
requests:
cpu: 100m
memory: 100Mi
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
nodeSelector:
beta.kubernetes.io/os: linux
securityContext:
runAsNonRoot: true
runAsUser: 65534
serviceAccountName: prometheus-operator
---
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
app.kubernetes.io/component: controller
app.kubernetes.io/name: prometheus-operator
app.kubernetes.io/version: v0.30.0
name: prometheus-operator
namespace: default
---
apiVersion: v1
kind: Service
metadata:
labels:
app.kubernetes.io/component: controller
app.kubernetes.io/name: prometheus-operator
app.kubernetes.io/version: v0.30.0
name: prometheus-operator
namespace: default
spec:
clusterIP: None
ports:
- name: http
port: 8080
targetPort: http
selector:
app.kubernetes.io/component: controller
app.kubernetes.io/name: prometheus-operator
```
## Create a NATS Cluster Instance
```yaml
apiVersion: "nats.io/v1alpha2"
kind: "NatsCluster"
metadata:
name: "nats-cluster"
spec:
size: 3
version: "1.4.1"
pod:
enableMetrics: true
metricsImage: "synadia/prometheus-nats-exporter"
metricsImageTag: "0.3.0"
```
## Create a Prometheus instance
### Create RBAC for the Prometheus instance
```yaml
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: prometheus
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
name: prometheus
rules:
- apiGroups: [""]
resources:
- nodes
- services
- endpoints
- pods
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources:
- configmaps
verbs: ["get"]
- nonResourceURLs: ["/metrics"]
verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: prometheus
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: prometheus
subjects:
- kind: ServiceAccount
name: prometheus
namespace: default
```
### Create the Prometheus instance
```yaml
---
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
name: prometheus
spec:
serviceAccountName: prometheus
serviceMonitorSelector:
matchLabels:
app: nats
nats_cluster: nats-cluster
resources:
requests:
memory: 400Mi
enableAdminAPI: true
```
## Create the ServiceMonitor for the NATS cluster
```yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: nats-cluster
labels:
app: nats
nats_cluster: nats-cluster
spec:
selector:
matchLabels:
app: nats
nats_cluster: nats-cluster
endpoints:
- port: metrics
```
## Confirm
```
kubectl port-forward prometheus-prometheus-0 9090:9090
```
Results:
<img height="400" width="1000" src="https://user-images.githubusercontent.com/26195/59470419-2066fd80-8e27-11e9-9e3e-250296a091da.png">

View File

@ -0,0 +1,390 @@
# Creating a NATS Streaming cluster in K8S with FT mode
## Preparation
First, we need a Kubernetes cluster with a provider that offers a
service with a `ReadWriteMany` filesystem available. In this short guide,
we will create the cluster on AWS and then use EFS for the filesystem:
```
# Create 3 nodes Kubernetes cluster
eksctl create cluster --name nats-eks-cluster \
--nodes 3 \
--node-type=t3.large \
--region=us-east-2
# Get the credentials for your cluster
eksctl utils write-kubeconfig --name nats-eks-cluster --region us-east-2
```
For the FT mode to work, we will need to create an EFS volume which
can be shared by more than one pod. Go into the [AWS console](https://us-east-2.console.aws.amazon.com/efs/home?region=us-east-2#/wizard/1) and create one and make the sure that it is in a security group where the k8s nodes will have access to it. In case of clusters created via eksctl, this will be a security group named `ClusterSharedNodeSecurityGroup`:
<img width="1063" alt="Screen Shot 2019-12-04 at 11 25 08 AM" src="https://user-images.githubusercontent.com/26195/70177488-5ef0bd00-16d2-11ea-9cf3-e0c3196bc7da.png">
<img width="1177" alt="Screen Shot 2019-12-04 at 12 40 13 PM" src="https://user-images.githubusercontent.com/26195/70179769-9497a500-16d6-11ea-9e18-2a8588a71819.png">
### Creating the EFS provisioner
Confirm from the FilesystemID from the cluster and the DNS name, we will use those values to create an EFS provisioner controller within the K8S cluster:
<img width="852" alt="Screen Shot 2019-12-04 at 12 08 35 PM" src="https://user-images.githubusercontent.com/26195/70177502-657f3480-16d2-11ea-9d00-b9a8c2f5320b.png">
```yaml
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: efs-provisioner
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: efs-provisioner-runner
rules:
- apiGroups: [""]
resources: ["persistentvolumes"]
verbs: ["get", "list", "watch", "create", "delete"]
- apiGroups: [""]
resources: ["persistentvolumeclaims"]
verbs: ["get", "list", "watch", "update"]
- apiGroups: ["storage.k8s.io"]
resources: ["storageclasses"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["events"]
verbs: ["create", "update", "patch"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: run-efs-provisioner
subjects:
- kind: ServiceAccount
name: efs-provisioner
# replace with namespace where provisioner is deployed
namespace: default
roleRef:
kind: ClusterRole
name: efs-provisioner-runner
apiGroup: rbac.authorization.k8s.io
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: leader-locking-efs-provisioner
rules:
- apiGroups: [""]
resources: ["endpoints"]
verbs: ["get", "list", "watch", "create", "update", "patch"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: leader-locking-efs-provisioner
subjects:
- kind: ServiceAccount
name: efs-provisioner
# replace with namespace where provisioner is deployed
namespace: default
roleRef:
kind: Role
name: leader-locking-efs-provisioner
apiGroup: rbac.authorization.k8s.io
---
apiVersion: v1
kind: ConfigMap
metadata:
name: efs-provisioner
data:
file.system.id: fs-c22a24bb
aws.region: us-east-2
provisioner.name: synadia.com/aws-efs
dns.name: ""
---
kind: Deployment
apiVersion: extensions/v1beta1
metadata:
name: efs-provisioner
spec:
replicas: 1
strategy:
type: Recreate
template:
metadata:
labels:
app: efs-provisioner
spec:
serviceAccountName: efs-provisioner
containers:
- name: efs-provisioner
image: quay.io/external_storage/efs-provisioner:latest
env:
- name: FILE_SYSTEM_ID
valueFrom:
configMapKeyRef:
name: efs-provisioner
key: file.system.id
- name: AWS_REGION
valueFrom:
configMapKeyRef:
name: efs-provisioner
key: aws.region
- name: DNS_NAME
valueFrom:
configMapKeyRef:
name: efs-provisioner
key: dns.name
- name: PROVISIONER_NAME
valueFrom:
configMapKeyRef:
name: efs-provisioner
key: provisioner.name
volumeMounts:
- name: pv-volume
mountPath: /efs
volumes:
- name: pv-volume
nfs:
server: fs-c22a24bb.efs.us-east-2.amazonaws.com
path: /
---
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: aws-efs
provisioner: synadia.com/aws-efs
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: efs
annotations:
volume.beta.kubernetes.io/storage-class: "aws-efs"
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Mi
```
Result of deploying the manifest:
```sh
serviceaccount/efs-provisioner created
clusterrole.rbac.authorization.k8s.io/efs-provisioner-runner created
clusterrolebinding.rbac.authorization.k8s.io/run-efs-provisioner created
role.rbac.authorization.k8s.io/leader-locking-efs-provisioner created
rolebinding.rbac.authorization.k8s.io/leader-locking-efs-provisioner created
configmap/efs-provisioner created
deployment.extensions/efs-provisioner created
storageclass.storage.k8s.io/aws-efs created
persistentvolumeclaim/efs created
```
### Setting up the NATS Streaming cluster
Now create a NATS Streaming cluster with FT mode enabled and using NATS embedded mode
that is mounting the EFS volume:
```yaml
---
apiVersion: v1
kind: Service
metadata:
name: stan
labels:
app: stan
spec:
selector:
app: stan
clusterIP: None
ports:
- name: client
port: 4222
- name: cluster
port: 6222
- name: monitor
port: 8222
- name: metrics
port: 7777
---
apiVersion: v1
kind: ConfigMap
metadata:
name: stan-config
data:
stan.conf: |
http: 8222
cluster {
port: 6222
routes [
nats://stan:6222
]
cluster_advertise: $CLUSTER_ADVERTISE
connect_retries: 10
}
streaming {
id: test-cluster
store: file
dir: /data/stan/store
ft_group_name: "test-cluster"
file_options {
buffer_size: 32mb
sync_on_flush: false
slice_max_bytes: 512mb
parallel_recovery: 64
}
store_limits {
max_channels: 10
max_msgs: 0
max_bytes: 256gb
max_age: 1h
max_subs: 128
}
}
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: stan
labels:
app: stan
spec:
selector:
matchLabels:
app: stan
serviceName: stan
replicas: 3
volumeClaimTemplates:
- metadata:
name: efs
annotations:
volume.beta.kubernetes.io/storage-class: "aws-efs"
spec:
accessModes: [ "ReadWriteMany" ]
resources:
requests:
storage: 1Mi
template:
metadata:
labels:
app: stan
spec:
# STAN Server
terminationGracePeriodSeconds: 30
containers:
- name: stan
image: nats-streaming:latest
ports:
# In case of NATS embedded mode expose these ports
- containerPort: 4222
name: client
- containerPort: 6222
name: cluster
- containerPort: 8222
name: monitor
args:
- "-sc"
- "/etc/stan-config/stan.conf"
# Required to be able to define an environment variable
# that refers to other environment variables. This env var
# is later used as part of the configuration file.
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: CLUSTER_ADVERTISE
value: $(POD_NAME).stan.$(POD_NAMESPACE).svc
volumeMounts:
- name: config-volume
mountPath: /etc/stan-config
- name: efs
mountPath: /data/stan
resources:
requests:
cpu: 0
livenessProbe:
httpGet:
path: /
port: 8222
initialDelaySeconds: 10
timeoutSeconds: 5
- name: metrics
image: synadia/prometheus-nats-exporter:0.5.0
args:
- -connz
- -routez
- -subz
- -varz
- -channelz
- -serverz
- http://localhost:8222
ports:
- containerPort: 7777
name: metrics
volumes:
- name: config-volume
configMap:
name: stan-config
```
Your cluster now will look something like this:
```
kubectl get pods
NAME READY STATUS RESTARTS AGE
efs-provisioner-6b7866dd4-4k5wx 1/1 Running 0 21m
stan-0 2/2 Running 0 6m35s
stan-1 2/2 Running 0 4m56s
stan-2 2/2 Running 0 4m42s
```
If everything was setup properly, one of the servers will be the active node.
```
$ kubectl logs stan-0 -c stan
[1] 2019/12/04 20:40:40.429359 [INF] STREAM: Starting nats-streaming-server[test-cluster] version 0.16.2
[1] 2019/12/04 20:40:40.429385 [INF] STREAM: ServerID: 7j3t3Ii7e2tifWqanYKwFX
[1] 2019/12/04 20:40:40.429389 [INF] STREAM: Go version: go1.11.13
[1] 2019/12/04 20:40:40.429392 [INF] STREAM: Git commit: [910d6e1]
[1] 2019/12/04 20:40:40.454212 [INF] Starting nats-server version 2.0.4
[1] 2019/12/04 20:40:40.454360 [INF] Git commit [c8ca58e]
[1] 2019/12/04 20:40:40.454522 [INF] Starting http monitor on 0.0.0.0:8222
[1] 2019/12/04 20:40:40.454830 [INF] Listening for client connections on 0.0.0.0:4222
[1] 2019/12/04 20:40:40.454841 [INF] Server id is NB3A5RSGABLJP3WUYG6VYA36ZGE7MP5GVQIQVRG6WUYSRJA7B2NNMW57
[1] 2019/12/04 20:40:40.454844 [INF] Server is ready
[1] 2019/12/04 20:40:40.456360 [INF] Listening for route connections on 0.0.0.0:6222
[1] 2019/12/04 20:40:40.481927 [INF] STREAM: Starting in standby mode
[1] 2019/12/04 20:40:40.488193 [ERR] Error trying to connect to route (attempt 1): dial tcp: lookup stan on 10.100.0.10:53: no such host
[1] 2019/12/04 20:40:41.489688 [INF] 192.168.52.76:40992 - rid:6 - Route connection created
[1] 2019/12/04 20:40:41.489788 [INF] 192.168.52.76:40992 - rid:6 - Router connection closed
[1] 2019/12/04 20:40:41.489695 [INF] 192.168.52.76:6222 - rid:5 - Route connection created
[1] 2019/12/04 20:40:41.489955 [INF] 192.168.52.76:6222 - rid:5 - Router connection closed
[1] 2019/12/04 20:40:41.634944 [INF] STREAM: Server is active
[1] 2019/12/04 20:40:41.634976 [INF] STREAM: Recovering the state...
[1] 2019/12/04 20:40:41.655526 [INF] STREAM: No recovered state
[1] 2019/12/04 20:40:41.671435 [INF] STREAM: Message store is FILE
[1] 2019/12/04 20:40:41.671448 [INF] STREAM: Store location: /data/stan/store
[1] 2019/12/04 20:40:41.671524 [INF] STREAM: ---------- Store Limits ----------
[1] 2019/12/04 20:40:41.671527 [INF] STREAM: Channels: 10
[1] 2019/12/04 20:40:41.671529 [INF] STREAM: --------- Channels Limits --------
[1] 2019/12/04 20:40:41.671531 [INF] STREAM: Subscriptions: 128
[1] 2019/12/04 20:40:41.671533 [INF] STREAM: Messages : unlimited
[1] 2019/12/04 20:40:41.671535 [INF] STREAM: Bytes : 256.00 GB
[1] 2019/12/04 20:40:41.671537 [INF] STREAM: Age : 1h0m0s
[1] 2019/12/04 20:40:41.671539 [INF] STREAM: Inactivity : unlimited *
[1] 2019/12/04 20:40:41.671541 [INF] STREAM: ----------------------------------
[1] 2019/12/04 20:40:41.671546 [INF] STREAM: Streaming Server is ready
```