Installing InfluxDB with High Availability

Forewords

It is advisable to have high availability when serving data. We are considering debian GNU/Linux 10 during this process. With influxdb replication is achieved either by means of paying a managed influxdb instance, buying enteprise version, or implementing an influx-relay that offers bare-minumum high-availability.

In this guide we will create 3 nodes (the minimum to have high availability) serving influxdb by means of a load balancer and an influx-relay.

Installing InfluxDB

A minimal set of instructions to install InfluxDB version 1.8.3 (as the date of this writing v2.0 is still beta)

wget https://dl.influxdata.com/influxdb/releases/influxdb_1.8.3_amd64.deb
dpkg -i influxdb_1.8.3_amd64.deb

Open influx cli and create your user

Start the influx service

systemctl start influxd

And the call influx cli from the terminal and type the following instruction using an appealing password

CREATE USER root WITH PASSWORD '<password>' WITH ALL PRIVILEGES

Basic configuration

We need to edit /etc/influxdb/influxdb.conf to enable authorization. Look for the [http] block and set the following configurations

[http]
  # Determines whether HTTP endpoint is enabled.
  # enabled = true
...
  # The bind address used by the HTTP service.
  bind-address = ":8086"
...
  # Determines whether user authentication is enabled over HTTP/HTTPS.
  auth-enabled = true 
...
  # Determines whether HTTP request logging is enabled.
  log-enabled = true
...
  # If influxd is unable to access the specified path, it will log an error and fall back to writing
  # the request log to stderr.
  access-log-path = "/var/log/influxdb/access.log"

Once edited, restart the influxdb service for the new configurations to take effect,

systemctl restart influxd.service

Now, install influx at the remaining 2 nodes (3 is the mininum number of replicas to be considered “High Availability”). And configure them exactly the same way (even with the same root user at influx)

Configuring hosts

We need to modify each host’s /etc/hosts file in order to add a reference to the other hosts The host file should have a section like,

# InfluxDB nodes
xxx.xxx.xxx.xxx influxdb-node
xxx.xxx.xxx.xxx influxdb-node-02
xxx.xxx.xxx.xxx influxdb-node-03

where xxx.xxx.xxx.xxx represents the IP of the corresponding host

Installing the influx-relay

With influxdb replication is achieved either by means of paying a managed influxdb instance, buying enteprise version, or implementing an influx-relay that offers bare-minumum high-availability performing distributed writes among different influxdb replicas We will use https://github.com/influxdata/influxdb-relay.

You can either install go or precompile the binaries. I’ll made use of my existing local go installation and followed the steps of,

  • Install go get -u github.com/influxdata/influxdb-relay
  • Copy cp $GOPATH/src/github.com/influxdata/influxdb-relay/sample.toml ./relay.toml

Then copied $GOPATH/bin/influxdb-relay (since it was compiled for amd64) and distributed it across all the nodes soon to be influxdb replicas.

Now, according to the readme of influx-relay,

The architecture is fairly simple and consists of a load balancer, two or more InfluxDB Relay processes and two or more InfluxDB processes. The load balancer should point UDP traffic and HTTP POST requests with the path /write to the two relays while pointing GET requests with the path /query to the two InfluxDB servers.

        ┌─────────────────┐                 
        │writes & queries │                 
        └─────────────────┘                 
         ┌───────────────┐                  
         │               │                  
┌────────│ Load Balancer │─────────┐        
│        │               │         │        
│        └──────┬─┬──────┘         │        
│               │ │                │        
│               │ │                │        
│        ┌──────┘ └────────┐       │        
│        │ ┌─────────────┐ │       │┌──────┐
│        │ │/write or UDP│ │       ││/query│
│        ▼ └─────────────┘ ▼       │└──────┘
│  ┌──────────┐      ┌──────────┐  │        
│  │ InfluxDB │      │ InfluxDB │  │        
│  │ Relay    │      │ Relay    │  │        
│  └──┬────┬──┘      └────┬──┬──┘  │        
│     │    |              |  │     │        
│     |  ┌─┼──────────────┘  |     │        
│     │  │ └──────────────┐  │     │        
│     ▼  ▼                ▼  ▼     │        
│  ┌──────────┐      ┌──────────┐  │        
│  │          │      │          │  │        
└─▶│ InfluxDB │      │ InfluxDB │◀─┘        
   │          │      │          │           
   └──────────┘      └──────────┘           

We will pick a master node and install a nginx upstream to point to the influx relays and influxdb instances. To do so we first copied both the compiled binary and the configuration to the target hosts.

Edited the file relay.toml to look like this,

# relay.toml
[[http]]
name = "influxdb-node-http"
bind-addr = "127.0.0.1:9086"
output = [
    { name="influxdb-node-01", location = "http://127.0.0.1:8086/write" },
    { name="influxdb-node-02", location = "http://influxdb-node-02:8086/write" },
    { name="influxdb-node-03", location = "http://influxdb-node-03:8086/write" },
]

[[udp]]
name = "influxdb-node-udp"
bind-addr = "127.0.0.1:9086"
read-buffer = 0 # default
output = [
    { name="influxdb-node-01", location="127.0.0.1:8086", mtu=512 },
    { name="influxdb-node-02", location="influxdb-node-02:8086", mtu=1024 },
    { name="influxdb-node-03", location="influxdb-node-03:8086", mtu=1024 },
]

As the docs say, we require a load balancer to redirect queries to the database (which hopefully will have the same data as the relay would perform writes to all of them). An nginx configuration that looks working is,

  upstream influxdb_relay {
          server 127.0.0.1:9086;
          server influxdb-node-02:9086;
          server influxdb-node-03:9086;
  }

  upstream influxdb {
          server 127.0.0.1:8086;
          server influxdb-node-02:8086;
          server influxdb-node-03:8086;
  }

  server {
    listen 80 default_server;
    listen [::]:80 default_server;

    server_name _;

    location ~* ^/write?(.+)$ {
      if ($query_string) {
        proxy_pass http://influxdb_relay/write?$query_string;
      }
    }

    location ~* ^/query?(.+)$ {
      if ($query_string) {
        proxy_pass http://influxdb/query?$query_string;
      }
    }

    # We can use this location to debug the server. 
    # Feel free to remove it once you are convinced of how the locations work
    location ~* ^/debug?(.+)$ {
      add_header Content-Type text/plain;
      
      if ($query_string) {
        return 200 "query_string=$query_string";
      }
    }
  }

Note that requests without query-strings will produce 404 responses. Also note that the relay only allow writes (it only broadcats writes) If we eventually wanted to broadcast certain requests not allowed by the relay, we could rely (<- “winks”) on the ngx_http_mirror_module (available since nginx 1.13.4):

server {
    location / {
        proxy_pass http://17.0.0.1:8000;
        mirror /s1;
        mirror /s2;
        mirror /s3;       
    }
    location /s1 { internal; proxy_pass http://17.0.0.1:8001$request_uri; }
    location /s2 { internal; proxy_pass http://17.0.0.1:8002$request_uri; }
    location /s3 { internal; proxy_pass http://17.0.0.1:8003$request_uri; }
}

nginx will:

send the same request to all servers wait for all of them to finish respond with the http://17.0.0.1:8000 response (and ignore the others)

InfluxRelay service unit for systemd

Create the directory /opt/influxdb-relay and put the compiled executable influx-relay and the configuration file relay.toml there

create /lib/systemd/system/influxdb-relay.service with the contents,

[Unit]
Description=InfluxDB Relay service
After=network.target

[Service]
User=root
Type=simple
WorkingDirectory=/opt/influxdb-relay
ExecStart=/opt/influxdb-relay/influxdb-relay -config /opt/influxdb-relay/relay.toml
Restart=on-failure

[Install]
WantedBy=multi-user.target

Load it, enable it and start it,

systemctl daemon-reload
systemctl enable influxdb-relay.service
systemctl start influxdb-relay.service

verify that the service unit is running

root@influxdb-node:/opt/influxdb-relay# systemctl status influxdb-relay.service 
● influxdb-relay.service - InfluxDB Relay service
   Loaded: loaded (/lib/systemd/system/influxdb-relay.service; enabled; vendor preset: enabled)
   Active: active (running) since Tue 2020-10-27 17:42:59 UTC; 31s ago
 Main PID: 8135 (influxdb-relay)
    Tasks: 6 (limit: 2377)
   Memory: 6.7M
   CGroup: /system.slice/influxdb-relay.service
           └─8135 /opt/influxdb-relay/influxdb-relay -config /opt/influxdb-relay/relay.toml

Oct 27 17:42:59 influxdb-node systemd[1]: Started InfluxDB Relay service.
Oct 27 17:42:59 influxdb-node influxdb-relay[8135]: 2020/10/27 17:42:59 starting relays...
Oct 27 17:42:59 influxdb-node influxdb-relay[8135]: 2020/10/27 17:42:59 Starting UDP relay "influxdb-node-udp" o
Oct 27 17:42:59 influxdb-node influxdb-relay[8135]: 2020/10/27 17:42:59 Starting HTTP relay "influxdb-node-http"

I’ve verified that the relay works by performing requests directly to it

In particular a couple of requests:

curl -i -XPOST http://localhost:19086/write --data-urlencode "q=CREATE DATABASE mydb"

which will fail since the db parameter was not passed

curl -i -XPOST 'http://localhost:19086/write?db=mydb' --data-binary 'cpu_load_short,host=server01,region=us-west value=0.64 1434055562000000000' -u 'user:password'

Which will fail since the database mydb does not exists. It can be verified that both relay and influxdb show the same response

curl -i -XPOST 'http://localhost:18086/write?db=mydb' --data-binary 'cpu_load_short,host=server01,region=us-west value=0.64 1434055562000000000' -u 'user:password'

HTTP/1.1 404 Not Found
Content-Type: application/json
Request-Id: b05c293a-188d-11eb-aa70-1c1b0deb4028
X-Influxdb-Build: OSS
X-Influxdb-Error: database not found: "mydb"
X-Influxdb-Version: 1.8.3
X-Request-Id: b05c293a-188d-11eb-aa70-1c1b0deb4028
Date: Tue, 27 Oct 2020 19:50:39 GMT
Content-Length: 41

{"error":"database not found: \"mydb\""}

Just like,

curl -i -XPOST 'http://localhost:19086/write?db=mydb' --data-binary 'cpu_load_short,host=server01,region=us-west value=0.64 1434055562000000000' -u 'user:password'
HTTP/1.1 404 Not Found
Content-Length: 41
Date: Tue, 27 Oct 2020 19:16:21 GMT
Content-Type: text/plain; charset=utf-8

{"error":"database not found: \"mydb\""}
comments powered by Disqus