Setup consul cluster
Service discovery with Consul
Consul does the service discovery and health checks using beacons and maintains the state of the system using leader election protocols like raft. It makes use of serf an implementation of gossip protocol which spreads the information (node-alive, node-dead, node-stale) like fire within the cluster. It exposes the discovered information via DNS and HTTPs. Reference : get the whole picture of why we use consul
In the code repository cloned you will have a consul folder which has the following scripts too.
Installing consul in master instance :
We are going to use the consul docker image but we need some variables and configurations in place to start consul.
# Datacenter name can be anything, organization have some name for each datacenter
# Why : Consul is data center aware so we need this parameter to start consul
DATACENTER=dc1
# Domain Name Server for our AWS instances something like 10.29.0.1
# Why : we need to route aws internal dns traffic to this server
AWS_DNS=$(cat /etc/resolv.conf |grep -i nameserver|head -n1|cut -d ' ' -f2)
# Docker Bridge IP something like 172.17.0.1
# Why : we get a dns server address which is also accessible from inside the container
BRIDGE_IP=$(docker run --rm alpine sh -c "ip ro get 8.8.8.8 | awk '{print \$3}'")
# Domain name of AWS instances something like ec2.internal from
# Why : we need to know which traffic is to go to aws dns server specified in /etc/resolv.conf
DOMAIN_NAME=$(curl -s http://169.254.169.254/latest/meta-data/local-hostname | cut -d "." -f 2-)
# Private ip address for the instance ( NOTE this, will be useful for installing agents)
# Why : we now can advertise ourselves in consul cluster
IP_ADDRESS=$(curl -s http://169.254.169.254/latest/meta-data/local-ipv4)
Consul has an internal DNS server running on 8600 non standard port by default, but the standard DNS port is 53 (privileged). So we are routing all consul domain traffic like postgres.service.consul
from 53 to 8600. dnsmasq does this job efficiently. We are using a docker container for that and docker restarts when ever it stops. Reference : dnsmasq man page.
docker run --restart=always -d --name=dnsmasq --net=host andyshinn/dnsmasq:2.76 -u root --log-facility=- -q -R -h -S $AWS_DNS -S /consul/$BRIDGE_IP#8600
When you docker logs dnsmasq
you will get to see
# using nameserver 172.17.0.1#8600 for domain consul
# using nameserver 10.27.0.2#53
Now we are changing the /etc/resolv.conf with Here document bash capability [ <<-EOF .... EOF ].
sudo tee /etc/resolv.conf <<-EOF
nameserver $BRIDGE_IP
search $DOMAIN_NAME
EOF
/etc/resolv.conf is a file used by operating system to convert the names like /wiki to numeric ip address where the server and page can be found. nameserver tells the kernal to ask this address for any numeric ip address given name. search does a append then search for names like /wiki to /wiki.internal. The above code will let us search for ec2 related ip address like ec2.internal to be resolved and all requests for through the docker bridge ip so that we can get /consul domain requests into port 8600 via dnsmasq.
.
Consul stores the data thats discovered and logs all events so we need a data directory
# Setup a data directory
sudo mkdir -p /consul/{data,config}
sudo chmod 777 -R /consul
Here comes the consul configurations (a simple json file) :
cat <<EOF > /consul/config/00_server.json
{
"advertise_addr": "$IP_ADDRESS",
"bootstrap_expect": 1,
"client_addr": "0.0.0.0",
"datacenter": "$DATACENTER",
"disable_remote_exec": true,
"dns_config": {
"node_ttl": "10s",
"allow_stale": true,
"max_stale": "10s",
"service_ttl": {
"*": "10s"
}
},
"log_level": "INFO",
"leave_on_terminate": true,
"server": true,
"ui": true
}
EOF
advertise_addr gives the node address in the cluster
bootstrap_expect is the minimum amount of nodes to be alive in the cluster while startup so that the node can stand for leader election
client_addr gives the ability to control the source of who talks to this node
server to be set as true for this node to act as server (default : false)
ui to true so that we can get consul-ui (default url : http://localhost:8500/ui/
Reference : all agent options
.
Run the consul docker container named consul
# Run the consul container with /consul as volume mounted inside the container so that all configurations go inside the container
docker run --restart=always -d --name=consul -v /consul:/consul --net=host consul:v0.7.0 agent
Check installation
docker exec -it consul consul members
# you will see only one server as member in cluster
Installing consul in agent/slave instances :
We are going to use the consul docker image but we need some variables and configurations in place to start consul.
DATACENTER=dc1
AWS_DNS=$(cat /etc/resolv.conf |grep -i nameserver|head -n1|cut -d ' ' -f2)
BRIDGE_IP=$(docker run --rm alpine sh -c "ip ro get 8.8.8.8 | awk '{print \$3}'")
DOMAIN_NAME=$(curl -s http://169.254.169.254/latest/meta-data/local-hostname | cut -d "." -f 2-)
IP_ADDRESS=$(curl -s http://169.254.169.254/latest/meta-data/local-ipv4)
TODO : Hardcoded as of now how to get this dynamically # Use new retry-join-ec2-tag feature
Agent needs to know to which server it has to register and join the cluster on startup, can be a list of ips too
Reference : post which gives two methods for agents joining cluster
CONSUL_SERVER_IP=<Private ip of the master instance>
docker run --restart=always -d --name=dnsmasq --net=host andyshinn/dnsmasq:2.76 -u root --log-facility=- -q -R -h -S $AWS_DNS -S /consul/$BRIDGE_IP#8600
sudo tee /etc/resolv.conf <<-EOF
nameserver $BRIDGE_IP
search $DOMAIN_NAME
EOF
.
Consul stores the data thats discovered and logs all events so we need a data directory
# Setup a data directory
sudo mkdir -p /consul/{data,config}
sudo chmod 777 -R /consul
Here comes the consul configurations (a simple json file) :
# Setup consul agent
cat <<EOF > /consul/config/00_server.json
{
"bind_addr": "$IP_ADDRESS",
"client_addr": "0.0.0.0",
"retry_join": ["$CONSUL_SERVER_IP"],
"datacenter": "$DATACENTER",
"log_level": "INFO"
}
EOF
bind_addr gives the node address in the cluster
client_addr gives the ability to control the source of who talks to this node
retry_join list of server ip addresses
Reference : all agent options
.
Run the consul docker container named consul
# Run the consul container with /consul as volume mounted inside the container so that all configurations go inside the container
docker run --restart=always -d --name=consul -v /consul:/consul --net=host consul:v0.7.0 agent
Check installation
docker exec -it consul consul members
# you will see all members in cluster
Look out for logs to dig more about leader elections in master instance
docker logs consul
Play around !