How to Install HAProxy HTTP Load Balancer on CentOS
Installing HAProxy CentOS 7
As a fast developing open source application, the HAProxy that is available for install in the CentOS default repositories might not be the latest release. You may check the version number for the repositories with the official channels; to do so, use the command written below.
sudo yum info haproxy
We will be using the latest stable version, 1.7, which is not currently available in the standard repositories.
You will need to install it from the source.
To start, ensure that you have the prerequisites to download and compile the program.
sudo yum install gcc pcre-static pcre-devel -y
You can download the source code with the following command. You may check if there is ever a newer version than your current one at the HAProxy download page: ‘http://www.haproxy.org/#down’
wget https://www.haproxy.org/download/1.7/src/haproxy-1.7.8.tar.gz -O ~/haproxy.tar.gz
After the download is completed, you can extract the files with the following command.
tar xzvf ~/haproxy.tar.gz -C ~/
Make sure that you change it into the extracted source directory.
Then, ensure you compile the program for your system.
Lastly, install HAProxy itself.
sudo make install
After you are done with all of the steps above, HAProxy should now be installed but requires a few more steps to get it operational. Continue to set up the software and services.
Setting up HAProxy for your VPS
For this next step, you will need to add the following directories and the statistics file for the HAProxy records.
sudo mkdir -p /etc/haproxy sudo mkdir -p /run/haproxy sudo mkdir -p /var/lib/haproxy sudo touch /var/lib/haproxy/stats
Make a symbolic link so that the binary will allow you to run HAProxy commands as a normal user.
sudo ln -s /usr/local/sbin/haproxy /usr/sbin/haproxy
If you would like to add the proxy as a service to the system, copy the file ‘haproxy.init’ from the examples to the ‘/etc/init.d’ directory. You will need to change the file permissions to make the script executable before reloading the ‘systemd’ daemon.
sudo cp ~/haproxy-1.7.8/examples/haproxy.init /etc/init.d/haproxy sudo chmod 755 /etc/init.d/haproxy sudo systemctl daemon-reload
As usage for everything, we recommend to add a new account for HAProxy to be run under.
sudo useradd -r haproxy
Next, you may double check for the installed version number with the command below.
HA-Proxy version 1.7.8 2017/07/07
Copyright 2000-2017 Willy Tarreau <[email protected]>
In this case, you should have the version 1.7.8 as shown in the example output above.
CentOS 7 has a restrictive firewall for this project by default, You can use the commands below to allow the required services and reload the firewall.
sudo firewall-cmd --permanent --zone=public --add-service=http sudo firewall-cmd --permanent --zone=public --add-port=8181/tcp sudo firewall-cmd --reload
Configuring the load balancer
To set up HAProxy for load balancing is a pretty straight forward procedure. All you have to do is tell HAProxy the type of connections that it should be listening for and where the connections should be relayed to.
You can get this done by making a configuration file ‘/etc/haproxy/haproxy.cfg’ with the defining settings.
Load balancing at layer 4
To start off with a simple setup, make a new configuration file, for example using ‘vi’ with the following command.
sudo vi /etc/haproxy/haproxy.cfg
Be sure to add the following sections to the file. Replace the ‘<server name>’ with the name you would like to call the servers on the statistics page and the ‘<private IP>’ with the private IPs of the servers that you can direct the web traffic to.
You are able to check the private IPs for your servers at your UpCloud Control Panel and Private Network tab under Network menu.
global log /dev/log local0 log /dev/log local1 notice chroot /var/lib/haproxy stats socket /run/haproxy/admin.sock mode 660 level admin stats timeout 30s user haproxy group haproxy daemon defaults log global mode http option httplog option dontlognull timeout connect 5000 timeout client 50000 timeout server 50000 frontend http_front bind *:80 stats uri /haproxy?stats default_backend http_back backend http_back balance roundrobin server <server name> <private IP>:80 check server <server name> <private IP>:80 check
This will define a layer 4 load balancer along with a front-end name ‘http_front’ which will be listening to the port number 80 and it will direct the traffic the default backend named ‘http_back’. The additional stats URI /‘haproxy?stats’ will enable you to see the statistics page at the shown address.
Different load balancing algorithms
Configuring the servers in the backend area will allow HAProxy to use the servers for load balancing using the algorithm called ‘roundrobin’, whenever it is available.
The algorithms which are used to balance are also used to decide on what each server at the backend is used for.
Some of the recommended algorithms have the following:
Roundrobin: Every server is going to be used in turns and according to their weights. This is considered the smoothest and fairest algorithm with the servers processing time remains equally distributed. While this algorithm is dynamic, it allows server weights to be adjusted on the fly.
Leastconn: This algorithm will usually go with the server which has the lowest number of connections while Roundrobin is performed between a couple of servers with the same load. This algorithm is a good choice for long sessions, such as LDAP, SQL, TSE, and others. However it is not so commonly used for short sessions like HTTP.
First: The first server which has available connection slots will get the connection. The servers will be chosen from the lowest numeric identifier to the highest which will end up defaulting to the server’s position on the farm. Once a server has reached the ‘maxconn’ value, the next server will be used.
Source: The source IP address will be hashed and divided because of the total weight of the running servers to point out which server will have the request. This will have the same client IP address which will always reach the same server while the servers will be staying the same.
Configuring load balancing for layer 7
Another option is to configure the load balancer to work on layer 7; this option is useful when parts of your web application are located on different hosts and can easily be accomplished by conditioning the connection transfer for example by the URL.
Start with opening the HAProxy config file with a text editor.
sudo vi /etc/haproxy/haproxy.cfg
Next, you will need to set the front and backend segments as shown in the following example.
frontend http_front bind *:80 stats uri /haproxy?stats acl url_blog path_beg /blog use_backend blog_back if url_blog default_backend http_back backend http_back balance roundrobin server <server name> <private IP>:80 check server <server name> <private IP>:80 check backend blog_back server <server name> <private IP>:80 check
This will declare a front end ACL rule called ‘url_blog’ which applies to all connections with paths that start with ‘/blog.Use_backend’ which will define the connections which match the ‘url_blog’ condition before being served by the backend named ‘blog_back’ whilst the other requests are handled by the default backend.
In the backend side, the config will be set up with two server groups, ‘http_back’ as told before and ‘blog_back’. The servers will specifically connect to ‘example.com/blog’.
After the configurations have been made, save the file and restart HAProxy with the below command.
sudo systemctl restart haproxy
If any types of errors or warnings appear, be sure to check the configuration again for any mistypes; if you have all of the required files and folders then you can try restarting it again.
Testing the setup
If you have HAProxy configured and up, you may open your load balancer server’s public IP in a web server and make sure that you get connected to your backend correctly.
The parameter stats ‘uri’ in the config will enable you to view the statistics page at the defined link.
http://<load balancer public IP>/haproxy?stats
Once you have loaded the statistics page, and all of the servers are actually listed in green, it means your configuration is successful.
The statistics page will have useful information to help keep track of the web hosts including up times, down times, and session counts.
If one of the servers is listed in red, check if the server is powered on and that you can actually ping it from the load balancer machine.
In a situation where your load balancer does not reply, you will need to check the HTTP connections in case they are being blocked by the firewall. You can also confirm that HAProxy is running with the following command.
Password protecting the statistics page
Having the statistics page listed at the front end will make it publicly open for anyone to view; in most cases, this will not really be a good idea. Instead, you can set it up to its own port number by adding the following example to the end of your ‘haproxy.cfg’ file. Replace the username and password with something good and secure.
listen stats bind *:8181 stats enable stats uri / stats realm Haproxy\ Statistics stats auth username:password
Once you have added the new listen group, you can then remove the old reference to the stats uri from the frontend group. After this is done, save the file and restart HAProxy again.
sudo systemctl restart haproxy
Next, open the load balancer again with your new port number and log in with the username and password you have chosen in the configuration file.
http://<load balancer public IP>:8181
You will want to check if your servers are still all reporting green before opening the load balancer IP with out any port numbers on your web browser.
http://<load balancer public IP>/
Your backend servers should have a slightly different landing page so that you can notice that each time you reload the page, you will get a reply from a different host. You can always try different balancing algorithm.