Why we need a fine tuning of network settings?
Usually the default network parameters supplied along with the OS should be able to handle the regular traffic. But if you are managing a high traffic server and if you are experiencing sluggishness in accessing your application, then it is recommended to do a linux network performance tuning of your linux operating system.
TCP Connection Establishment
As you know, web servers\application servers generally use Transmission Control Protocol(TCP) for their client-server communication. TCP is a connection oriented protocol, which means, the sender and receiver needs to establish a reliable connection between them to transmit the data. As the first step of establishing the connection, the sender will send a connection request to the receiver. If the receiver is ready to accept the data, then it will send back an acknowledgement(ACK) back with SYN bit set. Now the sender will acknowledges the receiver’s initial sequence Number and its ACK. Now the sender will start its data transfer.
Since the sender and receiver may not be having same network speed, the TCP uses a flow control mechanism named sliding window protocol, so that the sender and receiver will be transmitting the data at same rate. The receiver and the sender will exchange the information about the amount of data, they can accept, using a TCP segment field called receive window. The receiver updates the filed with the amount of data, that it can accept.
Upon seeing the value, the sender will adjust is data transmission, so that it will not send data above this window size, until an acknowledgement is received from the receiver. Once an acknowledgement is received and once the new receive window size is declared by the receiver, the sender can transmit the next set of data. Earlier, the maximum receive window size that can be mentioned in a TCP frame was 65,535 bytes. Now using a new feature called, Window Scaling, the limit is increased to a maximum of 1,073,725,440 bytes(1Gb)
Bandwidth Delay Product – BDP, the bits of data in transit between hosts is equal to Bandwidth * RTT
or in other words,
BDP (bytes) = total bandwidth (KBytes/sec) x round trip time (ms)
The network throughput of that network <= (TCP buffer size / RTT)
The TCP Windows size needs to be large enough to accommodate network bandwidth x maximum expected delay
TCP window size needs to be >= BW * RTT
On a 100 Mbps network with round trip time(RTT) value of 150 ms and with a TCP buffer size of 128 KB, the Bandwidth Delay Product will be 1.88 MB. The maximum throughput value will be <= 6.99 Mbps. To use the 100 Mbps with RTT 150ms, the TCP buffer size should be >= 1831.1 KB
In our above mentioned network, we are wasting 1815 Kilo Bytes of window size(1880-65). So we need to enable the Window Scaling feature. We can modify the window scaling parameter in linux by editing the sysctl.conf file. You need to set the below parameter to 1.
net.ipv4.tcp_window_scaling = 1
You can do the same by executing the below command,
echo 'net.ipv4.tcp_window_scaling = 1' >> /etc/sysctl.conf
Obtain TCP Memory Values
Now obtain the TCP memory values by executing the below commands,
To view receive socket memory size, please execute the below two commands,
cat /proc/sys/net/core/rmem_max cat /proc/sys/net/core/rmem_default
To view the send socket memory size, please execute the below two commands. The first command will give its maximum value and the second command will provide you its default value.
cat /proc/sys/net/core/wmem_max cat /proc/sys/net/core/wmem_default
To view the maximum amount of option memory buffers, please execute the below command,
If the receive socket memory size is small, then sender will be able to send data equal to the receiver socket memory size. So we need to increase this value to a higher value,say 32MB. Likewise, we need the send socket memory size, also to be large, say 32MB.For a network with RTT value, 100ms and 10Gbps network, the value can be as higher as 64MB. If the RTT value is 50ms, then it can be increased to 128MB.
echo 'net.core.wmem_max=33554432' >> /etc/sysctl.conf echo 'net.core.rmem_max=33554432' >> /etc/sysctl.conf
Next step is to increase the linux autotuning TCP buffer limit to 16MB. Here, we can set minimum amount of receive window size, which will be set to each TCP connection, even if the server is having a high load. The default value will be allocated against each TCP connection. Since we are employing the window scaling feature, the window size will grow dynamically till the maximum receive window size, set in bytes, 16777216. For a network with RTT value, 100ms and 10Gbps network, the value can be as higher as 32MB.If the RTT value is 50ms, then it can be increased to 128MB.
echo 'net.ipv4.tcp_rmem = 4096 87380 16777216' >> /etc/sysctl.conf echo 'net.ipv4.tcp_wmem = 4096 65536 16777216' >> /etc/sysctl.conf
Also recommended to set net.ipv4.tcp_timestamps and net.ipv4.tcp_sack to 1, so that it can reduce the CPU load.
echo 'net.ipv4.tcp_timestamps = 1' >> /etc/sysctl.conf echo 'net.ipv4.tcp_sack = 1' >> /etc/sysctl.conf
View congestion control algorithms
To view the available list of congestion control algorithms available for your machine, please execute the bwlo command. It is recommended to set htcp as the congestion control mechanism.
To set htcp as your congestion control alogithm, please execute the below command,
sysctl -w net.ipv4.tcp_congestion_control=htcp
It is recommended to increase number of incoming connections backlog queue Sets the maximum number of packets, queued on the INPUT side, when the interface receives packets faster than kernel can process them.
echo 'net.core.netdev_max_backlog = 65536' >> /etc/sysctl.conf
View the performance tuning done
To save and reload, please execute the below command,
We can use the tcpdump to view the changes on eth1, if eth1 is your NIC.
tcpdump -ni eth1