QoS

From Hack Sphere Labs Wiki
Revision as of 10:06, 4 February 2015 by Webdawg (talk | contribs) (HFSC General Information)

Jump to: navigation, search

pfSense

HFSC burst is broken in pfSense. "It's a kernel issue with dummynet in pf."

HFSC General Information

If you put ack packets in a high bw queue, they will confirm with the remote system that data was received.

You can give certain services priority and keep speed and latency low.

You can serve x amount of data out quickly while slowing long term objects. You decide you want to serve out data quickly in the beginning of the connection and slow down after a few seconds. This is called a nonlinear service curve (NLSC or just SC).[1]

bandwidth

  • parent queue is max bandwidth on entire interface
  • child is percent or hard number that cannot exceed parent queue

priority level specifies the order in which a service is to occur relative to other queues and is used in CBQ and PRIQ, but not HFSC. Priority is does _not_ define an amount of bandwidth, but the order in which packets are buffered before being set out of the interface. Default (1)[1]

qlimit: the amount of packets to buffer and queue when the amount of available bandwidth has been exceeded. This value is 50 packets by default. When the total amount of upload bandwidth has been reached on the outgoing interface or higher queues are taking up all of the bandwidth then no more data can be sent. The qlimit will put the packets the queue can not send out into slots in memory in the order that they arrive. When bandwidth is available the qlimit slots will be emptied in the order they arrived; first in, first out (FIFO). If the qlimit reaches the maximum value of qlimit, the packets will be dropped.[1]

Look at qlimit slots as "emergency use only," but as a better alternative to dropping the packets out right. Understand dropping packets is the proper way TCP knows it needs to reduce bandwidth; so dropping packets are not bad. The problem is TCP Tahoe or Reno methods will slow down the connection too severely and it takes a while to ramp back up after a dropped packet. A small qlimit buffer helps smooth out the connection, but "buffer bloat" works against TCP's congestion control. Also, do not think that setting the qlimit really high will solve the problem of bandwidth starvation and packet drops. What you want to do is setup a queue with the proper bandwidth boundaries so that packets only go into the qlimit slots for a short time (no more than a second), if ever.[1]

Calculating qlimit: If the qlimit is too large then you will run into a common issue called buffer bloat. Search on Google for "buffer bloat" for more information. A good idea is to set the qlimit to the amount of packets you want to buffer (not drop) in no more then a given amount of time. Take the total amount of upload bandwidth you have for your connection. Lets say that is 25 megabit upload speed. Now decide how much time you are willing to buffer packets before they get sent out. Lets say we will buffer 0.5 seconds which is quite long. So, 25 megabit divided by 8 is 3.125 megabytes per second. The average maximum segment size is 1460 bytes. 3.125 MB/sec divided by 0.001460 MB is 2140.41 packets per second. Now, we decided that we want to queue 0.5 seconds which is 2140.41 packets per second time 0.5 seconds which is 1070 packets. Thus, we set the qlimit at 1070. 1070 packets at a MSS of 1460 bytes is a 1.562 megabyte buffer. This is just a rough model, but you get the idea. We prefer to set our buffer a little high so that network spikes get buffered for 0.5 to one(1) second and then sent out. This method smooths out upload spikes, but does add some buffer bloat to our external network connection. In _our_tests on _our_ network a larger buffer worked better in the real world then the default qlimit of 50 packets set by OpenBSD. Do your own tests and make an informed decision.

realtime: the amount of bandwidth that is guaranteed to the queue no matter what any other queue needs. Realtime can be set from 0% to 80% of total connection bandwidth. Lets say you want to make sure that your web server gets 25KB/sec of bandwidth no matter what. Setting the realtime value will give the web server queue the bandwidth it needs even if other queues want to share its bandwidth.

upperlimit: the amount of bandwidth the queue can _never_ exceed. For example, say you want to setup a new mail server and you want to make sure that the server never takes up more than 50% of your available bandwidth. Or lets say you have a p2p user you need the limit. Using the upperlimit value will keep them from abusing the connection.

linkshare (m2): this value has the exact same use as "bandwidth" above. If you decide to use both "bandwidth" and "linkshare" in the same rule, pf (OpenBSD) will override the bandwidth directive and use "linkshare m2". This may cause more confusion than it is worth especially if you have two different settings in each. For this reason we are not going to use linkshare in our rules. The only reason you may want to use linkshare _instead of_ bandwidth is if you want to enable a nonlinear service curve.

nonlinear service curve (NLSC or just SC): The directives realtime, upperlimit and linkshare can all take advantage of a NLSC. In our example below we will use this option on our "web" queue. The format for service curve specifications is (m1, d, m2). m2 controls the bandwidth assigned to the queue. m1 and d are optional and can be used to control the initial bandwidth assignment. For the first d milliseconds the queue gets the bandwidth given as m1, after wards the value given in m2.

default: the default queue. As data connections or rules which are not specifically put into any other queue will be put into the default queue rule. This directive must be in only one rule. You can _not_ have two(2) default directives in any two(2) rules.

ecn: In ALTQ, ECN (Explicit Congestion Notification) works in conjunction with RED (Random early detection). ECN allows end-to-end notification of network congestion without dropping packets.

ECN is an optional feature which is used when both endpoints support it and are willing to use it. OpenBSD has ecn disabled by default and Ubuntu has it turned on only if the remote system asks for it first. Traditionally, TCP/IP networks signal congestion by dropping packets. When ECN is successfully negotiated, an ECN-aware router may set a mark in the IP header instead of dropping a packet in order to signal impending congestion. The receiver of the packet echoes the congestion indication to the sender, which must react as though a packet was dropped. ALTQ's version of RED is similar to Weighted RED (WRED) and RED In/Out (RIO) which provide early detection when used with ECN. The end result is a more stable TCP connection over congested networks.

Be very careful when enabling ECN on your machines. Remember that any router or ECN enabled device can notify both the client and server to slow the connection down. If a machine in your path is configured to send ECN when their congestion is low then your connections speed will suffer greatly. For example, telling clients to slow their connections when the link is 90% saturated would be reasonable. The connection would have a 10% safety buffer instead of dropping packets. Some routers are configured incorrectly and will send ECN when they are only 10%-50% utilized. This means your throughput speeds will be painfully low even though there is plenty of base bandwidth available. Truthfully, we do not use ECN or RED due to the ability of routers, misconfigured or not, to abuse congestion notification.
  1. 1.0 1.1 1.2 1.3 https://calomel.org/pf_hfsc.html