My hardware:
IBM x335 server with 2 Xeon 2.4GHz CPUs, 2GB RAM, 2×36GB SCSI hard drives.
Note, that I will not use RAID, because I only have 2 hard drives and I need’em both for different roles to maximize performance. Plus, I have a cluster of squid servers, so even if one of them goes down - it won’t be as painful.
If you decide to use RAID, Tony (the guy who wrote the helpful guide I mentioned earlier) suggests this:
I highly recommend that your machine is set up with a separate pair of squid disks, or worst case, on a separate partition on the host OS drives, utilizing a decently fast RAID level (think RAID10 here, don’t bother going near RAID5, you’ll get major I/O lag on writes). I’d recommend going for FAST disks (stay away from IDE here, or you’ll be in a world of pain).
1. Install Ubuntu 8.04 server edition, with ssh server.
disk a: 34GB - /, ext3.
2GB - swap area.
disk b: 36GB - /var/spool/squid, reiserfs
after the install completes, enter /etc/fstab and comment out the disk b line. instead:
/dev/sdb1 /var/spool/squid reiserfs defaults,notail,noatime 1 2
Reason for choosing ReiserFS:
Probably the most important thing to note when deploying squid, is that in 99% of cases, you will have many thousands – if not millions – of very small files; due to this, you need to choose a file-system that is able to deal very well with reading/writing many small files concurrently.
Enter ReiserFS.
Having tried both XFS – very poor performance over time -, and ext3 – better performance, but still lags a lot under load -, I switched over to ReiserFS, and have found that this lives up very well to its reputation of being good with many-small-files and many-reads/writes-per/sec.
2.
sudo nano /etc/network/interfaces
enter the right config.
3.
sudo nano /etc/apt/sources.list
uncomment all the deb lines
add
deb http://download.webmin.com/download/repository sarge contrib
4.
sudo apt-get update sudo apt-get upgrade
5.
sudo apt-get install build-essential webmin htop
6.
wget http://www.squid-cache.org/Versions/v2/2.7/squid-2.7.STABLE5.tar.gz
or upload from a local dir.
7.
wget http://squirm.foote.com.au/squirm-1.26.tgz
or upload from a local dir.
squirm is a redirector, which we use for rewriting dynamic links, so that squid will be able to cache them. I’m not sure you need it, but in case you do, I’m not going to cut it out of the guide.
8.
Some kernel optimizations:
sudo nano /etc/sysctl.conf
enter the values below. You might want to take a look at what you’re doing, just google those - there’s plenty of info.
fs.file-max = 131072 net.core.rmem_default = 262144 net.core.rmem_max = 262144 net.core.wmem_default = 262144 net.core.wmem_max = 262144 net.ipv4.tcp_rmem = 4096 87380 8388608 net.ipv4.tcp_wmem = 4096 65536 8388608 net.ipv4.tcp_mem = 194976 259968 389952 net.ipv4.tcp_low_latency = 1 net.core.netdev_max_backlog = 4000 net.ipv4.ip_local_port_range = 1024 65000 net.ipv4.tcp_max_syn_backlog = 16384
9.
sudo nano /etc/security/limits.conf
enter this line before the “end of file”:
* - nofile 65535
10.
sudo sysctl -w net.ipv4.route.flush=1
11.
sudo reboot
12.
tar xzf squid-2.7.STABLE5.tar.gz
13.
cd squid-2.7.STABLE5
14.
While you can install the squid package via “apt-get install squid”, you really shouldn’t as you’re gonna miss on some options that aren’t in the package, and since some of’em are pretty necessary for our setup, let’s take some more time and compile it from source. Plus, it will let us understand better what we’re up against here.
Disregard the backslashes (\) near the quote signs (”), as it’s something that wordpress adds (I didn’t have time to solve this). But the backslashes at the end of each line are intentional, although you can just type it all in one line with space between each option (that’s in case you didn’t know that
). Also, I believe most of the options will suite you just fine, except march and CHOST, which you might want to change, depends on your CPU architecture. This is a good resource to find out which options to use.
sudo CHOST=\"i686-pc-linux-gnu\" \ CFLAGS=\"-DNUMTHREADS=128 \ -march=pentium4 \ -O3 \ -pipe \ -fomit-frame-pointer \ -funroll-loops \ -ffast-math \ -fno-exceptions\" \ ./configure \ --prefix=/usr \ --enable-async-io=128 \ --enable-coss-aio-ops \ --enable-icmp \ --enable-useragent-log \ --enable-snmp \ --enable-cache-digests \ --enable-follow-x-forwarded-for \ --enable-storeio=\"ufs,aufs,coss\" \ --with-large-files \ --enable-removal-policies=\"heap,lru\" \ --with-maxfd=32768 \ --enable-epoll \ --disable-ident-lookups \ --enable-truncate \ --exec-prefix=/usr \ --bindir=/usr/sbin \ --libexecdir=/usr/lib/squid \ --localstatedir=/var \ --datadir=/usr/share/squid \ --sysconfdir=/etc/squid --srcdir=.
So what do we have here?
Note the -DNUMTHREADS=128; you can easily run with 30 on a 500mhz machine. So set this according to your hardware. This CFLAG controls the number of threads squid is able to run when using asynchronous I/O. The rest of the CFLAGS heavily optimize the outputted binaries.
I recommend building with the ./configure line as above, obviously, if you change it, YMMV!
Here’s a rundown of what those options do:
--enable-async-io: enables asynchronous I/O – this is really important, as it stops squid from blocking on disk reads/writes
--enable-icmp: optional, squid uses this to determine the closest cache-peer, and then utilizes the most responsive one based off the ping time. Disable this if you don’t have cache peers.
--enable-useragent-log: causes squid to print the useragent in log entries – useful when you’re using lynx to debug squid speed.
--enable-snmp: you’ll want this enabled if you want to proxy SNMP requests to squid and graph the output.
--enable-cache-digests: required if you want to use cache peering
--enable-follow-x-forwarded-for: We have multi-level proxying happening as packets come through to squid, so to stop squid from seeing every request as from the load balancers, we enable this so squid reads the X-Forwarded-For header and picks up the real IP of the client that’s making the request.
--enable-storeio="ufs,aufs,coss": YMMV if you utilizing an alternate storage i/o method. AUFS is Asynchronous, and has significant performance gains over UFS or diskd. (this lines are obviously copied from Tony’s guide, but as you will see, I’m actually gonna use coss, as it is also an asynchronous, but suits better for small files, which is the best in my case.
--enable-removal-policies="heap,lru": heap removal policies outperform the LRU policy, and we personally utilize “heap LFUDA”, if you want to use LRU, YMMV.
--with-maxfd=32768: File Descriptors can play hell with squid, I’ve set this high to stop squid from either being killed or blocking when it’s under load. The default squid maxfd is (i believe), 4096, and I’ve seen squid hit this numerous times.
--enable-epoll: Enables epoll() over select(), as this increases performance.
--disable-ident-lookups: Stops squid from performing an ident looking for every connection, this also removes a possible DoS vulnerability, whereby a malicious user could take down your squid server by opening thousands of connections.
--enable-truncate: Forces squid to use truncate() instead of unlink() when removing cache files. The squid docs claim that this can cause problems when used with async I/O, but so far I haven’t seen this be the case. A side effect of this is that squid will utilizing more inodes on disk.
15.
sudo make
16.
sudo make install
Let’s move on.


Squid setup and configuration for a high-load environment…
IBM x335 server with 2 Xeon 2.4GHz CPUs, 2GB RAM, 2×36GB SCSI hard drives.
Note, that I will not use RAID, because I only have 2 hard drives and I need’em both for different roles to maximize performance. Plus, I have a cluster of squid servers, so …