Penguin

Squid Caching Proxy Server Notes


Problem solving

Resolving name problems

Having problems that http://brian/wherever/whatever doesn't resolve in Squid? This is caused by Squid running its own DNS resolver, instead of using gethostbyname(3). It pulls the IPs of the name servers out of resolv.conf(5). Add a line like this to your squid.conf
append_domain .yourdomain.tla

Any domain without a dot in it will get that domain prepended to it; everything works nicely all of a sudden.

Always get Connection Refused for any website

probably means that squid has run out of disk space...

This is also the default configuration - all users are denied access. If you want to simply allow all users to use squid and are just installing to save bandwidth go into /etc/squid/squid.conf and change the line http_access deny all to http_access allow all and restart squid - typically by /etc/init.d/squid restart

"Unable to load page" error

Microsoft InternetExplorer 6 SP 1 has a bug where if you are using "Basic" auth (eg, with squid), the first page afterwards will display an "Unable to load page" error. This is because MSIE tries to reuse an already closed TCP connection. See KB:331906.

Caching / Proxying Microsoft Windows Update

Windows Update caching works just fine, for the most part. If you have an authenticated proxy, you might want to add ".microsoft.com", ".windowsupdate.com" and "*.akamai.net" as an auth-bypass whitelist.

As of mid December 2004, Windows Update (under XP at least) changed the way it works. It ignores proxy settings, and attempts to make direct connections to a pool of servers. This is fairly annoying, as if you have no default route set on your workstations (a sensible security measure), you can no longer run windows update.

The subnets in question are: 207.46.0.0/16 and (I think) 64.2.21.0/24. It seems that the only solution is to allow these direct access via your firewall.

To use windows update via a proxy you must configure it using proxycfg as below

proxycfg -u

will import your proxy server setting from Internet Explorer.

Allowing SVN through Squid

To allow access to an apache based svn server, you should add this to your squid.conf:

  extension_methods REPORT MERGE MKACTIVITY CHECKOUT

Add-on utilities for Squid

Log Analysis (sarg)

sarg is a log file analyser for squid. It's partially useful.

Sarg is a reasonably nice tool for generating nice reports for your squid logs. I have only two problems with it currently.

  • Dates on reports spanning weeks or months are often wrong - all the data is there but the title of the report says it only covers 2-5 days.
  • Only shows reports of the percentage of traffic that was/was not served from the cache. Does not give an actual byte count. Sure it is easy to calculate it from the total but it would be even easier if it did it for me.

Log Analysis (srg)

SRG is a fast and flexible log analyser written in C/C++, it was written by MattBrown while working for CRCnet because none of the existing log analysation programs such as sarg were adequate. In particular SRG allows you to generate reports right down to the level of each file requested from a site, and reports can be generated in plain html or using PHP to allow you to easily integrate with your squid authentication system to restrict access to all or parts of the report. Another useful feature of SRG is the ability to generate an email every time a report is generated summarising the traffic used during the reporting period.

SRG is released under the GPL and is under active development.

Find out more about srg at http://www.crc.net.nz/software/srg.php

Graphing Squid data

Here are some other notes on Squid, SNMP and MRTG. This shows sample MRTG config options for graphing some of the info. Note that you can get MRTG to talk directly to Squid's nonstard SNMP port.

Content Blocking

Investigate the following blacklists:

(Note from Daniel Barron, DG author: the SG clause is in violation of the GPL and thus is invalid. The DG license is fully 100% within the GPL. What is asked for is that commercial users pay to download DG. I just thought I'd clarify the FUD.)


Useful configurations and tips

Proxy Auto Detection

To set things up so that your web browsers auto detect your proxy server, investigate WPAD, the Web Proxy Auto Detection script.

Filtering - ACLs in squid

When specifying ACLs, dont set more than one type of acl on a single acl line. Squid ignores them. eg:

 acl lab proxy_auth labuser src 192.168.2.0/32
 acl denylab proxy_auth labuser
 ....
 http_access allow lab
 http_access deny denylab

doesn't work. instead:

 acl labuser proxy_auth labuser
 acl labmachines proxy_auth 192.168.2.0/32
 ....
 http_access allow labuser labmachines
 http_access deny labuser

will do the trick.

URL Blocking

 acl restrictedmachine src ip.ad.dr.ess/255.255.255.255
 acl restrictedmachinesites dstdomain "/etc/squid/list-of-sites"

 http_access allow restrictedmachine restrictedmachinesites
 http_access deny restrictedmachine

list-of-sites takes the form

 # banned sites list
 host.domain.com
 # or
 .domain.com
 # for everything in domain.com

Alternatively, an external redirector such as ufdbGuard is used to block URL categories.

 redirect_program  /local/squid/bin/ufdbGuard -c /local/squid/etc/ufdbGuard.conf
 redirect_children 2

Authentication and transparent proxying

Proxy Auth with NTLM?

A full working example on having a Squid proxy pick up user information from NTLM and a MicrosoftWindows ActiveDirectory. This will allow anyone in the AD Group "Internet" to have full access to the internet, and anyone in "Domain Users" (and not in "Internet") to access sites in the "/etc/squid-allowedsites" file only.

If you are using InternetExplorer or newer Mozilla browsers (on MicrosoftWindows), this will work transparently using NTLM Authentication. If you're using another browser (or are running Linux), you'll be prompted for a username and password.

The format for authentication helpers has changed as of Samba 3. This example works with Squid 2.5STABLE3 and Samba 3.0.10.

Initially we tried to use transparent proxying AND NTLM auth, as all indications were that this should work. In practice it does not - see below.

After installation of all packages and config files, samba must be joined to the domain with the command net join -U Administrator - this will prompt you for the admin password. Then, teach Winbind the domain credentials: wbinfo --set-auth-user Administrator%password.

At every boot, Winbind must be started. Packages do this for you automatically.

Config files:

squid.conf

# This configuration file is setup for NTLM authentication
#
# Set NTLM parameters
auth_param ntlm program /usr/bin/ntlm_auth --helper-protocol=squid-2.5-ntlmssp
auth_param ntlm children 5
auth_param ntlm max_challenge_reuses 0
auth_param ntlm max_challenge_lifetime 2 minutes

# Set basic parameters
auth_param basic program /usr/bin/ntlm_auth --helper-protocol=squid-2.5-basic
auth_param basic children 5
auth_param basic realm Squid proxy-caching web server
auth_param basic credentialsttl 2 hours

# Don't query neighbours for dynamic pages
hierarchy_stoplist cgi-bin ?

# Don't cache replies on dynamic pages
acl QUERY urlpath_regex cgi-bin \?
no_cache deny QUERY

# Define ACLs
acl all src 0.0.0.0/0.0.0.0
acl allsites dst 0/0
acl manager proto cache_object
acl localhost src 127.0.0.1/255.255.255.255
acl localnet src 192.168.99.0/255.255.255.0

acl allowedsites        url_regex       "/etc/squid/allowedsites"
external_acl_type       ntgroup         %LOGIN  /usr/lib/squid/wbinfo_group.pl
acl fullusers           external        ntgroup         "/etc/squid/fullusers"

http_access allow localhost
http_access allow localnet allowedsites
http_access allow fullusers
http_access deny all

# Allow ICP queries from all
icp_access allow all

# Hostname
visible_hostname firewall.example.co.nz

/etc/squid-allowedsites

.foo.bar
.foo.bar.baz

/etc/squid-fullusers

Internet

(These are checked against groups only).

/etc/smb.conf

[global]
   # general options
   workgroup = EXAMPLE
   netbios name = FIREWALL

   # winbindd configuration
   # default winbind separator is \, which is good if you
   # use mod_ntlm since that is the character it uses.
   # users only need to know the one syntax
   # winbind separator = \

   # idmap uid and idmap gid are aliases for
   # winbind uid and winbid gid, respectively
   idmap uid = 10000-20000
   idmap gid = 10000-20000
   winbind enum users = yes
   winbind enum groups = yes
   # makes wbinfo able to see groups
   client schannel = no

   security = ads
   realm = example.co.nz
   password server = 10.7.x.x

You will also need to allow the user ID Squid is running as to write to the /var/lib/samba/winbindd_privileged directory or you will get authentication failures (with errors written to cache.log).

Transparent proxy and authentication

This can't work. An excellent post on the topic to the Squid users list summarises why:

HTTP specifies two "authentication required" error codes. One for a HTTP server (401), the other for a HTTP proxy (407). When a browser connects to a server requiring authentication, the server examines the HTTP header supplied in the request. If it includes the correct authentication information (username and password) the request is honoured and the server sends back a return code of 200. If the authentication information is not present in the header, the server responds with a return code of 401. When the browser sees this it pops up the authentication window where you type your username and password. The browser then re-submits the original request this time containing the authentication information it just collected. All future requests to the server will contain the authentication information.

Proxy authentication is handled in a similar manner. A browser that knows it's using a proxy (in tranparent proxying, this is NOT the case) makes a connection to the proxy and issues an HTTP request. That request can contain proxy authentication information. Note that this is in a different part of the HTTP request to the web server authentication information. If the proxy requires authentication and the proxy-auth HTTP header is empty, the proxy responds with a return code of 407. When the browser receives this it pops up a window asking for the proxy username and password. Once you've typed it in, the browser resubmits the original request this time containing the proxy authentication information. All further requests to the proxy will contain the authentication information.

If a browser is not configured to use a proxy, it will quite rightly ignore any return code of 407. Why should it give away your proxy username and password to anyone who asks for it?

In your case you have browser->transparent proxy->auth proxy. The auth proxy can certainly request authentication of the transparent proxy. The cache_peer config line supports this with the "login=user:password" option. However, all that does is authenticate the proxy with its parent. There is no way to make the transparent proxy authenticate individual users. Even if the 407 sent by the auth proxy, could be passed from transparent proxy to browser (it can't because the transparent proxy traps it) you cannot make the browser respond because as far as it knows, it isn't using a proxy.

As has been stated many, many times on this list

transparency, authentication, pick one.