It's very important to get your SOA ResourceRecord correct for a zone.
Use YYYYmmddHHxx as the serial for your zone. You will forget to update the serial sooner or later, and having a format where it's obvious that you've forgotten makes it easy to notice. This convention is very popular around the internet.
The refresh value is how often secondaries will poll the primary to check to see if there are new changes. Values for this should be between 3600 (1 hour) to about 43200 seconds (12 hours). We recommend 3600. Secondaries refreshing causes a miniscule amount of traffic on the network, and thus the extra load of servers checking every hour is vanishingly small.
The retry value is how often secondaries will try again if a refresh failed. Values in about 120s to 7200s are reasonable. Depending on how important it is that your information is correct and up to date, I'd recommend either 600 (10 minutes) or 3600 (1 hour).
The expire value is how long a secondary will continue to serve information if it has been unable to contact the primary name server. Once this time has expired the server will no longer return authoritative results and will be considered lame.
A good value for this is about 1209600 (2 weeks) to 2419200 seconds (4 weeks). In general setting this too low will cause your secondary to become lame prematurely, and if your primary is down for an extended outage, you want your secondary to continue to server records, however if the secondary for some reason is unable to contact your primary you want the secondary to stop sending incorrect stale information to clients. We recommend a value of 2419200 seconds (4 weeks).
The last value in the SOA is the minimum ttl. This was originally what the minimum TTL is for records returned from this zone, if no ttl was specified then this value was used, hence it is sometimes incorrectly referred to as the "Default TTL" for a zone. More recent RFC's suggest you use the $TTL directive for this. Most modern bind implementations will moan if a $TTL is not there.
Most modern DNS implementations are willing to give out replies with the TTL being lower than this value so it's use as it was originally defined is no longer that important.
However, name servers that support caching NXDOMAIN? will use this value as the amount of time they will cache the result for. Hence it being called the "Negative cache TTL".
RFC:2308 recommends 3600s (1 hour) to 86400s (1 day). We suggest that you use 86400.
Try to have names for services, then if services are moved to another machine you can update the name for the service without having to reconfigure all the clients to point at another machine. If machines are configured to use a search path correctly then applications that search for a machine (such as "news"), will function without configuration.
Some often used names are: |Name|Service |cvs|cvs server (viewcvs available over http, pserver and ssh access) |ftp|ftp server |www|web server |mail| Smarthost for internal clients/MX |MXn|Machine for external reception of mail |NSn|Nameserver. Note that DJBDNS prefers(?) nameservers be called a.ns and b.ns |proxy|web proxy |wpad|Proxy autodiscovery |pop|v server used for MUA's to check mail on |imap |news| news server |dhcp-n|DHCP assigned leases
All IP addresses that you are authoritative for should be given reverse lookups, even DHCP ranges, where you can use the $GENERATE directive.
All IP's that have a reverse lookup should have a forward lookup for the same name that returns the same IP.
Remember the . at the end of the domain name. Remember that the NS, MX, CNAME, DNAME records all require a name on the right hand side, and will not accept an IP address. Consider running a script from cron to check for blindingly obvious mistakes.
Avoid _ / and % in names. _ for instance is valid in DNS but is not valid as a hostnames.
Try to give a machine the least number of names possible. While this contracts the above where you should have one name per service (since one machine often has multiple services), at least reusing the name for a service is a good idea. For instance, if you host 5 domains, have them all use "ns.example.com" as their primary nameserver.
so that no-one will ever accept mail that claims to be from that domain.
If you are running a nameserver in authoritative mode, avoid using it as a nameserver for stub resolvers. (i.e, don't allow recursion for any host through it). This avoids problems where the nameserver configuration is out of date, and prevents issues with people intentionally (or unintentionally) poisoning your authoritative nameserver.
For best performance out of a DNS server, try and use one name for it. ie, call your nameserver "ns1.example.com" in ALL of your zones. Also try to make sure that TTL's for the NS records, and A records on your nameserver, and any other related glue are at least 432000 seconds (5 days). This makes sure that if anything goes wrong higher up in the heirachy, your customers can still get to your site for approximately 2 days giving you time to get the issue fixed. Since queries will still flow directly to your nameserver, you will be able to return other names (such as "www") directly even if the higher up zones are having issues.
You may want to use the same idea for MX records. Be aware that these make it difficult to migrate nameservers in the future, so remember to turn your TTL's down later.
If the flags line in the header of the output contains 'aa' (for authoritative answer), then the nameserver is authoritative for that domain.
For nameservers that are supposed to handle recursive lookups for stub resolvers, limit the IP ranges that can issue requests aggressively. People who can do recursive queries through your nameservers can end up with bad entries being cached.
The general consensus is "don't". You shouldn't run a recursive nameserver and an authoritative nameserver in the same process, due to the possibility of cache poisoning. DJB has a relatively clear explanation.
See also: NamedNotes