Troubleshooting DNS problems with DIG

February 6, 2020 by Jonas Lejon
Troubleshooting DNS problems with DIG

One of the most powerful tools to have at the command line for working with DNS records is the dig command.

dig (Domain Information Groper) is the swiss army knife of DNS administrators. Let’s have a look at how we can use it to troubleshoot the most common problems.

Standard output in dig

By default, the output generated by dig is pretty verbose. Here’s an example where we look up the MX records for the our domain hostdns.com.

$ dig hostdns.com MX

; <<>> DiG 9.10.6 <<>> hostdns.com MX
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 34904
;; flags: qr rd ra; QUERY: 1, ANSWER: 5, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;hostdns.com. IN MX

;; ANSWER SECTION:
hostdns.com. 3599 IN MX 1 aspmx.l.google.com.
hostdns.com. 3599 IN MX 10 aspmx2.googlemail.com.
hostdns.com. 3599 IN MX 10 aspmx3.googlemail.com.
hostdns.com. 3599 IN MX 5 alt1.aspmx.l.google.com.
hostdns.com. 3599 IN MX 5 alt2.aspmx.l.google.com.

;; Query time: 47 msec
;; SERVER: 192.168.126.5#53(192.168.126.5)
;; WHEN: Thu Oct 17 08:55:03 CEST 2019
;; MSG SIZE rcvd: 170

Wow, there’s a lot in there!

Let’s unpack this and see what’s relevant.

Querying a domain and a specific resource type

Our command line tool was `dig hostdns.com MX`. It follows the format of `dig <domain> <resource-type>`. A resource type is any of the possible DNS records, like `A`, `AAAA`, `CNAME`, …

Our output section reflects the query we just asked. It tells us a lot.

;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 34904
;; flags: qr rd ra; QUERY: 1, ANSWER: 5, AUTHORITY: 0, ADDITIONAL: 1

In the very first block, we see the query status: we received a valid reply (`NOERROR`). This could also have been a `NXDOMAIN` to indicate the (sub)domain name we were query’ing did not exist.

Below that are a series of interesting _flags_. The most common ones you’ll find are:

– `qr`: _query response_, indicates that what is shown next is the response to our DNS query (_`dig` can also show us the query, not just the response_).
– `rd`: _recursion desired_, we asked for a specific query, and our nameservers are allowed to forward this query to their upstream nameservers if they don’t have the response.
– `ra`: _recursion available_, the nameserver that received our query indicated that it _can_ do a recursive lookup (this can be explicitly denied in the config)

The next most important section is the actual reply. In our example, it looked like this.

;; ANSWER SECTION:
hostdns.com. 3599 IN MX 1 aspmx.l.google.com.
hostdns.com. 3599 IN MX 10 aspmx2.googlemail.com.
hostdns.com. 3599 IN MX 10 aspmx3.googlemail.com.
hostdns.com. 3599 IN MX 5 alt1.aspmx.l.google.com.
hostdns.com. 3599 IN MX 5 alt2.aspmx.l.google.com.

This indicates that our query (the `MX` records of the domain `hostdns.com`) could be found. There were 5 results, each with a different MX priority.

The format of the reply is as follows:

<domain> <TTL> IN <type> <answer>

The `TTL` (Time To Live) indicates how long the DNS response should be cached (or considered “valid”). The data in `<answer>` is what we’re actually after and contains the mailservers responsible for our domain.

Below that, we see some debug & diagnostic data:

;; Query time: 47 msec
;; SERVER: 192.168.126.5#53(192.168.126.5)

The nameserver took 47msec to reply. Since we didn’t explicitly give it a nameserver to query, it used our OS default – in this case the query was answered by a nameserver on IP `192.168.126.5`.

Querying the full hierarchy of nameservers

Whenever you’re debugging DNS issues, it’s wise to do a full “trace” of the records. DNS works hierarchically, it has a set of root nameservers, below that some TLD-specific ones, then nameservers for your own domain, … The _tree_ for `hostdns.com` looks like this.

root nameservers:
a.root-servers.net.
b.root-servers.net.
...
\_ .com nameservers:
a.gtld-servers.net.
b.gtld-servers.net.
...
\_ hostdns.com nameservers:
ns-47.awsdns-05.com.
ns-934.awsdns-52.net.

We can see this information too when we query using the `+trace` flag. If we then only ask for the `NS` (nameserver) records, our response is filtered to just what we need.

$ dig +trace hostdns.com NS
. 84482 IN NS a.root-servers.net.
. 84482 IN NS b.root-servers.net.
[...]

com. 172800 IN NS a.gtld-servers.net.
com. 172800 IN NS b.gtld-servers.net.
[...]

hostdns.com. 172800 IN NS ns-47.awsdns-05.com.
hostdns.com. 172800 IN NS ns-934.awsdns-52.net.
[...]

Why is this useful? Your local nameserver (or your ISP’s nameservers) might be caching old records, using `+trace` we force each nameserver in the tree to reply on their own. If you get the correct result with `+trace` and the wrong one without that flag, there’s a good chance there’s DNS caching (and long TTLs) involved and you need to wait things out. Once the TTL has expired, your DNS might be updated.

Querying a specific nameserver

Another useful trick is to query a specific nameserver. Let’s say you have 4 nameservers. How do you know all 4 give the same reply? You can use the `@` flag in `dig` to query a nameserver you pick.

$ dig +short +noshort @ns-47.awsdns-05.com. hostdns.com A
hostdns.com. 3600 IN A 206.189.247.1
$ dig +short +noshort @ns-1693.awsdns-19.co.uk. hostdns.com A
hostdns.com. 3600 IN A 206.189.247.1
$ dig +short +noshort @ns-47.awsdns-05.com. hostdns.com A
hostdns.com. 3600 IN A 206.189.247.1
$ dig +short +noshort @ns-934.awsdns-52.net. hostdns.com A
hostdns.com. 3600 IN A 206.189.247.1

Notice the use of @ns-47.awsdns-05.com. in our dig command? That tells our dig client to query that specific nameserver instead.

The extra flags of +short +noshort are a useful addition, it keeps our output very short and to-the-point. No debug or verbose output, just the query response we asked for.

With these tips, you should be able to get some basic troubleshooting going. You’ll be able to better understand the output of the `dig` command, see the entire _tree_ of nameservers that can reply and query specific nameservers for troubleshooting.

If you have any other debugging tips, let us know in the comments below!

Guest blog post by Mattias Geniar