Domain Name System: End to End
We all interact with the Domain Name System (DNS) every day. Every time we load a web page, or click a link, our software relies on DNS to figure out what address to send requests to. In this post, I will dig into how DNS works by tracing through the machinery that happens behind the scenes.
What is the purpose of DNS?
It is first worth calling out what DNS is and why anybody should care about it. A good place to start is Domain Names - Implementation and Specification RFC1035. In the introduction, it says:
The goal of domain names is to provide a mechanism for naming resources
in such a way that the names are usable in different hosts, networks,
protocol families, internets, and administrative organizations.
Domain names are supposed to be a convenient, general purpose way to refer to resources. Convenient in this case most likely means relative to referring things to an actual host address. Practically, this means DNS turns "google.com" into an IP address like "184.108.40.206" . This provides a number of benefits. "google.com" is much easier to remember than a series of 4 numbers. Also, a web service can change which host address it wants to use without end users even noticing.
DNS Resolution "by hand"
Now that we know the goal of DNS, I want to demonstrate how DNS works by crafting DNS queries using
dig. I will ignore a lot of pieces of the system for this illustration (like caching and recursive resolution). I will try to resolve "pcarleton.com"
In order to figure out what IP address is serving "pcarleton.com.", I can work my way down the hierachy. There are servers at each level of the hierarchy that can tell me where to find information for the level below. The domain name "pcarleton.com" becomes
["", "com", "pcarleton"] in the hierarchy where the empty string is the implied "root" domain.
A name server at the root level can tell me about what nameservers are responsible for each Top Level Domain (TLD) like ".com", ".net" etc. In order to query a root server, I need to know one of its IP addresses. I can pick one from the IANA website, so I'll pick
a.root-servers.net with IP address
I can then query this root server for what name servers are responsible for
dig -t ns pcarleton.com @220.127.116.11. This gives me a list of servers which are responsible for the
The root server gave a list of
.com nameservers (and IP addresses) that look like
a.gtld-servers.net with the letters A through M. For instance,
a.gtld-servers.net has address:
18.104.22.168. I can then query this address to see which
.com name server is responsible for the
pcarleton domain name with
dig -t ns pcarleton.com @22.214.171.124.
Namecheap DNS servers
This query reveals that
dns1.registrar-servers.com has information for
pcarleton.com (and has IP
126.96.36.199). This name server is the one run by Namecheap which is where I registered
pcarleton.com. If we query this one with
dig pcarleton.com @188.8.131.52, we get the IP address of this site which we can then use.
This example demonstrated how to interact with the DNS system, but it did not show how changes to that information would propagate. To understand that, let's look back at what data was required:
Changes to the list of root IP's should never happen. If it did, it would require pushing a file listing the new IP's to all running DNS servers by system admins all over the world.
Changes of the mapping of TLD's to authoritative name servers does not happen frequently, but is administered by ICANN. It will be communicated to all the root servers which will update their records and serve them to requests.
Changes to the TLD server's mapping (the Registry) changes more frequently. Every time a new domain name is registered, the TLD'd servers need to know which domain name servers have the required IP addresses. A Registrar makes a request to the "Registry" to update these records.
Changes to the final DNS servers mapping of domain name to IP address can happen much more frequently since it only needs to be updated in the two DNS servers listed. I can change this by making a request to Namecheap.
This example showed how some of the pieces of the DNS system work, but it is not typically how a DNS request goes. In reality, the client usually makes a request to a DNS server which has a lot of information cached (like the .com TLD servers) and will make requests to the appropriate servers rather than telling the client which nameservers to look at.
Here's a list of resources to look for further information about DNS:
You can test this out locally by running
dig google.comfrom the command line. If you put this IP address into your browser, it will load google.com. ↩︎
It is not difficult to imagine how a bad actor could cause trouble with this too. If they were to compromise DNS information, they could cause clients to send information to a location where they can intercept it. For this post, I won't get into that, and I will pretend like everybody on the internet are behaving nicely. ↩︎
pcarleton.com is implicitly re-written as "pcarleton.com." (note the trailing period) where the root domain is the empty string at the end. ↩︎
Ordinary websites do not load "iana.org" in order to figure out root servers. These 13 name servers are like internet constants in that their IP addresses never change. Domain name servers usually have these IP addreses in a file somewhere. ↩︎
I verified this was Namecheap's by going to whois.com. This info can also be obtaind with the
whois dns1.registrar-servers.com --host whois.enom.com(I got the
whois.enom.compart from first issuing the command without the
--hostargument). Enom must be the registrar that Namecheap used to register its domain with. That leads me to wonder if there are any circular registrar dependencies, but is looks like Enom used itself to register "enom.com". ↩︎
This is assuming that the request is "recursive". A recursive request asks that the server make requests to other DNS servers if it doesn't have the answer. An "iterative" request by contrast asks that the server tell the client what server will have the answer if it does not have the answer. A DNS server gets to decide whether it supports "recursive" requests. ↩︎