What happens when we type a URL in our browser and press Enter?

1. Parse the URL

To visit any website on the Internet we need to know the address of the server that is hosting that website. That address is a number called the IP address. IP stands for Internet Protocol. A protocol is a set of rules that define a method of exchanging information over a computer network. That is what the internet really is, a network made up of billions of connected computers, each one with their own IP.

For example google.com‘s IP is 172.217.5.101, so by typing http://172.217.5.101 in your browser you get to google.com.

The first thing the browser does is check if what we entered is an IP. If it is, then it will try to connect to it. In reality though, it’s impossible for us to remember all the IPs of the many websites we visit, so we remember their URLs instead. But URLs are not addresses and our browser needs the exact address. So how does it figure that out? Naturally, it looks it up!

2. Lookup Address

Local Lookup

There are several different locations where the IP address can be found. First the browser checks these 2 locations on its computer.

Browser Cache

The browser searches its cache. Cache (pronounced cash) is a hardware or software component that stores data so future requests for that data can be served faster. The Chrome browser, for example, caches the IPs of the URLs you visit for 30 seconds, after that the cache expires. If you use Chrome you can type in this URL to see your cached addresses chrome://net-internals/#dns.

Hosts File

Second, the browser looks in the operating system’s hosts file. The hosts file is a text document that has a list of IPs with their associated domains.

Windows:
C:\Windows\System32\drivers\etc\hosts
Mac:
/etc/hosts

External Lookup

If the IP address cannot be found locally, the browser must do an external search. This is where DNS comes to the rescue!
It stands for Domain Name Service, which is a database hosted on multiple servers around the world that contains records of every domain on the internet and its IP address(s). DNS makes browsing the internet human-friendly, as we no longer have to remember any numbers, instead remembering the URL name, and DNS gives us the IP. DNS lookup proceeds as follows:

Router Cache

If the IP is not found on the browser’s computer, the browser tries to see if there’s a router on the network with a DNS cache. If it doesn’t find it there it connects to the ISP.

ISP DNS Cache

ISP stands for Internet Service Provider, which is the company you pay monthly to have Internet access.
ISPs have DNS cache servers in their data centers. Those servers are the next thing to get checked. If the record is still not found, the ISP’s DNS server initiates a recursive DNS query.

Recursive DNS Query


Simply put, a recursive DNS query is multiple DNS servers calling each other until they find the correct record, which is then returned to the browser.

Let’s say we’re trying to find the IP of mail.google.com. The ISP’s DNS server contacts the top-level domain .com DNS server, which redirects to the google.com DNS server, which returns the IP of mail.google.com all the way back to the browser.

Note that even if we have to go through all these steps to get the IP, it all happens extremely quickly and we don’t have to do any kind of waiting usually. Also, the DNS servers will cache any IP they did not have for a certain amount of time, so it can be accessed quickly the next time it’s requested.

3. Connect to Server

Initiate TCP/IP Connection

Once the browser (which is the client in a server/client relationship) receives the IP, it will attempt to establish a connection to the website’s hosting server using TCP (Transmission Control Protocol).

The connection is established using a TCP/IP 3-way handshake.
Before we get to the server we usually go through some network components, most importantly being the load balancer and the firewall described below.

1. Client asks the server if it is open for connections by sending it a SYN (synchronize) packet.

2. If the server can accept the connection it responds with an ACK (Acknowledgement) of the SYN by sending a SYN/ACK packet.

3. Client receives SYN/ACK packet from server and acknowledges by sending another ACK packet.

Now we have an established TCP/IP connection and we can transfer data back and forth!

Load Balancer

The IP address we get for the website we’re trying to visit usually belongs to a server called the load balancer. The load balancer does what you imagine, it splits up the load or web traffic onto multiple servers. Large website that receive a lot of traffic need to do this because one server is not enough for the potential millions of users trying to connect.

Firewall

Sometimes the IP we get belongs to a firewall which is a very important security component in the network stack. A firewall can be implemented as software or hardware. Firewalls exist at different locations in the network, either before or after load balancers. The firewall is a barrier between a trusted and an untrusted network. It has rules that define who is allowed to access the network and who will be blocked.

Data Base

Websites store their information in an SQL database. Larger websites split their database onto multiple servers. SQL (Structured Query Language) is used to manipulate the database. SQL commands such as Select, Insert, Update, Delete, Create, and Drop are used to accomplish almost everything one needs to do with a database.

4. HTTP(S) Request

HTTP

HTTP is the Hypertext Transfer Protocol. It is the underlying protocol used by the World Wide Web to define how messages are formatted and transmitted between a Web server and a browser.

HTTPS

If you look closely at the displayed URL in your browser, you see it often starting with https not http. HTTPS is the secure version of the HTTP protocol. That means that the connection between us and the server is encrypted and can’t be deciphered by anyone listening in to our communication. This is very important when you are dealing with sensitive information. You don’t want anyone finding out your banking, credit card or social security number, for example.

SSL

HTTPS uses the SSL (Secure Sockets Layer) protocol to establish an encrypted connection between a web server and a browser. An SSL certificate is necessary to create SSL connection. You can see the SSL certificates being used by Chrome by clicking on Manage certificates under the Privacy and security preferences.

GET Request

Once the TCP/IP connection is established, the browser uses HTTP to send a GET request to the web server. The server has a software running on it called a web server like Apache or Nginx that processes the GET request and sends the needed information back.

POST Request

Sometimes we need to send information to the server, like login info, or we might be submitting a form. In this case the browser will send an HTTP POST request to the server instead of a GET request. To see the client/server communication happening in the background when you browse the internet checkout Firefox’s Firebug plugin. A lot more communication is happening that what you’d expect!

5. Server Response

Server Response Codes

A server has multiple kinds of responses to the GET/POST requests it receives. The response is a 3-digit number, the first one of which tells us what family of messages the response belongs to. The one we’re usually familiar with is the famous 404 Page Not Found.

It breaks down like this:

1xx: Informational response
2xx: Success
3xx: Redirection
4xx: Client error
5xx: Server error

6. Display Webpage

Once the basic HTML code is received it is processed and displayed by the browser. After that the browser sends GET requests for other items that the HTML code refers to, like CSS style guides, Javascript files, images and videos. That’s why a lot of times it seems to us that the page loads in parts, with components with the biggest size loading last.