• What happens when we type a URL in our browser and press Enter?

  • 1. Parse the URL

  • 2. Lookup Address

  • 3. Connect to Server

  • 4. HTTP(S) Request

  • 5. Server Response

  • 6. Display Webpage

    • To visit any website on the Internet we need to know the address of the server that is hosting that website. That address is a number called the IP address. IP stands for Internet Protocol. A protocol is a set of rules that define a method of exchanging information over a computer network. That is what the internet really is, a network made up of billions of connected computers, each one with their own IP.

      For example google.com‘s IP is 172.217.5.101, so by typing http://172.217.5.101 in your browser you get to google.com.

      The first thing the browser does is check if what we entered is an IP. If it is, then it will try to connect to it. In reality though, it’s impossible for us to remember all the IPs of the many websites we visit, so we remember their URLs instead. But URLs are not addresses and our browser needs the exact address. So how does it figure that out? Naturally, it looks it up!

    • Local Lookup

      There are several different locations where the IP address can be found. First the browser checks these 2 locations on its computer.

    • External Lookup

      If the IP address cannot be found locally, the browser must do an external search. This is where DNS comes to the rescue!
      It stands for Domain Name Service, which is a database hosted on multiple servers around the world that contains records of every domain on the internet and its IP address(s). DNS makes browsing the internet human-friendly, as we no longer have to remember any numbers, instead remembering the URL name, and DNS gives us the IP. DNS lookup proceeds as follows:

    • Initiate TCP/IP Connection

      Once the browser (which is the client in a server/client relationship) receives the IP, it will attempt to establish a connection to the website’s hosting server using TCP (Transmission Control Protocol).

      The connection is established using a TCP/IP 3-way handshake.
      Before we get to the server we usually go through some network components, most importantly being the load balancer and the firewall described below.

    • Load Balancer

      The IP address we get for the website we’re trying to visit usually belongs to a server called the load balancer. The load balancer does what you imagine, it splits up the load or web traffic onto multiple servers. Large website that receive a lot of traffic need to do this because one server is not enough for the potential millions of users trying to connect.

      Firewall

      Sometimes the IP we get belongs to a firewall which is a very important security component in the network stack. A firewall can be implemented as software or hardware. Firewalls exist at different locations in the network, either before or after load balancers. The firewall is a barrier between a trusted and an untrusted network. It has rules that define who is allowed to access the network and who will be blocked.

      Data Base

      Websites store their information in an SQL database. Larger websites split their database onto multiple servers. SQL (Structured Query Language) is used to manipulate the database. SQL commands such as Select, Insert, Update, Delete, Create, and Drop are used to accomplish almost everything one needs to do with a database.

    • HTTP

      HTTP is the Hypertext Transfer Protocol. It is the underlying protocol used by the World Wide Web to define how messages are formatted and transmitted between a Web server and a browser.

      HTTPS

      If you look closely at the displayed URL in your browser, you see it often starting with https not http. HTTPS is the secure version of the HTTP protocol. That means that the connection between us and the server is encrypted and can’t be deciphered by anyone listening in to our communication. This is very important when you are dealing with sensitive information. You don’t want anyone finding out your banking, credit card or social security number, for example.

      SSL

      HTTPS uses the SSL (Secure Sockets Layer) protocol to establish an encrypted connection between a web server and a browser. An SSL certificate is necessary to create SSL connection. You can see the SSL certificates being used by Chrome by clicking on Manage certificates under the Privacy and security preferences.

      GET Request

      Once the TCP/IP connection is established, the browser uses HTTP to send a GET request to the web server. The server has a software running on it called a web server like Apache or Nginx that processes the GET request and sends the needed information back.

      POST Request

      Sometimes we need to send information to the server, like login info, or we might be submitting a form. In this case the browser will send an HTTP POST request to the server instead of a GET request. To see the client/server communication happening in the background when you browse the internet checkout Firefox’s Firebug plugin. A lot more communication is happening that what you’d expect!

    • Server Response Codes

      A server has multiple kinds of responses to the GET/POST requests it receives. The response is a 3-digit number, the first one of which tells us what family of messages the response belongs to. The one we’re usually familiar with is the famous 404 Page Not Found.

      It breaks down like this:

      1xx: Informational response
      2xx: Success
      3xx: Redirection
      4xx: Client error
      5xx: Server error

    • Once the basic HTML code is received it is processed and displayed by the browser. After that the browser sends GET requests for other items that the HTML code refers to, like CSS style guides, Javascript files, images and videos. That’s why a lot of times it seems to us that the page loads in parts, with components with the biggest size loading last.

    • Browser Cache

      The browser searches its cache. Cache (pronounced cash) is a hardware or software component that stores data so future requests for that data can be served faster. The Chrome browser, for example, caches the IPs of the URLs you visit for 30 seconds, after that the cache expires. If you use Chrome you can type in this URL to see your cached addresses chrome://net-internals/#dns.

    • Hosts File

      Second, the browser looks in the operating system’s hosts file. The hosts file is a text document that has a list of IPs with their associated domains.

      Windows:
      C:\Windows\System32\drivers\etc\hosts
      Mac:
      /etc/hosts

    • Router Cache

      If the IP is not found on the browser’s computer, the browser tries to see if there’s a router on the network with a DNS cache. If it doesn’t find it there it connects to the ISP.

    • ISP DNS Cache

      ISP stands for Internet Service Provider, which is the company you pay monthly to have Internet access.
      ISPs have DNS cache servers in their data centers. Those servers are the next thing to get checked. If the record is still not found, the ISP’s DNS server initiates a recursive DNS query.

    • Recursive DNS Query


      Simply put, a recursive DNS query is multiple DNS servers calling each other until they find the correct record, which is then returned to the browser.

      Let’s say we’re trying to find the IP of mail.google.com. The ISP’s DNS server contacts the top-level domain .com DNS server, which redirects to the google.com DNS server, which returns the IP of mail.google.com all the way back to the browser.

      Note that even if we have to go through all these steps to get the IP, it all happens extremely quickly and we don’t have to do any kind of waiting usually. Also, the DNS servers will cache any IP they did not have for a certain amount of time, so it can be accessed quickly the next time it’s requested.

    • 1. Client asks the server if it is open for connections by sending it a SYN (synchronize) packet.

      2. If the server can accept the connection it responds with an ACK (Acknowledgement) of the SYN by sending a SYN/ACK packet.

      3. Client receives SYN/ACK packet from server and acknowledges by sending another ACK packet.

      Now we have an established TCP/IP connection and we can transfer data back and forth!

        {"cards":[{"_id":"7f98d0bd0f1df3755d000012","treeId":"7f98d0830f1df3755d000010","seq":10890940,"position":1,"parentId":null,"content":"# **What happens when we type a `URL` in our browser and press `Enter`?**"},{"_id":"7f9dd3ee31ef64765c00001c","treeId":"7f98d0830f1df3755d000010","seq":10880085,"position":2.5,"parentId":null,"content":"# **1. Parse the URL**"},{"_id":"7f9dd80a31ef64765c00001d","treeId":"7f98d0830f1df3755d000010","seq":10892564,"position":1,"parentId":"7f9dd3ee31ef64765c00001c","content":"To visit any website on the Internet we need to know the `address` of the `server` that is hosting that website. That address is a number called the `IP` address. `IP` stands for `Internet Protocol`. A `protocol` is a set of rules that define a method of exchanging information over a `computer network`. That is what the `internet` really is, a network made up of billions of connected computers, each one with their own `IP`.\n\nFor example `google.com`'s IP is `172.217.5.101`, so by typing `http://172.217.5.101` in your browser you get to google.com.\n\nThe first thing the browser does is check if what we entered is an `IP`. If it is, then it will try to connect to it. In reality though, it's impossible for us to remember all the `IPs` of the many websites we visit, so we remember their `URLs` instead. But URLs are not addresses and our browser needs the exact address. So how does it figure that out? Naturally, it looks it up!\n"},{"_id":"7f9c42c131ef64765c00000c","treeId":"7f98d0830f1df3755d000010","seq":10880284,"position":3,"parentId":null,"content":"# **2. Lookup Address**"},{"_id":"7f9e175f815a5a86a900001b","treeId":"7f98d0830f1df3755d000010","seq":10880166,"position":0.5,"parentId":"7f9c42c131ef64765c00000c","content":"### `Local Lookup`\nThere are several different locations where the `IP` address can be found. First the browser checks these 2 locations on its computer."},{"_id":"7f9f55389c951d2ee000001d","treeId":"7f98d0830f1df3755d000010","seq":10892565,"position":1,"parentId":"7f9e175f815a5a86a900001b","content":"## `Browser Cache`\nThe browser searches its cache. `Cache` (pronounced *cash*) is a hardware or software component that stores data so future requests for that data can be served faster. The `Chrome` browser, for example, caches the IPs of the URLs you visit for 30 seconds, after that the cache expires. If you use Chrome you can type in this URL to see your cached addresses `chrome://net-internals/#dns`."},{"_id":"7f9f548a9c951d2ee000001c","treeId":"7f98d0830f1df3755d000010","seq":10892567,"position":2,"parentId":"7f9e175f815a5a86a900001b","content":"## `Hosts File`\nSecond, the browser looks in the operating system's `hosts` file. The hosts file is a text document that has a list of IPs with their associated domains.\n\nWindows:\n`C:\\Windows\\System32\\drivers\\etc\\hosts`\nMac:\n`/etc/hosts`"},{"_id":"7f9f2a829c951d2ee000001b","treeId":"7f98d0830f1df3755d000010","seq":10892595,"position":1.625,"parentId":"7f9c42c131ef64765c00000c","content":"### `External Lookup`\nIf the IP address cannot be found locally, the browser must do an external search. This is where `DNS` comes to the rescue!\nIt stands for `Domain Name Service`, which is a `database` hosted on multiple servers around the world that contains `records` of every `domain` on the internet and its `IP` address(s). `DNS` makes browsing the internet human-friendly, as we no longer have to remember any numbers, instead remembering the `URL` name, and `DNS` gives us the `IP`. DNS lookup proceeds as follows:"},{"_id":"7f9c505e31ef64765c00000e","treeId":"7f98d0830f1df3755d000010","seq":10892596,"position":1,"parentId":"7f9f2a829c951d2ee000001b","content":"## `Router Cache`\nIf the `IP` is not found on the browser's computer, the browser tries to see if there's a `router` on the `network` with a DNS cache. If it doesn't find it there it connects to the ISP. "},{"_id":"7f9b56ea0f1df3755d00001c","treeId":"7f98d0830f1df3755d000010","seq":10892599,"position":2,"parentId":"7f9f2a829c951d2ee000001b","content":"## `ISP DNS Cache`\n`ISP` stands for `Internet Service Provider`, which is the company you pay monthly to have Internet access.\nISPs have DNS cache servers in their data centers. Those servers are the next thing to get checked. If the record is still not found, the ISP's DNS server initiates a `recursive DNS query`."},{"_id":"7f9c5b5331ef64765c00000f","treeId":"7f98d0830f1df3755d000010","seq":10892601,"position":3,"parentId":"7f9f2a829c951d2ee000001b","content":"## `Recursive DNS Query`\n![](https://www.filepicker.io/api/file/zHW9lR74QauHvHhtl9zr)\nSimply put, a recursive DNS query is multiple DNS servers calling each other until they find the correct record, which is then returned to the browser.\n\nLet's say we're trying to find the IP of `mail.google.com`. The ISP's DNS server contacts the top-level domain `.com` DNS server, which redirects to the `google.com` DNS server, which returns the IP of `mail.google.com` all the way back to the browser.\n\nNote that even if we have to go through all these steps to get the IP, it all happens extremely quickly and we don't have to do any kind of waiting usually. Also, the DNS servers will cache any IP they did not have for a certain amount of time, so it can be accessed quickly the next time it's requested."},{"_id":"7f9c64b031ef64765c000011","treeId":"7f98d0830f1df3755d000010","seq":10880913,"position":4,"parentId":null,"content":"# **3. Connect to Server**"},{"_id":"7f9ffb1c9c951d2ee000001f","treeId":"7f98d0830f1df3755d000010","seq":10892604,"position":1,"parentId":"7f9c64b031ef64765c000011","content":"## `Initiate TCP/IP Connection`\nOnce the browser (which is the `client` in a `server/client` relationship) receives the IP, it will attempt to establish a connection to the website's hosting server using `TCP (Transmission Control Protocol)`.\n\nThe connection is established using a `TCP/IP 3-way handshake`.\nBefore we get to the server we usually go through some network components, most importantly being the load balancer and the firewall described below."},{"_id":"7fa04f1f9c951d2ee0000020","treeId":"7f98d0830f1df3755d000010","seq":10880384,"position":1,"parentId":"7f9ffb1c9c951d2ee000001f","content":"**1.** Client asks the server if it is open for connections by sending it a `SYN (synchronize)` packet.\n\n**2.** If the server can accept the connection it responds with an `ACK (Acknowledgement)` of the SYN by sending a `SYN/ACK` packet.\n\n**3.** Client receives SYN/ACK packet from server and acknowledges by sending another `ACK` packet.\n\nNow we have an `established` TCP/IP connection and we can `transfer` data back and forth!"},{"_id":"7fad56319c951d2ee0000024","treeId":"7f98d0830f1df3755d000010","seq":10892610,"position":2,"parentId":"7f9c64b031ef64765c000011","content":"## `Load Balancer`\nThe `IP` address we get for the website we're trying to visit usually belongs to a server called the `load balancer`. The load balancer does what you imagine, it splits up the load or web traffic onto multiple servers. Large website that receive a lot of traffic need to do this because one server is not enough for the potential millions of users trying to connect.\n\n## `Firewall`\nSometimes the IP we get belongs to a `firewall` which is a very important `security` component in the `network stack`. A firewall can be implemented as software or hardware. Firewalls exist at different locations in the network, either before or after load balancers. The firewall is a barrier between a trusted and an untrusted network. It has rules that define who is allowed to access the network and who will be blocked.\n\n## `Data Base`\nWebsites store their information in an `SQL` database. Larger websites split their database onto multiple servers. `SQL (Structured Query Language)` is used to manipulate the database. SQL commands such as `Select`, `Insert`, `Update`, `Delete`, `Create`, and `Drop` are used to accomplish almost everything one needs to do with a database.\n\n\n"},{"_id":"7f9c66c331ef64765c000012","treeId":"7f98d0830f1df3755d000010","seq":10890442,"position":4.25,"parentId":null,"content":"# **4. HTTP(S) Request**\n"},{"_id":"7fa09c209c951d2ee0000021","treeId":"7f98d0830f1df3755d000010","seq":10892611,"position":1,"parentId":"7f9c66c331ef64765c000012","content":"## `HTTP`\n`HTTP` is the `Hypertext Transfer Protocol`. It is the underlying protocol used by the `World Wide Web` to define how messages are formatted and transmitted between a Web server and a browser.\n\n## `HTTPS`\nIf you look closely at the displayed URL in your browser, you see it often starting with `https` not `http`. HTTPS is the secure version of the HTTP protocol. That means that the connection between us and the server is encrypted and can't be deciphered by anyone listening in to our communication. This is very important when you are dealing with sensitive information. You don't want anyone finding out your banking, credit card or social security number, for example.\n\n## `SSL`\nHTTPS uses the `SSL (Secure Sockets Layer)` protocol to establish an encrypted connection between a web server and a browser. An SSL certificate is necessary to create SSL connection. You can see the SSL certificates being used by Chrome by clicking on `Manage certificates` under the `Privacy and security` preferences.\n\n## `GET Request`\nOnce the TCP/IP connection is established, the browser uses HTTP to send a GET request to the web server. The server has a software running on it called a `web server` like `Apache` or `Nginx` that processes the GET request and sends the needed information back.\n\n## `POST Request`\nSometimes we need to send information to the server, like login info, or we might be submitting a form. In this case the browser will send an `HTTP POST request` to the server instead of a GET request. To see the client/server communication happening in the background when you browse the internet checkout Firefox's `Firebug` plugin. A lot more communication is happening that what you'd expect!"},{"_id":"7f9c70c931ef64765c000016","treeId":"7f98d0830f1df3755d000010","seq":10890322,"position":4.5,"parentId":null,"content":"# **5. Server Response**"},{"_id":"7f9c74ef31ef64765c000018","treeId":"7f98d0830f1df3755d000010","seq":10892614,"position":1,"parentId":"7f9c70c931ef64765c000016","content":"## `Server Response Codes`\nA server has multiple kinds of responses to the GET/POST requests it receives. The response is a 3-digit number, the first one of which tells us what family of messages the response belongs to. The one we're usually familiar with is the famous `404 Page Not Found`. \n\nIt breaks down like this:\n\n`1xx`: Informational response\n`2xx`: Success\n`3xx`: Redirection\n`4xx`: Client error\n`5xx`: Server error"},{"_id":"7f9c695931ef64765c000014","treeId":"7f98d0830f1df3755d000010","seq":10890329,"position":7,"parentId":null,"content":"# **6. Display Webpage**"},{"_id":"7fadd14d9c951d2ee0000028","treeId":"7f98d0830f1df3755d000010","seq":10890439,"position":1,"parentId":"7f9c695931ef64765c000014","content":"Once the basic HTML code is received it is processed and displayed by the browser. After that the browser sends GET requests for other items that the HTML code refers to, like `CSS` style guides, `Javascript` files, `images` and `videos`. That's why a lot of times it seems to us that the page loads in parts, with components with the biggest size loading last."}],"tree":{"_id":"7f98d0830f1df3755d000010","name":"What happens when you type a URL in a browser and press Enter?","publicUrl":"what-happens-when-you-type-a-url-in-a-browser-and-press-enter","latex":false}}