The Hypertext Transfer Protocol HTTP is an application-level protocol for distributed, collaborative, hypermedia information systems. This is the foundation for. To display page browser fetches the file aracer.mobi from a web server. Same as aracer.mobi (Defaults: port 80, file aracer.mobi, http protocol). ○ HTTP. Computer Networks. HTTP Protocol. Professor Richard Harris. School of Engineering and Advanced. Technology (SEAT).
|Language:||English, Spanish, Arabic|
|Distribution:||Free* [*Registration needed]|
The Hypertext Transfer Protocol (HTTP) is an application-level the protocol referred to as “HTTP/”, and is an update to RFC . HTTP is an asymmetric request-response client-server protocol as illustrated. video/mpeg ", " application/msword ", and " application/pdf ". http protocol simplified protocol session::= request response request::= requestLine header+ [body]. requestLine::= method path response::= status header+.
It then parses this file, making additional requests corresponding to execution scripts, layout information CSS to display, and sub-resources contained within the page usually images and videos. The Web browser then mixes these resources to present to the user a complete document, the Web page. Scripts executed by the browser can fetch more resources in later phases and the browser updates the Web page accordingly.
A Web page is a hypertext document. This means some parts of displayed text are links which can be activated usually by a click of the mouse to fetch a new Web page, allowing the user to direct their user-agent and navigate through the Web. The Web server On the opposite side of the communication channel, is the server, which serves the document as requested by the client.
A server appears as only a single machine virtually: this is because it may actually be a collection of servers, sharing the load load balancing or a complex piece of software interrogating other computers like cache, a DB server, or e-commerce servers , totally or partially generating the document on demand.
A server is not necessarily a single machine, but several server software instances can be hosted on the same machine.
Due to the layered structure of the Web stack, most of these operate at the transport, network or physical levels, becoming transparent at the HTTP layer and potentially making a significant impact on performance. Those operating at the application layers are generally called proxies. These can be transparent, forwarding on the requests they receive without altering them in any way, or non-transparent, in which case they will change the request in some way before passing it along to the server.
HTTP messages can be read and understood by humans, providing easier testing for developers, and reduced complexity for newcomers. New functionality can even be introduced by a simple agreement between a client and a server about a new header's semantics.
HTTP is stateless, but not sessionless HTTP is stateless: there is no link between two requests being successively carried out on the same connection. This immediately has the prospect of being problematic for users attempting to interact with certain pages coherently, for example, using e-commerce shopping baskets. Though HTTP doesn't require the underlying transport protocol to be connection-based; only requiring it to be reliable, or not lose messages so at minimum presenting an error.
HTTP therefore relies on the TCP standard, which is connection-based, even though a connection is not always required. This is less efficient than sharing a single TCP connection when multiple requests are sent in close succession. Experiments are in progress to design a better transport protocol more suited to HTTP. Cache or authentication methods were functions handled early in HTTP history. The ability to relax the origin constraint, by contrast, has only been added in the s. Here is a list of common features controllable with HTTP.
How documents are cached can be controlled by HTTP. The server can instruct proxies and clients, about what to cache and for how long. The client can instruct intermediate cache proxies to ignore the stored document. Relaxing the origin constraint To prevent snooping and other privacy invasions, Web browsers enforce strict separation between Web sites. Only pages from the same origin can access all the information of a Web page.
Though such constraint is a burden to the server, HTTP headers can relax this strict separation on the server side, allowing a document to become a patchwork of information sourced from different domains; there could even be security-related reasons to do so. Authentication Some pages may be protected so that only specific users can access them.
Proxy and tunneling Servers or clients are often located on intranets and hide their true IP address from other computers. HTTP requests then go through proxies to cross this network barrier. Not all proxies are HTTP proxies. Other protocols, like ftp, can be handled by these proxies. This creates sessions, despite basic HTTP being a state-less protocol. This is useful not only for e-commerce shopping baskets, but also for any site allowing user configuration of the output.
HTTP flow When a client wants to communicate with a server, either the final server or an intermediate proxy, it performs the following steps: Open a TCP connection: The TCP connection is used to send a request, or several, and receive an answer. The client may open a new connection, reuse an existing connection, or open several TCP connections to the servers.
The server can also choose to encode the document before returning to the client to reduce the transmission time. The server must set the response header "Content-Encoding" to inform the client that the returned document is encoded. The common encoding methods are "x-gzip. Connection: Close Keep-Alive - The client can use this header to tell the server whether to close the connection after this request, or to keep the connection alive for another request.
Referer: referer-URL - The client can use this header to indicate the referrer of this request. If you click a link from web page 1 to visit web page 2, web page 1 is the referrer for request to web page 2.
All major browsers set this header, which can be used to track where the request comes from for web advertising, or content customization. Nonetheless, this header is not reliable and can be easily spoofed. Note that Referrer is misspelled as "Referer" unfortunately, you have to follow too. User-Agent: browser-type - Identify the type of browser used to make the request.
Server can use this information to return different document depending on the type of browsers. Cache-Control: no-cache Instead, it uses "Pragma: no-cache".
This header will be described in later chapter on authentication. This header will be discussed in later chapter on state management. If-Modified-Since: date - Tell the server to send the page only if it has been modified after the specific date.
GET Request for Directory Suppose that a directory called "testdir" is present in the document base directory "htdocs". Otherwise, the server returns the directory listing, if directory listing is enabled in the server configuration.
Otherwise, the server returns " Page Not Found". The following trace was captured using telnet. A connection is established with the proxy server, and a GET request issued.
Absolute request-URI is used in the request line. However, the server returns only the response header without the response body, which contains the actual document.
Sometimes, HEAD is not listed. Based on the data submitted, the server takes an appropriate action and produces a customized response. Once they fill in the requested data and hit the submit button, the browser packs the form data and submits them to the server, using either a GET request or a POST request. Each field has a name and can take on a specified value. This is known as a query string. It will send the query string to the server as part of the request.
Special characters are not allowed inside the query string.
If this amount exceed a server-specific threshold, the server would return an error " Request URI too Large". The URL-encoded query string would appear on the address box of the browser. POST method overcomes these drawbacks.
If POST request method is used, the query string will be sent in the body of the request message, where the amount is not limited. The request headers Content-Type and Content-Length are used to notify the server the type and the length of the query string. POST method will be discussed later.
Suppose the user enters "Peter Lee" as the username, "" as password; and clicks the submit button. You should never use send your password without proper encryption. Hostname: The DNS domain name e. Port: The TCP port number that the server is listening for incoming requests from the clients. Path-and-file-name: The name and location of the requested resource, under the server document base directory.
URL rewriting for session management, e. POST vs GET for Submitting Form Data As mentioned in the previous section, POST request has the following advantage compared with the GET request in sending the query string: The amount of data that can be posted is unlimited, as they are kept in the request body, which is often sent to the server in a separate data stream.
The query string is not shown on the address box of the browser. Hence, sending password using a POST request is absolutely not secure. When the user clicks the submit button, the browser send the form data and the content of the selected file s.
The original local file name could be supplied as a "filename" parameter, or in the "Content-Disposition: form-data" header. Read " Uploading Files in Servlet 3. This is often used to make a connection through a proxy. Extension methods also error codes and headers can be defined to extend the functionality of the HTTP protocol.