Listen on a server port (80 by default)
Accept GET/HEAD/POST request
Map resource name (URL) to a local resource
Retrieve local resource and send it back to client
Web browsers initiate network communications with servers by sending them URLs.
the address of a data file stored on the server that is to be sent to the client
a program stored on the server that the client wants executed, with the output of the program returned to the client
When a Web Server begins execution, it informs the operating system under which it is running that it is ready now to accept incoming network connections through a specific port on the machine.
While in this running state, the server runs as a background process in the operating system environment.
A Web client, or a browser,
opens a network connection to a web broswer
sends information requests, and possibily data to the server
receives information from the server, and
closes the connection
Of course, other machines exist between the Web servers and clients
e.g. network routers and domain-name servers
The primary task of a web server is to monitor a communication port on its host machine, accept HTTP commands, and perform operations specified by the commands
All HTTP commands include a URL, which inlcudes
the specification of the host machine
a filename, or
a program name (e.g. .py, .asp)
A Web server typically has two root directories
document root - its file hierarchy stores Web documents that serve to clients
Many server allow secondary areas that are outsinde the directory of document root or even the server machine
connected in a LAN - configured to direct request URLs wiht a particular file path to a storage ares
server root - along with its descendant directories, stores the server and its support software
http://online.mq.edu.au/pub/COMP249/lectureschedule.html
Resource name: /pub/COMP249/lectureschedule.html
Mapped to a local file system:
/home/httpd/html/pub/COMP249/lectureschedule.html C:\Web\httpd\html\pub\COMP249\lectureschedule.html
http://online.mq.edu.au/pub/COMP249/
Resource name: /pub/COMP249
Server must look for a default name in the given directory: index.html, index.htm, etc.
Settings are dependant on server configuration
http://www.ics.mq.edu.au/~cassidy/
Resource name: /~cassidy/
Refers to the personal directory of a user
Look in user's home directory for a give subdirectory: html (in OCS), public_html (also common).
Permissions:
Server runs as an untrusted user
Needs to be able to read and perhaps execute files in your html directory.
http://www.smh.com.au/articles/2005/03/13/1110649055094.html http://slashdot.org/article.pl?sid=05/03/13/1853233& tid=133&tid=186&tid=159
Server is free to find a resource any way it chooses
This includes finding it in a database or running a program to generate it.
In the SMH case the stories are likely to be stored in a database and served as needed, other content is added on the fly.
The Slashdot URL refers to a Perl script which will be run to generate the content. The remaining text is GET encoded form variables.
http://ad.doubleclick.net/click;h=v2|30d0|0|0|%2a|l ;7516609;0-0;0;8856706;3454-728|90;4719404|4737300|1; ;%3fhttp://www.sun.com/emrkt/sunfirev20z/ http://ad.au.doubleclick.net/click%3Bh=v5|33ae|3|0|%2a |h%3B27111491%3B0-0%3B0%3B12619400%3B1-468|60%3B14797496 |14815392|1%3B%3B%7Esscs%3D%3fhttp://www.energy.com.au/onit
Note that these are folded onto multiple lines for display purposes. Note the use of escape codes like %3B to include characters in the URL that aren't allowed. Other special charecters: e.g. %20 or + (space), %21 (!) - % plus the hexidecimal ASCII code of the character.
Some MIME types:
text/html, image/jpg, audio/mp3, application/xml, application/xhtml+xml, text/plain, application/cybercash, video/mp4, text/x-vcard, text/css, multipart/digest, chemical/x-genbank, video/quicktime, application/pdf
GET /~cassidy/ HTTP/1.1
Host: www.ics.mq.edu.au
User-Agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.7.12)
Gecko/20050922 Firefox/1.0.7 (Ubuntu package 1.0.7)
Accept: text/xml,application/xml,application/xhtml+xml,
text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Cookie: UserTrack=63B08C38-1234-0000-0000-00000000000000;
Try at http://web-sniffer.net/ or install live header - a Firefox add-on.
Note lines folded for display.
What do each of these headers mean? Which are required? Many are defined in the HTTP standard but others can be defined via the HTTP extension framework.
HTTP/1.x 200 OK Date: Mon, 20 Mar 2006 05:33:32 GMT Server: Apache/2.0 Accept-Ranges: bytes Content-Length: 4111 Keep-Alive: timeout=15, max=499 Connection: Keep-Alive Content-Type: text/html Content-Language: en
POST /~steve/form.html HTTP/1.1
Host: localhost
User-Agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.7.12)
Gecko/20050922 Firefox/1.0.7 (Ubuntu package 1.0.7)
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,
text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Referer: http://localhost/~steve/form.html
Content-Type: application/x-www-form-urlencoded
Content-Length: 106
name=Steve+Cassidy&interests=This+is+a+field+with%0D%0Aquite+a+bit+
of+text%0D%0Athat+has+linebreaks.%0D%0A
Note lines folded for display.
This is a POST request, note how the data is encoded in the request body.
GET /~steve/form.html?name=Steve+Cassidy&interests=This+is+a+field+
with%0D%0Aquite+a+bit+of+text%0D%0Athat+has+linebreaks.%0D%0A HTTP/1.1
Host: localhost
User-Agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.7.12)
Gecko/20050922 Firefox/1.0.7 (Ubuntu package 1.0.7)
Accept: text/xml,application/xml,application/xhtml+xml,
text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Referer: http://localhost/~steve/form.html
If-Modified-Since: Mon, 20 Mar 2006 06:22:29 GMT
If-None-Match: "4f42a9-fd-40f672edb1340"
Note lines folded for display.
This is the same form submitted via a GET request, here the data is encoded in request URL. Note also the If-Modified-Since header in this request, sent because my browser has just asked for the same resource.
GET /~steve/ HTTP/1.1 Host: www.shlrc.mq.edu.au HTTP/1.x 301 Moved Permanently Date: Mon, 20 Mar 2006 06:32:36 GMT Server: Apache/2.0.46 (Red Hat) Location: http://www.ics.mq.edu.au/~cassidy/ Content-Length: 242 Connection: close Content-Type: text/html; charset=iso-8859-1
Alternately
<meta http-equiv="refresh"
content="URL=http://my.new.site.com/">
The HTTP redirect is a server response that can be used to indicate that a resource has moved to a new location. An alternate is to include the above meta tag in a page header to force a redirect from the current page.
15:HOST = '' # Symbolic name meaning the local host
16:PORT = 50004 # Arbitrary non-privileged port
17:s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
18:s.bind((HOST, PORT))
19:s.listen(1)
20:
21:
22:conn, addr = s.accept()
23:data = conn.recv(4096)
24:words = data.split()
25:
26:if len(words) > 0 and words[0] == "GET": 27: page = """<html> 28:<head><title>Hello</title></head> 29:<body><p>Your request was:</p> 30:<pre>""" + 31:data + """ 32:</body> 33:</html> 34: 35:""" 36: 37: header = """HTTP/1.0 200 ok 38:Content-length: """ + str(len(page)) + """ 39:Content-type: text/html 40: 41:""" 42:else: 43: header = "HTTP/1.0 440 Page Not Found\n\n" 44: page = "" 45: 46:print header+page 47:conn.send(header+page)Download the full script
8:import BaseHTTPServer
9:import CGIHTTPServer
10:
11:server_address = ('', 8000)
12:handler = CGIHTTPServer.CGIHTTPRequestHandler
13:handler.cgi_directories = ['/cgi-bin']
14:httpd = BaseHTTPServer.HTTPServer(server_address, handler)
15:
16:print "Starting server. Connect to http://localhost:8000/"
17:
18:httpd.serve_forever()
19:
20: