Skip to content

Lincong/simle-HTTP-server

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

87 Commits
 
 
 
 
 
 

Repository files navigation

This HTTPs server is created for the purpose of practicing coding. 
There are 3 checkpoints.
Checkpoint 1 is to create a server that echos data stream from the client
Checkpoint 2 is to modify the echo server to be a HTTP server
Checkpoint 3 is to add SSL and CGI to the HTTP server

To run checkpoint 1:
cd src
./make_run

// in another terminal tab
python cp1_checker.py 127.0.0.1 9999 5 50 100000 50

********************************************************************
Implementation Notes
********************************************************************

********************************************************************
CGI ****************************************************************
********************************************************************

2 parts
  1. meta-variables (describe a client's request)  
  2. programmer interface (between the script and the HTTP server)


script-URI = <scheme> "://" <server-name> ":" <server-port>
             <script-path> <extra-path> "?" <query-string>

script: 需要两个东西,一个是meta-variables, 一个是request data。request data可能不是以来就有的,需要HTTP server慢慢接受,但是meta-variables是用env variable直接设的,所以一来就有。

In the event of an error condition, the server can interrupt or
terminate script execution at any time and without warning.  That
could occur, for example, in the event of a transport failure between
the server and the client; so the script SHOULD be prepared to handle
abnormal termination.

CGI request:
Need to pipe stdin, pipe stdout, fork(), setup environment variables per the CGI specification, and 
execve() the executable (it should be executable)

If the CGI program fails in any way, return a 500 response to the client, otherwise
Send all bytes from the stdout of the spawned process to the requesting client.
The CGI application will produce headers and message body as it sees fit, you do not need to modify or 
inspect these bytes at all; you are a proxy...things have come full circle!

Meta-variables are identified by case-insensitive names.
Meta-variable values MUST be considered case-sensitive except as noted otherwise.

      meta-variable-name = "AUTH_TYPE" | "CONTENT_LENGTH" |
                           "CONTENT_TYPE" | "GATEWAY_INTERFACE" |
                           "PATH_INFO" | "PATH_TRANSLATED" |
                           "QUERY_STRING" | "REMOTE_ADDR" |
                           "REMOTE_HOST" | "REMOTE_IDENT" |
                           "REMOTE_USER" | "REQUEST_METHOD" |
                           "SCRIPT_NAME" | "SERVER_NAME" |
                           "SERVER_PORT" | "SERVER_PROTOCOL" |
                           "SERVER_SOFTWARE" | scheme |
                           protocol-var-name | extension-var-name
      protocol-var-name  = ( protocol | scheme ) "_" var-name
      scheme             = alpha *( alpha | digit | "+" | "-" | "." )
      var-name           = token
      extension-var-name = token

CONTENT_LENGTH: server MUST set this meta-variable if and only if the request is accompanied by a message-body entity.
CONTENT_TYPE: server MUST set this meta-variable if an HTTP Content-Type field is present in the client request header.
GATEWAY_INTERFACE: MUST be set to the dialect of CGI being used by the server to communicate with the script.
PATH_INFO: a path to be interpreted by the CGI script.  It identifies the resource or sub-resource to be returned by
           the CGI script, and is derived from the portion of the URI path hierarchy following the part that identifies 
           the script itself.

QUERY_STRING: it provides information to the CGI script to affect or refine the document to be returned by the script.
REMOTE_ADDR: The REMOTE_ADDR variable MUST be set to the network address of the client sending the request to the server.

REQUEST_METHOD: MUST be set to the method which should be used by the script to process the request.
      REQUEST_METHOD   = method
      method           = "GET" | "POST" | "HEAD" | extension-method
      extension-method = "PUT" | "DELETE" | token

SCRIPT_NAME: MUST be set to a URI path (not URL-encoded) which could identify the CGI script.

********************************************************************
HTTP ***************************************************************
********************************************************************

HTTP methods:
  1. GET
  2. HEAD
  3. POST

HTTP request format:
  1. Request line (3 fields: method field, URL field, HTTP version field)
  2. Header lines (Host: *, Connection: *, User-agent: *, Accept-language: *)
  3. Blank line
  4. Entity body


1. If request header size > 8192 bytes
    For now, send error message and disconnect.


In HTTP handlers:
  receive request --> generate response headers --> send headers back --> stream response entity body (file content)

**********************************
Structure:
**********************************

http_handle (function): (2 parts)
  // process input part
  1. read all data from the recv buffer and process it 
  2. generate Request
  3. Use Request object to generate Response object
  4. Put the response object in the back_log (queue)

  // generate output part
  5. check the state of the current "http_task"
  6. do corresponding actions
  7. if the current "http_task" is finished. Get another one from the queue and repeat from step 5 until the send buffer is filled up
  8. return

http_agent (object): // it is basically just the queue and each "peer_t" contains a pointer to one http_agent
  1. requests_backlog (a queue of <http_task>)
  2. parse_header_buffer // because we can only call "parse()" until '\r\r' are encountered in the data

http_task (object):
  1. current_state
  2. response_header_buffer (response header data as a whole string)
  3. some data structures holding file descriptors of files the content of which are going to be sent the client

********************************************************************
GET
********************************************************************

request:

response: 
  1. status line (200 OK)
  2. headers (k-v pair)
    1. Server : ..
    2. Date : ..
    3. Content-Length : ..
    4. Content-Type : ..
    5. Last-modified : ..
    6. Connection : Close/Keep-alive
    7. \r\n\r\n (clrf)
  3. body (usually file content)

********************************************************************
HEAD
********************************************************************

response: 
  1. status line (200 OK)
  2. headers (k-v pair)
    1. Server : ..
    2. Date : ..
    3. Content-Length : ..
    4. Content-Type : ..
    5. Last-modified : ..
    6. Connection : Close/Keep-alive
    7. \r\n\r\n (clrf)

********************************************************************
POST
********************************************************************

request: (related to CGI)

response: 
  1. status line (200 OK)
  2. \r\n\r\n (clrf)



handle_clients() -> handle_http_request() -> handle_dynamic_request()