Erlang Central

A fast web server demonstrating some undocumented Erlang features

Revision as of 11:50, 29 June 2006 by Theboss (Talk | contribs)



This HOWTO describes a web server written for the day when even Yaws is not quick enough.

The web server presented is quite simple. Even so it is split into 5 modules. Some of these are dictated by the OTP framework, and others are split out for convenience. The 5 modules are:

  • iserve - API for managing URIs and callbacks
  • iserve_app - OTP Application behaviour
  • iserve_sup - OTP Supervisor
  • iserve_listen - Gen_server to own the listening socket and create connections
  • iserve_socket - Process to handle a single HTTP connection for its lifetime

This HOWTO presents code and descriptions for each of these as they arise.

TCP Server Framework

A web server needs to support lots of connections, so at it's heart it needs to be a multiple connection TCP/IP server. There are any number of ways to arrange a set of erlang processes into such a thing. My favourite method is to have a single gen_server which opens and owns the listen socket (the listening process). This spawns another process which waits in accept until a connection attempt is received. At this time this accepting process sends a message back to the listening process and goes on to handle the traffic. This avoids the need for gen_tcp:controlling_process/2 and associated complexity.

On receipt of the message from the accepting process the listening process spawns a new accepting process and so on.

The listening process also traps exits, and if it receives a non normal exit from the current accepting process it creates a new one. In this way the listening process supervises its acceptor.

Common header file

The web server creates a #req{} record as it processes each request. This is used as part of the API into implementation callbacks and by the iserve_socket process. Here are the contents of iserve.hrl up front to get it out of the way:

Code listing 3.1

% This record characterises the connection from the browser to our server
% it is intended to be a consistent view derived from a bunch of different headers
-record(req, {connection=keep_alive,	        % keep_alive | close
	      content_length,                   % Integer
	      vsn,                              % {Maj,Min}
	      method,                           % 'GET'|'POST'
	      uri,				% Truncated URI /index.html
              args="",                          % Part of URI after ?
	      headers,				% [{Tag, Val}]
	      body = <<>>}).			% Content Body

Listening Process

Here is the code for the listening process. It is a very basic gen_server which models a single process:

Code listing 4.1



-export([start_link/1, create/2]).

%% gen_server callbacks
-export([init/1, handle_call/3, handle_cast/2, handle_info/2, terminate/2,

-record(state, {listen_socket,

start_link(Port) when is_integer(Port) ->
    Name = list_to_atom(lists:flatten(io_lib:format("iserve_~w",[Port]))),
    gen_server:start_link({local, Name}, ?MODULE, Port, []).

<span class="comment">// %% Send message to cause a new acceptor to be created</span>
create(ServerPid, Pid) ->
    gen_server:cast(ServerPid, {create, Pid}).

<span class="comment">// %% Called by gen_server framework at process startup. Create listening socket</span>
init(Port) ->
    process_flag(trap_exit, true),
    case gen_tcp:listen(Port,[binary,{packet,0},
                              {active, false},
                              <span class="input">{backlog, 30}</span>]) of
	{ok, Listen_socket} ->
            <span class="comment">// %%Create first accepting process</span>
	    Pid = iserve_socket:start_link(self(), Listen_socket, Port),
	    {ok, #state{listen_socket = Listen_socket,
                        port = Port,
			acceptor = Pid}};
	{error, Reason} ->
	    {stop, Reason}

handle_call(Request, From, State) ->
    Reply = ok,
    {reply, Reply, State}.

<span class="comment">// %% Called by gen_server framework when the cast message from create/2 is received</span>
handle_cast({create,Pid},#state{listen_socket = Listen_socket} = State) ->
    New_pid = iserve_socket:start_link(self(), Listen_socket, State#state.port),
    {noreply, State#state{acceptor=New_pid}};

handle_cast(Msg, State) ->
    {noreply, State}.

handle_info({'EXIT', Pid, normal}, #state{acceptor=Pid} = State) ->
    {noreply, State};

<span class="comment">// %% The current acceptor has died, wait a little and try again</span>
handle_info({'EXIT', Pid, _Abnormal}, #state{acceptor=Pid} = State) ->
    iserve_socket:start_link(self(), State#state.listen_socket, State#state.port),

handle_info(Info, State) ->
    {noreply, State}.

terminate(Reason, State) ->

code_change(OldVsn, State, Extra) ->
    {ok, State}.

The notable thing about this code is the use of undocumented socket options to set up the initial state of connections made to the web server port.

  • {backlog, 30} specifies the length of the OS accept queue.
  • {packet, http} puts the socket into http mode. This makes the socket wait for a HTTP Request line, and if this is received to immediately switch to receiving HTTP header lines. The socket stays in header mode until the end of header marker is received (CR,NL,CR,NL), at which time it goes back to wait for a following HTTP Request line.

Acceptor/Socket process

It would be easy enough to create an abstraction of the Listen/Accept process structure and pass in the implementation function as another parameter. For this HOWTO however I'll stick with the most basic model - the acceptor process starts life as an acceptor and goes on to handle the traffic.

The acceptor process is implemented in a separate module iserve_socket. It is in two parts - the first part sets up a bunch of defines and exports and then does the accepting. Here is it is:

Code listing 5.1




-define(not_implemented_501, "HTTP/1.1 501 Not Implemented\r\n\r\n").
-define(forbidden_403, "HTTP/1.1 403 Forbidden\r\n\r\n").
-define(not_found_404, "HTTP/1.1 404 Not Found\r\n\r\n").

-record(c,  {sock,

-define(server_idle_timeout, 30*1000).

start_link(ListenPid, ListenSocket, ListenPort) ->
    proc_lib:spawn_link(?MODULE, init, [{ListenPid, ListenSocket, ListenPort}]).

init({Listen_pid, Listen_socket, ListenPort}) ->
    case catch gen_tcp:accept(Listen_socket) of
	{ok, Socket} ->
            <span class="comment">// %% Send the cast message to the listener process to create a new acceptor</span>
	    iserve_server:create(Listen_pid, self()),
	    {ok, {Addr, Port}} = inet:peername(Socket),
            C = #c{sock = Socket,
                   port = ListenPort,
                   peer_addr = Addr,
                   peer_port = Port},
	    request(C, #req{}); <span class="comment">// %% Jump to state 'request'</span>
	Else ->
	    error_logger:error_report([{application, iserve},
				       "Accept failed error",
	    exit({error, accept_failed})

Note here that the process is started via the proc_lib:spawn_link/3 call. This wraps the normal spawn_link/3 bif so that the same nice error reports are created as for gen_servers, but it allows for a totally unstructured process implementation.

Web server state machine

The rest of this module contains the web server code. It is structured as a state machine which follows the state changes of the http socket mode. A single function models each state, and state transitions are simply implemented as a call to the function which owns the next state.

The states are:

  • request - wait for a HTTP Request line. Transition to state headers if one is received.
  • headers - collect HTTP headers. After the end of header marker transition to body state.
  • body - collect the body of the HTTP request if there is one, and lookup and call the implementation callback. Depending on whether the request is persistent transition back to state request to await the next request or exit.

The code for the state request is below. A blocking call is made to gen_tcp:recv/3 with a timeout. The http driver waits for a CRNL terminated line of the form GET / HTTP/1.0. If anything else is received an http_error indication is returned with the erroneous data.

Some broken clients include extra CR or CRNL sequences so these are skipped.