Erlang Central

Tracing with Onviso

Revision as of 13:17, 1 September 2009 by Olafura (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)



Tracing and debugging large systems, well, even smaller systems is often a tricky and challenging business. This howto introduces a wrapper called Onviso which can be used to trace systems distributed across several nodes and analyze the traces collected.

Note: Onviso is still very much a work in-progress, i.e things may change!


The Onviso application wraps the OTP tool Inviso in an easy to use API. It is designed offer adequate defaults while still retaining the advantages that Inviso offers while tracing in large system environments.

Using only two functions it is possible to set up tracing across multiple nodes and merge these in any possible way. Additionally, for convenience when using Onviso as an ad-hoc tracing tool, it is also possible to retrieve the status of the recent traces run and the configuration that was used.

For merging the traces there are a number of defaults provided, the simplest of them simply writing every trace to a file in the order as they are generated. The merge functionality can also be used to conduct property checking or profiling the system that was traced. The possibilities are endless and only restricted to the three funs that are passed to the merge function as you will see soon.


Onviso merely wraps the functionality of Inviso. Inviso is part of Erlang/OTP, at the time of writing in version 0.6, and is still maturing. Onviso in itself doesn't add any new functionality to Inviso (apart from some information on the traces set). In essence, Onviso is an attempt to make Inviso easier to use.

Some useful resources:

Comparison to dbg

Onviso's strengths comes with the advanced possibilities of merging the traces collected as well as easily setting up tracing across multiple nodes. Onviso can be used in a similar fashion as dbg, however, it is not intended to be a replacement.

Setting up a trace

To give some meaning to this guide we'll provide a system that we can trace on. In the following examples we'll play with two nodes: 'server@linux' and 'client@linux'. Our test node will be called 'inviso@linux'.

On 'server@linux' the following application will be running:


start() ->
    Pid = spawn(?MODULE,loop,[[]]),

stop() ->
    server ! stop.

loop(Data) ->
	{put,From,Ting} -> From ! ok,
	{get,From}      -> From ! Data,
	stop            -> stopped;
	clear           -> loop([])

And our client node will have the following application running:


init() ->
init(Node) ->
    erlang:set_cookie(node(), inviso),

server_node() ->
    {ok,HostName} = inet:gethostname(),
    list_to_atom("server@" ++ HostName).

get() ->
    erlang:send({server,server_node()}, {get,self()}),
    receive Data -> Data
    after 1000   -> no_reply

put(Ting) ->
    erlang:send({server,server_node()}, {put,self(),Ting}),
    receive ok -> ok
    after 1000 -> no_reply

If you have used dbg before this will look very familiar. To set up a trace in Onviso you must specify what to trace on, this is done by specifying modules, functions and possibly arguments passed to the functions. Onviso uses the following format to specify a so called pattern:

{module, function, arguments, matchspecification}

Say for example that we want to trace the 'put' function in the client module the pattern would look like:

{client, put, '_', []}

This pattern would match any calls made to the client:put/1 function. Furthermore, if we would like to match only when the function is passed the atom none as an argument then we can write the following:

{client, put, [none], []}

Wildcards can also be used to match all functions of a module:

{client, '_', '_', []}

Note however that

{client, '_', [none], []}

is illegal.

Onviso exports one function to set-up a trace, not surprisingly called trace.

trace(Patterns, Flags) % only traces on the local node
trace(Patterns, Nodes, Flags) % no overload protection is used
trace(Patterns, Nodes, Flags, OverloadProtection)

I wont go into details about match specifications, if you want you can read about them here instead. However, Onviso has a nice trick up its sleeve (which is honestly inspired by Mats Cronqvist's work on redbug), namely match specification shortcuts. Instead of typing:

{client, put, '_', [{'_',[],[{return_trace}]}]}

you simply type:

{client, put, '_', return} 

and you get the same result; a match specification which will give you the return information from the calls which are matched by the pattern. One can also use caller to get the calling function.

>>> add info on flags

Tracing example

So, lets set up a trace on the loop function of the server module and a trace on the client functions that access the server node. Remember we have to specify patterns, nodes and flags.

1> onviso:trace([{server, loop, '_', []}, 
                 {client, put, '_', return},
                 {client, get, '_', return}], 
                ['server@linux', 'client@linux'],
                {all, [call]}).

{ok, 1}

If Onviso could connect to the nodes it will return with ok and a reference number. We will see the usage of this shortly. Make sure that you are using the same magic cookie on all the nodes you want to connect to.

erlang:set_cookie(node(), inviso).

Note that the trace is activated once the function is executed. This means it will now start collecting traces on the respective nodes, eventually to be written to file.

Collecting the results

Taking care of what was traced, i.e merge/X

Protecting live systems

The architecture of Inviso has already some protection for live systems built-in, but there's actually more to it. You can also specify own functions to monitor the load of the node. This section will cover an introduction to the architecture used by Inviso and we'll also set-up a trace using the so called overload protection.


Background to relaying (architecture), ways of relaying (collector/relayer, file)

Overload protection

How do we specify protection against system overload

More merging

As mentioned in the introduction there are endless ways one can merge and analyze the traces collected. To use a well known cliché: only your imagination sets the limit.

In this section we'll look at two examples to illustrate some of the possibilities. Please feel free to add more examples to this section.


Function calls Function time execution


Checking states Visualizing flow