Starting to play with mnesia

If you haven’t already tried it out, mnesia is an Erlang application that implements a database, albeit a rather peculiar and particularly powerful one.

I have developed mnesia-based systems in industrial settings competing with solutions based on industry-standard databases installed in industry-standard cluster server environments, and in those solutions, the mnesia-based systems outdid them in terms of usability, reliability and ease of administration, not to mention that it was practically free with no huge license fees to pay the IT giants.

That said, don’t jump to conclusions. mnesia is not necessarily suitable for every application that needs a DBMS. Analyze your problem and evaluate the benefits and drawbacks before committing. Given the extreme ease of implementation, you can easily build a proof-of-concept solution and evaluate it against the more traditional solutions.

In this post we’ll explore what mnesia does and how to go about it. A cautionary note. DBMSs need to be installed and set up before you can use them in your applications. mnesia is no different. Except that it is provided as a library in the erlang distribution so you don’t need to install it, but you do need to set things up.

When you start using mnesia for the first time, you follow some tutorial or quick-start. The setting up is sort of skipped with some code provided so that you don’t need to “waste any time”. In the long run though, this leads to misunderstandings and lots of lost hours trying to figure out what you did wrong and so on. So, we will take some time going over the setting up in order to understand better what’s going on. It’s not difficult and most of all it is logical.

Records

mnesia stores information in tables which are made up of records, so let’s first take a look at those.

Assume that we want to write an application that keeps track of bike rides. For each ride, we want to record who took the ride, when it started, how long it took and what distance was covered. We could capture this information in a record ride

-record(ride, {rider, date, month, year, duration, distance, itinerary}).

We can guess what the different fields mean, but let’s make it clearer by specifying the type of each field.

-record(ride, {rider :: string(),
               date :: #{day := integer(),
                         month := 1..12,
                         year := 2020..2030},
               duration :: integer(),  %% minutes
               distance :: integer(),  %% kms
               itinerary :: {string(), %% starting place
                             string(), %% place en route
                             string()} %% ending place
              }).

Now this doesn’t change anything from the compiler’s point of view, but it certainly clarifies what we expect as the values of those fields. And it can help if you run dialyzer, of course.

If you are familiar with relational databases, note that we are not limited to so-called scalar types. Here we have date as a map and itinerary as a tuple. Erlang actually allows the field to be a term. It can even be a function!

Let’s put the definition in a module called blog and write a function blog:record_play/0 to create a ride record and print some information about it.

record_play() ->
    R = #ride{rider = "A",
              date = #{day => 13, month => 6, year => 2022},
              duration = 83,
              distance = 29,
              itinerary = {"p1", "p2", "p3"}},
    io:format("record fields: ~p~n", [record_info(fields, ride)]),
    io:format("record size: ~p~n", [record_info(size, ride)]),
    io:format("R: ~p~n", [R]).

record_info/2 is a pseudo-function that the compiler will generate automatically while compiling. It doesn’t work in the shell and the record name has to be passed as an atom.

Let’s compile and run it

163> c(blog).            
{ok,blog}
164> blog:record_play().
record fields: [rider,date,duration,distance,itinerary]
record size: 6
R: {ride,"A",#{day => 13,month => 6,year => 2022},83,29,{"p1","p2","p1"}}
ok

As you probably know, a record is simply a tuple whose first element is the name of the record. This is confirmed by the last two outputs.

If we have several ride records and we put each each record in a row of a spreasheet with each field in a separate column, we get a table like

riderdatedurationdistanceitinerary
ride“A”#{day => 13,month => 6,year => 2022}8329{“p1”,”p2”,”p1”}
ride“B”#{day => 13,month => 6,year => 2022}4115{“p1”,”p3”,”p4”}
ride“A”#{day => 15,month => 6,year => 2022}2210{“p2”,”p3”,”p1”}
ride“A”#{day => 18,month => 6,year => 2022}3618{“p3”,”p2”,”p3”}
ride“B”#{day => 1,month => 7,year => 2022}9034{“p5”,”p6”,”p5”}
ride“B”#{day => 21,month => 7,year => 2022}5216{“p4”,”p3”,”p1”}

And that’s more or less the illusion mnesia provides when we store the records in an mnesia table.

Information Storage

mnesia can keep the tables in RAM, on disc or both, replicating as necessary. Further, mnesia can run in a distributed environment with multiple erlang nodes and in such configurations, it can be set up to replicate the tables on any number of those nodes as well.

We need to tell mnesia the directory in which to store the table and other information for the Erlang node by means of the enviornment variable dir of application mnesia. We can set the variable in the system config file, set it explicitly before starting mnesia, or provide it in the erl command. In the latter case, remember that although the directory is a string (enclosed in double quotes), it needs to be provided as an atom, so we enclose the double-quoted string in single quotes. It looks weird at first, but you get used to it.

If the dir environment variable is not defined, mnesia assumes it is Mnesia.Node where Node is nonode@nohost if distribution is not enabled or the value returned by erlang:node/0.
Please note that the directory is specific for an Erlang node. We cannot use the same directory for different Erlang nodes.

Schema

Once we’ve setup the directory, we need to create a schema. The schema is a table that holds information about all the tables, including their definitions in terms of what kind of records are stored, the nodes the tables are replicated to, whether they are stored in RAM or disc or both, etc.

mnesia cannot run without a schema, so if you try to start mnesia without ever having created a schema, it will create one in the RAM.

Since mnesia uses the schema for its operations, you cannot create a schema while mnesia is running. When mnesia is not running, you can create the schema with mnesia:create_schema/1 which takes a list of nodes on which the schema needs to be created. The nodes must be running without mnesia running on those nodes. We’ll stick to configurations with just one running node for now.

We populate the schema with table definitions which we’ll cover shortly, but let’s now try our hands on the concepts so far as you might be getting restless with all this theory.

Make a temporary directory, e.g. /tmp/blog where the tables will be stored and then start an Erlang shell.

Here follows a short session of commands whose function you likely can guess.

$ mkdir /tmp/blog  
$ erl  
Erlang/OTP 24 [erts-12.0.2] [source] [64-bit] [smp:2:2] [ds:2: 2:10] [async-threads:1] [jit]
Eshell V12.0.2  (abort with ^G)
1> application:get_env(mnesia, dir).
undefined
2> mnesia:system_info(directory).
"/home/mint/projects/blog/mnesia/Mnesia.nonode@nohost"
3> application:set_env(mnesia, dir, "/tmp/blog").
ok
4> mnesia:system_info(directory).                 
"/tmp/blog"
5> mnesia:system_info(use_dir).   
false
6> mnesia:system_info(db_nodes).
[nonode@nohost]
7> mnesia:info().                                 
===> System info in version "4.19.1", debug level = none <=== opt_disc. Directory "/tmp/blog" is NOT used. use fallback at restart = false running db nodes   = [] stopped db nodes   = [nonode@nohost]   ok 8> mnesia:create_schema([node()]).
ok
9> mnesia:system_info(use_dir).    
true
10> ls("/tmp/blog").
LATEST.LOG     schema.DAT      
ok
11> mnesia:info().                  
===> System info in version "4.19.1", debug level = none <=== opt_disc. Directory "/tmp/blog" is used. use fallback at restart = true running db nodes   = [] stopped db nodes   = [nonode@nohost]   
ok 
12> mnesia:start().
ok
13> mnesia:info().  
---> Processes holding locks <---   ---> Processes waiting for locks <---   ---> Participant transactions <---   ---> Coordinator transactions <--- ---> Uncertain transactions <---   ---> Active tables <---   schema         : with 1        records occupying 414      word s of mem ===> System info in version "4.19.1", debug level = none <===
opt_disc. Directory "/tmp/blog" is used.
use fallback at restart = false
running db nodes   = [nonode@nohost]
stopped db nodes   = []  
master node tables = []
remote             = []
ram_copies         = []
disc_copies        = [schema]
disc_only_copies   = []
[{nonode@nohost,disc_copies}] = [schema]
2 transactions committed, 0 aborted, 0 restarted, 0 logged to disc
0 held locks, 0 in queue; 0 local transactions, 0 remote
0 transactions waits for other nodes: []
ok

mnesia:system_info/1 provides information on a number of aspects of the application, most of which can be queried even when it is not running.

directory is obvious. use_dir says if the said directory is being used by mnesia or not. As we can see from lines 5 and 9, this is true only when a schema has been created. And from line 10 we see that a file called schema.DAT has been created. This is actually a dets file and you can inspect it with dets:traverse/2 if you are really curious.

Use of mnesia:info/0 in the shell can provide a lot of information on the current state of the application, most of which should already make sense.

Tables

Let’s move on and see how to create the table to record the rides with mnesia:create_table/2 which wants the name of the table and some optional information as arguments.

Even though we can call the table whatever we want, it is convenient to name the table as the name of the records it will store.

mnesia:create_table(
    ride,
    [{attributes, record_info(fields, ride)},
     {disc_copies, [node()]},
     {record_name, ride},
     {type, bag}])

Don’t try it out in the shell yet as it won’t work. Let’s first see what the options are saying and then we’ll see how to get it to work.

attributes say what “columns” the table will have. The compiler-generated pseudo-function record_info/2 comes to the rescue.

disc_copies specifies the nodes where replicas of the table on disc as well as in RAM will reside. If you don’t want RAM copies, use the option disc_only_copies.

record_name specifies the record structures which will be stored. The default is the name of the table itself.

Finally there is type which can be set, ordered_set or bag. To understand what that means, we need to mention that the first attribute of the record (or the second element of the corresponding tuple if you prefer) is called the key of the record. A key is used to select a record from among all the records in the table. When the type is set, there is at most one record in the table with a given key. When it is bag, there can be any number of records with the same key, but they must differ in at least one other attibute. Here, we decide to use rider as the key (listing it as the first attibute wasn’t accidental) and bag as the type as we could have many records for the same rider.

Since record_info is generated by the compiler at compile time only, we’ll use mnesia:create_table/2 in a module and not from the shell (unless you decide to provide the list of fields explicitly, but it is not recommended). If the operation succeeds, it returns {atomic, ok}. Although we will deal with transactions later, this says that the operation was atomic, that is it succeded on all nodes. If it had failed on any or more of the nodes, we would be sure the table did not get created on any of the nodes.

mnesia:info/0 or mnesia:table_info/2 can give us information about the created table.

Reading and writing records

We finally get to the point where we can actually use the table we created.

We’ll later want to add records to the table with mnesia:write/1, the variant of mnesia:write/3 that uses some default values, and view the ones already in the table with mnesia:read/1, the varaint of mnesia:read/3, but we start with mnesia:dirty_write/1 and mnesia:dirty_read/2.

We write a function in our blog module to add a ride

add_ride_v1(Rider, Date, Duration, Distance, Itinerary) ->
    mnesia:dirty_write(#ride{rider = Rider,
                             date = Date,
                             duration = Duration,
                             distance = Distance,
                             itinerary = Itinerary}).

and then add a record using it

12> blog:add_ride_v1("A", #{day=>13,month=>6,year=>2022},83,29,{"p1","p2","p3"}).
ok
13> mnesia:table_info(ride,size).
1

We also write a function in our blog module to view a record

view_rides_v1(Rider) ->
    mnesia:dirty_read(accounting, Rider).

and then use it to view the record we added

14> blog:view_rides_v1("A").
[{ride,"A",
      #{day => 13,month => 6,year => 2022},
      83,29,
      {"p1","p2","p3"}}]

There is another extremely useful way of viewing the records in the table which comes in very handy when you are beginning to use mnesia. It is the observer tool. Go to the Table Viewer tab, and then in the View menu, select Mnesia Tables. You will see a list of tables (minus the schema). If you double click the row showing the ride table, another window opens displaying all the records in the table. If you double click on any row, that record opens in a edit window where you can even modify those values, but be careful when doing that.

Note that the read operation returns a list to accomodate for the fact that we could have had multiple records and even none. Try adding the records in the table we saw earlier and view the records for rider “A”. Try reading the records for riders “B” and “C”.

We are now ready for transactions.

Transactions

Let’s assume we want to count all adding and viewing operations for every rider. We keep the counts in another table called accounting with accounting records defined as

-record(accounting, {rider, read_count, write_count}).

and create the corresponding table with

mnesia:create_table(
    accounting,
    [{attributes, record_info(fields, accounting)},
     {disc_copies, [node()]},
     {type, set}])

Now every time we add a ride record, we also increment the corresponding write count. We modify our add_ride function to do that

add_ride_v2(Rider, Date, Duration, Distance, Itinerary) ->
    mnesia:dirty_write(#ride{rider = Rider,
                             date = Date,
                             duration = Duration,
                             distance = Distance,
                             itinerary = Itinerary}),
    %% mnesia:stop(),
    case mnesia:dirty_read(accounting, Rider) of
        [] -> %% no accounting for Rider yet
            mnesia:dirty_write(#accounting{rider = Rider,
                                           write_count = 1});
        [#accounting{write_count = Writes} = Rec] ->
            mnesia:dirty_write(Rec#accounting{write_count = Writes + 1})
    end.

Similarly, we modify our view_record function to increment the corresponding read count

view_rides_v2(Rider) ->
    case mnesia:dirty_read(accounting, Rider) of
       [] -> %% no accounting for Rider yet
            mnesia:dirty_write(#accounting{rider = Rider,
                                           read_count = 1});
       [#accounting{read_count = Reads} = Rec] ->
           %% timer:sleep(1000),
           mnesia:dirty_write(Rec#accounting{read_count = Reads + 1})
    end,
    %% mnesia:stop(),
    mnesia:dirty_read(ride, Rider).

Ignore the commented out mnesia:stop and timer:sleep calls for now.

If we compile the module again and try the new versions, we’ll see that things work as expected.

29> blog:add_ride_v2("D", #{day=>21,month=>7,year=>2022},52,16,{"p4","p3","p1"}).
ok
30> blog:view_rides_v2("D").
[{ride,"D",
      #{day => 21,month => 7,year => 2022},
      52,16,
      {"p4","p3","p1"}}]
31> mnesia:dirty_read(accounting, "D").
[{accounting,"D",1,1}]

However, there’s a problem with this.

The functions are not atomic. When adding a record, what if something goes wrong after the ride record has been written? We will have added a ride record without accounting for it. In the same way, if something goes wrong after the ride record has been read in the record viewing function, we would account for the viewing of a record that has actually not been viewed.

To verify this, let’s simulate this in the viewing function by uncommenting mnesia:stop().

If we try viewing the record now, mnesia will stop. Starting it again and verifying the counts in the accounting table will show that indeed the count has been incremented but we hadn’t actually viewed the record.

40> blog:view_rides_v2("D").                  
=INFO REPORT====
   application: mnesia
   exited: stopped
   type: temporary

** exception exit: {aborted,{no_exists,[ride,"D"]}}  
  in function &nbsp;mnesia:abort/1 (mnesia.erl, line 361)  
41> mnesia:start().  
ok  
42> mnesia:dirty_read(accounting, "D").  
[{accounting,"D",2,1}]

This is where transactions come in. A transaction is a set of operations which either succeeds or is aborted. When it succeeds all operations are guaranteed to have succeded. When it aborts, it is as if none of the operations succeded. When a function is executed within an mnesia transaction, mnesia provides that guarantee on all mnesia operations.

The non-dirty variants of read and write can only be used in a transaction context. Here’s how we’ll rewrite our viewing function to work in a transaction context

view_rides_v3(Rider) ->
    Inspect =
        fun() ->
            case mnesia:read(accounting, Rider) of
                [] -> %% no accounting yet
                    mnesia:write(#accounting{rider = Rider,
                                             read_count = 1});
                [#accounting{read_count = Reads} = Rec] ->
                    %% timer:sleep(1000),
                    mnesia:write(Rec#accounting{read_count = Reads + 1})
            end,
            %% mnesia:stop(),
            mnesia:read(ride, Rider)
        end,
    mnesia:transaction(Inspect).

We can easily verify that this works just like the previous version. Now, as before, uncomment the mnesia:stop instruction, compile and run it again. mnesia will stop, but this time, when we start mnesia again, the read count will not have been incremented.

58> blog:view_rides_v3("D").            
{atomic,[{ride,"D",
              #{day => 21,month => 7,year => 2022},
              52,16,
              {"p4","p3","p1"}}]}
59> mnesia:dirty_read(accounting, "D").
[{accounting,"D",3,1}]
60> c(blog).                            
{ok,blog}
61> blog:view_rides_v3("D").            
** exception exit: shutdown
62> =INFO REPORT==== 12-Aug-2022::17:15:41.989913 ===
   application: mnesia
   exited: stopped
   type: temporary

62> mnesia:start().                       
ok  
63> mnesia:dirty_read(accounting, "D").  
[{accounting,"D",3,1}]

Locks

The question of multiple operations all succeeding together is just one aspect of a transaction. There is another serious problem without a transaction that has to do with concurrency.

Immagine two processes trying to view a rider’s records at the same time. Both processes could begin with seeing the same read_count, so they’ll both increment it by one. With the result that the read count will have been incremented by one even though there were two read operations.

We can simulate the simultaneous reading of the read count by waiting a little bit before incrementing the count. So let’s uncomment timer:sleep(1000) instead of mnesia:stop() in the viewing functions, recompile and spawn a couple of processes both trying to view the record. e.g. as follows

86> [spawn(fun() -> io:format("~p~n", [blog:view_rides_v2("D")]) end) 
       || _ <- [1,2]].    
[<0.3998.0>,<0.3999.0>]
[{ride,"D",#{day => 21,month => 7,year => 2022},52,16,{"p4","p3","p1"}}]
[{ride,"D",#{day => 21,month => 7,year => 2022},52,16,{"p4","p3","p1"}}]
87> mnesia:dirty_read(accounting,"D").
[{accounting,"D",4,1}]

As expected, the read count has incremented by one.

Now try the version that uses a transaction context

88> [spawn(fun() -> io:format("~p~n", [blog:view_rides_v3("D")]) end) 
      || _ <- [1,2]]. 
[<0.4016.0>,<0.4017.0>]
{atomic,[{ride,"D",
              #{day => 21,month => 7,year => 2022},
              52,16,
              {"p4","p3","p1"}}]}
{atomic,[{ride,"D",
              #{day => 21,month => 7,year => 2022},
              52,16,
              {"p4","p3","p1"}}]}
89> mnesia:dirty_read(accounting,"D").
[{accounting,"D",6,1}]

Surprised? This version counts the read operations correctly. The result indicates the function execution was atomic, but how exactly did that happen?

The magic works using so-called locks. In a transaction context, a process can read a record only if it has acquired a read lock on the key of that record (or on the entire table), and a process can write a record in a table only if it has acquired a write lock on its key (or on the entire table).

Incidentally, this is a classic problem of shared resources among multiple processes. With process local variables, Erlang solves the problem by making them invariant.

Only one process at any given time can acquire a lock of a kind. If a process tries to acquire a lock already acquired by some other process, it will get blocked until the lock is released. So once a process acquires a read lock on a key, it can be sure that no other process in a transaction context will be able to read records with that key until it releases the lock. And when the process acquires a write lock on a key, it can be sure that no other process in a transaction context will be able to write a record with that key until it releases the lock on the key.

Not only does mnesia acquire and release the requested locks before executing its read and write functions in a transaction context, it also takes care of the fact the tables may be replicated to other nodes. In fact a read lock is acquired on one node, possibly the local node if a replica exists on the local node, and a write lock is acquired on all active nodes where the table is replicated.

However, blocking a process on the acquisition of a lock can create problems of so-called deadlocks. If one process is holding a lock on key X and needs a lock on key Y, while another process is holding a lock on key Y and needs a lock on key X, they will both block indefinitely.

mnesia solves this problem by aborting the ongoing transaction and re-try it if it can’t acquire a lock because some other process is holding it. When a transaction is aborted, mensia rolls back any changes it might have made to records in the aborted transaction. Although this makes mnesia deadlock free, it means that a transaction may be tried many times before it succeeds. And since mnesia can only rollback changes made by itself, all other actions, particularly ones with side effects, will not be rolled back. For example, if you send a message to some process during the transaction, you might end up sending that message very many times. So, be careful and try not to put statement with side effects inside transactions.

Queries

Now let’s try some more complex queries. In particular, these are queries where the key of the records is not known or is irrelevant. This would generally require an exhaustive search of the entire table, unless some other means are available to reduce the search space.

We’ll examine three ways of searching through the entire table using mnesia:foldl, mnesia:select and query list comprehensions. In the next section we look at a way to reduce the search space by means of indexes.

We’ll try to get a list of the duration of all rides, 10 km or more, that had p2 in their itinerary.

We start with using mnesia’s fold operations which have semantics similar to the fold opertions on a list. Here’s how we might do it in blog:q1/2

q1(MinDistance, Place) ->
    Find =
        fun() ->
            mnesia:foldl( %% or foldr
                fun(#ride{distance = Distance,
                          itinerary = {P1, P2, P3},
                          duration = Duration}, Acc)
                    when Distance >= MinDistance,
                         P1 =:= Place;
                         P2 =:= Place;
                         P3 =:= Place -> [Duration | Acc];
           (_, Acc) -> 
                Acc
        end, [], ride)
    end,
    mnesia:transaction(Find).

If we compile again and try the function we get the desired answers

111> c(blog).
{ok,blog}
112> blog:q1(10, "p2").  
{atomic,"$S"}
113> blog:q1(20, "p2").  
{atomic,"S"}

Those strings are actually lists of integers. If I want to quickly see what integers they are, I force Erlang to print the string as a list of integers by adding a zero to the list

114> "$S" ++ [0].
[36,83,0]

Another way is to use mnesia’s select function which employs Erlang Match Specification. We do this in function blog:q2/2

q2(MinDistance, Place) ->
    Find =
        fun() ->
            MatchHead = #ride{distance = '$1',
                              itinerary = {'$2', '$3', '$4'},
                              duration = '$5',
                              _ = '_'},
            Guards = [{'>=', '$1', MinDistance},
                      {'orelse',
                        {'=:=', '$2', Place},
                        {'orelse',
                          {'=:=', '$3', Place}, {'=:=', '$4', Place}}}],
            Results = ['$5'],
            mnesia:select(ride, [{MatchHead, Guards, Results}])
        end,
    mnesia:transaction(Find).

Compiling and running q2, gives us the same answers

132> c(blog).                
{ok,blog}
133> blog:q2(10, "p2").
{atomic,"S$"}

Now this is arguably quite cryptic if you are not familiar with Erlang Match Specifications. Very briefly, a match specification is a list of match functions.

Each match function is a tuple of a match head, guards and results.

The match head is a sort of record template, where attributes in the records can be bound to variables, the variables being expressed as atoms with the form $N. Attributes that we are not interested in can be bound to the “don’t care” variable '_'. In our example, we have bound distance to the variable '$1' and duration to '$5'.

Guards is a list of matching conditions expressed in tuples with the first element the matching condition function. In our example, we have two matching conditions. The first is that distance is greater than or equal to MinDistance. The second says Place is one of the three elements in itinerary.

Lastly results says what we want to extract from the records that match. In our example we simply fetch the duration.

Querying using match specifications tends to be more efficient than going through the entire table with a fold.

The last method we’ll examine is by means of Query List Comprehensions implemented in the Erlang distribution’s module qlc. It is a generic set of functions to work with abstractions called QLC Tables. table/1 and table/2 functions in mnesia, ets and dets modules, provide a way to obtain a so-called query handles on mnesia, ets and dets tables respectively.

Queries list comprehensions are like list comprehensions, but instead of applying to and producing lists, they apply to and produce query handles. To obtain the result of a query, qlc:e/1 is applied to the query handle.

We can get all records of a table by evaluating the query handle corresponding to an mnesia table

142> QH = mnesia:table(ride).  
{qlc_handle,{qlc_table,#Fun<mnesia.23.126602418>,true,  
        #Fun<mnesia.24.126602418>,#Fun<mnesia.25.126602418>,  
        #Fun<mnesia.26.126602418>,#Fun<mnesia.29.126602418>,  
        #Fun<mnesia.28.126602418>,#Fun<mnesia.27.126602418>,'=:=',  
        undefined,no_match_spec}}
143> mnesia:transaction(fun() -> qlc:e(QH) end).
{atomic,[{ride,"A",
              #{day => 13,month => 6,year => 2022},
              83,29,
              {"p1","p2","p3"}},
        {ride,"A",
              #{day => 15,month => 6,year => 2022},
              22,10,
              {"p1","p3","p4"}},
        {ride,"A",
              #{day => 18,month => 6,year => 2022},
              36,18,
              {"p2","p3","p1"}}, 
        …

We can transform a query handle by applying qlc:q/1 to a query list comprehension. We’ll do that in blog:q3/2, the implementation of our previous query on rides longer than a certain distance and with a given place in the itineray. In order to use qlc, we must include qlc.hrl

-include_lib("stdlib/include/qlc.hrl").
q3(MinDistance, Place) ->
    Find =
        fun() ->
            Table = mnesia:table(ride),
            Query = qlc:q([R#ride.duration
                            || R = #ride{distance = Distance,
                                         itinerary = Itinerary,
                                         duration = Duration}
                            <- Table , Distance >= MinDistance,
                            lists:member(Place, tuple_to_list(Itinerary))]),
            qlc:e(Query)
        end,
    mnesia:transaction(Find).

Once again, if we compile and run the function, we’ll get the same results

165> c(blog).           
{ok,blog}
166> blog:q3(10, "p2").
{atomic,"S$"}

Indexes

When queries could have been more efficient if only some other field of the record was the key, we can make use of indexes, or secondary keys. They are particularly useful when matching fields of the record with specific values, rather than constraints.

We add an index, or secondary key to a table with mnesia:add_table_index/2 and then reading a record employing the secondary key with mnesia:index_read/3

179> mnesia:add_table_index(ride, distance).
{atomic,ok}
180> mnesia:transaction(fun() -> mnesia:index_read(ride, 36, distance) end).
{atomic,[]}
181> mnesia:transaction(fun() -> mnesia:index_read(ride, 16, distance) end).
{atomic,[{ride,"B",
              #{day => 21,month => 7,year => 2022},
              52,16,
              {"p4","p3","p1"}}]}

We could have made the same query even without the index on distance using mnesia:match_object/1, which we haven’t talked about yet

183> rr("blog.erl").
[accounting,ride]
184> mnesia:transaction(fun() -> mnesia:match_object(#ride{distance = 16, _ = '_'}) end).
{atomic,[#ride{rider = "B",
               date = #{day => 21,month => 7,year => 2022},
               duration = 52,distance = 16,
               itinerary = {"p4","p3","p1"}}]}

An aside. To try this out directly from the shell, I imported the record definitions with the shell’s rr/1 function, with the added benefit that the output got printed as a record instead of a tuple.

Even though it is not obvious from it, mnesia tries makes use of any indexes that exist on the table by automatically using mnesia:index_match_object/2. In fact, mnesia does that also with query list comprehensions.

Fault Tolerance

Let’s now have a look at how table replication to different nodes helps make a fault-tolerant database. This makes mnesia really stand apart, as I mentioned in the introduction.

Let’s begin by creating three directories for three different nodes, n1, n2 and n3, and start the nodes ensuring that mnesia’s environment variable dir is bound to the path of one of the three directories.

$ mkdir /tmp/db1 /tmp/db2 /tmp/db3
$ erl -sname n3 -mnesia dir '"/tmp/db3"'  
Eshell V12.0.2  (abort with ^G)
(n3@arif-mint)1> nodes().
[]
(n3@arif-mint)2> [net_adm:ping('n1@arif-mint'), net_adm:ping('n2@arif-mint')].
[pong,pong]
(n3@arif-mint)3> Nodes = [node() | nodes()].
['n3@arif-mint','n1@arif-mint','n2@arif-mint']
(n3@arif-mint)4> [rpc:call(N, mnesia, system_info, [use_dir]) || N <- Nodes].
[false,false,false]

Now we create the schema

(n3@arif-mint)5> mnesia:create_schema(Nodes).
ok
(n3@arif-mint)6> [rpc:call(N, mnesia, system_info, [use_dir]) || N <- Nodes].
[true,true,true]

Next we start mnesia on all the nodes

(n3@arif-mint)7> [rpc:call(N, mnesia, start, []) || N <- [node() | nodes()]].
[ok,ok,ok]

Now we can create the tables. To allow us to work in the shell, we import the record definitions from our module blog and print an empty record to get the attributes.

(n3@arif-mint)8> rr(blog).
[accounting,ride]
(n3@arif-mint)9> #ride{}.
#ride{rider = undefined,date = undefined,
     duration = undefined,distance = undefined,
     itinerary = undefined}
(n3@arif-mint)10> mnesia:create_table(ride, [{attributes, [rider, date, duration, distance, itinerary]}, {disc_copies, Nodes}, {type, bag}]).
{atomic,ok}
(n3@arif-mint)11> #accounting{}.
#accounting{rider = undefined,read_count = 0,
           write_count = 0}
(n3@arif-mint)12> mnesia:create_table(accounting, [{attributes, [rider, read_count, write_count]}, {disc_copies, Nodes}, {type, set}]).
{atomic,ok}
(n3@arif-mint)13> [rpc:call(N, mnesia, system_info, [tables]) || N <- Nodes].
[[accounting,ride,schema], 
 [accounting,ride,schema], 
 [accounting,ride,schema]] 

Let’s now write a ride record and verify we can read it on all of the nodes

(n3@arif-mint)18> mnesia:transaction(fun() -> mnesia:write(#ride{rider = "R", distance = 10, duration = 32}) end).
{atomic,ok}
(n3@arif-mint)19> [rpc:call(N, mnesia, dirty_read, [ride, "R"]) || N <- Nodes].
[[#ride{rider = "R",date = undefined,duration = 32,
       distance = 10,itinerary = undefined}],
[#ride{rider = "R",date = undefined,duration = 32,
       distance = 10,itinerary = undefined}],
[#ride{rider = "R",date = undefined,duration = 32,
       distance = 10,itinerary = undefined}]]

Now we restart node n1, start mnesia and verify that we can read the record we wrote from node n3

$ erl -sname n1 -mnesia dir '"/tmp/db1"'
Eshell V12.0.2  (abort with ^G)
(n1@arif-mint)1> mnesia:start().
ok
(n1@arif-mint)2> nodes().
['n2@arif-mint']
(n1@arif-mint)3> mnesia:dirty_read(ride, "K").
[{ride,"K",undefined,53,20,undefined}]

Well, I suppose that should be sufficient to convince us about the resilience of mnesia when we have to do with failing nodes.

However, don’t get deceived into believing mnesia is some kind of magical solve-all tool. Certain failure scenarios can leave the nodes in an inconsistent state. When that happens mnesia can detect it and stop with a specific message. mnesia provides tools to deal with such situations, the simplest being declaring one of the nodes as the master node and have it replicated to the remaining nodes. But we are into advanced territory, not in the realm of a simple getting started. What’s important is to remember that real problems do exist, but there are tools to help with solving them.

Letting wx widgets crash

Fault-tolerance

One of the features of Erlang that I love is the possibility of “letting it crash”, made possible by supervision trees in which processes can crash and will be brought back to life by the supervisor.

We cannot directly supervise wx_objects because they do not adhere to OTP design requirements1.

In this blog we’ll explore ways to make a fault-tolerant GUI that we could “let crash”.

Let’s start with an extremely simple application called click_counter, which displays a button with a label that says how many times it has been clicked. The application has one gen_server, called main, which is supervised by the application’s top supervisor. We use one_for_one supervision with a restart strategy of permanent for main.

The button will later be implemented as a wx_object called slave, but for now let’s just implement it as a plain button widget.

The following is a possible implementation of main:init/1

init([]) ->  %% in main.erl
    wx:new(),
    Caption = io_lib:format("click_counter ~p", [self()]),
    Frame = wxFrame:new(wx:null(), ?wxID_ANY, Caption),

    Slave = wxButton:new(Frame, 1000, [{label, "Never been clicked"}]), 

    Sizer = wxBoxSizer:new(?wxVERTICAL),
    wxSizer:add(Sizer, Slave, [{flag, ?wxEXPAND bor ?wxALL}, {border, 5}]),

    wxFrame:setSizer(Frame, Sizer),

    wxFrame:connect(Frame, close_window), 
    wxButton:connect(Slave, command_button_clicked),

    wxFrame:show(Frame),

    {ok, #{frame => Frame,
	     sizer => Sizer,
	     slave => Slave,
	     click_count => 0}}.

That is quite straightforward. We give the button an id of 1000. We keep the frame, the sizer and the button as well as the click count in main’s state. We also subscribe to window closing and button click events with the wxWindow:connect/2 calls.

We handle those events in the main:handle_info/22 callback

handle_info(#wx{event = #wxClose{}}, State) ->
    {stop, normal, State};

handle_info(#wx{id=1000}, State = #{slave := Button,
                                    click_count := CurrCount}) ->
    Count = CurrCount + 1,
    Label = io_lib:format("Clicked ~p times", [Count]),
    wxButton:setLabel(Button, Label),
    {noreply, State#{click_count => Count}}.

The window closing event requests the stopping of the gen_server. We’ll see later that this handling is not quite what we’d desire.

The button click event increments the click count and updates the button’s label accordingly.

We also cleanup wx resources in main:terminate/2

terminate(_Reason, _State) ->  %% in main.erl
    wx:destroy().

Let’s now simulate the crashing of main. We’ll send it a message, crash, and handle it in such a way as to make the process crash.

handle_info(crash, State) ->  %% in main.erl
    1 = 2, %% this crashes
    {noreply, State};

We could use erlang:exit/1, but I prefer something like 1=2 to simulate unintended crashing.

If we start the application, a frame with a caption like “click counter <0.84.0>” is displayed.

If we click on the button, its label will get updated with the number of times we’ve clicked it.

If we send the message crash to main with

1> main ! crash.

main will terminate, cleaning up wx as programmed. However, the supervisor will restart the process and a new frame will be displayed. We can see it is a new frame because the pid in the frame’s caption will be different. The state is not preserved, so the button label will indicate it has never been clicked.

Essentially, we have a fault-tolerant GUI. If something in the code makes the GUI process crash, we’ll have it restarted. But all is not well, as we’ll see next.

Let’s try closing the frame by clicking on its close icon.

Even though we might have expected the frame to disappear, it has been restarted, just as it would have in case the process crashed. That’s because we are using the permanent restart strategy in the main child specification, so even though the process was terminated “normally”, the supervisor restarts it all the same.

We can fix that by changing the restart strategy to transient. Now the frame will restart if we crash main, but will disapper when we close the window. However, the click_count application will still be running. So we have to figure out a way on how to force that.

An erroneous way could be to request the application be stopped explicitly if the reason for the process termination is normal or shutdown

terminate(Reason, _State) ->  %% in main.erl
    wx:destroy(),
    case Reason of
    	R when R =:= normal; 
	       R =:= shutdown -> 
    	    application:stop(frame_app); 
    	_->
    	    ok
    end.

It’s erroneous because even though the application does get stopped after a while, that’s because of a timeout, not because the application stopped neatly. We’re trying to stop the application synchronously from within a process that is part of that application.

We can improve upon that by spawning a process which stops the application

spawn(fun() -> application:stop(click_counter) end);  

Concurrency

Now let’s see how to extend this principle when employing widgets that are controlled by different processes.

We’ll modify main so that the button is created in a separate process, slave, with the behaviour wx_object. We’ll move the button creation and handling of click counts to that slave process.

init([Parent]) ->   %% in slave.erl 
    Button = wxButton:new(Parent, 1000, [{label, "Never been clicked"}]),
    wxButton:connect(Button, command_button_clicked),
    {Button, #{parent => Parent, 
	         button => Button, 
	         click_count => 0}}.

We modify main:init/1 to use this wx_object instead of explicitly creating the button widget.

Slave = wx_object:start_link({local, slave}, slave, [Frame], []),

Note that we’re using wx_object:start_link/4 in order to register the slave process name.

We’ll handle click counting in slave in the wx_object:handle_event/2 callback. In this case it’s just a name change from handle_info to handle_event.

We’ll also cleanup wx resources in wx_object:terminate/2

terminate(_Reason, #{button := Button}) ->
    wxButton:destroy(Button). 

Note that slave is now linked to main. If slave crashes, it will make main crash too. Vice versa, if main crashes, it will kill slave as well. In both cases, main will be restarted, which will create the slave again.

This is a brutal solution of sorts, which is fine in this simple example, but if we have several slave widgets, we’d probably want a cleaner solution.

One way is to handle the crashing slave in main.

For this, we setup main to handle trap_exits

process_flag(trap_exit, true),  %% in main:init/1

When the slave does crash, an 'EXIT' signal will be sent to main, where we can handle it

handle_info({'EXIT', _, _}, 
	      State = #{frame := Frame,
		          sizer := Sizer,
		          slave := Slave}) ->
    NewSlave = wx_object:start_link({local, slave}, slave, [Frame], []),
    true = wxSizer:replace(Sizer, Slave, NewSlave),
    wxSizer:layout(Sizer),
    {noreply, State#{slave => NewSlave}};

We can verify this works, by clicking the slave button a few times so that the click count changes, then make it crash. The button will get redrawn with the initial “Never been clicked” label, but the pid of the frame will not have changed.

To conclude, we can make supervisor-friendly GUIs by employing plain gen_servers instead of wx_objects. And we can add wx_objects to sizers, instead of plain wx widgets. Those wx_objects can be linked to the gen_servers in order to implement suitable strategies to recover from any crashes in them.

1For example, wx_object:start_link does not return {ok, Pid} upon success.

2If main had been a wx_object, we would have handled the events in its handle_event/2 callback.

Starting to play with xmerl

Suppose we have the following snippet of xml:

<book>
  <person>
    <first>Kiran</first>
    <last>Pai</last>
    <age>22</age>
  </person>
</book>

The element book may contain any number of person elements, but here we have just one to keep things simple.

Our aim is to transform this snippet into an Erlang term: {book, [{person, [{first, "Kiran"}, {last, "Pai"}, {age, "22"}]}]}, so that we can manipulate its values.

We’ll first bind the xml as a string to a variable, say X

1> X = "<book>
         <person>
	     <first>Kiran</first>
	     <last>Pai</last>
	     <age>22</age>
	  </person>
	</book>".

Before we parse it, we’ll load some record definitions from the xmerl library

2> rr("path_to_xmerl.hrl").
[xmerl_event,xmerl_fun_states,xmerl_scanner,xmlAttribute,
 xmlComment,xmlContext,xmlDecl,xmlDocument,xmlElement,
 xmlNamespace,xmlNode,xmlNsNode,xmlObj,xmlPI,xmlText]

Now we scan X with xmerl_scan:string/1

3> {Book, Rest} = xmerl_scan:string(X).
{#xmlElement{
     name = book,expanded_name = book,nsinfo = [],
     namespace = #xmlNamespace{default = [],nodes = []},
     parents = [],pos = 1,attributes = [],
     content = 
         [#xmlText{
              parents = [{book,1}],
              pos = 1,language = [],value = "\n  ",type = text},
          #xmlElement{
              name = person,expanded_name = person,nsinfo = [],
              namespace = #xmlNamespace{default = [],nodes = []},
              parents = [{book,1}],
              pos = 2,attributes = [],
              content = 
                  [#xmlText{
                       parents = [{person,2},{book,1}],
                       pos = 1,language = [],value = "\n    ",type = text},
                   #xmlElement{
                       name = first,expanded_name = first,nsinfo = [],
                       namespace = #xmlNamespace{...},
                       parents = [...],...},
                   #xmlText{
                       parents = [{person,2},{book,...}],
                       pos = 3,language = [],
                       value = [...],...},
                   #xmlElement{
                       name = last,expanded_name = last,nsinfo = [],...},
                   #xmlText{parents = [{...}|...],pos = 5,...},
                   #xmlElement{name = age,...},
                   #xmlText{...}],
              language = [],
              xmlbase = 
                  "C:/Users/Desktop/erlang/projects/play/xmerl",
              elementdef = undeclared},
          #xmlText{
              parents = [{book,1}],
              pos = 3,language = [],value = "\n",type = text}],
     language = [],
     xmlbase = "C:/temp/play",
     elementdef = undeclared},
 []}

And even though it looks scary, we can see that element book did get recognized.

Let’s see how we can extract the elements the way we wanted to.

The content field of the book xmlElement record is a list of person and text elements. The content field of the person xmlElement, in turn, is a list of first, last, age and text elements. Finally, the content field of the first, last and age elements is a textElement record whose value field is the actual value of those elements.

So, the person element can be extracted from Book.

5> [Person] = [E || E <- Book#xmlElement.content, is_record(E, xmlElement)].

Similarly, we can extract the elements first, last and age from Person.

6> Leaves = [First, Last, Age] = 
6> [E || E <- Person#xmlElement.content, is_record(E, xmlElement)].

Finally, we can extract the values of the leaf elements:

7> lists:map(fun(E) -> [T] = E#xmlElement.content, 
7>                     {E#xmlElement.name, T#xmlText.value} end, 
7>           Leaves).       
[{first,"Kiran"},{last,"Pai"},{age,"22"}]

This can become tedious when we have to deal with something even slightly more complex than this. But don’t despair. xmerl can be customized. The details are in the customization tutorial link in the xmerl documentation.

We will play with the hook and accumulator functions, but there are others you may want to consider: event, fetch, continuation, rules and close functions.

Starting with the hook function, the tutorial says it is called when the processor has parsed a complete entity. The function is called with the Entity and the global state of the processor and it expects you to return a tuple with the possibly transformed entity and global state.

For our case, let’s use the hook function to transform every xmlElement into a tuple made up of its name and content.

8> HookFun = fun(#xmlElement{name=Name, content=Cont}, GS) -> 
8>                 {{Name, Cont}, GS};
8>              (E, GS) -> {E, GS}
8>           end.

If we now use this hook function as an option to our string scanning function, we get:

9> xmerl_scan:string(X, [{hook_fun, HookFun}]).                    
{{book,[#xmlText{parents = [{book,1}],
                 pos = 1,language = [],value = "\n  ",type = text},
        {person,[#xmlText{parents = [{person,2},{book,1}],
                          pos = 1,language = [],value = "\n    ",type = text},
                 {first,[#xmlText{parents = [{first,2},{person,2},{book,1}],
                                  pos = 1,language = [],value = "Kiran",
                                  type = text}]},
                 #xmlText{parents = [{person,2},{book,1}],
                          pos = 3,language = [],value = "\n    ",type = text},
                 {last,[#xmlText{parents = [{last,4},{person,2},{book,1}],
                                 pos = 1,language = [],value = "Pai",
                                 type = text}]},
                 #xmlText{parents = [{person,2},{book,1}],
                          pos = 5,language = [],value = "\n    ",type = text},
                 {age,[#xmlText{parents = [{age,6},{person,2},{book,1}],
                                pos = 1,language = [],value = "22",
                                type = text}]},
                 #xmlText{parents = [{person,2},{book,1}],
                          pos = 7,language = [],value = "\n  ",type = text}]},
        #xmlText{parents = [{book,1}],
                 pos = 3,language = [],value = "\n",type = text}]},
 []}

Not bad.

Let’s see what the accumulator function does.

Again, the customization tutorial says it is called to accumulate the contents of an entity. The function takes the parsed entity, an accumulator and the global state as parameters and should return a tuple containing a possibly modified accumulator and the global state.

If you don’t specify any accumulator function, one will be used by default, and according to the tutorial, it is:

fun(ParsedEntity, Acc, GlobalState) ->
  {[ParsedEntity | Acc], GlobalState}.

The tutrial also shows a “non accumulating” function:

fun(ParsedEntity, Acc, GlobalState) ->
  {Acc, GlobalState}.

Well, this seems to totally ignore the ParsedEntity!

Perhaps we could use it to ignore text elements that are only whitespace?

 
10> AccFun = fun(#xmlText{value=V} = E, Acc, GS) -> 
10>               case re:run(V, "^\\s*$") of
10>                 {match, _} -> {Acc, GS}; 
10>                 nomatch -> {[E | Acc], GS} 
10>               end; 
10>             (E,Acc,GS) -> 
10>               {[E|Acc], GS} 
10>           end. 

If the parsed element is a text element and its value is all whitespace, we ignore the parsed element. Otherwise we accumulate it, just like the default accumulator function.

Let’s see what happens if we use both the hook function and the accumulator function:

11> 41> xmerl_scan:string(X, [{acc_fun, AccFun}, {hook_fun, HookFun}]).
{{book,
     [{person,
          [{first,
               [#xmlText{
                    parents = [{first,2},{person,2},{book,1}],
                    pos = 1,language = [],value = "Kiran",type = text}]},
           {last,
               [#xmlText{
                    parents = [{last,4},{person,2},{book,1}],
                    pos = 1,language = [],value = "Pai",type = text}]},
           {age,
               [#xmlText{
                    parents = [{age,6},{person,2},{book,1}],
                    pos = 1,language = [],value = "22",type = text}]}]}]},
 []}

Oh, well!

The elements with text content weren’t touched. But those are the very elements whose content value we are interested in. So let’s intercept them in the accumulator function and instead of accumulating the text element, we just accumulate its value.

12> AccFun2 = fun(#xmlText{value=V} = E, Acc, GS) -> 
12>               case re:run(V, "^\\s*$") of
12>                 {match, _} -> {Acc, GS}; 
12>                 nomatch -> {[V | Acc], GS} 
12>               end; 
12>             (E,Acc,GS) -> 
12>               {[E|Acc], GS} 
12>           end. 

Not a very big change. Just use V instead of E in the nomatch clause. But the result is a big change!

13> xmerl_scan:string(X, [{acc_fun, AccFun2}, {hook_fun, HF}]).
{{book,[{person,[{first,["Kiran"]},                 
                 {last,["Pai"]},
                 {age,["22"]}]}]},
 []}

Now that is something. The only nag is we wanted the values of elements with text content to be just text, not a list of one string. We could take care of that in the hook function. Instead of returning the content straight away, we could check if the content is simply a list of one string, and if it is, return that string instead.

14> HookFun2 = fun(#xmlElement{name=Name, content=[Cont]}, GS) 
14>                  when is_list(Cont) -> 
14>                    {{Name, Cont}, GS}; 
14>               (#xmlElement{name=Name, content=Cont}, GS) -> 
14>                    {{Name, Cont}, GS}; 
14>               (E, GS) -> 
14>                    {E, GS} 
14>            end.

If we now employ this modified hook function together with the accumulator function, we get:

15> xmerl_scan:string(X, [{acc_fun, AccFun2}, {hook_fun, HookFun2}]).
{{book,[{person,[{first,"Kiran"},{last,"Pai"},{age,"22"}]}]},                
 []}  

What if we’d had more person elements:

16> Y = 
"<book>
  <person>
    <first>Kiran</first>
    <last>Pai</last>
    <age>22</age>
  </person>
  <person>
    <first>Bill</first>
    <last>Gates</last>
    <age>46</age>
  </person>
  <person>
    <first>Steve</first>
    <last>Jobs</last>
    <age>40</age>
  </person>
</book>".

Well, no problems at all.

17> xmerl_scan:string(Y, [{acc_fun, AccFun2}, {hook_fun, HookFun2}]).
{{book,[{person,[{first,"Kiran"},{last,"Pai"},{age,"22"}]},
        {person,[{first,"Bill"},{last,"Gates"},{age,"46"}]},
        {person,[{first,"Steve"},{last,"Jobs"},{age,"40"}]}]},
 []}

Here’s a snippet from a plant catalogue I picked up somewhere on the Internet.

<?xml version="1.0" encoding="UTF-8"?>
<!-- Edited by XMLSpy -->
<CATALOG>
	<PLANT>
		<COMMON>Bloodroot</COMMON>
		<BOTANICAL>Sanguinaria canadensis</BOTANICAL>
		<ZONE>4</ZONE>
		<LIGHT>Mostly Shady</LIGHT>
		<PRICE>$2.44</PRICE>
		<AVAILABILITY>031599</AVAILABILITY>
	</PLANT>
	<PLANT>
		<COMMON>Columbine</COMMON>
		<BOTANICAL>Aquilegia canadensis</BOTANICAL>
		<ZONE>3</ZONE>
		<LIGHT>Mostly Shady</LIGHT>
		<PRICE>$9.37</PRICE>
		<AVAILABILITY>030699</AVAILABILITY>
	</PLANT>
</CATALOG>

If we store this in a file, say plant_catlog.xml, we could use the xmerl_scan:file/2 function to parse it.

17> xmerl_scan:file("plant_catalog.xml", [{acc_fun, AccFun2}, {hook_fun, HookFun2}]). 
{{'CATALOG',[{'PLANT',[{'COMMON',"Bloodroot"},
                       {'BOTANICAL',"Sanguinaria canadensis"},
                       {'ZONE',"4"},
                       {'LIGHT',"Mostly Shady"},
                       {'PRICE',"$2.44"},
                       {'AVAILABILITY',"031599"}]},
             {'PLANT',[{'COMMON',"Columbine"},
                       {'BOTANICAL',"Aquilegia canadensis"},
                       {'ZONE',"3"},
                       {'LIGHT',"Mostly Shady"},
                       {'PRICE',"$9.37"},
                       {'AVAILABILITY',"030699"}]},
             {'PLANT',[{'COMMON',"Marsh Marigold"},
                       {'BOTANICAL',"Caltha palustris"},
                       {'ZONE',"4"},
                       {'LIGHT',"Mostly Sunny"},
                       {'PRICE',"$6.81"},
                       {'AVAILABILITY',"051799"}]}]},
 []}

Lastly, we may want to do the reverse. i.e. generate an XML snippet from our taged tuples. This is provided by the xmerl:export functions. The resulting list, however, may be a bit scary at first, but flattening it will lead to what you desired.

One caveat though. The function requires the form we generated in 13> above.

18> TaggedData = {book,[{person,[{first,["Kiran"]},{last,["Pai"]},{age,["22"]}]}]}.
19> lists:flatten(xmerl:export_simple_content(TaggedData, xmerl_xml)). 
"<book><person><first>Kiran</first><last>Pai</last><age>22</age></person></book>"

Starting to play with jinterface

Introduction

Here’s the problem: we want to get an Erlang process to communicate with some code running on a Java Virtual Machine and vice versa.

Running example

We’ll use a running example to demonstrate how this can be done. It uses a “counters server” running on an Erlang node. The server accepts the following asynchronous messages:

{FromPid, get, CounterName}
{FromPid, set, CounterName, Value}
{FromPid, incr, CounterName, Increment}

with the obvious behaviours.

If the counter named CounterName doesn’t exist, it is created with the default value of 0. The server returns the tuple {ServerPid, CounterName, CounterValue}. The implementation is straightforward and hardly needs a comment except that it will stop upon receiving the message stop.

-module(counters).
-export([start/0, init/0]).
start() ->
  register(counters, spawn(?MODULE, init, [])).
init() ->
  loop(ets:new(counters, [])).
  %% note that we need to crate the table here, in the spawned process
loop(Counters) ->
  receive
    {From, get, Counter} ->
      Value = case ets:lookup(Counters, Counter) of
        [] -> 0;
        [{Counter,V}] -> V
      end,
      From ! {self(), Counter, Value},
      loop(Counters);
    {From, set, Counter, Value} ->
      ets:insert(Counters, {Counter, Value}),
      From ! {self(), Counter, Value},
      loop(Counters);
    {From, incr, Counter, Increment} ->
      Value = case ets:lookup(Counters, Counter) of
        [] -> 0;
        [{Counter,V}] -> V
      end,
      ets:insert(Counters, {Counter, Value+Increment}),
      From ! {self(), Counter, Value+Increment},
      loop(Counters);
    stop ->
      io:format("gotta stop~n", []),
      ok
  end.

The Java code will be the client that can get, set or increment named counters.

jinterface

Since we can’t have the ERTS and the Java VM running in the same process, the two pieces of code are necessarily running in two different processes. In other words, the Java code and the Erlang code constitute an Erlang distributed system and the Erlang code must be run in an Erlang node, which is an ERTS that has been given a name.

For our running example, let’s start the node eNode that uses the cookie erljava.

> erl -sname eNode -setcookie erljava

And then the server counters in it

(eNode@localhost)1> c(counters).
{ok,counters}.
(eNode@localhost)2> counters:start().
<0.48.0>

The Erlang node will see the Java code as an “Erlang” code running on another “Erlang” node. The same is not true of the Java code, as Java does not have native abstractions for nodes communicating with asynchronous messages. The jinterface library in Erlang serves that purpose. It provides “Java” abstractions for things such as nodes, processes, messages and name registry, as well as “Erlang” data types. It also provides helpers to convert between the Erlang and Java data types.

A couple of other things to remember for the communication to work are epmd and cookies.

Erlang distributed systems rely on the epmd running. It is started automatically on any host with a running Erlang node. So we only need to worry about it if the Java VM will run on a host with no Erlang node. One simple way of ensuring it is up and running would be to start an Erlang node on the host and then stop it. The Erlang node will stop, but the epmd will keep running.

Erlang uses cookies to authorize communication between nodes, so the Java VM and the Erlang node must use the same cookie for them to be able to communicate.

Let’s now take a look at the abstractions jiniterface provides in Java.

Node

An “Erlang” node is represented by an object of class OtpNode. It is created by providing a name and optionally the cookie and port number. The following code creates a node called jNode that uses the cookie erljava. The port is automatically assigned.

OtpNode javaNode = new OtpNode(“jNode”, “erljava”);

To see if this node can communicate with a remote node, eNode, for our running example, we can use the ping method:

if (javaNode.ping(“eNode”, 10000))
  System.out.println(“eNode is up”);
else
  System.out.println(“eNode is down”);

Process, Message box

Within an Erlang node, Erlang processes, identified by pids, have message boxes where they receive messages from other processes. jinterface provides the abstraction OtpMbox to represent those message boxes. Each OtpMbox object has an associated OtpErlangPid object which acts like the pid of a real Erlang process. An OtpMbox thus represents an “Erlang” process.

A mail box is created using the createMbox method of a node object.

OtpMbox jProcess = javaNode.createMbox();

It’s pid can be obtained using the method self

OtpErlangPid jPid = jProcess.self();

Much like the ERTS registry, we could register a name for it within the node either explicitly with the method registerName or implicitly by providing it in the constructor of the message box

jProcess.registerName(“jProcess”);
OtpMbox anotheJProcess = jNode.createNode(“anotherProcess”);

Data types

All Erlang types are represented as objects that derive from the abstract type OtpErlangObject.

You can see the complete list of types in the Eralng documentation. Let’s just have a look at the data types we need for our running example.

Atom

Erlang atoms are mapped to OrpErlangAtom objects.

Integer

Integers are mapped to OtpErlangLong objects. We can get its value as an integer using the method intValue.

OtpErlangLong anErlangInt = new OtpErlangLong();
int aJavaInt = anErlangInt.intValue();

List

Lists are mapped to OtpErlangList objects. Its length is obtained using the method arity. Its string representation is obtained using the method toString. We can get an array of objects of type OtpErlangObject using the method elements. You can get the element by index (zero-based) using the method elementAt.

OtpErlangList anEmptyErlangList = new OtpErlangList();
OtpErlangList anErlangString = new OtpErlangList(“Erlang is kool!”);
String kool = anErlangString.stringValue();
OtpErlangObject[] objs =
  {new OtpErlangList(“one”), new OtpErlangList(“two”)};
OtpErlangList anErlangListOfLists = new OtpErlangList(objs);

String

Even though Erlang treats strings as lists of integers, jinterface provides the OtpErlangString abastraction.

Tuple

Erlang tuples are mapped to OtpErlangTuple objects. The interface is very similar to the interface to the OtpErlangList.

Pid

A pid is mapped to the OtpErlangPid.

Sending messages

Enough of that! Let’s see how we can send messages to another node. The interface to OtpMbox includes a method send that will be used for this purpose.

In our running example, we need to send messages to the process named counters on the node named eNode. The message must be a tuple and its first element must be the pid of the sending process.

Thus, to get the value of counter “my_counter”, we’ll use:

OtpErlangObject get_atom = new OtpErlangAtom("get");
OtpErlangObject counter_name = new OtpErlangList("my_counter");
OtpErlangObject[] get_msg = {jPid, get_atom, counter_name};
jProcess.send(“counters”, “eNode”, new OtpErlangTuple(get_msg));

To set its value to 1234, we’ll use:

OtpErlangObject aValue = new OtpErlangLong(1234);
OtpErlangObject[] set_msg =
  {jPid, new OtpErlangAtom("set"), counter_name, aValue};
jProcess.send(“counters”, “eNode”, new OtpErlangTuple(set_msg));

Finally, to increment its value by 10, for example, we’d use:

aValue = new OtpErlangLong(10);
OtpErlangObject[] incr_msg =
  {jPid, new OtpErlangAtom("incr"), counter_name, aValue};
jProcess.send(“counters”, “eNode”, new OtpErlangTuple(incr_msg));

Please note that we did not have to explicitly connect the “java” node to the counters server, as we would need to in a normal Erlang distributed system. The reason is that the jinterface library sets up the connection automatically before the send operation.

Receiving messages

We need a way to receive the messages sent by the Erlang node. We can do it with the receive method of the message box. It will block until a message arrives, which is then delivered as an OtpErlangObject.

OtpErlangObject response = jProcess.receive();

In our example, after sending the “get” message, we expect to receive a tuple whose first element is the pid of the counters server process, the second the name of the counter, “my_counter” and the third the counter’s value, 0, since it has never been used.

if (response instanceof OtpErlangTuple) {
  String remote_counter_name =
    ((OtpErlangString)((OtpErlangTuple) response).elementAt(1))
    .stringValue();
  int remote_counter_value =
    ((OtpErlangLong)((OtpErlangTuple) response).elementAt(2))
    .intValue();
  System.out.println(
    remote_counter_name + " = " +
    remote_counter_value);
}

Running the example client

Thereìs a couple of things we still need to do before we can actually run the java client.

First, we must import the jinterface library:


import com.ericsson.otp.erlang.*;

Then we must set the java class path. One simple way is to define an environment variable called CLASSPATH to be the classpath. Assuming the jinterface library is installed in /usr/local/otp/lib/jinterface-1.5.8/java_src, then the classpath would be something like .://usr/local/otp/lib/jinterface-1.5.8/java_src.

Next we need to put the code in the main method of some java class.

Assuming our java class is called CountersClient, the code will go in a file named CountersClient.java and will look something like:

import com.ericsson.otp.erlang.*;
public class CountersClient
{
  public static void main(String[] args) throws Exception
  {
    System.setProperty("OtpConnection.trace", "0");
    OtpNode javaNode = new OtpNode("jNode", "erljava");
    if (javaNode.ping("eNode", 10000)) {
      System.out.println("eNode is up");
      OtpMbox jProcess = javaNode.createMbox();
      OtpErlangPid jPid = jProcess.self();
      jProcess.registerName("jProcess");

      // get value of my_counter
      OtpErlangObject get_atom = new OtpErlangAtom("get");
      OtpErlangObject counter_name = new OtpErlangList("my_counter");
      OtpErlangObject[] get_msg = {jPid, get_atom, counter_name};
      jProcess.send("counters", "eNode", new OtpErlangTuple(get_msg));
      OtpErlangObject response = jProcess.receive();
      if (response instanceof OtpErlangTuple) {
        String remote_counter_name =
          ((OtpErlangString)((OtpErlangTuple)response).elementAt(1))
          .stringValue();
        int remote_counter_value =
          ((OtpErlangLong)((OtpErlangTuple)response).elementAt(2))
          .intValue();
        System.out.println(
          remote_counter_name + " = " +
          remote_counter_value);
      }

      // set my_counter to 1234
      OtpErlangObject aValue = new OtpErlangLong(1234);
      OtpErlangObject[] set_msg =
        {jPid, new OtpErlangAtom("set"), counter_name, aValue};
      jProcess.send("counters", "eNode", new OtpErlangTuple(set_msg));
      response = jProcess.receive();
      if (response instanceof OtpErlangTuple) {
        String remote_counter_name =
          ((OtpErlangTuple) response).elementAt(1)
          .toString();
        String remote_counter_value =
          ((OtpErlangTuple) response).elementAt(2)
          .toString();
        System.out.println(
          remote_counter_name + " = " +
          remote_counter_value);
      }

      // increment my_counter by 10
      aValue = new OtpErlangLong(10);
      OtpErlangObject[] incr_msg =
        {jPid, new OtpErlangAtom("incr"), counter_name, aValue};
      jProcess.send("counters", "eNode", new OtpErlangTuple(incr_msg));
      response = jProcess.receive();
      if (response instanceof OtpErlangTuple) {
        String remote_counter_name =
          ((OtpErlangTuple) response).elementAt(1)
          .toString();
        String remote_counter_value =
          ((OtpErlangTuple) response).elementAt(2)
          .toString();
        System.out.println(
          remote_counter_name + " = " +
          remote_counter_value);
      }
    }
    else {
      System.out.println("eNode is down");
    }
  }
}

Tracing

One little big feature you may want to know how to use is tracing, You can set the java system property OtpConnection.trace, which defaults to 0, meaning no tracing, to a number between 1 and 4 for different levels of tracing of messages exchanged between the nodes.

System.setProperty(“OtpConnection.trace”, “4”);

Note how both the property and its value are specified as strings in java.

Starting to play with leex and yeec

Introduction

I’ve always been curious about domain specific languages (DSL), but at the same time I would say they weren’t much use as you can obtain what you want in your programming language. Besides, you have to translate the DSL to the programming language of choice anyway, so what’s the big deal.

Perhaps my reluctance was due to my not admitting I didn’t know how to interpret something written in a DSL. Perhaps things like leex and yecc would help, but they all seemed so scary. Looking around for tutorials, I just found one and though it did help, I would have liked something else. So I took the courage to face these beasts and in the end it turned out to be worth the effort.

This is the problem I tackled: how to parse a string containing a date and a time in quite a free format into a datetime tuple of the kind Erlang uses in its libraries.

My goal was to produce {{2014,1,15},{16,43,0}} parsing any of the following strings:


- 15/1/2014 16:43

- 15-jan-2014 4:43 pm

- 16:43 P.M. Jan 15, 2014

- date: 15 - 01 - 2014, time: 16 : 43

and so on.

In Erlang terms, I wanted a function parse/1 that would take any of the strings above and return the datetime tuple.

I did that in the end, but to see what leex and yecc do and how, it is enough to concentrate on just the time bit. So I’ll stick to the problem of parsing a string containing a time to the tuple {HH,MM} with HH the hour in 24 hour format and MM the minutes.

leex will analyze our string and break it into tokens. yecc will then use the tokens to decide what time is represented in the string.

leex

We start with leex. No theory here. There are plenty of resources on the web that explain what a “lexer” does.

It will break our string into tokens in a format yecc wants. Each token will say what type of input we are dealing with.

16:43 should give us three tokens of the categories integer, time_separator, integer. I could have called the time_separator category something else, like colon.

4:43 pm should give us four tokens of the categories integer, time_separator, integer, meridian_specifier.

By meridian_specifier I mean something that specifies that the time is ante or post meridian.

time: 16 : 43 P.M. should give us tokens of the categories time_separator, integer, time_separator, meridian_specifier.

Here the string “time” should be ignored. And it will be as we will see shortly.

Incidentally, though I didn’t say it, whitespace is ignored as well. This is no magic. As we will see, we have to deal with it explicitly.

That said, how does leex go about it? Well, it has to be instructed. And the instructions go in a file with the extension .xrl.

This file has three sections: definitions, rules and code. It does allow Erlang comment lines. Here is the structure:


%% leex file structure

Definitions.

Rules.

Erlang code.

We’ll start in the middle. With the rules. They are the heart of it all.

A rule is made up of two parts. A regular expression (a la Erlang) and Erlang code. This code will be invoked if the parsed string matches the regular expression and will generally return a token tuple of the form {token, Token}. We’ll talk of Token shortly. The regular expression and the code are separated by whitespace, a colon and other whitespace. The code can call functions implemented in the third section, Erlang code.

The code may also return {end_token, Token} when the returned token is known to be the last token, skip_token, when the token in question is to be ignored, or an error in the form {error, ErrorDescription}.

Back to the Token. This is a tuple with generally three fields: {Category, Location, Content}.

Category tells us something about the nature of the token. Whether it is a number, a string, etc.

Location specifies where in the input we found the token. This is probably used for debugging purposes, but is required by yecc. The Erlang code can assume there is a variable TokenLine which is bound to the line number.

Content is what we would like the parser to believe was input as a token. It may be what was actually present in the parsed string, but it could be something we decide. The Erlang code can assume the availability of variables TokenLen and TokenChars. The former is the length of the matched substring. The latter is the matched substring.

There are times when you will want the Category to be the same as the Content. In such a situation your code can also return just {Category, Location}.

With all that in mind, let’s start writing our rules.

We have an integer when we have a sequence of one or more digits. So a possible rule for this is:


[0-9]+ : {token, {integer, TokenLine, list_to_integer(TokenChars)}}.

The regular expression is simple. It just matches on a sequence of one or more digits. The category is integer, location is given by TokenLine and the content is the integer we get out of converting the substring bound to TokenChars into an integer.

The time_separator is even simpler.


: : {token, {time_separator, TokenLine}}.

We omit the content as it doesn’t serve any real purpose.

Next, let’s deal with the meridian_specifier. We match on all allowed possibilities separately for am and pm.


am|AM|a\.m\.|A\.M\. : {token, {meridian_specifier, TokenLine, am}}.

pm|PM|p\.m\.|P\.M\. : {token, {meridian_specifier, TokenLine, pm}}.

So far so good. Now we need rules to ignore whitespace and other alphabetical substrings.


[\s\t]+ : skip_token.

[a-zA-Z]+ : skip_token.

And there you have it. Just for “fun”, we may raise an error if something else is present in the parsed string.


[.]+ : {error, syntax}.

While we are at it, let’s just say a word about the definitions sections. It allows us to define macros which we can use in specifying the regular expressions for the rules.

The definitions take the form of Macro = Value. No dot at the end here!

We could define L to be a letter of the alphabet in the definitions section:


L = [a-zA-Z]

And then use that in specifying the regular expression for the token category of alphabetic characters to ignore. To use the defined macro, we need to enclose it in braces, otherwise leex would think it is a string.


{L}+ : skip_token.

Now that we have the instructions ready for leex, how do we proceed?

First thing, we let leex read these instructions with the function file/1 or file/2. Let’s assume our instructions are in the file time.xrl.


1> leex:file(time).

{ok, “./time.erl”}

This says leex has generated the file time.erl for us. It contains the code needed to parse our strings.

Let’s compile it and try out its string/1 function on a sample string containing timestamps.


2> c(time).

{ok,time}

3> time:string(“time: 10:23 A.M.”).

{ok, [{time_separator,1,time_separator},

      {integer,1,10},

      {time_separator,1,time_separator},

      {integer,1,23},

      {meridian_specifier,1,am}],

     1}

Voilà! Now let’s see what we can do with this.

yecc

yecc uses the tokenized description of the string to parse to infer the actual time represented. To do this, it has to be instructed on the semantics of a valid time representation. These instructions we provide yecc in a yrl file. Let’s call the file time_parser.yrl.

One of the things we must tell yecc in these instructions is what token categories it will have to interpret. These are token categories leex generated. They all go under the name of Terminals.


Terminals integer time_separator meridian_specifier.

Then there are things yecc will infer from these terminals. These are called Nonterminals. In our simple case, we will infer time. If we had used the complete problem of a datetime, we would also have inferred a date and a datetime.


Nonterminals time.

Next, yecc needs to know what we are eventually trying to infer. In our case it is time. In the complete problem it would have been datetime. yecc is instructed about this with the keyword Rootsymbol.


Rootsymbol time.

However, these instructions don’t go into the instruction file in that order. The correct order is Nonterminals, then Terminals and then the Rootsymbol.

In a more complex scenario, we would also instructions on end symbols and operator precedence, but we have a simple example, so we won’t go into that.

Now we can actually write out the semantic rules with which to infer the non-terminals, in particular the rootsymbol, as that is the goal of the instructions. Incidentally, the semantic rules are called grammar rules.

The rule says how to infer a non-terminal from a sequence of terminals. Erlang code may optionally be associated with the rule and this code may make use of functions in the last section of the instruction file, the Erlang code section.

The general form of the rule is then:


Nonterminal -> sequence of terminals : associated Erlang code.

For our example, I can think of two possible constructions. One with and one without the meridian specifier.


time -> integer time_separator integer.

time -> integer time_separator integer meridian_specifier.

The first integer is the hour and the second the minutes.

But there are also error cases. What if the integer that we suppose to be the hour is more than 24 or the minutes more than 60? What if the hour is more than 12 but the meridian specifier says it is ante-meridian? Let’s assume that in such cases we emit the atom notime.

To actually get the parser to emit something, we associate some Erlang code with each of the rules. We will put the code in the section Erlang code and call that code from the rule. There are several things here that seem alien, but I’ll explain it all shortly.


time -> integer time_separator integer :

    hm (‘$1’, ‘$3’).

time -> integer time_separator integer meridian_specifier :

    hm (‘$1’, ‘$3’, ‘$4’).

Erlang code.

hm({integer, _, Hour}, {integer, _, Min})

  when Hour >= 0, Hour =< 23, Min >= 0, Min =< 59 ->

    {Hour, Min};

hm(_,_) ->

    notime.

hm(Hour, Min, {meridian_specifier, _, am}) ->

    case hm(Hour, Min) of

      {H, M} when H =< 11 -> {H,M};

      _ -> notime

    end;

hm(Hour, Min, {meridian_specifier, _, pm}) ->

    case hm(Hour, Min) of

      {H, M} when H >= 12 -> {H,M};

      {H, M} -> {H + 12, M}

    end.

We have defined two functions in the Erlang code section hm/2 and hm/3. They take the presumed hours and minutes as the first two arguments. The third argument for hm/3 is the meridian specifier. The code is straightforward and I assume it needs no explanation.

Let’s look at how it is invoked though.

yecc provides the atoms ‘$1’, ‘$2’, etc., that can be used in the Erlang code associated with a semantic or grammar rule. These atoms represent the actual token associated with each terminal in the rule. Thus ‘$1’ is the token associated with the first terminal, ‘$2’ the one associated with the second terminal and so on. The token, as we remember from the leex part, looks like: {Category, Location, Content}. This is why the implementation of hm/2 and hm/3 assumes the arguments to be of that form. And this is why we had leex produce the tokens in that format in the first place!

On with generating the parser!

Now that we have the instructions ready, we can tell yecc to generate a parser.


1> yecc:file(time_parser).

{ok,”time_parser.erl”}

2> c(time_parser).

{ok, time_parser}

yecc generates the module time_parser.erl. It contains the function parse/1 which will parse take a list of tokens as input. To get it all to work, then, we first generate the tokens using the parser generated by leex and then parse those tokens with the yecc-generated parser.


3> {ok, Tokens, _} = time:string(“the time is 10:37 am”).

{ok, [{integer, 1, 10},

      {time_separator, 1, time_separator},

      {integer, 1, 37},

      {meridian_separator, 1, am}],

     1}

4> time_parser:parse(Tokens).

{ok, {10:37}}

Starting to play with Regular Expressions

Introduction

I recently decided to tackle a problem that I thought simple to solve in Erlang. But it turned out to be harder than I’d initially thought due to the not-so-straightforward (for me, of course) regular expressions in Erlang.

The problem was to shift the subtitles of a film by a certain amount. 12 seconds in my case.

The subtitles file’s format is srt. It is a text file made up of several records. Each record is made up of a record number, the text to display and the time period in which the text will be displayed.


<Record-Number>\r\n

<Start-Time> --> <Stop-Time>\r\n

<Text>\r\n\r\n

<Record-Number> is an integer and the records are numbered sequentially, starting at 1.

<Start-Time> and <Stop-Time> are in HH:MM:SS,sss format where HH is hours, MM is minutes, SS is seconds and sss is milliseconds.

\r\n are the carriage return and newline characters.

The text may be multi-line, each line separated from the other by a \r\n.

For example, a two record file may look like:


1\r\n

00:01:20,120 --> 00:01:33,234\r\n

- Good day, Mr. John!\r\n

\r\n

2\r\n

00:01:57,121 --> 00:02:05,566\r\n

- Good day to you, my man\r\n

\r\n

Parsing srt files

Timeshifting the file requires replacing the start and stop times of each record with values that have been shifted by the required amount of time. For simplicity, I’ll consider time shifts of whole seconds, leaving the milliseconds intact.

I read in the whole file as a binary string using file:read_file/1


1> {ok, T} = file:read_file(Subtitles_File_Path).

{ok, <<"1\r\n00:01:20,120 --> 00:01:33,234\r\n
- Good day, Mr. John!\r\n\r\n
2\r\n00:01:57,121 --> 00:02:05,566\r\n
- Good day to you, my man\r\n\r\n">>}

To parse the records, I try using regular expressions. Each record should match the following sequence of fields:

  • a numeric field of at least one digit => record number
  • the two characters ‘\r’, ‘\n’
  • two digits => start hours
  • the character ‘:’
  • two digits => start minutes
  • the character ‘:’
  • two digits => start seconds
  • the character ‘,’
  • three digits => start milliseconds
  • the characters ‘ ‘, ‘-‘, ‘-‘, ‘>’, ‘ ‘
  • two digits => end hours
  • the character ‘:’
  • two digits => end minutes
  • the character ‘:’
  • two digits => end seconds
  • the character ‘,’
  • three digits => end milliseconds
  • the characters ‘\r’, ‘\n’
  • any number of characters, including the characters ‘\r’ and ‘\n’, but ending with the characters ‘\r’, ‘\n’, ‘\r’, ‘\n’
  • zero or more characters representing zero or more other records

Regular expression primer

Now, a regular expression is a specification of a pattern that we’re trying to discover in a given string. If the pattern is there, we have a match. In Erlang, the module re provides the functionality for regular expression processing. In this primer, I’ll only use the re:run/3 function:


> re:run(String, RE, Options).

It will determine if the patterns specified in the regular expression RE match the string String. Options is a list of directives to the function and establish how some specific patterns are interpreted or how any match is communicated back to the caller of the function.

If there is no match, the function returns the atom nomatch. If there is a match, it returns a two-element tuple. The first element is the atom match and the second a list of match specifications.

Let’s first look at how to specify the patterns.

Any characters will match the corresponding characters. Here’s some examples.


> re:run("Hello", "Hello").

will match, but


> re:run("Hello", "hello").

will not; because the character ‘H’ doesn’t match the character ‘h’.

Control characters are specified with their corresponding escape sequences: newline is \n, carriage return is \r, and so on.

What if we are interested in just a substring? What if we don’t care what a character is?


> re:run("Hello", ".ello").

will match. As will


> re:run("Cello", ".ello").

The dot in the regular expression matches any character. So if you really want to be sure you matched a dot and not just any character, use the escaped dot:


> re:run("He.llo", ".e\.llo").

The first dot matches the character ‘H’, the second, escaped version, matches the dot between the characters ‘e’ and ‘l’.

We can also specify quantifiers after each specifier. A + will match one or more of the specified patterns. An * will match zero or more of them.


> re:run("Hello", ".*").

will match.


> re:run("Hello", "Hel+o").

will match as well. The + after the l meaning one or more ls.

In fact, whereas


> re:run("Hello", "Hello!*").

will match (there are no !s in the string being parsed and we specified zero or more !s),


> re:run("Hello", "Hello!+").

won’t (there are no !s in the string being parsed and we specified one or more !s).

Again, just as with a dot, if you actually want to match the character ‘+’, use the escaped version \+.

Here’s some more. We use \d to specify a digit:


> re:run("4", "\d").

should match as 4 is a digit, but


> re:run("a", "\d").

shouldn’t, because ‘a’ is not a digit.

Another thing which is good to know to follow the rest of the discussion is that you can specify subpatterns within the regular expression by using parentheses.


> re:run("Her name is Jane", "Her name is (.+)").

will match and “Jane” will match the first subpattern which is also the only subpattern specified in this case.

One last thing. The re module will happily accept binary strings instead of ordinary strings/lists. So,


> re:run(<<"Her name is Jane">>, <<"Her name is (.+)">>).

will work just the same.

Parsing the srt file with regular expressions

Armed with this knowledge, let’s have a go at parsing the srt file.

My first attempt at specifying a regular expression to match the srt records is:


2> RE = <<”
\d+\r\n
\d\d:\d\d:\d\d,\d\d\d
 -->
 \d\d:\d\d:\d\d,\d\d\d\r\n
.*
\r\n\r\n
.*”>>.

but as we’ll see, it doesn’t match:


3> re:run(T, RE).

nomatch

(I’m assuming T is bound to the contents of the entire file as a binary string)

Hmmm… what’s wrong? I can’t figure it out, so I decide to go step by step, trying to hunt down the problem.


4> re:run(T, <<".*">>).

{match,[{0,2}]}

We have a match, as expected. But what do those numbers mean?

I’ll explain it later, but for now  just take it for granted that it says the pattern matched the string starting at index zero (the first character) and that it matched two characters. Looking at the parsed string, it means it matched “1\r”.

I was expecting it to match the entire string. Instead, it just matches the first two characters. Why?

Going through the documentation, it turns out that the dot specifier will not match newline characters. Unless, of course, you advise the function otherwise. That’s what the dotall option is supposed to do.


5> re:run(T, <<".*">>, [dotall]).

{match,[{0,136}]}

Hooray! It worked. Let’s see if it matched the complete string.


6> byte_size(T).

136

Yes, it did.

Next, I’m going to match just the first record number.


7>  re:run(T, <<"\d+.*">>, [dotall]).

nomatch

Now wait a minute! What happened here? Why didn’t it see the first digit?

Going through the documentation, paying attention to the warning box up there (!) and remembering having had this problem before and that it had been answered on the Erlang mailing list (!), I reckon the backslash has to be escaped a second time. Don’t ask me why, it has to do something with one backslash being consumed by the shell’s parser and the second required by the re module.


8>  re:run(T, <<"\\d+.*">>, [dotall]).

{match,[{0,136}]}

Aha! Let’s see if we can actually capture the record number:


9>  re:run(T, <<"(\\d+).*">>, [dotall]).

{match,[{0,136},{0,1}]}

There’s a match, but what’s all these tuples? Again, reading through the entire documentation of the re module (ugh!), at the place where it explains what the capture option does (why do I have to read this to know what will happen if I don’t use it is anybody’s guess), there is finally an explanation of what happens if you don’t use this option.

“By default, re:run/3 captures all of the matching part of the substring as well as all capturing subpatterns (all of the pattern is automatically captured). The default return type is (zero-based) indexes of the captured parts of the string, given as {Offset,Length} pairs”.

Alright, technically that is correct, but I bet two euros you won’t understand what on earth it all means when you read it the first time around. It means that the first tuple is all of the matching part, the other tuples are the matching subpatterns. In our case, the first tuple says the entire string was matched; the second that the first subpattern (the one enclosed in parentheses) starts at the beginning of the string (index 0) and is one byte long.

Let’s modify our first attempt with what we’ve learnt so far to see if we are now ok.


10> re:run(T,
<<"(\\d+)\\r\\n
(\\d\\d:\\d\\d:\\d\\d,\\d\\d\\d)
 -->
 (\\d\\d:\\d\\d:\\d\\d,\\d\\d\\d)\\r\\n
.*">>, [dotall]).

{match,[{0,136},{0,1},{3,12},{20,12}]}

It would seem we are. Here, I have tried subpatterns for the start and end times as well.

Let’s continue to see if we can interpret an entire record and can separate the rest for further (recursive) processing:


11> re:run(T,
<<"(\\d+)\\r\\n
(\\d\\d:\\d\\d:\\d\\d,\\d\\d\\d)
 -->
 (\\d\\d:\\d\\d:\\d\\d,\\d\\d\\d)\\r\\n
(.*)\\r\\n\\r\\n
(.*)">>, [dotall]).

{match,[{0,136},{0,1},{3,12},{20,12},{34,98},{136,0}]}

Hmm… Not quite. The first (.*) subpattern has gulped up everything up to the last \r\n\r\n sequence, leaving the last (.*) subpattern empty (starting at index 136 with length 0)!

Where’s the problem? Something about greedy matching from my perl days rings a bell. Obviously the pattern is greedily matching everything it can! There must be a way to tell the function not to do that.

So I look up the re module documentation again and go through the options to the function run/3. Nothing that resembles directives to be greedy or not. So I search for the word greed in the page and voila! There is an ungreedy option for the compile function. Looking closer, compile options can be given to the run/3 functions as well. Not very obvious, but that’s documentation in the Erlang world: correct, though not always obvious.

So, let’s give the ungreedy option a try:


12> re:run(T,
<<"(\\d+)\\r\\n
(\\d\\d:\\d\\d:\\d\\d,\\d\\d\\d)
 -->
(\\d\\d:\\d\\d:\\d\\d,\\d\\d\\d)\\r\\n
(.*)\\r\\n\\r\\n
(.*)">>, [dotall, ungreedy]).

{match,[{0,68},{0,1},{3,12},{20,12},{34,30},{68,0}]}

That’s better, but we still have a problem. The last subpattern didn’t match anything at all (see that the offset is 68, but the length is 0). Why? We have specified that the matching be ungreedy, but for us to match the rest of the string, the last subpattern needs to be greedy.

Now we know that by default, pattern matching is greedy, but if we use the option ungreedy, every pattern search is affected. There ought to be a way to specify greediness or ungreediness on a subpattern basis. Again, reading through the documentation, we see that we can use the character ‘?’ to specify the greediness of a subpattern when the ungreedy option is used.

Let’s try that then:


13> re:run(T,
<<"(\\d+)\\r\\n
(\\d\\d:\\d\\d:\\d\\d,\\d\\d\\d)
 -->
 (\\d\\d:\\d\\d:\\d\\d,\\d\\d\\d)\\r\\n
(.*)\\r\\n\\r\\n
(.*?)">>, [dotall, ungreedy]).

{match,[{0,136},{0,1},{3,12},{20,12},{34,30},{68,68}]}

Aha! That did it. We have matched the rest of the string. It starts at index 68 and is 68 bytes long.

So far so good. Now how do I read the time fields. I know the start time is the substring starting at (zero-based) index 3 and is 12 bytes long. But how do I extract it? It’s not a string, so I can’t use the string module in the stdlib to do it. Converting the entire binary to a list before running the regular expression would beat the purpose of using binary strings in the first place. So what do we do?

Let’s go back to the documentation of the re module. Reading through it, we see that the option capture allows us to specify what the function should return.

First, we specify what we need returned: all of the patterns (the default), just the first pattern or all patterns but the first. Since we don’t care about the match on the entire string, let’s use the all_but_first option.

Next, we can specify whether we want the returned subpatterns as indexes (the default), as lists or as binaries. Let’s use binaries:


14> re:run(T,
<<"(\\d+)\\r\\n
(\\d\\d:\\d\\d:\\d\\d,\\d\\d\\d)
 -->
 (\\d\\d:\\d\\d:\\d\\d,\\d\\d\\d)\\r\\n
(.*)\\r\\n\\r\\n
(.*?)">>, [dotall, ungreedy, {capture, all_but_first, binary}]).

{match,[<<"1">>,<<"00:01:20,120">>,<<"00:01:33,234">>,
<<"\n\n- Good day, Mr. John!">>,
<<"\n\n2\r\n00:01:57,121 --> 00:02:05,566\r\n\n\n
- Good day to you, my man\r\n\r\n">>]}

Well, well, well! This is more than I had bargained for.

It’s a simple matter now to bind the parsed values to some variables with which we can work:


15> {match, [Rec_Number, Start_Time, End_Time, Text, Rest]} =
re:run(T,
<<"(\\d+)\\r\\n
(\\d\\d:\\d\\d:\\d\\d,\\d\\d\\d)
 -->
 (\\d\\d:\\d\\d:\\d\\d,\\d\\d\\d)\\r\\n
(.*)\\r\\n\\r\\n
(.*?)">>, [dotall, ungreedy, {capture, all_but_first, binary}]).

We could have grouped the single hour, minute, second, etc fields as well to get direct access to them.


15> {match, [Rec_Number,
Start_Hour, Start_Min, Start_Sec, Start_Millisec,
End_Hour, End_Min, End_Sec, End_Millisec,
Text, Rest]} =
re:run(T,
<<"(\\d+)\\r\\n
(\\d\\d):(\\d\\d):(\\d\\d),(\\d\\d\\d)
 -->
(\\d\\d):(\\d\\d):(\\d\\d),(\\d\\d\\d)\\r\\n
(.*)\\r\\n\\r\\n
(.*?)">>, [dotall, ungreedy, {capture, all_but_first, binary}]).

We could also have used ungreedy (.*) subpatterns to match the start and end times.


15> {match, [Rec_Number, Start_Time, End_Time, Text, Rest]} =
re:run(T,
<<"(\\d+)\\r\\n
(.*) --> (.*)\\r\\n
(.*)\\r\\n\\r\\n
(.*?)">>, [dotall, ungreedy, {capture, all_but_first, binary}]).

And so on.

Now we can extract the hour, minute, second etc from the start time:


16> {match, [Hours, Minutes, Seconds]} =
re:run(Start_Time,
<<"(\\d\\d):(\\d\\d):(\\d\\d)">>,
[{capture, all_but_first, list}]).

{match,["00","12","23"]}

It’s quite easy to convert this into seconds and add the timeshift to it:


17> Shifted_Start_Time =
list_to_integer(Hours) * 3600 +
list_to_integer(Minutes) * 60 +
list_to_integer(Seconds) +
Timeshift.

755

We could have used the functions in the calendar module of the stdlib:


18> calendar:time_to_seconds(
{list_to_integer(Hours),
 list_to_integer(Minutes),
 list_to_integer(Seconds)}) + 12.

755

We certainly will be better off using the calendar module to convert the shifted seconds back to hours, minutes and seconds!


19> {SSH, SSM, SSS} = calendar:seconds_to_time(Shifted_Start_Time).

{0,12,35}

Once the times have been shifted, we can write the record to another file. We will have to remember to convert the various fields into binaries first.

We can recursively go through this procedure with the rest of the text string (the one we bound to the variable Rest), until we get a nomatch, which hopefully will be when we are through with the text.

On effort estimates and productivity

From the erlang mailing list

Joe Armstrong

Once upon a very long time ago we did a project to compare the efficiency of Erlang to PLEX.

We implemented “the same things” (TM)  in Erlang and PLEX and counted total man hours

We did this for several different things.

Erlang was “better” by a factor of 3 or 25 (in total man hours) – the weighted average was a factor 8

They asked “what is the smart programmer effect”

We said “we don’t know”

We revised the figure 8 down to 3 to allow for “the smart programmer effect” – this was too high to be credible, so we revised it down to 1.6. (the factors 3 and 1.6 where just plucked out of the air with no justification)

Experiments that show that Erlang is N times better than “something else” won’t be believed if N is too high.

The second point to remember is that you *never* implement exactly the same thing in two different languages (or very rarely) – the second time you do something you have presumably learnt from the mistakes made the first time you do something.

If you implement the same thing N times in the same language, each implementation should take less effort and code than the last time you did it. What can you learn from this?

The difference in programmer productivity can vary by a factor of 80 – (really it’s infinity, because some programmers *never* get some code right, so the factor 80 discounts the totally failed efforts) – So given a productivity factor you have to normalize it by  a factor that depends upon the skill and experience of the programmer.

There are people who claim that they can make models estimating how long a software projects take.

But even they say that such models have to be tuned, and are only applicable to projects which are broadly similar. After you’ve done almost the same thing half a dozen times it might be possible to estimate how long a similar project might take.

The problem is we don’t do similar things over and over again. Each new unsolved problem is precisely that,  a new unsolved problem.

Most time isn’t spent programming anyway –  programmer time is spent:

a) fixing broken stuff that should not be broken

b) trying to figure out what problem the customer actually wants solving

c) writing experimental code to test some idea

d) googling for some obscure fact that is needed to solve a) or b)

e) writing and testing production code

e) is actually pretty easy once a) – d) are fixed. But most measurements of productivity only measure lines of code in e) and man hours.

I’ve been in this game for many years now, and I have the impression that a) is taking a larger and larger percentage of my time. 30 years ago there was far less software, but the software there was usually worked without any problems – the code was a lot smaller and consequently easier to understand.

Again in the last 30 years programs have got hundreds to thousands of times larger (in terms of code lines) but programming languages haven’t got that much better and our brains have not gotten any smarter. So the gap between what we can build and what we can understand is growing rapidly.

Extrapolating a bit I guess a) is going to increase – so in a few years we’ll have incredibly smart devices which almost work, and when broke nobody will able to fix, and programmers will spend 100% of their time fixing broken stuff that should not be broken.

And no I have to figure out why firefox has suddenly stopped working – something is broken …

Cheers

/Joe

Mahesh Paolini-Subramanya

There are – at least – four orthogonal areas in which your software gets developed, each of which has different metrics for estimation, tracking, progress, etc., etc., etc.  To throw some semantics at this

1) Technical

When the specifics of the solution are clear, and it pretty much boils down to implementation. “I need to replace the valve on my Hot Tub” (With the same model, I’ve already done it once before, etc., etc.)

2) Engineering

When you need to solve the problem first, before implementing the solution.  There _is_ a body of kno “I need to install my Hot Tub” (Hmmm. There is no water line going out there. How do I get one? What about the electrical circuit? Do I need architectural permission? etc.)

 3) Science

You need to invent a new class of solutions for the problem at hand.  “I need to install my Hot Tub in an underground bunker in the marshlands of Florida” (How do I build an underground bunker in the marsh? Maybe I can freeze the ground and pour concrete? How do I keep the concrete from sinking? Hmmm. Time to start running experiments?”

 4) Art

You need to get the intangibles correct, viz., is it maintainable? Supportable? Documented? Elegant? “Will some future Significant Other like my paranoid underground-bunker hot-tub?”

 Note that each of the above are _different_.

– Its fairly easy to Manage / Maintain / Monitor “Technical” work.

– There is a reasonable body of knowledge that helps in doing the same for “Engineering” stuff.

– For Science, its all pretty clearly made-up (the fusion reactor!),

– For “Art”, well, it really _is_ in the eye of the beholder.  (Good Documentation? For whom? What do you mean by “Good”? *I* understand it! And so does Jane!)

Which brings me back to the original point – much as we would like it to be that way, software pretty much never fits neatly into one of the buckets above – it is some combination of the four, with different parameters for {T, E, S, A}.

What’s worse, there is a time-variant aspect to this too – and the parameters are inter-related.  e.g., different “Engineering” solutions have different “Technical” impacts.  In short, your development process is actually f(T, E, S, A, Time)

All this being said, there is a time-honored Academic way of solving f(T, E, S, A, Time), which basically consists of wishing-away the unknowns (or make unrealistic assumptions about them) and then spend an in-ordinate amount of time on the remaining parameters.  With some appropriate tweaking of parameters, one can quite successfully make this match some “real-world” results, which can then be trumpeted widely.  Any failures can be blamed on the actors (ho-ho. “Actors”. In an erlang post. I crack myself up.) who were clearly not qualified..

This may very well be the best option?

Cheers

Mahesh Paolini-Subramanya

On Contracts and Crashes

Joe delivers this entertaining talk on some of the really basic things to keep in mind when developing our systems.

Contracts are a specification of the “protocol” between two “black boxes” which could be Erlang processes.

When one or the other is not, a MiddleMan (MM) takes care of “translating” to a common language – the Erlang protocol

Whereas it may not be a very good idea to crash in the face a contract violation when the whole system is just one big sequential blog, it is indeed more than reasonable to do so when we have a system made up of millions of processes.

Thanks to Erlang Solutions for uploading the talk.