Friday 6 April 2007

The Atomiser, Part III

Right, so we have a skeleton parse transformation module that emits a message and returns the unchanged Abstract Syntax Tree. What do we do now?


There are a few requirements listed in Part II; we might as well start from the top:

* Ability to include a list of valid atoms into a source file.

Implicit in this requirement is that the Atomiser will actually be able to read this list and do something with it. Let's work on that.


To begin with we want to embed our list of valid atoms in our source code. Unfortunately we cannot use just any old syntax to do this - the syntax we choose must be parse-able by the existing Erlang compiler so that it can build us an Abstract Syntax Tree to play with.

Luckily we can use a module attribute to specify our list of valid atoms. For no particular reason I will name this attribute 'atoms':

-atoms([atom1, atom2, atom2, atom4]).

The Atomiser will do a quick pass through the top level of the Abstract Syntax Tree to pull in all the 'atoms' module attributes, storing their contents in a dictionary of valid atoms. The atom names themselves will be the keys of this dictionary; the line number of the attribute that specified the atom will be the value stored against each atom key. (We will use these line numbers for reporting later, if a specified atom is unused.)

Here is a function to scan an AST and print all the atoms attributes it finds:

atoms_find([]) ->
    ok;
atoms_find([{attribute,Line,atoms,AtomList}|ASTRest]) ->
    io:format("Found atom list on line ~B: ~p~n", [Line, AtomList]),
    atoms_find(ASTRest);
atoms_find([_Node|ASTRest]) ->
    atoms_find(ASTRest).



And with a slight modification, instead of printing them out we can store the atoms we find in a dictionary:

atoms_from_ast(AST) ->
    atoms_from_ast(AST, dict:new()).

atoms_from_ast([], Atoms) ->
    Atoms;
atoms_from_ast([{attribute,Line,atoms,AtomList}|ASTRest], Atoms) ->
    atoms_from_ast(ASTRest, atoms_from_attribute(Line, AtomList, Atoms));
atoms_from_ast([_|ASTRest], Atoms) ->
    atoms_from_ast(ASTRest, Atoms).



Neat, huh?

Well, okay, I haven't yet added the code to extract the atoms from an atoms attribute and store them in the dictionary. It sounds a bit complicated...

atoms_from_attribute(Line, AtomList, Atoms) ->
    AddAtom = fun(Atom, Dict) ->
        dict:store(Atom, Line, Dict)
        end,
    lists:foldl(AddAtom, Atoms, AtomList).


...well, maybe that is not too complicated after all.


Oh, wait! If I make the Atomiser report on atoms that have already been specified as valid, then I can totally make this look impressive:

atoms_from_attribute(Line, AtomList, Atoms) ->
    AddAtom = fun(Atom, Dict) ->
        case dict:find(Atom, Dict) of
            {ok, LineAlreadyDefined} ->
                io:format(
                    "Line ~B: Atom ~w already defined on line ~B.~n",
                    [Line, Atom, LineAlreadyDefined]),
                Dict;
            error -> dict:store(Atom, Line, Dict)
            end
        end,
    lists:foldl(AddAtom, Atoms, AtomList).


There we go. Now the Atomiser will let us know if we have accidentally specified a valid atom more than once.

We can try this out now. Here is the full listing of the Atomiser so far:


-module(atomiser).
-export([parse_transform/2]).
-compile({parse_transform, atomiser}). % Comment out for initial compile.

-atoms([atom1, atom2, atom2, atom4]).

parse_transform(AST, _Options) ->
    Atoms = atoms_from_ast(AST),
    io:format("Retrieved these valid atoms: ~p~n", [dict:fetch_keys(Atoms)]),
    AST.

atoms_from_ast(AST) ->
    atoms_from_ast(AST, dict:new()).

atoms_from_ast([], Atoms) ->
    Atoms;
atoms_from_ast([{attribute,Line,atoms,AtomList}|ASTRest], Atoms) ->
    atoms_from_ast(ASTRest, atoms_from_attribute(Line, AtomList, Atoms));
atoms_from_ast([_|ASTRest], Atoms) ->
    atoms_from_ast(ASTRest, Atoms).

atoms_from_attribute(Line, AtomList, Atoms) ->
    AddAtom = fun(Atom, Dict) ->
        case dict:find(Atom, Dict) of
            {ok, LineAlreadyDefined} ->
                io:format(
                    "Line ~B: Atom ~w already defined on line ~B.~n",
                    [Line, Atom, LineAlreadyDefined]),
                Dict;
            error -> dict:store(Atom, Line, Dict)
            end
        end,
    lists:foldl(AddAtom, Atoms, AtomList).



Compiling this atomiser.erl file should give us the list of the three unique atoms specified, and a complaint about atom2 occurring twice:

1> c(atomiser), l(atomiser).
Line 5: Atom atom2 already defined on line 5.
Retrieved these valid atoms: [atom1,atom2,atom4]
{module,atomiser}
2>



This thing had better do some real work soon... it is already past thirty lines of code!

1 comment:

Obligatory legal stuff

Unless otherwise noted, all code appearing on this blog is released into the public domain and provided "as-is", without any warranty of any kind, express or implied, including but not limited to the warranties of merchantability, fitness for a particular purpose and noninfringement. In no event shall the author(s) be liable for any claim, damages, or other liability, whether in an action of contract, tort or otherwise, arising from, out of or in connection with the software or the use or other dealings in the software.