Sunday, 8 April 2007

The Atomiser, Part VII

As promised, here is the full listing of my current atomiser.erl file:

-module(atomiser).
-author("Philip Robinson").
-export([parse_transform/2]).
%-compile({parse_transform, atomiser}). % Uncomment after initial compile.

-atoms([atom, attribute, bin, bin_element, call, 'case', char]).
-atoms([clause, clauses, cons, eof, 'fun', function, generate]).
-atoms(['if', integer, lc, match, nil, op, 'receive', record]).
-atoms([record_field, remote, string, tuple, var]).
-atoms([atoms, error, found, ok]).

parse_transform(AST, _Options) ->
    atoms_unused_print(walk_ast(AST, dict:new())),
    AST.

atoms_from_attribute(Line, AtomList, Atoms) ->
    AddAtom = fun(Atom, Dict) ->
        case dict:find(Atom, Dict) of
            {ok, LineAlreadyDefined} ->
                io:format("Line ~B: Atom ~w already defined on line ~B.~n",
                    [Line, Atom, LineAlreadyDefined]),
                Dict;
            error -> dict:store(Atom, Line, Dict)
            end
        end,
    lists:foldl(AddAtom, Atoms, AtomList).

atom_check(Atom, Line, Atoms) ->
    case dict:find(Atom, Atoms) of
        {ok, found} -> Atoms;
        {ok, _LineDefinedOn} -> dict:store(Atom, found, Atoms);
        error ->
            io:format("Line ~B: Atom ~w unexpected.~n", [Line, Atom]),
            Atoms
        end.

atoms_unused_print(Atoms) ->
    Filter = fun({_Atom, FoundOrDefinedLine}) ->
        FoundOrDefinedLine =/= found
        end,
    PrintUnusedAtom = fun({Atom, Line}) ->
        io:format("Line ~B: Atom ~w unused.~n", [Line, Atom])
        end,
    lists:foreach(PrintUnusedAtom,
        lists:keysort(2, lists:filter(Filter, dict:to_list(Atoms)))).

-define(WALK_AST(Pattern, Expressions),
    walk_ast([Pattern|ASTRest], Atoms) ->
        Fun = fun(AST, AtomsMarked) ->
            walk_ast(AST, AtomsMarked)
            end,
        walk_ast(ASTRest, lists:foldl(Fun, Atoms, Expressions))).

walk_ast([], Atoms) -> Atoms;
walk_ast([{atom,Line,Atom}|RestAST], Atoms) -> % Check whether atom is valid.
    walk_ast(RestAST, atom_check(Atom, Line, Atoms));
walk_ast([{attribute,Line,atoms,AtomList}|RestAST], Atoms) -> % Valid atoms.
    walk_ast(RestAST, atoms_from_attribute(Line, AtomList, Atoms));
?WALK_AST({attribute,_Line,_Name,_Value}, []);
?WALK_AST({bin,_Line,Elements}, [Elements]);
?WALK_AST({bin_element,_Line,_Name,_Size,_Type}, []);
?WALK_AST({call,_Line,_Fun,Args}, [Args]);
?WALK_AST({'case',_Line,Test,Clauses}, [[Test], Clauses]);
?WALK_AST({char,_Line,_Char}, []);
?WALK_AST({clause,_Line,Args,Guards,Exprs}, [Args] ++ Guards ++ [Exprs]);
?WALK_AST({cons,_Line,Head,Tail}, [[Head], [Tail]]);
?WALK_AST({eof,_Line}, []);
?WALK_AST({error,_Details}, []); % Ignore compiler errors.
?WALK_AST({'fun',_Line,{clauses,Clauses}}, [Clauses]);
?WALK_AST({function,_Line,_Fun,_Arity,Clauses}, [Clauses]);
?WALK_AST({generate,_Line,A,B}, [[A, B]]);
?WALK_AST({'if',_Line,Clauses}, [Clauses]);
?WALK_AST({integer,_Line,_Integer}, []);
?WALK_AST({lc,_Line,Head,Tail}, [[Head|Tail]]);
?WALK_AST({match,_Line,Left,Right}, [[Left], [Right]]);
?WALK_AST({nil,_Line}, []);
?WALK_AST({op,_Line,_BinaryOperator,Left,Right}, [[Left], [Right]]);
?WALK_AST({op,_Line,_UnaryOperator,_Operand}, []);
?WALK_AST({'receive',_Line,Clauses}, [Clauses]);
?WALK_AST({'receive',_Line,Clauses1,_TimeAfter,Clauses2}, [Clauses1, Clauses2]);
?WALK_AST({record,_Line,_Record,Fields}, [Fields]);
?WALK_AST({record_field,_Line,Field,Contents}, [[Field,Contents]]);
?WALK_AST({record_field,_Line,_Variable,_Record,Field}, [[Field]]);
?WALK_AST({remote,_Line,_Module,_Function}, []);
?WALK_AST({string,_Line,_String}, []);
?WALK_AST({tuple,_Line,Elements}, [Elements]);
?WALK_AST({var,_Line,_Name}, []);
walk_ast([Node|ASTRest], Atoms) ->
    io:format("Unknown node: ~p~n", [Node]),
    walk_ast(ASTRest, Atoms).



Some final notes, in no particular order:

I am quite pleased with the functionality of the Atomiser, especially considering that it currently weighs in at just under 100 lines of code. I can honestly attribute the relatively small size of this module to the use of the single WALK_AST substitution macro. If this macro had not been used then we would be looking at an increase of 50% in lines of code, at least.

The fact that the Atomiser does not alter the parse tree of the program it is examining made it an ideal project to get used to working with Erlang parse_transform programs. Without support for parse_transform modules I would have had to hack at the source code of the compiler to achieve a similar result... which is not really a viable option if the addition is not accepted into the project.

Obviously in this implementation I have only added walk_ast function clauses for those AST nodes my own programs require; your mileage may vary. If you do run this module over your own code and an unknown node appears then please let me know so I can add an appropriate clause here. Likewise, please feel free to drop me a line with questions, comments, and/or (especially!) suggestions.


Update 10/4/2007:

Yariv Sadan suggested that I take a look at Recless, one of his many ongoing projects (see comments in Part 1). As Yariv did in Recless, I have removed the atoms_from_ast function by rolling the gathering of atoms into the walk_ast function. It saves five lines of code, but more importantly it saves one pass through the top level of the AST. (The down side is that you can no longer expect the Atomiser to validate atoms before the appropriate module attribute is encountered, but I have no problem with that.)

"ayrnieu" posted a link to this series on Reddit, and also mentioned the Dialyzer tool. I have just played with the Dialyzer and have only one thing to say: Use the Dialyzer on your code.

1 comment:

  1. erl_syntax:subtrees may help with walking the AST. See below:

    tree_foldr(Fun, Acc0, Tree) ->
    F = fun (Group, AccIn) ->
    F = fun (Subtree, AccIn) ->
    tree_foldr(Fun, AccIn, Subtree)
    end,
    lists:foldr(F, AccIn, Group)
    end,
    case erl_syntax:subtrees(Tree) of
    [] ->
    Fun(Tree, Acc0);
    List ->
    Fun(Tree, lists:foldr(F, Acc0, List))
    end.

    I am a little new to Erlang, so some things may be kind of off.

    ReplyDelete

Obligatory legal stuff

Unless otherwise noted, all code appearing on this blog is released into the public domain and provided "as-is", without any warranty of any kind, express or implied, including but not limited to the warranties of merchantability, fitness for a particular purpose and noninfringement. In no event shall the author(s) be liable for any claim, damages, or other liability, whether in an action of contract, tort or otherwise, arising from, out of or in connection with the software or the use or other dealings in the software.