The Atomiser, Part VII
As promised, here is the full listing of my current atomiser.erl file:
-module(atomiser).
-author("Philip Robinson").
-export([parse_transform/2]).
%-compile({parse_transform, atomiser}). % Uncomment after initial compile.
-atoms([atom, attribute, bin, bin_element, call, 'case', char]).
-atoms([clause, clauses, cons, eof, 'fun', function, generate]).
-atoms(['if', integer, lc, match, nil, op, 'receive', record]).
-atoms([record_field, remote, string, tuple, var]).
-atoms([atoms, error, found, ok]).
parse_transform(AST, _Options) ->
atoms_unused_print(walk_ast(AST, dict:new())),
AST.
atoms_from_attribute(Line, AtomList, Atoms) ->
AddAtom = fun(Atom, Dict) ->
case dict:find(Atom, Dict) of
{ok, LineAlreadyDefined} ->
io:format("Line ~B: Atom ~w already defined on line ~B.~n",
[Line, Atom, LineAlreadyDefined]),
Dict;
error -> dict:store(Atom, Line, Dict)
end
end,
lists:foldl(AddAtom, Atoms, AtomList).
atom_check(Atom, Line, Atoms) ->
case dict:find(Atom, Atoms) of
{ok, found} -> Atoms;
{ok, _LineDefinedOn} -> dict:store(Atom, found, Atoms);
error ->
io:format("Line ~B: Atom ~w unexpected.~n", [Line, Atom]),
Atoms
end.
atoms_unused_print(Atoms) ->
Filter = fun({_Atom, FoundOrDefinedLine}) ->
FoundOrDefinedLine =/= found
end,
PrintUnusedAtom = fun({Atom, Line}) ->
io:format("Line ~B: Atom ~w unused.~n", [Line, Atom])
end,
lists:foreach(PrintUnusedAtom,
lists:keysort(2, lists:filter(Filter, dict:to_list(Atoms)))).
-define(WALK_AST(Pattern, Expressions),
walk_ast([Pattern|ASTRest], Atoms) ->
Fun = fun(AST, AtomsMarked) ->
walk_ast(AST, AtomsMarked)
end,
walk_ast(ASTRest, lists:foldl(Fun, Atoms, Expressions))).
walk_ast([], Atoms) -> Atoms;
walk_ast([{atom,Line,Atom}|RestAST], Atoms) -> % Check whether atom is valid.
walk_ast(RestAST, atom_check(Atom, Line, Atoms));
walk_ast([{attribute,Line,atoms,AtomList}|RestAST], Atoms) -> % Valid atoms.
walk_ast(RestAST, atoms_from_attribute(Line, AtomList, Atoms));
?WALK_AST({attribute,_Line,_Name,_Value}, []);
?WALK_AST({bin,_Line,Elements}, [Elements]);
?WALK_AST({bin_element,_Line,_Name,_Size,_Type}, []);
?WALK_AST({call,_Line,_Fun,Args}, [Args]);
?WALK_AST({'case',_Line,Test,Clauses}, [[Test], Clauses]);
?WALK_AST({char,_Line,_Char}, []);
?WALK_AST({clause,_Line,Args,Guards,Exprs}, [Args] ++ Guards ++ [Exprs]);
?WALK_AST({cons,_Line,Head,Tail}, [[Head], [Tail]]);
?WALK_AST({eof,_Line}, []);
?WALK_AST({error,_Details}, []); % Ignore compiler errors.
?WALK_AST({'fun',_Line,{clauses,Clauses}}, [Clauses]);
?WALK_AST({function,_Line,_Fun,_Arity,Clauses}, [Clauses]);
?WALK_AST({generate,_Line,A,B}, [[A, B]]);
?WALK_AST({'if',_Line,Clauses}, [Clauses]);
?WALK_AST({integer,_Line,_Integer}, []);
?WALK_AST({lc,_Line,Head,Tail}, [[Head|Tail]]);
?WALK_AST({match,_Line,Left,Right}, [[Left], [Right]]);
?WALK_AST({nil,_Line}, []);
?WALK_AST({op,_Line,_BinaryOperator,Left,Right}, [[Left], [Right]]);
?WALK_AST({op,_Line,_UnaryOperator,_Operand}, []);
?WALK_AST({'receive',_Line,Clauses}, [Clauses]);
?WALK_AST({'receive',_Line,Clauses1,_TimeAfter,Clauses2}, [Clauses1, Clauses2]);
?WALK_AST({record,_Line,_Record,Fields}, [Fields]);
?WALK_AST({record_field,_Line,Field,Contents}, [[Field,Contents]]);
?WALK_AST({record_field,_Line,_Variable,_Record,Field}, [[Field]]);
?WALK_AST({remote,_Line,_Module,_Function}, []);
?WALK_AST({string,_Line,_String}, []);
?WALK_AST({tuple,_Line,Elements}, [Elements]);
?WALK_AST({var,_Line,_Name}, []);
walk_ast([Node|ASTRest], Atoms) ->
io:format("Unknown node: ~p~n", [Node]),
walk_ast(ASTRest, Atoms).
Some final notes, in no particular order:
I am quite pleased with the functionality of the Atomiser, especially considering that it currently weighs in at just under 100 lines of code. I can honestly attribute the relatively small size of this module to the use of the single WALK_AST substitution macro. If this macro had not been used then we would be looking at an increase of 50% in lines of code, at least.
The fact that the Atomiser does not alter the parse tree of the program it is examining made it an ideal project to get used to working with Erlang parse_transform programs. Without support for parse_transform modules I would have had to hack at the source code of the compiler to achieve a similar result... which is not really a viable option if the addition is not accepted into the project.
Obviously in this implementation I have only added walk_ast function clauses for those AST nodes my own programs require; your mileage may vary. If you do run this module over your own code and an unknown node appears then please let me know so I can add an appropriate clause here. Likewise, please feel free to drop me a line with questions, comments, and/or (especially!) suggestions.
Update 10/4/2007:
Yariv Sadan suggested that I take a look at Recless, one of his many ongoing projects (see comments in Part 1). As Yariv did in Recless, I have removed the atoms_from_ast function by rolling the gathering of atoms into the walk_ast function. It saves five lines of code, but more importantly it saves one pass through the top level of the AST. (The down side is that you can no longer expect the Atomiser to validate atoms before the appropriate module attribute is encountered, but I have no problem with that.)
"ayrnieu" posted a link to this series on Reddit, and also mentioned the Dialyzer tool. I have just played with the Dialyzer and have only one thing to say: Use the Dialyzer on your code.
erl_syntax:subtrees may help with walking the AST. See below:
ReplyDeletetree_foldr(Fun, Acc0, Tree) ->
F = fun (Group, AccIn) ->
F = fun (Subtree, AccIn) ->
tree_foldr(Fun, AccIn, Subtree)
end,
lists:foldr(F, AccIn, Group)
end,
case erl_syntax:subtrees(Tree) of
[] ->
Fun(Tree, Acc0);
List ->
Fun(Tree, lists:foldr(F, Acc0, List))
end.
I am a little new to Erlang, so some things may be kind of off.