You could skip the best part of this article and visit sourcefactor.org – SourceFactor is an interface for processing arbitrary sources to arbitrary targets (no, really), and it comes with a nifty utility class that helps formatting the output. You can use it with build processes or you can integrate it with java.
I wrote somewhere that languages are inextensible; inextensible they are. For proof:
- C and C++ macros.
- Java annotations
Macros are simple and potentially messy. Java isn’t messy. Instead Java provides a pretty scary API for preprocessing. Because I annotate all my source with XML, I wasn’t impressed when annotations came by, and after a while I just sat there, shaking my head – after all the idea of annotations is to save time on the so-called ‘boiler plate’ required by some APIs and frameworks.
That language designers provide patchy facilities to allow writing shortcuts shows just how far you can go with regular inheritance and re-usability constructs. The idea of writing domain specific languages isn’t new (check Persistence of Vision for some entertainment), and although it’s not for everybody, there are many, many advantages (Wikipedia has a decent article on ‘Language Oriented Programming’).
Yes, XML’s my cup of tea. One of the disadvantages of writing a domain specific language is that you need a parser. writing a parser is an expensive black art. Leaving elegance behind and hitting the ground running, XML parses anytime, anywhere.
Ergo one of my favorite pass-times is embedding regular java, ECMA-Script or PHP within XML declarations and exporting regular source code.
SourceFactor doesn’t bind you to XML sources. Nor does it tell you how to parse your input. But say you took on the challenge to write a simple, readable formal specification using a plain text editor, spreadsheet software or whatever you please. Well then, SourceFactor gives you a simple API that you can use to invoke your preprocessor from the command line. It is free, small, open source and convenient.
If a language were a violin, meta-programming is playing without brushing the strings. Another day I’ll write about naive parsers and how hubris, upon the world unleashed, millions of write only code-lines.