Archive for the 'Tools' Category

Naive parsing techniques – part I

In this post, I’m considering a design for a simple tool that could help doing the following:

  • Repackaging classes; this will involve replacing package and import declarations
  • Validating class dependencies
  • Generating code and TODO annotations from an incomplete specification
  • Tracking high level aspects of system architecture by extracting relevant data from program code

Looking at the above, we’d surely agree that some kind of API would be required to take care of source analysis. This is where naive parsing comes in play.

If we needed an API to help resolve the above tasks unambiguously, we would also need to parse the source code reliably. However, writing parsers is overkill. In fact, just setting up and integrating with a parser might be quite a stretch.

Instead of going the full stretch, I’m proposing to write a naive parsing library. This will consist in several utility methods allowing to scan source files, detect relevant code fragments with a high (not absolute) degree of confidence and add/modify source files.

Why naive parsing?

I’ve already given half of the answer: parsing is expensive. Using naive parsing functions will allow my reusing the same techniques across similar languages. We’ll not only be saving time preparing the parsing facilities – we’ll actually save time every time we need to write another utility. That’s where ‘naive’ comes in play.

Most parsers somehow relate to the way we model language. However, there are strong smells indicating that parsers do not quite interpret text as we do:

  • Classic parsers do not usually integrate error correction. When reading text, we are performing a mix of interpretation and error correction
  • Given correct input, however complex, a classic parser will resolve that input into a parse tree of arbitrary depth and breadth. When reading text, we can easily overload when a sentence becomes too long and too complex.

Parsers, then, provide fail-fast, scalable solutions. Scalability is definitely a quality, but comes at a cost – naive parsing consists in simple rules that are easy to come up things, whereas writing scalable parsing rules is an art. Fail-fast may or may not be a quality depending on the context. For example, we might want our utilities to correct simple mistakes, even if that means introducing another mistake from time to time. After all, our code will eventually get either interpreted or compiled, so we only want to make sure that our automations generate an overall reduction in development cost.

Isn’t 100% reliability a paramount requirement?

Yes, but not in the parser used by our utilities. We need %100 reliability in the final product – what our utilities are helping us build – in contrast, we only need our utilities to save more time than they cost. In this particular scenario (considered the originally suggested applications), three strategies will be used to achieve correct results using only a ‘mostly accurate’ parser.

  1. Simple, readable code. Simple, readable code is advocated everywhere. This suggests that parsers that allow very complex code – code that humans find hard or impossible to read – are too powerful. In short, we’ll agree to rewrite some of the code that the parser cannot read wherever this arguably increases the readability, simplicity and maintainability of the source.
  2. Semantic separation. As we’ll see later, a naive parser often fails to analyze correctly code that uses semantic overlaps. Semantic overlaps don’t really promote simple code, and need rarely occur in program code, although classic parsers often rely on very limited semantic separation – for example very limited number of language keywords.
  3. Patches. We will allow defining patches that determine project specific exceptions to the parsing process. This will be provided as an ultimate measure, mostly used to deal with legacy code that cannot be modified.

That’s all folks

Here’s a plan. Yes, I do have something concrete in mind, but if you read this post, I’ll be delighted to get feedback and hear that this has inspired ideas that have nothing to do with what I envision

Mix Master, Code Faster

Things I routinely do with my IDE:

  • One click getter/setter definitions. Plus I never need to actually overload with the sight of bunched accessors because they appear as indents on the left of my field names.
  • Looking up class specs, not the code. As discussed with a much respected colleague, this implies a cultural shift – Wow. From code culture to design culture!
  • DnD Refactoring. Typically achieved by pressing, holding the mouse and dragging a class, package or method to (respectively) another class, package or package.
  • 100 classes a click away.
  • Not using a UML editor. Hey, we already agreed seeing the specs before seeing the code implies a culture shift.
  • Not using search. What you see is what you get. I can see it, so I get it.

Maybe I’m having more fun than you do after all.

I won’t give you the name. After all you just need to poke around a couple of posts. Besides, my IDE doesn’t have syntax correction, code hyperlinks or version control.

It doesn’t even have an undo.

Next time, I’ll tell you how it all started.

The Best Java IDE – An Epic on Programming Folklore

Now, what do you expect here?

A review… ? Well, don’t think so…

10 years ago, I was on the brink of a bold, redemptory decision – you name it, I was ripe and ready to give up programming.

Two month ago, I exchanged a couple of mails with a heavyweight director at a huge IT corporation. With respect to the potential interest of new programming tools and paradigms, he wrote:

“We have staked everything on (Acme IDE), I really can’t imagine doing anything else”.

So much for open source, modularity, leadership and innovation.

Back to September 2000.

At the time, I was using a couple of known and less known IDEs. JBuilder and CodeGuide are still around, and if you’re actually writing java code, you might as well dust a copy off your cupboard and get on with writing some good code (not that I recommend either. Do I need to give the list of IDEs I do not recommend?).

In the past 10 years, I have seen no major improvements to programming technology, and readily expect the blues to last another 15. Why?

Programming culture?

  1. Text, white text on a black screen. Last time I made an effect while presenting an IDE was still white text on a black screen (well – graphs, but still…) and I know at least one programmer that gave up on Eclipse because the display is hard to configure.
  2. (Again), no services. How can you code when you don’t have  a bash command line handy, righ at your fingertips?
  3. Creeping libertarianism. At my job, dishing a $100 for an IDE just seems a little bit too much – sure, digging up free utilities is forever hip. Now don’t get me wrong, it’s got nothing to do with logistics – as a profession, we just tend to confuse commercial software with assistive technology.
  4. Virtuosity – who needs an IDE when languages already have Reusability, Concision, Modularity and all that prescient demi-gods have envisioned in the Golden Age, beautifully inspired by Logic, Reason & Noam Chomsky)?

Once we’ve hit the baseline, here’s how we choose the tools we commit our life and career to:

  • Industry endorsement. This starts at school. Officially, you can’t teach using an IDE until it’s been stamped by the government.
  • Standardisation. Well, yes – we’d very much appreciate gaining in productivity by using something new, but you see, it needs to be the same as what we already know, otherwise we won’t trust it, won’t try it, won’t use it and won’t keep it.
  • Features. There is a law linking featurism with atavistic feelings of safety and freedom. In fact, a typical answer that I get when suggesting I can’t do what I want with OTS products is you can probably do it, it does everything and I’m not sure how it’s done, but I know it’s possible.
    Yes, I know we know we don’t need the features. We never use them… but… but… what if…

Sure. I’m still writing software. I even said once that you’d have to pay me to use an OTS IDE again. Well surely I wouldn’t be writing AS3 code using an AS2 plugin for the fun or the weather. It doesn’t validate. It doesn’t hyperlink. It doesn’t even delete files anymore, and versions aren’t under control. So what?

Overall, I still get * notepad class performance *.

And I don’t mind, because what I really care about is what I see, and what I do. And that is that, which we could already do 10 years ago using CodeGuide or JBuilder, granted function folding has gone worse since Basic GFA.

Now go and get yourself whatever maximises usability by minimising features and clutter. Get the bare minimum that you can’t imagine coding without. As the market goes, less is more, and you might actually have to pay if you really want no more features than you can afford.

As a quick reminder, don’t forget that relying on object variables versus parameters makes your code harder to refactor. You’ve been warned.

Keep posted :)

SourceFactor – no compromise

You could skip the best part of this article and visit sourcefactor.org – SourceFactor is an interface for processing arbitrary sources to arbitrary targets (no, really), and it comes with a nifty utility class that helps formatting the output. You can use it with build processes or you can integrate it with java.

I wrote somewhere that languages are inextensible; inextensible they are. For proof:

  • C and C++ macros.
  • Java annotations

Macros are simple and potentially messy. Java isn’t messy. Instead Java provides a pretty scary API for preprocessing. Because I annotate all my source with XML, I wasn’t impressed when annotations came by, and after a while I just sat there, shaking my head – after all the idea of annotations is to save time on the so-called ‘boiler plate’ required by some APIs and frameworks.

That language designers provide patchy facilities to allow writing shortcuts shows just how far you can go with regular inheritance and re-usability constructs. The idea of writing domain specific languages isn’t new (check Persistence of Vision for some entertainment), and although it’s not for everybody, there are many, many advantages (Wikipedia has a decent article on ‘Language Oriented Programming’).

Yes, XML’s my cup of tea. One of the disadvantages of writing a domain specific language is that you need a parser. writing a parser is an expensive black art. Leaving elegance behind and hitting the ground running, XML parses anytime, anywhere.
Ergo one of my favorite pass-times is embedding regular java, ECMA-Script or PHP within XML declarations and exporting regular source code.

SourceFactor doesn’t bind you to XML sources. Nor does it tell you how to parse your input. But say you took on the challenge to write a simple, readable formal specification using a plain text editor, spreadsheet software or whatever you please. Well then, SourceFactor gives you a simple API that you can use to invoke your preprocessor from the command line. It is free, small, open source and convenient.

If a language were a violin, meta-programming is playing without brushing the strings. Another day I’ll write about naive parsers and how hubris, upon the world unleashed, millions of write only code-lines.

Visual XML editors?

I’ve been hunting online for an XML editor with the following features:

  • non overlapping tree views
  • tabulated attribute lists
  • custom actions.

If you had a look at my IDE, Antegram, you will know where I’m coming from: I use non overlapping tree views everyday in my code. To cut a long story short, I’m out of luck as I didn’t find what I was looking for. OK, I admit I’m quite particular about this – I’ll explain another time.

For a quick, unfair overview of what’s around and what it could do for you, read on…

Roughly, the offer is divided between WYSIWYG editors and rational editors. WYSIWYG editors are typically based on XSLT, and that seems powerful and exciting (although I am not quite addicted to XSLT). Skimming through my list I feel that at least 2 or 3 of the editors I found may support custom actions. Rational editors typically offer a wide range of visual tools including a tree view; unfortunately it doesn’t seem that any editor was built around several non overlapping tree view.

Here’s the list anyway:

Open XML Editor
XMLmind
EditiX
My Eclipse
Altova
oXygen
Foxe
Exchanger
XML Notepad
Serna
XMLSpy
Stylus Studio

I also found a few resources that may help in choosing an XML editor:



Follow

Get every new post delivered to your Inbox.