Apr
03
2009
1

In Praise of “.h” Files

Just back from CIDM 2009 at Nashville, and for the moment we’re ahead of the pack in fuzzy data mining. The reason for this is actually the quality of our software engineering – our (well Na’el’s) program is so much faster than the competition that we can actually tune parameters and get close to optimal performance out of what should be a good algorithm for learning. (It sort of looks like adding fuzzy-ness takes the decision tree approach from being a “weak learner” into being a “strong learner”). But that’s not what this post is about.

There were a couple of ideas that I ran into in the meeting and I was able to rapidly prototype code in C++ for them. One was “Differential Evolution” which is a neat twist on the genetic algorithm where random differences are used to recombine the genes, and the other is “radically random trees” where the decision tree is chosen at random and selected for by trial and error. (I haven’t quite finished prototyping that, but it is well underway). I also built a near state of the art particle swarm tool while (not really) listening to someone nattering on about their variation of yet another SVM. Differential evolution is really neat and I’ll probably post a few results soon.

How could I do this so fast? and more importantly have working code at the end!

I used C++ and I used “.h” header files to design the classes before I built a single bit of code. I could check my class design for consistency and completeness before writing code. Then the code practically writes itself, because with well-designed class methods there is only a small amount of self-contained code to write.  This is different from Java or Python (or many other OOL’s) but really useful.  In Java you don’t lay out a description of the objects before you write them. (well, you can but its not the standard approach) It is hard to resist the temptation to start on the  implementation before writing the design.  Sometimes that’s OK – especially if the object is so complex or poorly understood that you need to write something to find out what you don’t know – but it is truly inefficient.  Anyway even with a small, slightly slow, Asus EEE 1000, I could run test programs on problems that were comparable to what was being presented.  There is no easier way to sort out BS, than to test it as it’s being presented ;-).

Written by Rob in: engineering |

Powered by WordPress | Aeros Theme | TheBuckmaker.com WordPress Themes