Hierarchical Bayes Compiler

| 2 Comments

I was very excited to see Hal Daume announce the Hierarchical Bayes Compiler. The version is 0.1, but from taking an initial look at the source code, Hal seems to be using the latest from the computer science candy store such as Haskell. For one, his applications are in natural language processing, where scaling to large quantities of data are vital.

This is very relevant for us, as we're building an application built on a hierarchical model, and currently rely on WinBUGS to do the estimation. But WinBUGS and OpenBUGS are both based on the Oberon environment which is very closed in addition to being centered on the user interface. A more plug-and-play alternative is JAGS, but I have not tried out. The alternative is to use Jouni's Umacs (universal MC sampler), with quite a few adaptive sampling tricks. However, R is very slow for such things, and I've been reluctant to adopt it for our purpose. HBC works as a compiler, generating the sampler in C. While I do not know how well this works, but we can expect at least an order or two of magnitude improvement over interpreted samplers.

Will report more when I test it out.

2 Comments

I'm not sure what plug-and-play means in this context. But JAGS is certainly portable, and can be linked to other libraries written in C, C++, and FORTRAN. I confess that the release rate has been slow, but JAGS 1.0 should be out by the end of the year. JAGS doesn't scale very well: supporting the flexibility of the BUGS language creates a large memory overhead. So HBC - which has scalability as a specific design goal - should be more useful for large problems.

Martyn: By plug-and-play I mean whether a toolkit is embeddable. In particular, imagine that you have a graphical user interface that allows one to do data analysis. The data analysis in turn invokes a sampler. Now imagine that someone runs two data analysis modules, so there will be two samplers running at the same time. If the communication between the module and the sampler goes through files, this will not end nicely. Second, the sampler should be a part of the installation package. Having a sampler that depends on other packages can easily lead into a huge list of dependencies that have to be lugged around with the data analysis module. In all, being able to compile a model into a small efficient standalone executable is the best case I can think of for our needs.

Leave a comment

Subscribe to Entry

Recent Comments

  • Aleks: Martyn: By plug-and-play I mean whether a toolkit is embeddable. read more
  • Martyn: I'm not sure what plug-and-play means in this context. But read more