When Dynamic Typing Goes Wrong

Yesterday I found out first-hand how using a dynamically typed language can get you into trouble in unexpected ways whilst writing unit tests for CSF (my PHP framework).

Read the rest of this entry »

Why .NET Won’t Beat Java (Yet)

It’s been no real secret that the .NET CLR (Common Language Runtime) has been Microsoft’s answer to Java. Garbage collection, bytecode compilation, large set of core libraries, it’s all there. But there is a problem that I’ve encountered recently: distribution size and install base.

A fairly clean Windows XP machine is fairly certain to not have anything higher than .NET 1.x installed. Anything really compelling in the .NET framework requires 3.5. This means somehow you need to get version 3.5 onto the machine somehow.  This isn’t a problem in Windows Vista and later, which include the framework.  Microsoft’s answer to simple deployment is it’s “ClickOnce” system, where an application automatically installs .NET from the Internet (if necessary) before installing itself.  Sure, it’s a 60MB download, but it only needs to be done once.

The real issue is when you can’t guarantee an Internet connection or a working .NET 3.5 installation.  At this point you must resort to the offline installation, and this is where .NET and Java are very different.  For Java 1.6 SE, the offline installer is 16MB for Windows 32-bit.  For .NET 3.5, the offline installer is a 200MB universal package, with no way to cut out the parts you don’t need.  Java in fact is so small relatively speaking that many applications actually include the JRE in their package (for example OpenOffice - 148MB with JRE vs. 134MB without), whereas .NET can turn a 10MB application into a 210MB monstrosity.

Now in my particular situation, I’m embarassed to have chosen to use C#/.NET/WPF for a simple tool at work.  For programming the tool itself, it was certainly the fastest option - other kinds of Windows programming, e.g. MFC or Forms, just look painful, and I thought the barrier to entry would be lower than Java.  However this 1MB tool requires the 200MB .NET offline installer to be carted around with it because the network it’s used on is completely separate from the Internet.

.NET will only be really appealing once it’s ubiquitous, but then “critical mass” is one of the big problems for lots of software.  For now, I think I’m going to try Java next time…

Mouse Button Remapping with HAL

I’ve had a Logitech MX1000 mouse for a few years now, and the two most important features for me have been the ergonomic build and the few extra buttons. Something I’ve always found with many-buttoned mice is that the side button closest to the thumb is a much more ergonomic way to “middle click” than the actual middle mouse button—it’s a much more natural motion. Middle clicks are quite useful these days, especially with them being a standard way of closing tabs (and opening them in browsers), and having such a popular button perched on a rocking and rolling peak is far from ideal.

Since I’m primarily a Linux user, I don’t have Logitech’s own SetPoint software at my disposal, so I’ve always had to find a way to get this functionality in some other way. When I first got the mouse, this method involved deliberately using a “basic” mouse driver (referred to in xorg.conf as “IMPS/2″), which didn’t support many mouse buttons. The effect was that the button mappings wrapped around, leaving button 8, my preferred “middle click”, mapped to button 2 (8 mod 3), the real middle button.

Unfortunately, newer Xorg versions became smarter and better capable of handling more buttons, and this workaround ceased to function. For the next while, I used something even more hackish: xbindkeys combined with xmacroplay to simulate a middle click with the following part of my .xbindkeysrc:

"echo ButtonRelease 8 ButtonPress 2 ButtonRelease 2 | xmacroplay -d 0 :0.0 &"
    b:8

The downside to this solution is that there are some cases where the button events don’t work correctly, one of them being open-in-tab from a bookmark menu in Firefox. It seemed the best solution would be to get Xorg to remap the buttons in such a way that button 8 really was just an extra button 2. The “xinput” utility lets you set button maps in this way—this wiki entry shows how to remap mouse buttons (even if for a different purpose).

This method worked fine, and I put it in my startup programs for GNOME, but it didn’t persist after suspend/resume. It appears that when resuming, USB devices get “reattached”, and therefore don’t keep the settings applied to them the last time they were attached. The workaround for this is to set a policy using a HAL (Hardware Abstraction Layer) .fdi file. These files live in /etc/hal/fdi/policy (at least they do on Ubuntu) and allow you to set various properties on input devices. This page on the Ubuntu wiki gave me the recipe I needed to remap buttons based on the device name. I ended up with the following .fdi file (which I saved at /etc/hal/fdi/policy/logitech-mx1000.fdi):

<?xml version="1.0" encoding="UTF-8"?>
 
<deviceinfo version="0.2">
 
<!-- Remap Logitech MX1000 buttons so that the most accessible side button
     acts as a middle button -->
 
  <device>
    <match key="info.product" string="Logitech USB RECEIVER">
      <merge key="input.x11_options.ButtonMapping" type="string">1 2 3 4 5 6 7 2</merge>
    </match>
  </device>
 
</deviceinfo>

Now, whenever my Logitech mouse is connected, it gets the buttons remapped—this includes when resuming from suspend. Problem solved… until things are changed again of course!

The Rise and Fail of Facebook

A recent Lifehacker story about a Greasemonkey script to remove quiz stuff from Facebook made me realise something: you know your site is too “web 2.0″ when users are going out of their way to remove features. A quick search on Userscripts.org reveals that a substantial proportion of the Facebook-related scripts are either for removing features or auto-playing game applications.

I feel that Facebook started off pretty well.  The initial target audience seemed to be students, and that’s what most of the users were.  It was friendly and uncluttered, compared to other social networking sites like MySpace.  This spread to a wider audience, first sucking in kids and then adults.  This was less than ideal for some, the idea of their parents on a social networking site being horrifying, but still not a disaster.

But in the midst of all this, the apocalypse happened: a powerful API which allowed other people to extend the functionality of Facebook by creating applications that people could add to their account.  Gradually users became innundated with an unstoppable torrent of application requests, caused by application developers making the “tell all your friends” step seem (or be) necessary to using their application.  The average user’s profile became a mess of applications competing for screen space.  The target audience became those who have the time to play endless Flash games, answer endless quizzes, and generally clutter the “requests” page of their friends.

And so Facebook became MySpace 2.0—gaudy and cluttered, with infinite ability to add more clutter.  Maybe the lesson learned is that the moment you allow anything “shiny” in a social networking site, your demographic will degenerate to schoolkids who want to fill their profiles with as much of it as possible.

Personally, I’ve reduced my Facebook interaction for the past 2 years to having the site email me when anybody says something directly to me either via message or on my “wall”.  There is just too much clutter now for me to wade through the endless application requests, friend requests from people I don’t know, and news feed items generated by the latest viral quiz.  And maybe pictures of friends getting drunk have lost their novelty value…

Spectacular IE7 CSS box model failure

When re-theming DokuWiki to fit in with the general look-and-feel of this blog, I thought it would look good to have buttons with images relevant to what they did. Having created the necessary, CSS, I subjected it to my multi-browser test bed (actually a Windows XP virtual machine with the latest versions of the top 5 browsers running in it). The CSS I used was along the lines of this:

div.dokuwiki form.btn_show .button {
  padding: 1px 0 1px 16px;
  background: transparent url(images/page.png) 0px 1px no-repeat;
  border: none;
}

One thing that became apparent is that IE7 has absolutely no idea how to render this. The correct rendering (as produced by Firefox 3 and all the other browsers) is this:

button-background-image-ff3

IE7’s take on rendering this looked more like this:

button-background-image-ie7

After a bit more experimentation, I came up with the following illustration which roughly shows what is happening:

The nice thing is that this is fixed in the upcoming IE8, so I’ve taken the stress-free approach; since it’s a relatively “minor” visual bug, there’s no sense it tearing my hair out trying to make it render correctly everywhere. On this occasion I’ve decided that people can either deal with the fact their browser is behind the times, or upgrade, or wait for the upgrade to happen for them.

Quick and dirty screen scraping of UK MP expenses data

A dominating story in the news recently has been that of UK Members of Parliament “abusing” the expenses system. As part of this, expense claims data has been released to the public, but unsurprisingly not in a simple CSV format that just anybody can play with. All I could find were PDF files and news sites representing the data in their own way.

This led me to wonder how easy it would be to “scrape” the data from one of these sites. Having heard about BeautifulSoup, a Python HTML “tag soup” parser, I opened up an interactive Python session and started to play with the data from this BBC News page. Luckily for me, the BBC page’s HTML isn’t too ugly, so figuring out how to get the data rows wasn’t that hard.

The end result is the following Python script which scrapes the data from the BBC News page and saves it in both CSV and JSON formats.

  mp-expenses-extract.py (2.3 KiB, 129 hits)

  mp-expenses.csv (58.8 KiB, 142 hits)

  mp-expenses.json (226.8 KiB, 111 hits)

Mozilla/1.22 (compatible; MSIE 2.0; Windows 95)

Browsing the visitor stats today, I found this little gem:

Ancient browser user agent

“Mozilla/1.22 (compatible; MSIE 2.0; Windows 95)”?  Who uses such a browser?  My bet is on it being a zombie machine that somebody has completely forgotten about, which would explain the several other hits I have from versions of IE before 5.0…  These visitors turn up, hit the same page repeatedly, and then leave.  Unfortunately I deleted my spam before seeing this, otherwise I’d be able to verify this theory.  Maybe I’ll look more closely next time.

Tpl - A tiny PHP template engine

While developing CodeScape Framework I came up with a simple template engine that instead of implementing it’s own language just provided some structure-related hooks, and I’ve now got around to separating out and releasing Tpl, a simple, self-contained, hierarchical template engine for PHP.

Tpl’s role is centred around template structure, not content—it’s not a huge library of helper functions for creating links, displaying images, etc.  Like any template engine, the aim is to separate presentation logic from application logic, but it doesn’t attempt this by creating a template language.  PHP is designed to be interleaved with “static” output, which means it’s already a template language.  Trying to replace it for the purpose of templating rarely adds anything useful, incurs hefty performance overheads (which you then have to mitigate with caching) and usually is more restrictive than PHP.  To me, that doesn’t seem like a sensible application of effort.  Instead Tpl provides its functionality in the form of hooks—functions which are called from within the template to specify the structure. Instead of PHP being replaced, it is extended.

The main motivating factor for creating Tpl is that all PHP template engines seem to be centred around inclusion rather than inheritance, something I only noticed after spending time working with Django.  For example, a typical PHP template might look like this:

<!-- header.php -->
<html>
    <head><title><?=$title?></title></head>
    <body>
        <div id="header">Website Name</div>
<!-- footer.php -->
        <div id="footer">Some footer text</div>
    </body>
</html>
<!-- content.php -->
<? include('header.php'); ?>
 
<div id="content">
    Some content
</div>
 
<? include('footer.php'); ?>

The main problem with this is that it bears no real relation to the logical structure of the page, just the order in which stuff should appear in the HTML. If you modify something in the header, you need to make sure you’ve updated the footer too. The equivalent Django template would look like this:

{# base.html #}
<html>
    <head><title>{{ title }}</title></head>
    <body>
        <div id="header">Website Name</div>
 
        {% block content %}{% endblock %}
 
        <div id="footer">Some footer text</div>
    </body>
</html>
{# content.html #}
{% extends "base.html" %}
 
{% block content %}
<div id="content">
    Some content
</div>
{% endblock %}

The point to note is that the structure is coherent and in one place, and the parts that change are defined as logical blocks which can be replaced. This is the method I’ve adopted in Tpl—the equivalent template would look like this:

<!-- base.php -->
<html>
    <head><title><?=$C['title']?></title></head>
    <body>
        <div id="header">Website Name</div>
 
        <? Tpl::block('content'); ?><? Tpl::endblock(); ?>
 
        <div id="footer">Some footer text</div>
    </body>
</html>
<!-- content.php -->
<? Tpl::inherit('base.php'); ?>
 
<? Tpl::block('content'); ?>
<div id="content">
    Some content
</div>
<? Tpl::endblock(); ?>

More information about Tpl and how to download it can be found at the Tpl project page.

Enable Bitmap Fonts on Ubuntu Jaunty

I like to use tiny bitmap fonts like MonteCarlo for programming, but by default Ubuntu has bitmap font support turned off.  From (at least) Gutsy through to Intrepid, this method worked for enabling bitmap font support, but after installing the Jaunty beta I found this no longer works.

Luckily, after a brief look in /etc/fonts, I found that font configuration now follows the nice pattern of a conf.avail directory containing all the available configuration parts, and conf.d containing symlinks to these parts.  This makes enabling bitmap fonts even simpler now:

# "Un-disable" bitmap fonts
$ sudo rm /etc/fonts/conf.d/70-no-bitmaps.conf
# Clear the font cache
$ sudo fc-cache -f -v

Now you should be able to drop bitmap (i.e. PCF) fonts into ~/.fonts as you would with TTF fonts and be able to use them with no extra hassle.

VDB 0.1.0 (alpha) Released

Recently I’ve been working on a program that will let me do a full-text search of my 1500-file digital video library.  The result is something I’ve unimaginatively called “VDB”, short for “Video Database”.

VDB InterfaceThis release doesn’t do much: you can add your library of video files, and it will let you do a full-text search on the filename.  You can double-click an item in the list to launch the video with ‘gnome-open’ (which should honour user settings for preferred applications).

Feel free to download it, mess around with it, and send me any feedback you have.  I have plenty of future plans for it, including “watch folders”, tagging, metadata, fetching information from IMDB, and being able to search on all of this information.  (Think queries like “actor:’johnny depp’ tag:crime”.)

Visit the VDB project page.