May 17th, 2009
•
css •
View Comments
When re-theming DokuWiki to fit in with the general look-and-feel of this blog, I thought it would look good to have buttons with images relevant to what they did. Having created the necessary, CSS, I subjected it to my multi-browser test bed (actually a Windows XP virtual machine with the latest versions of the top 5 browsers running in it). The CSS I used was along the lines of this:
div.dokuwiki form.btn_show .button {
padding: 1px 0 1px 16px;
background: transparent url(images/page.png) 0px 1px no-repeat;
border: none;
}
One thing that became apparent is that IE7 has absolutely no idea how to render this. The correct rendering (as produced by Firefox 3 and all the other browsers) is this:

IE7′s take on rendering this looked more like this:

After a bit more experimentation, I came up with the following illustration which roughly shows what is happening:

The nice thing is that this is fixed in the upcoming IE8, so I’ve taken the stress-free approach; since it’s a relatively “minor” visual bug, there’s no sense it tearing my hair out trying to make it render correctly everywhere. On this occasion I’ve decided that people can either deal with the fact their browser is behind the times, or upgrade, or wait for the upgrade to happen for them.
May 13th, 2009
•
politics, python •
View Comments
A dominating story in the news recently has been that of UK Members of Parliament “abusing” the expenses system. As part of this, expense claims data has been released to the public, but unsurprisingly not in a simple CSV format that just anybody can play with. All I could find were PDF files and news sites representing the data in their own way.
This led me to wonder how easy it would be to “scrape” the data from one of these sites. Having heard about BeautifulSoup, a Python HTML “tag soup” parser, I opened up an interactive Python session and started to play with the data from this BBC News page. Luckily for me, the BBC page’s HTML isn’t too ugly, so figuring out how to get the data rows wasn’t that hard.
The end result is the following Python script which scrapes the data from the BBC News page and saves it in both CSV and JSON formats.
mp-expenses-extract.py (2.3 KiB, 236 hits)
mp-expenses.csv (58.8 KiB, 257 hits)
mp-expenses.json (226.8 KiB, 218 hits)
May 11th, 2009
•
View Comments
Browsing the visitor stats today, I found this little gem:

“Mozilla/1.22 (compatible; MSIE 2.0; Windows 95)”? Who uses such a browser? My bet is on it being a zombie machine that somebody has completely forgotten about, which would explain the several other hits I have from versions of IE before 5.0… These visitors turn up, hit the same page repeatedly, and then leave. Unfortunately I deleted my spam before seeing this, otherwise I’d be able to verify this theory. Maybe I’ll look more closely next time.