<?xml version="1.0"?>
<rss version="2.0">
<channel>
 <title>Benno's Blog</title>
 <link>http://benno.id.au/blog/</link>
 <description>Systems design and other random tech stuff</description>
 <language>en-au</language>
 <copyright>Ben Leslie 2006-09</copyright>
 <pubDate></pubDate>
 <lastBuildDate></lastBuildDate>
 <docs>http://blogs.law.harvard.edu/tech/rss</docs>
 <generator>Benno's Magical Content Generator</generator>
 <ttl>120</ttl>
 <webMaster>benno.blog@benno.id.au</webMaster>
  <item>
    <title>Building an AVR toolchain on OS X</title>
    <link>http://benno.id.au/blog/2010/07/09/building_avr_toolchain</link>
    <guid>http://benno.id.au/blog/2010/07/09/building_avr_toolchain</guid>
    <pubDate>Fri, 09 Jul 2010 19:49:17 +0000</pubDate>
    <description>&lt;p&gt;I’ve recently got myself a couple of &lt;a
href="http://www.arduino.cc/"&gt;Arduino&lt;/a&gt; Duemilanove boards from &lt;a
href="http://www.littlebirdelectronics.com/"&gt;Little Bird
Electronics&lt;/a&gt;.  Now the Arduino is a pretty cool bit of kit, and it
comes with a great integrated development environment, which makes
writing simple little programs an snap. Of course, I’m too much of an
old &lt;strong&gt;curmudgeon&lt;/strong&gt; to want to use and IDE and some
fancy-schmancy new language, I want &lt;code&gt;make&lt;/code&gt;, I want 
&lt;code&gt;emacs&lt;/code&gt;, I want &lt;code&gt;gcc&lt;/code&gt;. So I embarked on the
task of &lt;strong&gt;building a cross-compiler for the AVR board, that
will work on OS X&lt;/strong&gt;.&lt;/p&gt;

&lt;img src="http://benno.id.au/images/arduino.jpg" alt="Photo of my Arduino board" /&gt;

&lt;p&gt;Now, my ideal solution for this would have been to create a package
for &lt;a href="http://mxcl.github.com/homebrew/"&gt;Homebrew&lt;/a&gt;, my
current favourite package manager for OS X. Unfortunately, that just
isn’t going to happen right now. &lt;strong&gt;GCC toolchains pretty much
insist in putting things in &lt;code&gt;$prefix/$target&lt;/code&gt;&lt;/strong&gt;. So, in a
standard homebrew install, that would mean I need a directory
&lt;code&gt;/usr/local/avr&lt;/code&gt;, unfortunately, Homebrew insists
your package only dump things in &lt;code&gt;etc&lt;/code&gt;, &lt;code&gt;bin&lt;/code&gt;,
&lt;code&gt;sbin&lt;/code&gt;, &lt;code&gt;include&lt;/code&gt;, &lt;code&gt;share&lt;/code&gt; or
&lt;code&gt;lib&lt;/code&gt;. Now, after wasting a bunch of time first fighting
GCC’s autogoat build system to try and convince it to conform
to Homebrew’s idea of a filesystem hierarchy, and then a whole
other bunch of time trying to learn Ruby and
convince Homebrew that other directories should be allowed, I took
the corwads way out and decided &lt;strong&gt;&lt;code&gt;/opt/toolchains&lt;/code&gt; is
a perfectly respectable place for my cross-compilers to live.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;And so, the cross compiler dance begins. &lt;strong&gt;We start with
&lt;a href="http://www.gnu.org/software/binutils/"&gt;binutils&lt;/a&gt;&lt;/strong&gt;,
and as the code dictates, we start by trying with the latest
version, which at time of writing is &lt;strong&gt;2.20.1&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Of course, building a toolchain for a non-x86 based architecture
wouldn’t be the same if you didn’t need to &lt;strong&gt;patch the
source&lt;/strong&gt;.  The patches for the avr are mostly minimal; adding
some extra device definitions, for devices you probably don’t have
anyway. If you miss this step, expect some pain when you try to compile
avr-libc. I got my patches from the &lt;a href="http://www.freebsd.org/cgi/cvsweb.cgi/~checkout~/ports/devel/avr-binutils/files/"&gt;FreeBSD AVR binutils&lt;/a&gt; patchset. I try to avoid
patches I don’t need so have only applied the &lt;a href="http://www.freebsd.org/cgi/cvsweb.cgi/~checkout~/ports/devel/avr-binutils/files/patch-newdevices"&gt;&lt;code&gt;patch-newdevices&lt;/code&gt;&lt;/a&gt; patch. If things break for you, you might want to &lt;strong&gt;try the other patches as well.&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;code&gt;$ mkdir avr-toolchain-build&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;$ cd avr-toolchain-build&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;$ wget http://ftp.gnu.org/gnu/binutils/binutils-2.20.1.tar.bz2&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;$ wget "http://www.freebsd.org/cgi/cvsweb.cgi/~checkout~/ports/devel/avr-binutils/files/patch-newdevices?rev=1.16;content-type=text%2Fplain" -O patch-newdevices&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;$ tar jxf binutils-2.20.1.tar.bz2&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;$ cd binutils-2.20.1&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;$ patch -p0 &lt; ../patch-newdevices&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;$ cd ..&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;$ mkdir binutils-build&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;$ cd binutils-build&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;$ ../binutils-2.20.1/configure --target=avr --prefix=/opt/toolchains/ --disable-werror&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;$ make -j2&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;$ make install&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;$ cd ../&lt;/code&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Fairly simple, the only gotcha is the
&lt;code&gt;--disable-werror&lt;/code&gt;, seems that binutils added the
&lt;code&gt;-Werror&lt;/code&gt; flag to the build, &lt;em&gt;good move!&lt;/em&gt;,
unfortunately, seems that the code isn’t warning free, so things barf
(&lt;em&gt;bad move!&lt;/em&gt;), but it comes with a free flag to disable the
error... that’s good.&lt;/p&gt;

&lt;p&gt;So, now we move on to &lt;a href="http://gcc.gnu.org/"&gt;GCC&lt;/a&gt;. Now,
since I’m really only interested in compiling C code, (remember the
bit about being a curmudgeon, none of this fancy C++ stuff for more,
no siree!), we can just grab GCC’s core package.&lt;/p&gt;

&lt;p&gt;Now &lt;strong&gt;GCC has some dependencies these days&lt;/strong&gt;, so you need to get &lt;a
href="http://gmplib.org/"&gt;gmp&lt;/a&gt;, &lt;a
href="http://www.multiprecision.org/"&gt;mpc&lt;/a&gt; and &lt;a
href="http://www.mpfr.org/"&gt;mpfr&lt;/a&gt; installed somehow. I suggest
using Homwbrew. In fact it was this step with my old &lt;a
href="http://www.macports.org/"&gt;Mac Ports&lt;/a&gt; setup that forced me to
switch to Homebrew; too many weird library conflicts with iconv
between the OS X version and the ports version; but hey, you mileage
may vary! (&lt;strong&gt;Note:&lt;/strong&gt; make sure you have the latest
version of Homebrew that includes my fix for building mpfr).&lt;/p&gt;

&lt;p&gt;And of course, just like binutils, you need some patches as well.
Unfortunately things here aren’t as easy. There doesn’t appear to be
any semi-official patch for new devices for gcc 4.5.0! So, based on &lt;a
href="http://www.freebsd.org/cgi/cvsweb.cgi/ports/devel/avr-gcc/files/patch-newdevices?rev=1.21;content-type=text%2Fplain"&gt;this
patch&lt;/a&gt;, I managed to hack &lt;a
href="http://benno.id.au/drop/patch-gcc-4.5.0-avr-new-devices"&gt;something
together&lt;/a&gt;. The formats have changed a little, so I’m not 100%
confident that it works, if you are actually trying to use one of the
new devices in this patch, &lt;strong&gt;be a little skeptical, and
double-check my patch&lt;/strong&gt;. If you are using an existing supported
AVR like the &lt;code&gt;atmega328p&lt;/code&gt;, then the patch should work
fine (well, &lt;strong&gt;it works for me&lt;/strong&gt;).&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;code&gt;$ brew install gmp libmpc mpfr&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;$ wget http://ftp.gnu.org/gnu/gcc/gcc-4.5.0/gcc-core-4.5.0.tar.bz2&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;$ wget http://benno.id.au/drop/patch-gcc-4.5.0-avr-new-devices&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;$ tar jxf gcc-core-4.5.0.tar.bz2&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;$ cd gcc-core-4.5.0&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;$ patch -p1 &lt; ../patch-gcc-4.5.0-avr-new-devices&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;$ mkdir gcc-build&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;$ cd gcc-build&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;$ ../gcc-4.5.0/configure --target=avr --prefix=/opt/toolchains&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;$ make -j2&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;$ make install&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;$ cd ..&lt;/code&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;So, the final piece to complete the toolchain is getting a C
library.  You almost certainly want &lt;a
href="http://www.nongnu.org/avr-libc/"&gt;AVR Libc&lt;/a&gt;, or &lt;strong&gt;you
can be really hard-core and go without a C library at all&lt;/strong&gt;, your
call.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;code&gt;$ wget http://mirror.veriportal.com/savannah/avr-libc/avr-libc-1.7.0.tar.bz2&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;$ tar jxf avr-libc-1.7.0.tar.bz2&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;$ mkdir avr-libc-build&lt;/code&gt;
&lt;li&gt;&lt;code&gt;$ cd avr-libc-build&lt;/code&gt;
&lt;li&gt;&lt;code&gt;$ ../avr-libc-1.7.0/configure --prefix=/opt/toolchains --host=avr --build=&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;$ make&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;$ make install&lt;/code&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This should work with out any problems, however, if you messed up the new devices
patches when building GCC and binutils, you might get errors at this point.&lt;/p&gt;

&lt;p&gt;So, now we have a working GCC toolchain for the AVR we can start
programming. Of course, you really want to have a good reason to do
things this way, otherwise I really recommend &lt;strong&gt;just using
the Arduino development environment.&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Next time I’ll be looking at getting some example C programs
running on the Arduino.&lt;/p&gt;

&lt;h2&gt;References&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="http://www.nongnu.org/avr-libc/user-manual/install_tools.html"&gt;AVR Libc User Manual&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;    </description>  </item>

  <item>
    <title>Mass market smartphones</title>
    <link>http://benno.id.au/blog/2010/06/01/mass-market-smartphones</link>
    <guid>http://benno.id.au/blog/2010/06/01/mass-market-smartphones</guid>
    <pubDate>Tue, 01 Jun 2010 21:10:23 +0000</pubDate>
    <description>&lt;p&gt;The gap between so called &lt;a
href="http://en.wikipedia.org/wiki/Feature_phone"&gt;feature phones&lt;/a&gt;
and smart phones continues shrink. Features phones with their,
typically, closed and proprietary operating systems have generally had
an edge in terms of price by requiring fewer hardware resources and
forgoing a strong separation between the baseband processing software
and the primary user-interface software. However, with recent software
innovations there is no reason you can’t effectively run an open
smartphone OS such as Android on the hardware that is typically used
in feature phones today. In the near future rather than having feature
phone and smart phones we will have &lt;a
href="http://www.ok-labs.com/landing/mmsp/"&gt;mass-market
smartphones&lt;/a&gt; and high-end smartphones all running open operating 
systems such as Symbian S3 and Android.&lt;/p&gt;

&lt;p&gt;The first enabler is &lt;a
href="http://www.ok-labs.com/blog/entry/context-switching-in-context/"&gt;fast-context
switching approach&lt;/a&gt; which allows better performance of operating
systems that use protected processes, such as Symbian and Android, on ARM9
based CPUs (which are commonly used in feature phones).&lt;/p&gt;

&lt;p&gt;Secondly, continual improvement in smart-phone OS performance
enables those OSes to run on lower-end hardware. An example of this is
the recently announced &lt;a
href="http://android-developers.blogspot.com/2010/05/dalvik-jit.html"&gt;Dalvik
JIT&lt;/a&gt; in the latest Froyo release of Android.&lt;/p&gt;

&lt;p&gt;Finally, &lt;a
href="http://www.ok-labs.com/solutions/what-is-mobile-phone-virtualization"&gt;mobile
virtualization&lt;/a&gt; provides the separation guarantees required to
allow a baseband stack, and an open OS to run on the same piece of
hardware.&lt;/p&gt;

&lt;p&gt;As much as I love getting and playing with fancy new hardware, I’m
looking forward to seeing a wider adoption of open OSes like Android
across a wider range of the mobile phone market.&lt;/p&gt;

    </description>  </item>

  <item>
    <title>Android Emulator Internals — Bus Scanning</title>
    <link>http://benno.id.au/blog/2010/02/09/android-emulator-internals</link>
    <guid>http://benno.id.au/blog/2010/02/09/android-emulator-internals</guid>
    <pubDate>Tue, 09 Feb 2010 15:14:27 +0000</pubDate>
    <description>&lt;p&gt;Wow, I can’t believe it has been over two years since I &lt;a
href="http://benno.id.au/blog/2007/11/29/android-qemu"&gt;last wrote&lt;/a&gt;
about Android’s for of the QEMU emulator. Turns out there have
been some changes since I last looked at it.&lt;/p&gt;

&lt;p&gt;The most important is that the Android emulator no longer has a
fixed layout of devices in the physical memory address space. So,
while it may have previously been the case that the event device was
always at &lt;code&gt;0xff007000&lt;/code&gt;, now it might be at
&lt;code&gt;0xff008000&lt;/code&gt;, or &lt;code&gt;0xff009000&lt;/code&gt;, depending on what
other devices have been configured for a particular device
configuration.
&lt;/p&gt;

&lt;p&gt;Now, if a device may exist at some random physical address, how
does the OS know how to setup the devices drivers? Well, as I’m
sure you’ve guessed, the addresses and really &lt;em&gt;random&lt;/em&gt;, they
are located at page-offset addresses through a restricted range of
memory. OK, so how does the OS know what the range is? Well, there is
the &lt;code&gt;goldfish_device_bus&lt;/code&gt; device.&lt;/p&gt;

&lt;p&gt;Basically, this device provides a mechanism to enumerate the
devices on the bus. The driver writes &lt;code&gt;PDEV_BUS_OP_INIT&lt;/code&gt; to
the &lt;code&gt;PDEV_BUS_OP&lt;/code&gt; register, the
&lt;code&gt;goldfish_device_bus&lt;/code&gt; then raises an interrupt. The driver
the reads the &lt;code&gt;PDEV_BUS_OP&lt;/code&gt; register. Each time the value
is &lt;code&gt;PDEV_BUS_OP_ADD_DEV&lt;/code&gt;, the driver can read the other
registers such as &lt;code&gt;PDEV_BUS_IO_BASE&lt;/code&gt;,
&lt;code&gt;PDEV_BUS_IO_SIZE&lt;/code&gt;, &lt;code&gt;PDEV_BUS_IRQ&lt;/code&gt;, to determine
the properties of the new device. It continues doing this until it
reads a &lt;code&gt;PDEV_BUS_OP_DONE&lt;/code&gt;, which indicates the bus scan
has finished.&lt;/p&gt;

&lt;p&gt;The driver can determine what type of device it has found by
writing a pointer to the &lt;code&gt;PDEV_BUS_GET_NAME&lt;/code&gt; register. When
this happens the device writes an the device’s name (as an ASCII
string) to the pointer.&lt;/p&gt;

&lt;p&gt;Linux uses these strings to perform device to driver matching as
described in the &lt;a
href="http://www.mjmwired.net/kernel/Documentation/driver-model/platform.txt"&gt;Platform
Devices and Drivers&lt;/a&gt; document.&lt;/p&gt;
    </description>  </item>

  <item>
    <title>Weird things about JavaScript</title>
    <link>http://benno.id.au/blog/2010/01/04/weird-things-about-javascript</link>
    <guid>http://benno.id.au/blog/2010/01/04/weird-things-about-javascript</guid>
    <pubDate>Mon, 04 Jan 2010 21:43:58 +0000</pubDate>
    <description>&lt;p&gt;JavaScript keeps throwing up interesting new tidbits for me. One
that kind of freaked me out the other day is when functions and variables
are created in a given block of code.&lt;/p&gt;

&lt;p&gt;My expectation was that in a block of code, a sequence of statements,
the statements would be executed in order, one after another, just like
in most imperative languages. So for example in Python I can do something
like:&lt;/p&gt;

&lt;pre&gt;
 def foo(): pass
 print foo
&lt;/pre&gt;

&lt;p&gt;and expect output such as:&lt;/p&gt;

&lt;pre&gt;
 &amp;lt;function foo at 0x100407b18&amp;gt;
&lt;/pre&gt;

&lt;p&gt;However, I would not expect the same output if I wrote the code something like this:&lt;/p&gt;

&lt;pre&gt;
 print foo
 def foo(): pass
&lt;/pre&gt;

&lt;p&gt;and in fact, I don’t. I get something close to:&lt;/p&gt;

&lt;pre&gt;
  NameError: name 'foo' is not defined
&lt;/pre&gt;

&lt;p&gt;Now in JavaScript, by contrast, I can do something like:&lt;/p&gt;

&lt;pre&gt;
 function foo() { }
 console.log(foo);
&lt;/pre&gt;

&lt;p&gt;and the console will dutifully have &lt;code&gt;foo ()&lt;/code&gt; printed in the log.
However, and here comes the fun part, I can also do this:&lt;/p&gt;

&lt;pre&gt;
 console.log(foo);
 function foo() { }
&lt;/pre&gt;

&lt;p&gt;which blows my mind. I guess this really comes in handy when... nope, can’t think
of any good reason why this is a useful feature. (I’m sure there must be a good reason,
but damned if I can work it out. But this is only where the fun begins. Because the
same thing works for variables!&lt;/p&gt;

&lt;p&gt;Now usually in Javascript, if you have a function, and write code like:&lt;/p&gt;

&lt;pre&gt;
 function foo() { 
     x = 37;
 }
 foo();
 console.log("x:", x);
&lt;/pre&gt;

&lt;p&gt;You find out you’ve stuffed up, and accidently written to the global object
because for some brain-dead reason when you assign to a variable that doesn’t
exist in JavaScript it will merrily walk up the scope chain to the global object
and plonk it in there. If you do something like:&lt;/p&gt;

&lt;pre&gt;
 function foo() { 
     var x;
     x = 37;
 }
 foo();
 console.log("x:", x);
&lt;/pre&gt;

&lt;p&gt;&lt;code&gt;x&lt;/code&gt; will be &lt;code&gt;undefined&lt;/code&gt;, and the &lt;code&gt;x =
37&lt;/code&gt; line will update the function’s locally scoped variable, and
not mess with the global object. But now comes the part that screws with
your head. You can just as easily write this as:&lt;/p&gt;

&lt;pre&gt;
 function foo() { 
     x = 37;
     var x;
 }
 foo();
 console.log("x:", x);
&lt;/pre&gt;

&lt;p&gt;and it will have &lt;em&gt;exactly the same effect&lt;/em&gt;. Now it is fairly clear
what is happening here; as a function is parsed any variables and functions 
are created at that time. It turns out though that although variables are
created, they are not initialised, so code such as:&lt;/p&gt;

&lt;pre&gt;
 var x = 12;
 function foo() { 
     console.log("x:", x);
     var x = 37;
 }
 foo();
&lt;/pre&gt;

&lt;p&gt;will output x as &lt;code&gt;undefined&lt;/code&gt;, (not 37, or 12). Now this
behaviour isn’t wrong, or necessarily bad, but it was certainly counter
to my expectation and experience in other languages.&lt;/p&gt;    </description>  </item>

  <item>
    <title>Monkey Patching Javascript</title>
    <link>http://benno.id.au/blog/2010/01/01/monkey-patching-javascript</link>
    <guid>http://benno.id.au/blog/2010/01/01/monkey-patching-javascript</guid>
    <pubDate>Fri, 01 Jan 2010 22:54:22 +0000</pubDate>
    <description>&lt;p&gt;Javascript is a very permissive language; you can go messing around
of the innards of other classes to your heart’s content. Of course the
question is &lt;em&gt;should you?&lt;/em&gt;. &lt;/p&gt;

&lt;p&gt;Currently I’m playing around with the &lt;a
href="http://www.whatwg.org/specs/web-apps/current-work/multipage/comms.html"&gt;channel
messaging&lt;/a&gt; feature in HTML5. In a nutshell, this API lets you
create communication &lt;code&gt;MessageChannel&lt;/code&gt;, which have two
&lt;em&gt;MessagePort&lt;/em&gt; object associated with it. When you send a
message on port1 an event is triggered on port2 (and vice-versa). You send a message
by calling the &lt;code&gt;postMessage&lt;/code&gt; method on a port object. E.g:&lt;/p&gt;

&lt;pre&gt;
  port.postMessage("Hello world");
&lt;/pre&gt;

&lt;p&gt;This opens up a range of interesting possibilities for web
developers, but this blog post is about software design, not
cool HTML5 features.&lt;/p&gt;

&lt;p&gt;Unfortunately the &lt;code&gt;postMessage&lt;/code&gt; is very new, and the implementation
has not yet caught up with the specification. Although you are meant to be
able to transfer objects using &lt;code&gt;postMessage&lt;/code&gt; currently only strings
are supported, and any other objects are coerced into strings. This has an
interesting side-effect. If we have code such as:&lt;/p&gt;

&lt;pre&gt;
  port.postMessage({"op" : "test"});
&lt;/pre&gt;

&lt;p&gt;the receiver of the message ends up with the string
&lt;strong&gt;&lt;code&gt;[Object object]&lt;/code&gt;&lt;/strong&gt;, which is mostly
useless; actually it is completely useless. So, we want to transfer
structured data over a pipe that just supports standard strings, sounds
like a job for &lt;strong&gt;&lt;a href="http://www.json.org/"&gt;JSON&lt;/a&gt;&lt;/strong&gt;.
So, now my code ends up looking like:&lt;/p&gt;

&lt;pre&gt;
  port.postMessage(JSON.stringify({"op" : "test"});
&lt;/pre&gt;

&lt;p&gt;Now this is all good, but it gets a bit tedious having to type that out
every time I want to post a message, so a naïve approach we can simple create
a new function, &lt;code&gt;postObject&lt;/code&gt;:&lt;/p&gt;

&lt;pre&gt;
 function postObject(port, object) {
   return port.postMessage(JSON.stringify(object));
 }

 postObject(port, {"op" : "test"});
&lt;/pre&gt;

&lt;p&gt;OK, so this works, and it is pretty simple to use, but there
aesthetic here that makes this grate just a bit, why do I have to do
&lt;code&gt;postObject(port, object)&lt;/code&gt;, why can’t I do
&lt;code&gt;port.postObject(object)&lt;/code&gt;, that is more “object-oriented”,
so, thankfully JavaScript lets us &lt;a
href="http://en.wikipedia.org/wiki/Monkey_patch"&gt;monkey patch&lt;/a&gt; objects
at run time. So, if we do this:&lt;/p&gt;

&lt;pre&gt;
 port.postObject = function (object) {
   return this.postMessage(JSON.stringify(object));
 }
 
 port.postObject({"op" : "test"});
&lt;/pre&gt;

&lt;p&gt;OK, so far so good. What problems does this have? Well, firstly we are 
creating a new function for each port object, which isn’t great for either 
execution time, or memory usage if we have a large number of ports. So instead
we could do:&lt;/p&gt;

&lt;pre&gt;
  function postObject(object) {
    return this.postMessage(JSON.stringify(object));
  }
  port.postObject = postObject;

  port.postObject({"op" : "test"});
&lt;/pre&gt;

&lt;p&gt;This works well, except we have two problems. The
&lt;code&gt;postObject&lt;/code&gt; function ends up as a global function, and if
you called it as a global function, the &lt;code&gt;this&lt;/code&gt; parameter
would be the global, window object, rather than a port object, which
would be an easy mistake to make, and difficult to
debug. Additionally, we end up with additional per-object data for
storing the pointer. Thankfully javascript has a powerful mechanism
for solving both the problems: the &lt;a
href="http://helephant.com/2009/01/javascript-object-prototype/"&gt;prototype&lt;/a&gt;
object (not to be confused with the &lt;a
href="http://www.prototypejs.org/"&gt;javascript framework&lt;/a&gt; of the
same name).&lt;/p&gt;

&lt;p&gt;So, if we update the prototype object, instead of object directly, we
don’t need to add the function to the global namespace, we avoid per-object
memory usage, and we avoid extra code having to remember to set it for every port
object:&lt;/p&gt;

&lt;pre&gt;
 MessagePort.prototype.postObject = function postObject(object) {
   return this.postMessage(JSON.stringify(object));
 } 

 port.postObject({"op" : "test"});
&lt;/pre&gt;

&lt;p&gt;Now, this ends up working pretty well. The real question is
&lt;em&gt;should be do this&lt;/em&gt;? Or rather which of these options should we
choose? There is an argument to be made that monkey-patching anything
is just plain wrong, and it should be avoided. For example rather than
extending the base &lt;code&gt;MessagePort&lt;/code&gt; class, we could create a
sub-class (Exactly how you create sub-classes in JavaScript is another
matter!).&lt;/p&gt;

&lt;p&gt;Unfortunately sub-classing doesn’t get us to far, as we are not the
ones directly creating the &lt;code&gt;MessagePort&lt;/code&gt; instance, the
&lt;code&gt;MessageChannel&lt;/code&gt; construct does this for us. (I guess we 
could monkey-patch &lt;code&gt;MessageChannel&lt;/code&gt;, but that defeats the
purpose of avoiding monkey-patching!).&lt;/p&gt;

&lt;p&gt;Of course, another option would be to create a class that encapsulates
a port, taking one as a constructor. E.g:&lt;/p&gt;

&lt;pre&gt;
 function MyPort(port) {
   this.port = port;
 }

 MyPort.prototype.postObject = function postObject(object) {
   this.port.postMessage(JSON.stringify(object));
 }

 port = MyPort(port);
 port.postObject({"op" : "test"});
&lt;/pre&gt;

&lt;p&gt;Of course, this means we have to remember to wrap all our port
objects in this &lt;code&gt;MyPort&lt;/code&gt; class. This is kind of messy (in
my opinion). Also we can no longer call the standard port methods. Of
course, we could create wrappers for all these methods too, but then
things are getting quite verbose, and we are stuffed if it comes to
inspecting properties.&lt;/p&gt;

&lt;p&gt;Unlike some other object-oriented languages, Javascript provides
another alternative, we could change the class (i.e: prototype) of the object at runtime.
E.g:&lt;/p&gt;

&lt;pre&gt;
 function MyPort(port) { }
 MyPort.deriveFrom(MessagePort); /* Assume this is how we create sub-classes */

 MyPort.prototype.postObject = function (object) {
    this.portMessage(JSON.stringify(object));
 }

 port.__proto__ = MyPort.prototype; /* Change class at runtime */
 port.postObject({"op" : "test"});
&lt;/pre&gt;

&lt;p&gt;This solves most of the problems of the encapsulted approach, but
we still have to remember to adapt the object, and aditionally, &lt;code&gt;__proto__&lt;/code&gt;
is a non-standard extension.&lt;/p&gt;

&lt;p&gt;OK, so after quickly looking at the sub-classing approaches I think it is
fair to discount them. We are still left with trying to determine if any of
the monkey-patching approaches is better than a simple function call.&lt;/p&gt;

&lt;p&gt;So, there is mostly a consesus out there that monkey patching the base object
&lt;a href="http://erik.eae.net/archives/2005/06/06/22.13.54/"&gt;is verboten&lt;/a&gt;, but what
about other objects?&lt;/p&gt;

&lt;p&gt;Well, if you are in you own code, I think it is a case of anything
goes, but what if we are providing reusable code modules for other
people? (Of course, even in your own code, there might be libraries
that you include that are affected by your overloading).  When base
objects start working in weird and wonderful ways just because you
import a module debugging becomes quite painful. So I think
&lt;em&gt;changing&lt;/em&gt; the underlying implementation (like the example does
when monkey-patching the &lt;code&gt;postMessage&lt;/code&gt; method) should
probably be avoided.&lt;/p&gt;

&lt;p&gt;OK, so now the choices are down to function vs. add a method to the
built-in class’s prototype. So if we just add a new global function we
could be conflicting with any libraries that also name a global
function in the same way. If we add a method to the prototype, at
least we are limiting of messing with the namespace to just the
&lt;code&gt;MessagePort&lt;/code&gt; object; but really, both the options aren’t
ideal.&lt;/p&gt;

&lt;p&gt;The accepted way to get around this problem is to create
a module specific namespace. This reduces the number of potential 
conflicts. E.g:&lt;/p&gt;

&lt;pre&gt;
 var Benno = {};

 Benno.postObject = function postObject(port, object) {
   return port.postMessage(JSON.stringify(object));
 }

 Benno.postObject(port, {"op" : "test"});
&lt;/pre&gt;

&lt;p&gt;Now, this avoids polluting the global namespace (except for the
single &lt;code&gt;Benno&lt;/code&gt; object). So, it would have to come out above
the prototype extension approach. Now, we should consider if it is possible
to play any namespace tricks with the prototype approach. It might be nice
to think we could do something like:&lt;/p&gt;

&lt;pre&gt;
  MessagePort.prototype.Benno = {}
  MessagePort.prototype.Benno.postObject = function postObject(object) {
    return this.postMessage(JSON.stringify(object));
  }

  port.Benno.postObject(object);
&lt;/pre&gt;

&lt;p&gt;
but this doesn’t work because of the way in which methods and the
&lt;code&gt;this&lt;/code&gt; object work. &lt;code&gt;this&lt;/code&gt; in the function ends
up referring to the &lt;code&gt;Benno&lt;/code&gt; module object, rather than the
&lt;code&gt;MessagePort&lt;/code&gt; object.  &lt;/p&gt;

&lt;p&gt;Even assuming this did work, the function approach has some additional
benefits. If the user wants to reduce the typing they can do something like:&lt;/p&gt;

&lt;pre&gt;
   var $B = Benno;
   $B.postObject(port, object);
&lt;/pre&gt;

&lt;p&gt;or even, &lt;/p&gt;

&lt;pre&gt;
  var $P = Benno.postObject;
  $P(port, object);
&lt;/pre&gt;

&lt;p&gt;The other advantage of this scheme is that for someone debugging
the code it should be much more obvious where to look for the code and
to understand what is happening. If you were reading code and saw
&lt;code&gt;Benno.postObject(port, object)&lt;/code&gt;, it would be much more
obvious where the code came from, and where to start looking to debug
things.&lt;/p&gt;

&lt;p&gt;So, in conclusion, the best approach is also the simplest: just 
write a function. (But put it in a decent namespace first). Sure
&lt;code&gt;instance.operation(args)&lt;/code&gt; looks nicer than 
&lt;code&gt;operation(instance, args)&lt;/code&gt;, but in the end the ability
to namespace the function, along with the advantage of making a clear
distinction between built-in and added functionality means that
the latter solution wins to day in my eyes.&lt;/p&gt;

&lt;p&gt;If you have some other ideas on this I’d love to hear them, so please
drop me an e-mail. Thanks to &lt;a href="http://code.lardcave.net/"&gt;Nicholas&lt;/a&gt;
for his insights here.&lt;/p&gt;
    </description>  </item>

  <item>
    <title>HTML5 FileApi and Jpeg Meta-data</title>
    <link>http://benno.id.au/blog/2009/12/30/html5-fileapi-jpegmeta</link>
    <guid>http://benno.id.au/blog/2009/12/30/html5-fileapi-jpegmeta</guid>
    <pubDate>Wed, 30 Dec 2009 10:03:17 +0000</pubDate>
    <description>&lt;body&gt;

&lt;p&gt;I’m really impressed with the way the HTML5 spec is going, and
the fact that it is quickly going to become the default choice for
portable application development.&lt;/p&gt;


&lt;p&gt;One of the lastest additions to help support application
development is the &lt;a href="http://www.w3.org/TR/FileAPI/"&gt;File
API&lt;/a&gt;. This API enables a developer to gain access to the contents
of files locally. The main new data structure that a developer if
provided with is a &lt;code&gt;FileList&lt;/code&gt; objects which represents an
array of &lt;code&gt;File&lt;/code&gt; objects. &lt;code&gt;FileList&lt;/code&gt; objects
can be obtained from two places; &lt;code&gt;input&lt;/code&gt; form elements
and from drag &amp;amp; drop &lt;code&gt;DataTransfer&lt;/code&gt; objects.&lt;/p&gt;

&lt;p&gt;Based on this latest API, I’ve created a simple library, &lt;a
href="http://code.google.com/p/jsjpegmeta/"&gt;JsJpegMeta&lt;/a&gt; for parsing
Jpeg meta data.&lt;/p&gt;

&lt;p&gt;I’ve hacked together a &lt;a
href="http://benno.id.au/jpegmetaexample/"&gt;example&lt;/a&gt; that
demonstrates the library. Just select a JPEG file from the form, or
drag a JPEG file onto the window. For large JPEG files you might need
to be a little bit patient, as it can be a little slow. This slowness, suprisingly,
doesn’t appear to be the Javascript part, but rather Firefox’s handling of large
&lt;code&gt;data:&lt;/code&gt; URLs and JPEG display in general.&lt;/p&gt;

&lt;p&gt;The rest of this post goes into some of the details. Unfortunately only &lt;a
href="https://developer.mozilla.org/En/Firefox_3.6_for_developers"&gt;Firefox
3.6&lt;/a&gt; supports these new APIs right now.&lt;/p&gt;

&lt;h2&gt;Using the File API&lt;/h2&gt;

&lt;p&gt;Here is an example of how to get access to a &lt;code&gt;FileList&lt;/code&gt;.
When the user chooses a file, it calls the Javascript function
&lt;code&gt;loadFiles&lt;/code&gt;. (Assuming you have already defined that
function).&lt;/p&gt;

&lt;pre&gt;
  &amp;lt;form id="form" action="javascript:void(0)"&amp;gt;
    &amp;lt;p&amp;gt;Choose file: &amp;lt;input type="file" onchange="loadFiles(this.files)" /&amp;gt;&amp;lt;/p&amp;gt;
  &amp;lt;/form&amp;gt;
&lt;/pre&gt;

&lt;p&gt;A &lt;code&gt;File&lt;/code&gt; object just provides a reference to a file; to
actually get some data out of the file you need to use a
&lt;code&gt;FileReader&lt;/code&gt; object. The &lt;code&gt;FileReader&lt;/code&gt; object
provides an asynchronous API for reading the file data into
memory. Three different methods are provided by the
&lt;code&gt;FileReader&lt;/code&gt; object; &lt;code&gt;readAsBinaryString&lt;/code&gt;,
&lt;code&gt;readAsText&lt;/code&gt; and &lt;code&gt;readAsDataURL&lt;/code&gt;. A callback,
&lt;code&gt;onloadend&lt;/code&gt;, is executed when the file has been read into
memory, the data is then available via the &lt;code&gt;result&lt;/code&gt; field.&lt;/p&gt;

&lt;p&gt;Here example of what the &lt;code&gt;loadFiles&lt;/code&gt; function might look like:&lt;/p&gt;

&lt;pre&gt;
function loadFiles(files) {
    var binary_reader = new FileReader();

    binary_reader.file = files[0];
    
    binary_reader.onloadend = function() {
        alert("Loaded file: " + this.file.name + " length: " + this.result.length);
    }

    binary_reader.readAsBinaryString(files[0]);
    
    $("form").reset();
}
&lt;/pre&gt;

&lt;p&gt;Note the &lt;code&gt;$("form").reset();&lt;/code&gt; clears the input form.&lt;/p&gt;

&lt;h2&gt;Drag &amp;amp; Drop&lt;/h2&gt;

&lt;p&gt;Forms are not the only way to get a &lt;code&gt;FileList&lt;/code&gt;, you can
also get files from &lt;a href="http://www.whatwg.org/specs/web-apps/current-work/multipage/dnd.html"&gt;drag and drop&lt;/a&gt;
event. You need to handle three events; &lt;code&gt;dragenter&lt;/code&gt;, &lt;code&gt;dragover&lt;/code&gt; and &lt;code&gt;drop&lt;/code&gt;.&lt;/p&gt;

&lt;pre&gt;
&amp;lt;body ondragenter="dragEnterHandler(event)" ondragover="dragOverHandler(event)" ondrop="dropHandler(event)"&amp;gt;
&lt;/pre&gt;

&lt;p&gt;The default handling of these are fairly striaght forward:&lt;/p&gt;

&lt;pre&gt;
function dragEnterHandler(e) { e.preventDefault(); }
function dragOverHandler(e) { e.preventDefault(); }
function dropHandler(e) {
    e.preventDefault();
    loadFiles(e.dataTransfer.files);
}
&lt;/pre&gt;

&lt;h2&gt;Parsing files&lt;/h2&gt;

&lt;p&gt;The interesting thing here is the &lt;code&gt;readAsBinaryString&lt;/code&gt;,
when this method is used &lt;code&gt;result&lt;/code&gt; ends up being a
&lt;strong&gt;binary string&lt;/strong&gt;. This is pretty new because, as far
as I know, there hasn’t really been a good way to access binary data
in Javascript before. Each character in the binary string represents a 
byte, and has a character code in the range [0..255].&lt;/p&gt;

&lt;p&gt;This is great, because it means that we can parse binary strings
locally, without having to upload files to a server for
processing. Unfortunately there isn’t a great deal of support for
handling binary data in Javacript; there isn’t anything like Python’s
&lt;a href="http://docs.python.org/library/struct.html"&gt;struct&lt;/a&gt;
module.&lt;/p&gt;

&lt;p&gt;Luckily it isn’t too hard to write something close to this. Mostly
we wanted to parse unsigned and signed integers of arbitrary
length. To be useful, we need to handle both little and big &lt;a
href="http://en.wikipedia.org/wiki/Endianness"&gt;endianess&lt;/a&gt;. A very simple
implementation of parsing an unsigned integer is:&lt;/p&gt;

&lt;pre&gt;
    function parseNum(endian, data, offset, size) {
	var i;
	var ret;
	var big_endian = (endian === "&gt;");
	if (offset === undefined) offset = 0;
	if (size === undefined) size = data.length - offset;
	for (big_endian ? i = offset : i = offset + size - 1; 
	     big_endian ? i &lt; offset + size : i &gt;= offset; 
	     big_endian ? i++ : i--) {
	    ret &lt;&lt;= 8;
	    ret += data.charCodeAt(i);
	}
	return ret;
    }
&lt;/pre&gt;

&lt;p&gt;&lt;code&gt;endian&lt;/code&gt; specifies the endianess; the string literal "&gt;"
for big-endian and "&lt;" for little-endian. (Copying the Python struct
module). &lt;code&gt;data&lt;/code&gt; is the binary data to parse. An
&lt;code&gt;offset&lt;/code&gt; can be specified to enable parsing from the middle
of a binary structure; this defaults to zero. The &lt;code&gt;size&lt;/code&gt; of
the integer to parse can also be specified; it defaults to the
remainder of the string.&lt;/p&gt;

&lt;p&gt;Signed integers require a little bit more work. Although there are
multiple ways of &lt;a
href="http://en.wikipedia.org/wiki/Signed_number_representations"&gt;representing
signed numbers&lt;/a&gt;, by far the most common is the &lt;a
href="http://en.wikipedia.org/wiki/Two%27s_complement"&gt; two’s
complement&lt;/a&gt; method. A function that has the same inputs as &lt;code&gt;parseNum&lt;/code&gt;
is:&lt;p&gt;

&lt;pre&gt;
    function parseSnum(endian, data, offset, size) {
	var i;
	var ret;
	var neg;
	var big_endian = (endian === "&gt;");
	if (offset === undefined) offset = 0;
	if (size === undefined) size = data.length - offset;
	for (big_endian ? i = offset : i = offset + size - 1; 
	     big_endian ? i &lt; offset + size : i &gt;= offset; 
	     big_endian ? i++ : i--) {
	    if (neg === undefined) {
		/* Negative if top bit is set */
		neg = (data.charCodeAt(i) &amp; 0x80) === 0x80;
	    }
	    ret &lt;&lt;= 8;
	    /* If it is negative we invert the bits */
	    ret += neg ? ~data.charCodeAt(i) &amp; 0xff: data.charCodeAt(i);
	}
	if (neg) {
	    /* If it is negative we do two's complement */
	    ret += 1;
	    ret *= -1;
	}
	return ret;
    }
&lt;/pre&gt;

&lt;h2&gt;JpegMeta API&lt;/h2&gt;

&lt;p&gt;&lt;a href="http://code.google.com/p/jsjpegmeta/"&gt;JpegMeta&lt;/a&gt; is a
simple, pure Javascript library for parsing Jpeg meta-data. To use it
include the &lt;code&gt;jpegmeta.js&lt;/code&gt; file. This creates a single,
global, module object &lt;code&gt;JpegMeta&lt;/code&gt;. The &lt;code&gt;JpegMeta&lt;/code&gt;
module object has one public interface of use, the &lt;code&gt;JpegFile&lt;/code&gt;
class. You can use this to construct new JpegFile class instances. The input
is a binary string (for example as returned from a &lt;code&gt;FileReader&lt;/code&gt; object.
An example is:&lt;/p&gt;

&lt;pre&gt;
	var jpeg = new JpegMeta.JpegFile(this.result, this.file.name);
&lt;/pre&gt;

&lt;p&gt;After creation you can then access various meta-data properties,
categorised by meta-data groups. The main groups of meta-data are:&lt;/p&gt;

&lt;dl&gt;
&lt;dt&gt;general&lt;/dt&gt;&lt;dd&gt;Information extracted from the JPEG SOF segment. In particular
the hieght, width and colour depth.&lt;/dd&gt;
&lt;dt&gt;jfif&lt;/dt&gt;&lt;dd&gt;Meta-data from the &lt;a href="http://en.wikipedia.org/wiki/JPEG_File_Interchange_Format"&gt;JFIF&lt;/a&gt;
APP0 segment. This usually includes resolution, aspect ratio and colour space meta-data.&lt;/dd&gt;
&lt;dt&gt;tiff&lt;/dt&gt;&lt;dd&gt;Generic TIFF meta-data extracted from the Exif meta-data APP1 segment. Includes things such as camera
make and model, orientation and date-time.&lt;/dd&gt;
&lt;dt&gt;exif&lt;/dt&gt;&lt;dd&gt;Exif specific meta-data extracted from the Exif meta-data APP1 segment. Includes camera specific things
such as white balance, flash, metering mode, etc.&lt;/dd&gt;
&lt;dt&gt;gps&lt;/dt&gt;&lt;dd&gt;GPS related information extracted from the Exif meta-data APP1 segment. Includes atitude, longitude etc.&lt;/dd&gt;
&lt;/dl&gt;

&lt;p&gt;Meta-data groups can be access directly, for example:&lt;/p&gt;

&lt;pre&gt;
 var group = jpeg.gps;
&lt;/pre&gt;

&lt;p&gt;A lookup table is also provided: &lt;code&gt;jpeg.metaGroups&lt;/code&gt;. This
associative array can be used to determine which meta-groups a
particular jpeg file instance actually has.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;MetaGroup&lt;/code&gt; object has a name field, a description field and
an associative array of properties.&lt;/p&gt;

&lt;p&gt;Properties in a given group can be accessed directly. E.g:&lt;/p&gt;

&lt;pre&gt;
 var lat = jpeg.gps.latitude;
&lt;/pre&gt;

&lt;p&gt;Alternatively, the &lt;code&gt;metaProps&lt;/code&gt; associative array provides
can be used to determine which properties are available.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;metaProp&lt;/code&gt; object has a &lt;code&gt;name&lt;/code&gt; field,
&lt;code&gt;description&lt;/code&gt; field, and also a &lt;code&gt;value&lt;/code&gt;
field.&lt;/p&gt;

&lt;h2&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;The File API adds a poweful new capability to native HTML5 applications.&lt;/p&gt;
    </description>  </item>

  <item>
    <title>Un-shortening twitter URLs</title>
    <link>http://benno.id.au/blog/2009/11/08/urlunshortener</link>
    <guid>http://benno.id.au/blog/2009/11/08/urlunshortener</guid>
    <pubDate>Sun, 08 Nov 2009 17:29:55 +0000</pubDate>
    <description>&lt;h2&gt;Introduction&lt;/h2&gt;

&lt;p&gt;I got a &lt;a
href="http://twitter.com/benno37/status/1592017366"&gt;little annoyed at
twitter&lt;/a&gt; the other day for gratiutiously tinyurl-ing a link, even
when my message was under the magic 140 characters. Anyway, as far as I can
tell, there isn’t really anyway to avoid this.&lt;/p&gt;

&lt;p&gt;As it happens there has been quite a kerfuffle about this in the
blog-o-sphere of late, with &lt;a
href="http://joshua.schachter.org/"&gt;Josh Schachter&lt;/a&gt; (founder of &lt;a
href="http://del.icio.us"&gt;del.icio.us&lt;/a&gt;) discussing some of the
problems with &lt;a
href="http://joshua.schachter.org/2009/04/on-url-shorteners.html"&gt;URL shorteners&lt;/a&gt;.
While, I agree with most of what is written there, I think some of his
points are a little over the top. Personally, i don’t think that URL
shorteners are &lt;a
href="http://www.techcrunch.com/2009/04/06/are-url-shorteners-a-necessary-evil-or-just-evil/"&gt;evil&lt;/a&gt;,
but they can certainly be annoying. I regularly &lt;tt&gt;mouseover&lt;/tt&gt; links
to see where they are going, and like the visual feedback of
seeing visited links, both of which are broken by URL shorteners.&lt;/p&gt;

&lt;p&gt;So, for my problem, there is a pretty easy solution, somehow expand
the URL, and show that instead of the shortened URL. There are clearly already
solutions to solve this, but I was interested in seeing if this could be
done purely within the browser, using JavaScript and standard (or proposed standard) DOM
APIs.&lt;/p&gt;

&lt;h2&gt;Overall approach&lt;/h2&gt;

&lt;p&gt;Now, in general, these URL shortening services do the &lt;em&gt;“right
thing”&lt;/em&gt;, and provide a 301 permanent redirect response. So my
basic thinking is something like:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; get the tweets from twitter&lt;/li&gt;
&lt;li&gt;find all the URLs in the tweets&lt;/li&gt;
&lt;li&gt;for each URL, do an HTTP HEAD request&lt;/li&gt;
&lt;li&gt;if the response code is 301, replace the link text (and href) with the 
response location.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;.. and the plan is to try and do this on the client. Now, if you’ve done
something like this before, you can probably already guess all the pitfalls!&lt;/p&gt;

&lt;h2&gt;Render the tweet data&lt;/h2&gt;

&lt;p&gt;Twitter quite conveniently provides its data in a variety of useful
machine readable formats. The most useful for my purposes here is the
&lt;a href="http://apiwiki.twitter.com/Twitter-REST-API-Method%3A-statuses-user_timeline"&gt;twitter XML format&lt;/a&gt;.
Now, I’m leaving it as an exercise for the reader how to get an (authenticated)
bit of twitter XML into the client at run-time. For now, I’ve downloaded
a my latest timeline in XML format and made it available at &lt;a href="http://benno.id.au/twit/twit.xml"&gt;http://benno.id.au/twit/twit.xml&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Now, in these days of JSON and JavaScript, no-one much seems to
like &lt;a href="http://www.w3.org/TR/xslt"&gt;XSLT&lt;/a&gt; langauge, but I’m quite
partial to it, and as far as I’m concerned it is the best way to take
one XML document (the tweet timeline) and convert it into another
XML document (the HTML output). (Technically,  I’ll be converting it into
an XML document fragment.)&lt;/p&gt;

&lt;p&gt;I’ll leave out how to create an XSLT processor in Javascript as an exercise
for the reader, but the XSLT script itself is of some interest. At least I think so!&lt;/p&gt;

&lt;p&gt;Now, for the most part converting the Twitter status element into an appropriate
&amp;lt;div&amp;gt; and &amp;lt;p&amp;gt; tags is straight forward:&lt;/p&gt;

&lt;pre&gt;
&amp;lt;?xml version="1.0" encoding="ISO-8859-1"?&amp;gt;
&amp;lt;xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&amp;gt;
&amp;lt;xsl:output method="html" /&amp;gt;
&amp;lt;xsl:template match="statuses"&amp;gt;
&amp;lt;h1&amp;gt;Tweet!&amp;lt;/h1&amp;gt;
    &amp;lt;xsl:apply-templates/&amp;gt;
&amp;lt;/xsl:template&amp;gt;

&amp;lt;xsl:template match="status"&amp;gt;
&amp;lt;div class="tweet"&amp;gt;
  &amp;lt;div class="text"&amp;gt;
     &amp;lt;p&amp;gt;&amp;lt;xsl:value-of select="text" /&amp;gt;&amp;lt;/p&amp;gt;
  &amp;lt;/div&amp;gt;
  &amp;lt;div class="meta"&amp;gt;
    &amp;lt;span&amp;gt;&amp;lt;xsl:value-of select="created_at" /&amp;gt; from &amp;lt;xsl:value-of select="source"/&amp;gt;&amp;lt;/span&amp;gt;
  &amp;lt;/div&amp;gt;
&amp;lt;/div&amp;gt;
&amp;lt;/xsl:template&amp;gt;
&amp;lt;/xsl:stylesheet&amp;gt;
&lt;/pre&gt;

&lt;p&gt;What this script basically does is find a tag &amp;lt;statuses&amp;gt; and output the heading tag.
The for each &amp;lt;status&amp;gt; tag, it generates a div with the actual tweet. Now this works
pretty well, &lt;strong&gt;but&lt;/strong&gt;, the tweets are stored in plain text, without the URLs
marked up in anyway! E.g:&lt;/p&gt;

&lt;pre&gt;
&amp;lt;status&amp;gt;
  &amp;lt;created_at&amp;gt;Thu Apr 23 08:06:49 +0000 2009&amp;lt;/created_at&amp;gt;
  &amp;lt;id&amp;gt;1592566033&amp;lt;/id&amp;gt;
  &amp;lt;text&amp;gt;I can't believe the successor of the Radius protocol is the diameter protocol: http://tinyurl.com/c3zb2v&amp;lt;/text&amp;gt;
  &amp;lt;source&amp;gt;web&amp;lt;/source&amp;gt;
  ....
&amp;lt;/status&amp;gt;
&lt;/pre&gt;

&lt;p&gt;This means the script currently only prints out URLs with no link
markup. Unfortunately the XSLT language doesn’t have very powerful
inbuilt string handling functions. Fortunately the XSLT language is
pretty general purpose (and &lt;a
href="http://www.unidex.com/turing/utm.htm"&gt;turing complete&lt;/a&gt;. 
Confusingly, functions are called &lt;em&gt;templates&lt;/em&gt;, and the syntax
to invoke a function is far from convenient, but it is relatively
straight forward to parse a string to extract and markup links.&lt;/p&gt;

&lt;p&gt;So, we write a function, err, template, called
&lt;em&gt;parseurls&lt;/em&gt;. This template takes a single parameter called
text. It then outputs this text with anything that looks like a URL
replace with &lt;code&gt;&amp;lt;a href="URL"&amp;gt;URL&amp;lt;/&amp;gt;&lt;/code&gt;. For
this proof-of-concept, anything that starts with &lt;tt&gt;http://&lt;/tt&gt; is
going to count as a URL, which works relatively well in practise.&lt;/p&gt;

&lt;p&gt;The basic algorithm is to split the string into three parts:
&lt;tt&gt;before-url&lt;/tt&gt;, &lt;tt&gt;url&lt;/tt&gt;,
&lt;tt&gt;after-url&lt;/tt&gt;. &lt;tt&gt;before-url&lt;/tt&gt; is simply output as-is, 
&lt;tt&gt;url&lt;/tt&gt; is output as described previously. If there is
an &lt;tt&gt;after-url&lt;/tt&gt; part, then we recursively call the template
with this data.&lt;/p&gt;

&lt;pre&gt;
&amp;lt;xsl:template name="parseurls"&amp;gt;
  &amp;lt;xsl:param name="text"/&amp;gt;
  &amp;lt;xsl:choose&amp;gt;
    &amp;lt;xsl:when test="contains($text, 'http://')"&amp;gt;
      &amp;lt;xsl:variable name="after_scheme" select="substring-after($text, 'http://')" /&amp;gt;
      &amp;lt;xsl:value-of select="substring-before($text, 'http://')"/&amp;gt;
      &amp;lt;xsl:choose&amp;gt;
	&amp;lt;xsl:when test="contains($after_scheme, ' ')"&amp;gt;
	  &amp;lt;xsl:variable name="url" select="concat('http://', substring-before($after_scheme, ' '))" /&amp;gt;
	  &amp;lt;xsl:call-template name="linkify"&amp;gt;&amp;lt;xsl:with-param name="url" select="$url"/&amp;gt;&amp;lt;/xsl:call-template&amp;gt; 
	  &amp;lt;xsl:text&amp;gt; &amp;lt;/xsl:text&amp;gt;
	  &amp;lt;xsl:call-template name="parseurls"&amp;gt;
	    &amp;lt;xsl:with-param name="text" select="substring-after($after_scheme, ' ')"/&amp;gt;
	  &amp;lt;/xsl:call-template&amp;gt;
	&amp;lt;/xsl:when&amp;gt;
	&amp;lt;xsl:otherwise&amp;gt;
	  &amp;lt;xsl:variable name="url" select="concat('http://', $after_scheme)"/&amp;gt; 
	  &amp;lt;xsl:call-template name="linkify"&amp;gt;&amp;lt;xsl:with-param name="url" select="$url"/&amp;gt;&amp;lt;/xsl:call-template&amp;gt;
	&amp;lt;/xsl:otherwise&amp;gt;
      &amp;lt;/xsl:choose&amp;gt;
    &amp;lt;/xsl:when&amp;gt;
    &amp;lt;xsl:otherwise&amp;gt;
      &amp;lt;xsl:value-of select="$text"/&amp;gt;
    &amp;lt;/xsl:otherwise&amp;gt;
  &amp;lt;/xsl:choose&amp;gt;
&amp;lt;/xsl:template&amp;gt;
&lt;/pre&gt;

&lt;p&gt;So, through the power of functional programming we can transform the
raw tweet XML into a useful HTML document. Unfortunately making HTTP
request is beyond the scope of XSLT, so for the next step, we need to 
use JavaScript.&lt;/p&gt;

&lt;h2&gt;Replacing URLs&lt;/h2&gt;

&lt;p&gt;OK, the next step is to go and change the links in the document. The goal
is to find any links that have been shortened and replace them with the 
true destination.&lt;/p&gt;

&lt;p&gt;Finding the links in the document is relatively striaght-forward, something
like: &lt;code&gt;var atags = document.getElementsByTagName("body")[0].getElementsByTagName("a");&lt;/code&gt;,
does the trick. Now the aim is to go and see if any of these URL return a
redirect, and then update the element. What I want to do is something like:&lt;/p&gt;

&lt;pre&gt;
var atags = document.getElementsByTagName("body")[0].getElementsByTagName("a");
for (var i = 0; i &amp;lt; atags.length; i++) {
    var x = atags[i];
    var xmlhttp = new XMLHttpRequest();
    xmlhttp.update_x = x;
    xmlhttp.open("HEAD", x.attributes.getNamedItem("href").value);
    xmlhttp.onreadystatechange = function () {
       if (this.readyState == 4) {
           if (this.status == 301) {
               this.update_x.innerHTML = this.getResponseHeader('Location');
           }
       }
    }
    xmlhttp.send(null);
}
&lt;/pre&gt;

&lt;p&gt;Basically, what this code does (or what I wish it did), is grab the &lt;tt&gt;href&lt;/tt&gt;
attribute out of each link, and do a &lt;tt&gt;HEAD&lt;/tt&gt; request on it. Recall, a head
request will get the resource headers, but not the contents. Since we only care
about the reponse code, and the location header, we do the network a favour, and
don’t download all the uncessary data.&lt;/p&gt;

&lt;p&gt;Unfortunately, at this stage, we end up pretty hard against some
fundamental limitations of the XMLHTTPRequest, specifically, the
&lt;a href="http://en.wikipedia.org/wiki/Same_origin_policy"&gt;same-origin policy&lt;/a&gt;. In short, (simplified) terms, the same-origin
policy means you can’t make HTTP requests except to the domain where
the page resides. This is done for a very good security reason, but is
slightly frustrating at this point.&lt;/p&gt;

&lt;p&gt;Now, I wasn’t going to let a simple thing like this stop me, so I
implemented a pretty simple server side proxy which let me avoid this
problem. (This is a pretty unsatisfactory solution, and I’m definitely
looking forward to some of the new cross-origin extensions coming out.)
So, basically, we change the &lt;code&gt;open&lt;/code&gt; call to something like:&lt;/p&gt;

&lt;pre&gt;
xmlhttp.open("HEAD", "proxy?" + x.attributes.getNamedItem("href").value);
&lt;/pre&gt;

&lt;p&gt;
OK, so the server-side proxy gets around our cross-origin restriction,
unfortunately we now hit a new problem! It turns out that XMLHttpRequest 
is specified so that any redirects are &lt;em&gt;automatically followed&lt;/em&gt;
by the implementation. Which means, we are stuck again, because we don’t
even get a chance to find out about redirections! To get around this
I hacked up my proxy, so that it converted 301 response codes into
a 531 response code. (No particular reason for choosing that number
over any other available reponse code).
&lt;/p&gt;


&lt;p&gt;Putting all this together gives us a solution for doing URL elongating
(mostly) on the client-side. I’ve put an example up at &lt;a href="http://benno.id.au/twit/"&gt;http://benno.id.au/twit/&lt;/a&gt;.
&lt;/p&gt;

&lt;h2&gt;Conclusions and further work&lt;/h2&gt;

&lt;p&gt;As you can see from the example, modifying the links at afer
the page has already rendered can be somewhat distracting. An 
alternative user-interface would be to hook the mouse-over event,
and in those cases display the real URL in the status-bar.&lt;/p&gt;

&lt;p&gt;Obviously this has the potential to create a large number of
network requests, but this would be no different to a desktop
application, or browser plugin, so not really much that can be
done there. It might be possible to batch a number of requests
to the proxy, and also have the proxy cache requests, but I’d
prefer to find solutions to needing the proxy cache in the first
place!&lt;/p&gt;

&lt;p&gt;The W3C &lt;a href="http://www.w3.org/TR/2009/WD-cors-20090317/"&gt;Cross-Origin
Resource Working Draft&lt;/a&gt;, provides a mechanism which allows holes
to be punched in the same-origin restricting. If URL shortening services
allowed their resources to be shared cross-origin, by implementing
the &lt;tt&gt;Access-Control-Allow-Origin&lt;/tt&gt; http header, the need for the
proxying mechanism would go away.&lt;/p&gt;

&lt;p&gt;Finally, I would propose that the XMLHttpRequest API be updated to
enable a mechanism to avoid following redirects. The W3C &lt;a
href="http://www.w3.org/TR/XMLHttpRequest/"&gt;working draft&lt;/a&gt; notes
that a &lt;em&gt;Property to disable following redirects;&lt;/em&gt; is being
considered for future version of the specification. I would be
in favour of this.&lt;/p&gt;

&lt;p&gt;Unfortuantely the conclusion needs to be that with the current
API limitations and security models, it is not possible to write 
this kind of application in a pure client-side manner with 
current web technologies.&lt;/p&gt;

&lt;/body&gt;
&lt;/html&gt;
    </description>  </item>

  <item>
    <title>Updated blog</title>
    <link>http://benno.id.au/blog/2009/04/28/update-blog</link>
    <guid>http://benno.id.au/blog/2009/04/28/update-blog</guid>
    <pubDate>Tue, 28 Apr 2009 14:13:43 +0000</pubDate>
    <description>&lt;p&gt;I’ve updated my blog software so that you can now browse older blog
entries, and at the same time I’ve added &lt;em&gt;tags&lt;/em&gt;, so you can now
browse all my entries by tag. If you are only interested in Android
(for example), you can browse &lt;a
href="http://benno.id.au/blog/?tag=android"&gt;all my Android
articles&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Currently the RSS feed still has everything, if anyone wants my
RSS feed to support the same thing, please let me know, and I’ll add
that too.&lt;/p&gt;    </description>  </item>

  <item>
    <title>pexif 0.13 release</title>
    <link>http://benno.id.au/blog/2009/04/23/pexif-0.13</link>
    <guid>http://benno.id.au/blog/2009/04/23/pexif-0.13</guid>
    <pubDate>Thu, 23 Apr 2009 11:19:06 +0000</pubDate>
    <description>&lt;p&gt;&lt;a href="http://benno.id.au/code/pexif/"&gt;pexif&lt;/a&gt; is the python
library for editing an image’s EXIF data. Somewhat embarrassingly, the
last release I made (0.12) had a really stupid bug in it. This has now
been rectified, and  a new version (0.13) is now available.&lt;/p&gt;
    </description>  </item>

  <item>
    <title>Super OKL4</title>
    <link>http://benno.id.au/blog/2009/02/21/super-okl4</link>
    <guid>http://benno.id.au/blog/2009/02/21/super-okl4</guid>
    <pubDate>Sat, 21 Feb 2009 11:39:48 +0000</pubDate>
    <description>&lt;p&gt;Cool! Someone in Japan seems to be &lt;a href="http://hirish.wordpress.com/2009/02/18/okl4ポーティング/"&gt;porting&lt;/a&gt; &lt;a href="http://www.ok-labs.com/products/okl4"&gt;OKL4&lt;/a&gt;
to the &lt;a href="http://en.wikipedia.org/wiki/SuperH"&gt;SuperH&lt;/a&gt; architecture.&lt;/p&gt;
    </description>  </item>

</channel>
</rss>
