Android package installation

Sat, 17 Nov 2007 14:23:34 +0000
tech android article

So, I'm intrigued as to what exactly happens when you install a package. The way to work this out is get a look at the /data filesystem before package installation, then again afterwards, and do a big recursive diff

So lets get a relatively clean image:

  1. emulator -wipe-data
  2. adb push busybox ./
  3. adb shell ./busybox tar c -f /tmp/data.tar /data
  4. adb pull /tmp/data.tar .
  5. mkdir original
  6. cd original
  7. tar xf ../data.tar

Now that we have the clean image. Lets compile the simple Hello World sample.

  1. activityCreator.py --out HelloAndroid com.google.android.hello.HelloAndroid
  2. cd HelloAndroid
  3. ant

Now we have HelloAndroid.apk. So we should install this, and then we can find the diff of what happened.

  1. adb install HelloAndroid/bin/HelloAndroid.apk
  2. adb shell ./busybox tar c -f /tmp/data.tar /data
  3. adb pull /tmp/data.tar .
  4. mkdir after_install
  5. cd after_install
  6. tar xf ../data.tar

When we diff this we find not much as happened. There is the new HelloAndroid.apk file installed in the data/app directory. There is also a new com.google.android.hello in data/data, but it is empty, so not that interesting.

Only in after_install/data/app: HelloAndroid.apk
Only in after_install/data/data: com.google.android.hello
diff -ur original/data/system/packages.xml after_install/data/system/packages.xml
--- original/data/system/packages.xml   2007-11-17 14:27:17.000000000 +1100
+++ after_install/data/system/packages.xml      2007-11-17 14:34:20.000000000 +1100
@@ -5,6 +5,7 @@
   <package name="com.google.android.home" userId="10004" />
   <package name="com.google.android.fallback" userId="10003" />
   <package name="com.google.android.contacts" userId="10001" />
+  <package name="com.google.android.hello" userId="10006" />
   <shared-user name="android.uid.system" dummy-signature="android.signature.system" userId="1000" />
   <shared-user name="android.uid.phone" dummy-signature="android.signature.system" userId="1001" />
   <shared-user name="com.google.android.core" dummy-signature="com.google" userId="10002" />

This is pretty interesting to me. I'd really like to know how it finds the thing in the menu. As an experiment I'm going to edit the packages.xml file to see if this removes it from the menu. adb push original/data/system/packages.xml /data/system/packages.xml does this easily. Changing this doesn't update the menu immediately. On reboot it seems that the packages file is regenerated. Not much luck here, time to get a better idea of what is in the .apk file.

So, it turns out the .apk file isn't that difficult. file HelloAndroid.apk, tells us it is just a zip file. After extracting the zip file, we see just 4 files:

The XML files would presumably the most obvious, but they don't seem to be textual. file suggests it is a DBase 3 file, but that doesn't seem so likely. My guess is some kind of either unicode, or binary XML format, or actually both. I can't actually work it out to be perfectly honest! So, some reverse engineering required. The strings definitely look like UTF-16. hexdump helps to debug what is going on. The first part of the file looks like:

$ cat AndroidManifest.xml | hexdump -C
00000000  03 00 08 00 dc 05 00 00  01 00 1c 00 88 02 00 00  |................|
00000010  13 00 00 00 00 00 00 00  01 00 00 00 68 00 00 00  |............h...|
00000020  00 00 00 00 6a 00 00 00  c8 00 00 00 0a 01 00 00  |....j...........|
00000030  36 01 00 00 70 01 00 00  e8 00 00 00 00 00 00 00  |6...p...........|
00000040  8e 01 00 00 da 01 00 00  ce 00 00 00 c6 01 00 00  |................|
00000050  fc 00 00 00 94 00 00 00  12 00 00 00 52 01 00 00  |............R...|
00000060  28 01 00 00 6e 00 00 00  82 00 00 00 80 01 00 00  |(...n...........|
00000070  07 00 61 00 6e 00 64 00  72 00 6f 00 69 00 64 00  |..a.n.d.r.o.i.d.|
00000080  00 00 2a 00 68 00 74 00  74 00 70 00 3a 00 2f 00  |..*.h.t.t.p.:./.|
00000090  2f 00 73 00 63 00 68 00  65 00 6d 00 61 00 73 00  |/.s.c.h.e.m.a.s.|
000000a0  2e 00 61 00 6e 00 64 00  72 00 6f 00 69 00 64 00  |..a.n.d.r.o.i.d.|
000000b0  2e 00 63 00 6f 00 6d 00  2f 00 61 00 70 00 6b 00  |..c.o.m./.a.p.k.|
000000c0  2f 00 72 00 65 00 73 00  2f 00 61 00 6e 00 64 00  |/.r.e.s./.a.n.d.|
000000d0  72 00 6f 00 69 00 64 00  00 00 00 00 00 00 08 00  |r.o.i.d.........|
000000e0  6d 00 61 00 6e 00 69 00  66 00 65 00 73 00 74 00  |m.a.n.i.f.e.s.t.|
000000f0  00 00 07 00 70 00 61 00  63 00 6b 00 61 00 67 00  |....p.a.c.k.a.g.|
00000100  65 00 00 00 18 00 63 00  6f 00 6d 00 2e 00 67 00  |e.....c.o.m...g.|
00000110  6f 00 6f 00 67 00 6c 00  65 00 2e 00 61 00 6e 00  |o.o.g.l.e...a.n.|
00000120  64 00 72 00 6f 00 69 00  64 00 2e 00 68 00 65 00  |d.r.o.i.d...h.e.|
00000130  6c 00 6c 00 6f 00 00 00  01 00 20 00 00 00 0b 00  |l.l.o..... .....|
00000140  61 00 70 00 70 00 6c 00  69 00 63 00 61 00 74 00  |a.p.p.l.i.c.a.t.|
00000150  69 00 6f 00 6e 00 00 00  08 00 61 00 63 00 74 00  |i.o.n.....a.c.t.|
00000160  69 00 76 00 69 00 74 00  79 00 00 00 05 00 63 00  |i.v.i.t.y.....c.|
00000170  6c 00 61 00 73 00 73 00  00 00 0d 00 2e 00 48 00  |l.a.s.s.......H.|
00000180  65 00 6c 00 6c 00 6f 00  41 00 6e 00 64 00 72 00  |e.l.l.o.A.n.d.r.|
00000190  6f 00 69 00 64 00 00 00  05 00 6c 00 61 00 62 00  |o.i.d.....l.a.b.|
000001a0  65 00 6c 00 00 00 0c 00  48 00 65 00 6c 00 6c 00  |e.l.....H.e.l.l.|
000001b0  6f 00 41 00 6e 00 64 00  72 00 6f 00 69 00 64 00  |o.A.n.d.r.o.i.d.|
000001c0  00 00 0d 00 69 00 6e 00  74 00 65 00 6e 00 74 00  |....i.n.t.e.n.t.|
000001d0  2d 00 66 00 69 00 6c 00  74 00 65 00 72 00 00 00  |-.f.i.l.t.e.r...|
000001e0  06 00 61 00 63 00 74 00  69 00 6f 00 6e 00 00 00  |..a.c.t.i.o.n...|
000001f0  05 00 76 00 61 00 6c 00  75 00 65 00 00 00 1a 00  |..v.a.l.u.e.....|
00000200  61 00 6e 00 64 00 72 00  6f 00 69 00 64 00 2e 00  |a.n.d.r.o.i.d...|
00000210  69 00 6e 00 74 00 65 00  6e 00 74 00 2e 00 61 00  |i.n.t.e.n.t...a.|
00000220  63 00 74 00 69 00 6f 00  6e 00 2e 00 4d 00 41 00  |c.t.i.o.n...M.A.|
00000230  49 00 4e 00 00 00 08 00  63 00 61 00 74 00 65 00  |I.N.....c.a.t.e.|
00000240  67 00 6f 00 72 00 79 00  00 00 20 00 61 00 6e 00  |g.o.r.y... .a.n.|
00000250  64 00 72 00 6f 00 69 00  64 00 2e 00 69 00 6e 00  |d.r.o.i.d...i.n.|
00000260  74 00 65 00 6e 00 74 00  2e 00 63 00 61 00 74 00  |t.e.n.t...c.a.t.|
00000270  65 00 67 00 6f 00 72 00  79 00 2e 00 4c 00 41 00  |e.g.o.r.y...L.A.|
00000280  55 00 4e 00 43 00 48 00  45 00 52 00 00 00 00 00  |U.N.C.H.E.R.....|
00000290  80 01 08 00 54 00 00 00  00 00 00 00 00 00 00 00  |....T...........|

Well, this hasn't exactly been very informative. We've learned that basically all that happens on install is the .apk file is copied to /data/app. We also learned that this directory is scanned on startup to find packages to start. The strings in there are:

We can note that there aren't any duplicated strings, and updating the AndroidManifest.xml source file and regenerating it confirms this.

So looking at the WbXML looks likely. The first byte is '3', which indicates WbXML version 1.3. The next byte is '0', which indicates that public identifier is described by a string, which string index number '8'. The charset is indicated by '0', which means unknown. Next is the string table, and it looks bytes 'dc' and '05', which, if I've decode it right, indicates 11781 bytes. Which makes no sense at all. So we try and guess something else. As a 32-bit little-endian integer 0x5dc is exactly the length of the file. So that seems like a good guess for what that field is. In this case, I'm going to guess that '03 00 08 00', is a magic number to identify the file. The next 4 bytes are '01 00 1c 00', not sure hat this is it is followed by '88 02 00 00', which happen to be 0x288 little endian, which seem to represent the last character in what looks like a string table. Looking at other different binary XML files seems to confirm this hypothesis. The next field appears to be 0x13, which is pretty close to the number of strings we found earlier. And this is about as far as I can be bothered working out right now.

The use of the binary encoding seems a little strange. It certainly doesn't reduce the size, but maybe that isn't really the point. It is probably the case that this makes it a lot easier to parse, but you would expect there to be existing XML parsers in the libraries. So beats me what is going on!

The classes.dex file include in the .apk file is already documented by Retrodev.

The final file is the resources.arsc file. This doesn't seem to have too much information in it. There are a few strings that we would expect including, res/layout/main.xml, HelloAndroid and com.google.android.hello.

And that is about it for now. Not so useful at the end of the day, but it maybe give you some information as starting point for further reverse engineering.

blog comments powered by Disqus