Compiling the Android source code for ARMv4T

Thu, 23 Oct 2008 23:02:13 +0000
tech article android arm

After a lot of stuffing around installing new hard drives so I had enough space to actually play with the source code, getting screwed by Time Machine when trying to convert my filesystem from case-insenstive to case-insensitive (I gave up and am now usuing a case-sensitive disk image on top of my case-insenstive file system.. sigh), I finally have the Android source code compiling, yay!.

Compiling is fairly trivial, just make and away it goes. The fun thing is trying to work out exactly what the hell the build system is actually doing. I’ve got to admit though, it is a pretty clean build system, although it isn’t going to win any speed records. I’m going to go into more details on the build sstem when i have more time, and I’ve actually worked out what the hell is happening.

Anyway, after a few false starts I now have the build system compiling for ARMv4T processors (such as the one inside the Neo1973), and hopefully at the same time I haven’t broken compilation from ARMv5TE.

For those interested I have a patch available. Simply apply this to the checked out code, and the build using make TARGET_ARCH_VERSION=armv4t. Now, of course I haven’t actually tried to run this code yet, so it might not work, but it seems to compile fine, so that is a good start! Now once I work out how to make git play nice I'll actually put this into a branch and make it available, but the diff will have to suffice for now. Of course I’m not the only one looking at this, check out Christopher’s page for more information. (Where he actually starts solving some problems instead of just working around them ;)

The rest of this post documents the patch. For those interested it should give you some idea of the build system and layout, and hopefully it is something that can be applied to mainline.

The first changes made are to the linux-arm.mk file. A new make variable TARGET_ARCH_VERSION is added. For now this is defaulted to armv5te, but it can be overridden on the command line as shown above.

project build/
diff --git a/core/combo/linux-arm.mk b/core/combo/linux-arm.mk
index adb82d3..a43368f 100644
--- a/core/combo/linux-arm.mk
+++ b/core/combo/linux-arm.mk
@@ -7,6 +7,8 @@ $(combo_target)TOOLS_PREFIX := \
 	prebuilt/$(HOST_PREBUILT_TAG)/toolchain/arm-eabi-4.2.1/bin/arm-eabi-
 endif
 
+TARGET_ARCH_VERSION ?= armv5te
+
 $(combo_target)CC := $($(combo_target)TOOLS_PREFIX)gcc$(HOST_EXECUTABLE_SUFFIX)
 $(combo_target)CXX := $($(combo_target)TOOLS_PREFIX)g++$(HOST_EXECUTABLE_SUFFIX)
 $(combo_target)AR := $($(combo_target)TOOLS_PREFIX)ar$(HOST_EXECUTABLE_SUFFIX)

The next thing is to make the GLOBAL_CFLAGS variable dependent on the architecture version. The armv5te defines stay in place, but an armv4t architecture version is added. Most of the cflags are pretty similar, except we change the -march flag, and change the pre-processor defines. These will become important later in the patch as they provide the mechanism for distinguishing between versions in the code.

@@ -46,6 +48,7 @@ ifneq ($(wildcard $($(combo_target)CC)),)
 $(combo_target)LIBGCC := $(shell $($(combo_target)CC) -mthumb-interwork -print-libgcc-file-name)
 endif
 
+ifeq ($(TARGET_ARCH_VERSION), armv5te)
 $(combo_target)GLOBAL_CFLAGS += \
 			-march=armv5te -mtune=xscale \
 			-msoft-float -fpic \
@@ -56,6 +59,21 @@ $(combo_target)GLOBAL_CFLAGS += \
 			-D__ARM_ARCH_5__ -D__ARM_ARCH_5T__ \
 			-D__ARM_ARCH_5E__ -D__ARM_ARCH_5TE__ \
 			-include $(call select-android-config-h,linux-arm)
+else
+ifeq ($(TARGET_ARCH_VERSION), armv4t)
+$(combo_target)GLOBAL_CFLAGS += \
+			-march=armv4t \
+			-msoft-float -fpic \
+			-mthumb-interwork \
+			-ffunction-sections \
+			-funwind-tables \
+			-fstack-protector \
+			-D__ARM_ARCH_4__ -D__ARM_ARCH_4T__ \
+			-include $(call select-android-config-h,linux-arm)
+else
+$(error Unknown TARGET_ARCH_VERSION=$(TARGET_ARCH_VERSION))
+endif
+endif
 
 $(combo_target)GLOBAL_CPPFLAGS += -fvisibility-inlines-hidden

The next bit we update is the prelink-linux-arm.map file. The dynamic libraries in android are laid out explicitly in virtual memory according to this map file. If I’m not mistaken those address look suspiciously 1MB aligned, which means they should fit nicely in the pagetable, and provides some opportunity to use fast-address-space-switching techniques. In the port to ARMv4 I have so far been lazy and instead of fixing up any assembler code I’ve just gone with existing C code. One outcome of this is that I need the libffi.so for my foreign function interface, so I’ve added this to the map for now. I’m not 100% sure that when compiling for ARMv5 this won’t cause a problem. Will need to see. Fixing up the code to avoid needing libffi is probably high on the list of things to do.

diff --git a/core/prelink-linux-arm.map b/core/prelink-linux-arm.map
index d4ebf43..6e0bc43 100644
--- a/core/prelink-linux-arm.map
+++ b/core/prelink-linux-arm.map
@@ -113,3 +113,4 @@ libctest.so             0x9A700000
 libUAPI_jni.so          0x9A500000
 librpc.so               0x9A400000 
 libtrace_test.so        0x9A300000 
+libffi.so               0x9A200000


The next module is the bionic module which is the light-weight C library that is part of Android. This has some nice optimised routines for memory copy and compare, but unfortunately they rely on ARMv5 instructions. I’ve changed the build system to only use the optimised assembler when compiling with ARMv5TE, and falling back to C routines in the other cases. (The strlen implementation isn’t pure assembly, but the optimised C implementation has inline asm, so again it needs to drop back to plain old dumb strlen.)

project bionic/
diff --git a/libc/Android.mk b/libc/Android.mk
index faca333..3fb3455 100644
--- a/libc/Android.mk
+++ b/libc/Android.mk
@@ -206,13 +206,9 @@ libc_common_src_files := \
 	arch-arm/bionic/_setjmp.S \
 	arch-arm/bionic/atomics_arm.S \
 	arch-arm/bionic/clone.S \
-	arch-arm/bionic/memcmp.S \
-	arch-arm/bionic/memcmp16.S \
-	arch-arm/bionic/memcpy.S \
 	arch-arm/bionic/memset.S \
 	arch-arm/bionic/setjmp.S \
 	arch-arm/bionic/sigsetjmp.S \
-	arch-arm/bionic/strlen.c.arm \
 	arch-arm/bionic/syscall.S \
 	arch-arm/bionic/kill.S \
 	arch-arm/bionic/tkill.S \
@@ -274,6 +270,18 @@ libc_common_src_files := \
 	netbsd/nameser/ns_print.c \
 	netbsd/nameser/ns_samedomain.c
 
+
+ifeq ($(TARGET_ARCH),arm)
+ifeq ($(TARGET_ARCH_VERSION),armv5te)
+libc_common_src_files += arch-arm/bionic/memcmp.S \
+		arch-arm/bionic/memcmp16.S \
+		arch-arm/bionic/memcpy.S \
+		arch-arm/bionic/strlen.c.arm
+else
+libc_common_src_files += string/memcmp.c string/memcpy.c string/strlen.c string/ffs.c
+endif
+endif
+
 # These files need to be arm so that gdbserver
 # can set breakpoints in them without messing
 # up any thumb code.

Unfortunately, it is clear that this C only code hasn’t been used in a while as there was a trivial bug as fixed by the patch below. This makes me worry about what other bugs that aren’t caught by the compiler may be lurking.

diff --git a/libc/string/memcpy.c b/libc/string/memcpy.c
index 4cd4a80..dea78b2 100644
--- a/libc/string/memcpy.c
+++ b/libc/string/memcpy.c
@@ -25,5 +25,5 @@
  * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
-#define MEM_COPY
+#define MEMCOPY
 #include "bcopy.c"

Finally, frustratingly, the compiler’s ffs() implementation appears to fallback to calling the C library’s ffs() implementation if it can’t doing something optimised. This happens when compiling for ARMv4, so I’ve added an ffs() implementation (stolen from FreeBSD).

#include 
#include 

/*
 * Find First Set bit
 */
int
ffs(int mask)
{
        int bit;

        if (mask == 0)
                return (0);
        for (bit = 1; !(mask & 1); bit++)
                mask = (unsigned int)mask >> 1;
        return (bit);
}

The next module for attention is the dalvik virtual machine. Again this has some code that relies on ARMv5, but there is a C version that we fall back on. In this case it also means pulling in libffi. This is probably the module that needs to most attention in actually updating the code to be ARMv4 assembler in the near future.

project dalvik/
diff --git a/vm/Android.mk b/vm/Android.mk
index dfed78d..c66a861 100644
--- a/vm/Android.mk
+++ b/vm/Android.mk
@@ -189,6 +189,7 @@ ifeq ($(TARGET_SIMULATOR),true)
 endif
 
 ifeq ($(TARGET_ARCH),arm)
+ifeq ($(TARGET_ARCH_VERSION),armv5te)
 	# use custom version rather than FFI
 	#LOCAL_SRC_FILES += arch/arm/CallC.c
 	LOCAL_SRC_FILES += arch/arm/CallOldABI.S arch/arm/CallEABI.S
@@ -204,6 +205,16 @@ else
 		mterp/out/InterpC-desktop.c \
 		mterp/out/InterpAsm-desktop.S
 	LOCAL_SHARED_LIBRARIES += libffi
+	LOCAL_SHARED_LIBRARIES += libdl
+endif
+else
+	# use FFI
+	LOCAL_C_INCLUDES += external/libffi/$(TARGET_OS)-$(TARGET_ARCH)
+	LOCAL_SRC_FILES += arch/generic/Call.c
+	LOCAL_SRC_FILES += \
+		mterp/out/InterpC-desktop.c \
+		mterp/out/InterpAsm-desktop.S
+	LOCAL_SHARED_LIBRARIES += libffi
 endif
 
 LOCAL_MODULE := libdvm

Next is libjpeg, which again, has assembler optimisation that we can’t easily use without real porting work, so we fall back to the C

project external/jpeg/
diff --git a/Android.mk b/Android.mk
index 9cfe4f6..3c052cd 100644
--- a/Android.mk
+++ b/Android.mk
@@ -19,6 +19,12 @@ ifneq ($(TARGET_ARCH),arm)
 ANDROID_JPEG_NO_ASSEMBLER := true
 endif
 
+# the assembler doesn't work for armv4t
+ifeq ($(TARGET_ARCH_VERSION),armv4t)
+ANDROID_JPEG_NO_ASSEMBLER := true
+endif
+
+
 # temp fix until we understand why this broke cnn.com
 #ANDROID_JPEG_NO_ASSEMBLER := true
 

For some reason compiling with ARMv4 doesn’t allow the prefetch loop array compiler optimisation, so we turn it off for ARMv4.

@@ -29,7 +35,10 @@ LOCAL_SRC_FILES += jidctint.c jidctfst.S
 endif
 
 LOCAL_CFLAGS += -DAVOID_TABLES 
-LOCAL_CFLAGS += -O3 -fstrict-aliasing -fprefetch-loop-arrays
+LOCAL_CFLAGS += -O3 -fstrict-aliasing
+ifeq ($(TARGET_ARCH_VERSION),armv5te)
+LOCAL_FLAGS += -fprefetch-loop-arrays
+endif
 #LOCAL_CFLAGS += -march=armv6j
 
 LOCAL_MODULE:= libjpeg

Next up is libffi, which is just a case of turning it on since we now need it for ARMv4.

project external/libffi/
diff --git a/Android.mk b/Android.mk
index f4452c9..07b5c2f 100644
--- a/Android.mk
+++ b/Android.mk
@@ -6,7 +6,7 @@
 # We need to generate the appropriate defines and select the right set of
 # source files for the OS and architecture.
 
-ifneq ($(TARGET_ARCH),arm)
+ifneq ($(TARGET_ARCH_VERSION),armv5te)
 
 LOCAL_PATH:= $(call my-dir)
 include $(CLEAR_VARS)

The external module opencore contains a lot of software implemented codecs. (I wonder about the licensing restrictions on these things...). Not surprisingly these too are tuned for ARMv4, but again we fall back to plain old C.

project external/opencore/
diff --git a/codecs_v2/audio/aac/dec/Android.mk b/codecs_v2/audio/aac/dec/Android.mk
index ffe0089..6abdc2d 100644
--- a/codecs_v2/audio/aac/dec/Android.mk
+++ b/codecs_v2/audio/aac/dec/Android.mk
@@ -150,7 +150,7 @@ LOCAL_SRC_FILES := \
 LOCAL_MODULE := libpv_aac_dec
 
 LOCAL_CFLAGS := -DAAC_PLUS -DHQ_SBR -DPARAMETRICSTEREO  $(PV_CFLAGS)
-ifeq ($(TARGET_ARCH),arm)
+ifeq ($(TARGET_ARCH_VERSION),armv5te)
  LOCAL_CFLAGS += -D_ARM_GCC
  else
  LOCAL_CFLAGS += -DC_EQUIVALENT
diff --git a/codecs_v2/audio/gsm_amr/amr_wb/dec/Android.mk b/codecs_v2/audio/gsm_amr/amr_wb/dec/Android.mk
index e184178..3223841 100644
--- a/codecs_v2/audio/gsm_amr/amr_wb/dec/Android.mk
+++ b/codecs_v2/audio/gsm_amr/amr_wb/dec/Android.mk
@@ -48,7 +48,7 @@ LOCAL_SRC_FILES := \
 LOCAL_MODULE := libpvamrwbdecoder
 
 LOCAL_CFLAGS :=   $(PV_CFLAGS)
-ifeq ($(TARGET_ARCH),arm)
+ifeq ($(TARGET_ARCH_VERSION),armv5te)
  LOCAL_CFLAGS += -D_ARM_GCC
  else
  LOCAL_CFLAGS += -DC_EQUIVALENT
diff --git a/codecs_v2/audio/mp3/dec/Android.mk b/codecs_v2/audio/mp3/dec/Android.mk
index 254cb6b..c2430fe 100644
--- a/codecs_v2/audio/mp3/dec/Android.mk
+++ b/codecs_v2/audio/mp3/dec/Android.mk
@@ -28,8 +28,8 @@ LOCAL_SRC_FILES := \
 	src/pvmp3_seek_synch.cpp \
 	src/pvmp3_stereo_proc.cpp \
 	src/pvmp3_reorder.cpp
-	
-ifeq ($(TARGET_ARCH),arm)
+
+ifeq ($(TARGET_ARCH_VERSION),armv5te)
 LOCAL_SRC_FILES += \
 	src/asm/pvmp3_polyphase_filter_window_gcc.s \
 	src/asm/pvmp3_mdct_18_gcc.s \
@@ -46,7 +46,7 @@ endif
 LOCAL_MODULE := libpvmp3
 
 LOCAL_CFLAGS :=   $(PV_CFLAGS)
-ifeq ($(TARGET_ARCH),arm)
+ifeq ($(TARGET_ARCH_VERSION),armv5te)
  LOCAL_CFLAGS += -DPV_ARM_GCC
  else
  LOCAL_CFLAGS += -DC_EQUIVALENT

Unfortunately it is not just the build file that needs updating in this module. I need to manually go and update the headers so that some optimised inline assembler is only used in the ARMv5 case. To be honest this messes these files up a little bit, so a nicer solution would be preferred.

diff --git a/codecs_v2/video/m4v_h263/enc/src/dct_inline.h b/codecs_v2/video/m4v_h263/enc/src/dct_inline.h
index 86474b2..41a3297 100644
--- a/codecs_v2/video/m4v_h263/enc/src/dct_inline.h
+++ b/codecs_v2/video/m4v_h263/enc/src/dct_inline.h
@@ -22,7 +22,7 @@
 #ifndef _DCT_INLINE_H_
 #define _DCT_INLINE_H_
 
-#if !defined(PV_ARM_GCC)&& defined(__arm__)
+#if !(defined(PV_ARM_GCC) && defined(__arm__) && defined(__ARCH_ARM_5TE__))
 
 #include "oscl_base_macros.h"
 
@@ -109,7 +109,7 @@ __inline int32 sum_abs(int32 k0, int32 k1, int32 k2, int32 k3,
 #elif defined(__CC_ARM)  /* only work with arm v5 */
 
 #if defined(__TARGET_ARCH_5TE)
-
+#error
 __inline int32 mla724(int32 op1, int32 op2, int32 op3)
 {
     int32 out;
@@ -266,7 +266,7 @@ __inline int32 sum_abs(int32 k0, int32 k1, int32 k2, int32 k3,
     return abs_sum;
 }
 
-#elif defined(PV_ARM_GCC) && defined(__arm__) /* ARM GNU COMPILER  */
+#elif defined(PV_ARM_GCC) && defined(__arm__) && defined(__ARCH_ARM_5TE__) /* ARM GNU COMPILER  */
 
 __inline int32 mla724(int32 op1, int32 op2, int32 op3)
 {
diff --git a/codecs_v2/video/m4v_h263/enc/src/fastquant_inline.h b/codecs_v2/video/m4v_h263/enc/src/fastquant_inline.h
index 6a35d43..fbfeddf 100644
--- a/codecs_v2/video/m4v_h263/enc/src/fastquant_inline.h
+++ b/codecs_v2/video/m4v_h263/enc/src/fastquant_inline.h
@@ -25,7 +25,7 @@
 #include "mp4def.h"
 #include "oscl_base_macros.h"
 
-#if !defined(PV_ARM_GCC) && defined(__arm__) /* ARM GNU COMPILER  */
+#if !(defined(PV_ARM_GCC) && defined(__arm__) && defined(__ARCH_ARM_V5TE__)) /* ARM GNU COMPILER  */
 
 __inline int32 aan_scale(int32 q_value, int32 coeff, int32 round, int32 QPdiv2)
 {
@@ -423,7 +423,7 @@ __inline int32 coeff_dequant_mpeg_intra(int32 q_value, int32 tmp)
     return q_value;
 }
 
-#elif defined(PV_ARM_GCC) && defined(__arm__) /* ARM GNU COMPILER  */
+#elif defined(PV_ARM_GCC) && defined(__arm__) && defined(__ARCH_ARM_V5TE__) /* ARM GNU COMPILER  */
 
 __inline int32 aan_scale(int32 q_value, int32 coeff,
                          int32 round, int32 QPdiv2)
diff --git a/codecs_v2/video/m4v_h263/enc/src/vlc_encode_inline.h b/codecs_v2/video/m4v_h263/enc/src/vlc_encode_inline.h
index 69857f3..b0bf46d 100644
--- a/codecs_v2/video/m4v_h263/enc/src/vlc_encode_inline.h
+++ b/codecs_v2/video/m4v_h263/enc/src/vlc_encode_inline.h
@@ -18,7 +18,7 @@
 #ifndef _VLC_ENCODE_INLINE_H_
 #define _VLC_ENCODE_INLINE_H_
 
-#if !defined(PV_ARM_GCC)&& defined(__arm__)
+#if !(defined(PV_ARM_GCC) && defined(__arm__) && defined(__ARCH_ARM_V5TE__))
 
 __inline  Int zero_run_search(UInt *bitmapzz, Short *dataBlock, RunLevelBlock *RLB, Int nc)
 {
@@ -208,7 +208,7 @@ __inline  Int zero_run_search(UInt *bitmapzz, Short *dataBlock, RunLevelBlock *R
     return idx;
 }
 
-#elif defined(PV_ARM_GCC) && defined(__arm__) /* ARM GNU COMPILER  */
+#elif defined(PV_ARM_GCC) && defined(__arm__) && defined(__ARCH_ARM_V5TE__) /* ARM GNU COMPILER  */
 
 __inline Int m4v_enc_clz(UInt temp)
 {

A similar approach is needed in the skia graphics library.

project external/skia/
diff --git a/include/corecg/SkMath.h b/include/corecg/SkMath.h
index 76cf279..5f0264f 100644
--- a/include/corecg/SkMath.h
+++ b/include/corecg/SkMath.h
@@ -162,7 +162,7 @@ static inline int SkNextLog2(uint32_t value) {
     With this requirement, we can generate faster instructions on some
     architectures.
 */
-#if defined(__arm__) && !defined(__thumb__)
+#if defined(__arm__) && defined(__ARM_ARCH_5TE__) && !defined(__thumb__)
     static inline int32_t SkMulS16(S16CPU x, S16CPU y) {
         SkASSERT((int16_t)x == x);
         SkASSERT((int16_t)y == y);

The sonivox module (no idea what that is!), has the same requirement of updating the build to avoid building ARMv5 specific code.

project external/sonivox/
diff --git a/arm-wt-22k/Android.mk b/arm-wt-22k/Android.mk
index 565c233..a59f917 100644
--- a/arm-wt-22k/Android.mk
+++ b/arm-wt-22k/Android.mk
@@ -73,6 +73,7 @@ LOCAL_COPY_HEADERS := \
 	host_src/eas_reverb.h
 
 ifeq ($(TARGET_ARCH),arm)
+ifeq (($TARGET_ARCH),armv5)
 LOCAL_SRC_FILES+= \
 	lib_src/ARM-E_filter_gnu.s \
 	lib_src/ARM-E_interpolate_loop_gnu.s \

The low-level audio code in audioflinger suffers from the same optimisations, and we need to dive into the code on this occasion to fix things up.

project frameworks/base/
diff --git a/libs/audioflinger/AudioMixer.cpp b/libs/audioflinger/AudioMixer.cpp
index 9f1b17f..4c0890c 100644
--- a/libs/audioflinger/AudioMixer.cpp
+++ b/libs/audioflinger/AudioMixer.cpp
@@ -400,7 +400,7 @@ void AudioMixer::process__validate(state_t* state, void* output)
 static inline 
 int32_t mulAdd(int16_t in, int16_t v, int32_t a)
 {
-#if defined(__arm__) && !defined(__thumb__)
+#if defined(__arm__) && defined(__ARCH_ARM_5TE__) && !defined(__thumb__)
     int32_t out;
     asm( "smlabb %[out], %[in], %[v], %[a] \n"
          : [out]"=r"(out)
@@ -415,7 +415,7 @@ int32_t mulAdd(int16_t in, int16_t v, int32_t a)
 static inline 
 int32_t mul(int16_t in, int16_t v)
 {
-#if defined(__arm__) && !defined(__thumb__)
+#if defined(__arm__) && defined(__ARCH_ARM_5TE__) && !defined(__thumb__)
     int32_t out;
     asm( "smulbb %[out], %[in], %[v] \n"
          : [out]"=r"(out)
@@ -430,7 +430,7 @@ int32_t mul(int16_t in, int16_t v)
 static inline 
 int32_t mulAddRL(int left, uint32_t inRL, uint32_t vRL, int32_t a)
 {
-#if defined(__arm__) && !defined(__thumb__)
+#if defined(__arm__) && defined(__ARCH_ARM_5TE__) && !defined(__thumb__)
     int32_t out;
     if (left) {
         asm( "smlabb %[out], %[inRL], %[vRL], %[a] \n"
@@ -456,7 +456,7 @@ int32_t mulAddRL(int left, uint32_t inRL, uint32_t vRL, int32_t a)
 static inline 
 int32_t mulRL(int left, uint32_t inRL, uint32_t vRL)
 {
-#if defined(__arm__) && !defined(__thumb__)
+#if defined(__arm__) && defined(__ARCH_ARM_5TE__) && !defined(__thumb__)
     int32_t out;
     if (left) {
         asm( "smulbb %[out], %[inRL], %[vRL] \n"
diff --git a/libs/audioflinger/AudioResamplerSinc.cpp b/libs/audioflinger/AudioResamplerSinc.cpp
index e710d16..88b8c22 100644
--- a/libs/audioflinger/AudioResamplerSinc.cpp
+++ b/libs/audioflinger/AudioResamplerSinc.cpp
@@ -62,7 +62,7 @@ const int32_t AudioResamplerSinc::mFirCoefsDown[] = {
 static inline 
 int32_t mulRL(int left, int32_t in, uint32_t vRL)
 {
-#if defined(__arm__) && !defined(__thumb__)
+#if defined(__arm__) && defined(__ARCH_ARM_5TE__) && !defined(__thumb__)
     int32_t out;
     if (left) {
         asm( "smultb %[out], %[in], %[vRL] \n"
@@ -88,7 +88,7 @@ int32_t mulRL(int left, int32_t in, uint32_t vRL)
 static inline 
 int32_t mulAdd(int16_t in, int32_t v, int32_t a)
 {
-#if defined(__arm__) && !defined(__thumb__)
+#if defined(__arm__) && defined(__ARCH_ARM_5TE__) && !defined(__thumb__)
     int32_t out;
     asm( "smlawb %[out], %[v], %[in], %[a] \n"
          : [out]"=r"(out)
@@ -103,7 +103,7 @@ int32_t mulAdd(int16_t in, int32_t v, int32_t a)
 static inline 
 int32_t mulAddRL(int left, uint32_t inRL, int32_t v, int32_t a)
 {
-#if defined(__arm__) && !defined(__thumb__)
+#if defined(__arm__) && defined(__ARCH_ARM_5TE__) && !defined(__thumb__)
     int32_t out;
     if (left) {
         asm( "smlawb %[out], %[v], %[inRL], %[a] \n"

The AndroidConfig.h header file is included on every compile. We mess with it to convince it that we don’t have an optimised memcmp16 function.

project system/core/
diff --git a/include/arch/linux-arm/AndroidConfig.h b/include/arch/linux-arm/AndroidConfig.h
index d7e182a..76f424e 100644
--- a/include/arch/linux-arm/AndroidConfig.h
+++ b/include/arch/linux-arm/AndroidConfig.h
@@ -249,8 +249,9 @@
 /*
  * Do we have __memcmp16()?
  */
+#if defined(__ARCH_ARM_5TE__)
 #define HAVE__MEMCMP16  1
-
+#endif
 /*
  * type for the third argument to mincore().
  */

Next up is the pixelflinger, where things get interesting, because all of a sudden we have armv6 code. I’ve taken the rash decision of wrapping this in conditionals that are only enabled if you actually have an ARMv6 version, not a pesky ARMv5E, but I really need to better understand the intent here. It seems a little strange.

diff --git a/libpixelflinger/Android.mk b/libpixelflinger/Android.mk
index a8e5ee4..077cf47 100644
--- a/libpixelflinger/Android.mk
+++ b/libpixelflinger/Android.mk
@@ -5,7 +5,7 @@ include $(CLEAR_VARS)
 # ARMv6 specific objects
 #
 
-ifeq ($(TARGET_ARCH),arm)
+ifeq ($(TARGET_ARCH_VERSION),armv6)
 LOCAL_ASFLAGS := -march=armv6
 LOCAL_SRC_FILES := rotate90CW_4x4_16v6.S
 LOCAL_MODULE := libpixelflinger_armv6
@@ -39,7 +39,7 @@ PIXELFLINGER_SRC_FILES:= \
 	raster.cpp \
 	buffer.cpp
 
-ifeq ($(TARGET_ARCH),arm)
+ifeq ($(TARGET_ARCH_VERSION),armv5te)
 PIXELFLINGER_SRC_FILES += t32cb16blend.S
 endif
 
@@ -67,7 +67,7 @@ ifneq ($(BUILD_TINY_ANDROID),true)
 LOCAL_MODULE:= libpixelflinger
 LOCAL_SRC_FILES := $(PIXELFLINGER_SRC_FILES)
 LOCAL_CFLAGS := $(PIXELFLINGER_CFLAGS) -DWITH_LIB_HARDWARE
-ifeq ($(TARGET_ARCH),arm)
+ifeq ($(TARGET_ARCH_VERSION),armv6)
 LOCAL_WHOLE_STATIC_LIBRARIES := libpixelflinger_armv6
 endif
 include $(BUILD_SHARED_LIBRARY)

Finally scanline has an optimised asm version it calls in preference to doing the same thing inline with C code. Again, I take the easy way out, and use the C code.

diff --git a/libpixelflinger/scanline.cpp b/libpixelflinger/scanline.cpp
index d24c988..685a3b7 100644
--- a/libpixelflinger/scanline.cpp
+++ b/libpixelflinger/scanline.cpp
@@ -1312,7 +1312,7 @@ void scanline_t32cb16blend(context_t* c)
     const int32_t v = (c->state.texture[0].shade.it0>>16) + y;
     uint32_t *src = reinterpret_cast(tex->data)+(u+(tex->stride*v));
 
-#if ((ANDROID_CODEGEN >= ANDROID_CODEGEN_ASM) && defined(__arm__))
+#if ((ANDROID_CODEGEN >= ANDROID_CODEGEN_ASM) && defined(__arm__) && defined(__ARCH_ARM_5TE__))
     scanline_t32cb16blend_arm(dst, src, ct);
 #else
     while (ct--) {

And that my friends, is that! Now to see if I can actually run this code!

blog comments powered by Disqus