Sunday, January 29, 2012

HC-04 Bluetooth module and BlueCore4 Deep Sleep

The good news is that BlueCore indeed has Deep Sleep mode, it's not marketing hype. It can be verified by configuring PSKEY_CLOCK_REQUEST_ENABLE to drive PIO when chip uses/doesn't use external clock. The key here is not to keep PSTool run (which well, you need to tweak sleep settings).

The bad news is that an HC-04 module in Deep Sleep still consumes 2.5mA, which is far too high for anything called Deep Sleep. So, something on the module besides BlueCore still consumes too much power. However, when BlueCore is held in reset state, entire module consumes 0.

Without Deep Sleep, simple LED flash task consumes 3.5mA, so there's not that much gain from the Deep Sleep...

Enabling SPI logging in BlueSuite tools CSR BlueCore chips

BlueFlashCmd.exe and other tools, part of CSR BlueSuite toolset to program BlueCore Bluetooth chips, has support for logging SPI communication with a chip. To enable it:

BlueFlashCmd.exe -trans "SPIDEBUG_FILE=spi.log SPIDEBUG=ON" -chipver

The key parameter here is SPIDEBUG_FILE - logging is activated when it's specified, and thus logging is always done to a file. One way to have logging go to a console is to use SPIDEBUG_FILE=con: , but then it won't be possible to redirect it.

Besides SPIDEBUG=ON, there're other logging levels/features:
The keys may be also specified in the environment instead of -trans switch arguments.

Looking at the output produced, it turns out it spends a lot in calibrating SPI speed, a way to that down is to use SPISLOW=ON setting right away.

Tuesday, January 17, 2012

Notes on Android EGL/native windowing system interface

EGL is defined here and appears to be in OpenGL ES something what GLUT was (?) in OpenGL world. At least, EGL is what takes care of actually displaying on screen what was rendered by OpenGL ES.


EGLNativeWindowType is defined in opengl/include/EGL/eglplatform.h as struct egl_native_window_t* . egl_native_window_t is defined nearby in opengl/include/EGL/eglnatives.h . This is nice, clean, standalone structure (unlike horror in Eclair+). It starts with a "magic" (signature) 0x600913, followed by "version" which is sizeof of the structure (now I know why Microsoft gets a buck on each Android phone sold - they for sure patented this (stupid? lazy?) trick widely used in Win32 API). There're some function pointers (virtual method table) used by EGL to call into native windowing system (deep, close to vendor drivers) to support its workings. Out of these, there's essentially one related to rendering finalization: swapBuffers(). So yes, it seems that eglSwapBuffers() just calls out to this. The structure also directly contains "Base memory virtual address of the surface in the CPU side" and other easily graspable memory/graphics notions.

It's clear that in its young years, Android was fair and naive, not suitable at all for typical vendor OpenGL mumbo-jumbo. The "fix" came with the later versions.

Eclair and following

EGLNativeWindowType is defined (in include/EGL/eglplatform.h) to be struct ANativeWindow*. ANativeWindow was android_native_window_t in Eclair, but was renamed to ANativeWindow in Froyo (android_native_window_t is still available for compatibility). ANativeWindow is defined in platform/frameworks/base/include/ui/egl/android_natives.h:
ANativeWindow not just emulates VMT, but seems to want to do entire C++ in C-like manner in its metes and bounds. It uses subclassing from android_native_base_t, and that contains fields for magic ("_wnd" for ANativeWindow) and version-sizeof. Methods defined are also more involved. Among the most interesting are dequeueBuffer(), lockBuffer(), queueBuffer(). Essentially, it appears that EGL takes a free buffer out of native system's rotation, locks it for changing its contents, then puts back into queue for display.

Actual implementation of ANativeWindow for read device screen is in libs/ui/FramebufferNativeWindow.cpp . There's factory function to get its instance: android_createDisplaySurface().

All implementation details on buffers used are hidden in android_native_buffer_t type. It is also subclassed from android_native_base_t, its magic is "_bfr". Among simple things like width/height it contains buffer_handle_t, which leads in utero of vendor graphics driver. buffer_handle_t is native_handle*, and native_handle is a generic container for arbitrary number of fd's and pointers.

native_handle's are produced by gralloc vendor module, and that's exactly what FramebufferNativeWindow constructor does - it loads gralloc module, then uses it to open framebuffer and gralloc devices. It then allocates back and front buffers from gralloc devices with the size of the framebuffer. queueBuffer() method is implemented by calling framebuffer's post() method which well, shows the buffer on the screen.

Thoughts/notes on Android version upgrades and reusing Android drivers

So, what it takes to upgrade Android on an arbitrary Android device (from user perspective)? What takes to run Linux in full-fledged mode? This boils down to few things:

Kernel porting

Upgrade kernel. This means forward-porting hardware support modules to a new kernel version. That's, when source code for such modules is available. Sometimes, it's not violating GPL (or tainting kernel, but who wants that? ;-) ). In an ideal world, that would all that's needed, but vendors don't like GPL, and try to move stuff outside of kernel. So, the list continues.

Bitblt acceleration porting

Next step is making sure that basic hardware acceleration works - accelerated bitblt/compositing (defining compositing as bitblt with alpha). This is actually pretty important step - without accelerated bitblt, Android with more or less big screen will run pretty sluggishly. Well, X Windows won't run too zippy either. Bitblt code in Android lives in /system/lib/hw/copybit.*.so  and depends on gralloc.*.so from the same dir (* there is vendor/implementation identifier - Android support multiple, pluggable implementations). Needless to say, for a random device, source for these modules are not available - vendors don't have to provide it, it's Apache2-licensed, and very few choose to uphold OpenSource spirit nonetheless. So, if new Android version have changes the ABI for those modules, then oops - upgrade is "not possible" is layman terms. Of course, real men will immediately think about writing adapters, etc...

What about other X and other "foreign" windowing systems? They would need drivers/adapters too, and base all their low-level FB access based on Android's gralloc/bitblt/etc. model.

OpenGL ES porting

Last, least, but nonetheless. Curse of the modern world - OpenGL. You didn't have it on your Apple 2 (I mean Apple 2, yes, not what you thought about!) and everything was great, wasn't it? Apart from games you don't have time to play, what it's useful for? Yes, as soon as we'll stream videosignal directly into the brain, it will be useful for augmented reality, but I mean, *useful now*. They keep talking they added some hardware acceleration using OpenGL to the normal UI, then immediately say it didn't work that great, and depends on many things, because... Because OpenGL simply doesn't work that great, yeah. For example, even "accelerated", it's pretty slow, buggy, inconsistent across devices, etc, etc.

Anyway, thanks for listening to the rant. Let's de-emphasize usefulness of OpenGL, let's just take it as a challenge - vendors hacked it, so why can't hackers hack it? The basic idea is the same as with bitblt - there's an interface between closed vendor OpenGL ES implementation and Android. If interface changed, you're hosed. I mean, you write adapters. You also write adapters to make it talk with your windowing system of choice, and not Android. The core interfacing part of OpenGL ES/Android junction is EGL. How to deal about it is worth a separate post.

New Hack Toy - Zenithink ZT-180

Recently I got new hack toy: cheap (well, depends) Chinese Zenithink ZT-180 tablet. This one seem to be (have been) pretty popular and hacker-friendly. Here's data about it:

It seems that I've got vendor model ZT180_G2 (maybe, ZT180_G0).

Button combinations to hold during power on:
  • Home+Power - Upgrade firmware from /sdcard/zt-update/
  • Left arrow (of swing double-button)+Power - Boot with debug kernel from /sdcard/zt-debug/ . Debug kernel is booted and continues with the normal boot process.
  • Right arrow (of swing double-button)+Power - Multiboot support, select one of 3 modes:
    • Android adb - kernel with ADB over USB support
    • Android mass storage - kernel with mass storage USB gadget support (tablet is presented as a mass storage when connected to host)
    • Other OS - WinCE

Random links:

Monday, January 16, 2012

Stopping Android GUI services

Well, this really worth a cross-post. The way to stop Android GUI services is (under root of course):

setprop ctl.stop zygote

UPDATE: Thanks to hint from StackOverflow: there're even simpler commands: "start" & "stop". Surprise.

Saturday, January 7, 2012

Bluetooth Park Mode Exposed

So well, as I settled on Bluetooth as a wireless sensor/automation protocol and grow my device base, I started to wonder what will happen when I'll have more than 7 slave devices. Bluetooth stack of master device (Bluez in our case) must be smart to put devices into and out of PARK mode to allow to communicate with more than 7 devices, right? Not in this world.

Not only Bluez doesn't support park management, according to Marcel Holtmann, the problem is that it's really not known if there're devices (masters) which can support more than very limited number of slaves, or even have decent park mode implementation at all:
Anyway, I started to test parking with devices I have.

HC-04 Bluetooth module with CSR BlueCore4-Ext

# hcitool cc 00:11:10:xx:xx:xx; hcitool con
    < ACL 00:11:10:xx:xx:xx handle 12 state 8 lm MASTER

What we need here is a connection handle 12 (0x0c). hcitool doesn't have dedicated command for part mode (and for lot of other things), so we'll use raw HCI command mode using "cmd" command:

# hcitool cmd --help
    cmd <ogf> <ocf> [parameters]
    cmd 0x03 0x0013 0x41 0x42 0x43 0x44

What's important here is that <parameters> should be byte values. If words should be given as parameters, they should be give by bytes in little-endian format.

# hcitool cc 00:11:10:xx:xx:xx; hcitool cmd 0x02 0x05 0x0c 0 0x50 0 0x40 0

Here we run HCI_Park_State(Connection_Handle=0x000c, Beacon_Max_Interval=0x0050, Beacon_Min_Interval=0x0040) command (see Bluetooth spec). And here's communication as captured by hcidump (parts irrelevant to park command omitted):

2012-01-07 15:36:46.084074 < HCI Command: Park State (0x02|0x0005) plen 6
    handle 12 max 80 min 64
2012-01-07 15:36:46.086040 > HCI Event: Command Status (0x0f) plen 4
    Park State (0x02|0x0005) status 0x00 ncmd 1
2012-01-07 15:36:46.271047 > HCI Event: Mode Change (0x14) plen 6
    status 0x24 handle 12 mode 0x00 interval 0
    Error: LMP PDU Not Allowed

This "LMP PDU Not Allowed" was quite confusing and took me lot of googling to figure out. Fortunately, I found insightful post from a CSR guy right on this matter:

What we get here is that local Bluetooth master just forwards status it got from the remote device. As the message suggests, park might be disabled in remote device line policy. Let's see:

# hcitool cc 00:11:10:xx:xx:xx; hcitool lp 00:11:10:xx:xx:xx
Link policy settings: RSWITCH HOLD SNIFF PARK

Here's slight issue of what hcitool lp actually does. According to man, it "displays link policy settings for the connection to the device with  Bluetooth  address". There's also "hcitool lp" command which "Sets default link policy". So, reasonable assumption would be that every device may have default link policy, then during connection, they negotiate link policy which is suitable for the connection. Ok, let's set PARK policy explicitly:

# hcitool cc 00:11:10:xx:xx:xx; hcitool lp 00:11:10:xx:xx:xx PARK; hcitool lp 00:11:10:xx:xx:xx; hcitool cmd 0x02 0x05 0x0c 0 0x50 0 0x40 0
Link policy settings: PARK
< HCI Command: ogf 0x02, ocf 0x0005, plen 6
  0C 00 50 00 40 00
> HCI Event: 0x0f plen 4
  00 01 05 08

But hcidump shows the the same "LMP PDU Not Allowed" error. So, what we have is that HC-04 module advertizes park support, it apparently negotiates its availability for connection, but when asked to actually perform it, it rejects it. This is so much correlates with this message:

PS3 Bluetooth remote

# hcitool info 00:06:F5:xx:xx:xx
Requesting information ...
    BD Address:  00:06:F5:xx:xx:xx
    Device Name: BD Remote Control
    LMP Version: 2.0 (0x3) LMP Subversion: 0x229
    Manufacturer: Broadcom Corporation (15)
    Features: 0xbc 0x02 0x04 0x38 0x08 0x00 0x00 0x00
        <encryption> <slot offset> <timing accuracy> <role switch>
        <sniff mode> <RSSI> <power control> <enhanced iscan>
        <interlaced iscan> <interlaced pscan> <AFH cap. slave>

So, this fair Broadcom device doesn't even conceal the fact that it's barely Bluetooth interoperatable - it doesn't support park state at all.

No surprises when asking it to go there nontheless:

2012-01-07 16:03:08.863586 < HCI Command: Park State (0x02|0x0005) plen 6
    handle 13 max 80 min 64
2012-01-07 16:03:08.865636 > HCI Event: Command Status (0x0f) plen 4
    Park State (0x02|0x0005) status 0x1a ncmd 1
    Error: Unsupported Remote Feature / Unsupported LMP Feature

LG KS20 WindowsMobile phone

# hcitool info 00:1E:75:xx:xx:xx | grep park
        <park state> <RSSI> <channel quality> <SCO link> <HV2 packets>

So, this one advertizes park. But:

2012-01-07 16:29:11.288514 < HCI Command: Park State (0x02|0x0005) plen 6
    handle 12 max 80 min 64
2012-01-07 16:29:11.290485 > HCI Event: Command Status (0x0f) plen 4
    Park State (0x02|0x0005) status 0x00 ncmd 1
2012-01-07 16:29:11.466508 > HCI Event: Mode Change (0x14) plen 6
    status 0x0c handle 12 mode 0x00 interval 0
    Error: Command Disallowed

Broadcom is such Broadcom...

Huawei U8160 aka Vodafone 858 Smart Android phone
# hcitool info 04:C0:6F:xx:xx:xx | grep park

2012-01-07 16:19:01.322536 < HCI Command: Park State (0x02|0x0005) plen 6
    handle 12 max 80 min 64
2012-01-07 16:19:01.324521 > HCI Event: Command Status (0x0f) plen 4
    Park State (0x02|0x0005) status 0x1a ncmd 1
    Error: Unsupported Remote Feature / Unsupported LMP Feature

Broadcom is such Broadcom...

HTC Mogul WindowsMobile phone

2012-01-07 16:11:14.184067 < HCI Command: Park State (0x02|0x0005) plen 6
    handle 12 max 80 min 64
2012-01-07 16:11:14.186430 > HCI Event: Command Status (0x0f) plen 4
    Park State (0x02|0x0005) status 0x00 ncmd 1
2012-01-07 16:11:14.528429 > HCI Event: Mode Change (0x14) plen 6
    status 0x00 handle 12 mode 0x03 interval 80
    Mode: Park

Bwahaha! TI rules!

Samsung i740 WindowsMobile phone

Manufacturer: Cambridge Silicon Radio (10)

2012-01-07 17:54:17.141466 < HCI Command: Park State (0x02|0x0005) plen 6
    handle 12 max 80 min 64
2012-01-07 17:54:17.142911 > HCI Event: Command Status (0x0f) plen 4
    Park State (0x02|0x0005) status 0x00 ncmd 1
2012-01-07 17:54:17.472919 > HCI Event: Mode Change (0x14) plen 6
    status 0x00 handle 12 mode 0x03 interval 80
    Mode: Park

Linux computer with Broadcom chip

Park mode work here, supervised by Bluez.

So, out of 7 devices (that's whole piconet, and I have more!)  only 3 supported park mode.

More insightful posts:

Wednesday, January 4, 2012

Linux kernel Bluetooth ACL connection auto-disconnect

I finally found time to figure out why "hcitool cc" created connections die very soon (see previous posts). It turned out to be handled on the Linux kernel level. What's worse is that it's hardcoded at 2s - no provision for adjustment via sysfs for example. :

118 /* HCI timeouts */
119 #define HCI_CONNECT_TIMEOUT     (40000) /* 40 seconds */
120 #define HCI_DISCONN_TIMEOUT     (2000)  /* 2 seconds */
121 #define HCI_PAIRING_TIMEOUT     (60000) /* 60 seconds */
122 #define HCI_IDLE_TIMEOUT        (6000)  /* 6 seconds */
123 #define HCI_INIT_TIMEOUT        (10000) /* 10 seconds */
124 #define HCI_CMD_TIMEOUT         (1000)  /* 1 seconds */

Actual disconnection happens in hci_conn_timeout(). Other issue is that such disconnect reported as "Remote User Terminated Connection", which is, well, not true, as I don't terminate it, the system does. There's another status code, "Remote Device Terminated Connection due to Low Resources" which IMHO more suitable (if there's no "low resources", why do you disconnect so quickly, dear Linux?).

I quickly made a patch to be able to adjust disconnect timeout via sysfs, to experiment with low level connection more comfortably. Unfortunately, it turns out that I can extend delay to max 10s, even if set value much higher. So, something appears to call disconnect routine, even though I don't see any other references or timer manipulation %).