FM Radio Upgrade Reverse Engineering (Part 2)
You may notice that the entry number on this article is out of sequence with the dates, that is because it has taken me over a month to finish writing to a point where I felt any sort of conclusion was drawn…
In part 1 we left off trying to work out what device was at address 0x16
on the I2C bus, we had already worked out that 0x63
was the FM radio chip, so this time around we’re going to delve deep to try and understand this other address.
It doesn’t take much deducing to know that this address is likely coming from the STM32, given there are no other I2C capable IC’s on the board. We also know that the I2C pads that are exposed were used for testing the devices so we perhaps assume that some level of control of the STM32 is possible over this address but how much is really possible is what I’m hoping to discover.
The Wikipedia entry for reverse engineering of binary software outlines 3 main groups, so we’re going to go through each of the groups and try each approach, amusingly I only came across these 3 groups while writing this article post hoc but it fairly closely aligns to the steps I naturally took in this process, so it serves as a nice way to divide this article up.
Observation
Analysis through observation of information exchange, most prevalent in protocol reverse engineering, which involves using bus analyzers and packet sniffers, such as for accessing a computer bus or computer network connection and revealing the traffic data thereon. Bus or network behavior can then be analyzed to produce a standalone implementation that mimics that behavior.
A lot of this type of observation was done in the previous article which was helpful in working out the operation of the FM radio chip on the I2C bus, this was just passive observation though so this time we’re going to try passive observation, with some prodding.
Brute force
While I already had the board plugged into the I2C bus of my Raspberry Pi I thought I’d start out by just sending data to the address and see what happens, it’s not a very elegant approach but it has the potential to quickly turn out some results, or break something, but hopefully learning something along the way.
The first thing I notice is that when the radio is off, 0x16
still appears on the bus, this must be the STM32, so I should be able to issue instructions to it to turn it on? I’ll start with the ultimate brute force of doing an i2cdump
, which will effectively do a request for every single register in a block, i.e. everything from 0x00
to 0xff
.
Register state with the radio off
0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef
00: XX XX XX XX XX 00 XX XX XX XX XX 8e 0e 0e 0e 0e XXXXX.XXXXX?????
10: 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e ????????????????
20: 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e ????????????????
30: 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e ????????????????
40: 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e ????????????????
50: 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e ????????????????
60: 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e ????????????????
70: 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e ????????????????
80: 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e ????????????????
90: 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e ????????????????
a0: 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e ????????????????
b0: 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e ????????????????
c0: 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e ????????????????
d0: 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e ????????????????
e0: 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e ????????????????
f0: 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e 0e ????????????????
Register state with the radio on
0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef
00: 00 8e ce XX XX XX XX XX XX XX 00 00 00 00 00 00 .??XXXXXXX......
10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
40: 00 00 00 00 00 ff 00 00 00 00 00 00 00 00 00 00 ................
50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
80: 00 00 00 00 00 00 00 00 00 00 ff XX ff XX ff XX ...........X.X.X
90: ff XX ff XX ff XX XX XX 00 00 00 00 00 00 00 00 .X.X.XXX........
a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
d0: 00 00 00 ff 00 00 00 00 00 00 00 00 00 00 00 00 ................
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
This didn’t really turn out much of use, sure the output is different depending on the radio state which might hint towards something, it might not. One thing of note while I was doing this, the device itself completely locked up, which sort of makes sense, it’ll be hitting the interrupts and blocking the thread. Also of note was that hitting all the registers while the radio was in the on state caused the lights to flicker on/off a bunch, I don’t yet know if this is because I was hitting a register that controlled them or if just the interrupts blocked the thread that causes the usual light pulses.
After hitting everything at once and not really turning up much of use, I tried to then hit a register at a time and watch to see if anything happened, either on my scope which was still hooked up to the I2C bus or if the actual radio behaviour changed, light status, sounds etc. I did notice that hitting 0x02
while the radio was turned on appeared to force a reset, the lights went out and I could see the FM radio chip boot sequence happen again that I discussed in part 1. I also noticed that if I sent data to 0x01
while the radio was off, it did appear to power it on, also sending 0xff
to 0x07
appeared to fairly reliably turn the radio off. I also got some interesting readings from 0x80
, where the state sits at 0x0a
but when a command is sent on other registers, this flips to 0x00
and then shortly after back to 0x0a
, so I wonder if this is some kind of CTS (Clear To Send) or state of in progress instructions?
At this stage I’m getting some hints of functionality but nothing concrete, nothing to where I could definitevely say that this register performs this action, because it did also seem a bit hit or miss, sometimes it wouldn’t do what I thought it should. There was some glimmer of repeatability with register 0x03
as I had noticed that sending 0xff
to it did appear to invoke some sort of perhaps self test behaviour, when sent the device will freeze for a moment, then you hear a burst of radio static (the pre-recorded kind), each of the LEDs would light up in some sort of sequence and then it goes back to normal. Given the purpose of these test pads, could that be all there is, a register to trigger the self test? We’ll have to dig a bit deeper to fully understand that.
Serial Wire Debug
So brute forcing didn’t really turn up a huge deal, we can see that control is indeed possible via this address on I2C, but no concrete rules on what to send to which registers, and what to expect as a result. So we need to change our approach to something with a little more tact.
On the back side of the main PCB as briefly mentioned in the previous entry there are some pads for Serial Wire Debug, so with the correct debugger maybe we’ll be able to find out some more.
Initially I tried to make use of the SWD pads with the hardware that I had available to me, so I first tested with my scope and was able to confirm that yes CLK was indeed outputting a clock pulse, but I wasn’t able to gather much of use from the DIO data pad. I had an FT232 UART board on my desk which I tried to use in combination with OpenOCD and a few other tools to see if I could parse any data or form any kind of a connection, all without a great deal of success.
So, I bought an ST-LINK v2 in-circuit debugger/programmer, which is the right tool for the job when it comes to STM32’s. I soldered wires onto each of the pads for SWD on the back of the board and connected them up to the ST-LINK, then using the STM32CubeProgrammer attempted to make connection to the STM32 microcontroller on the board. We got a connection! And with the connection made, I read all of the flash data and saved it to disk.
Disassembly
So, now we have the full contents of the flash from the MCU, how do we get answers from it? Software Reverse Engineering (SRE). The particular variant of reversing we’ll be using here is disassembly, we have the raw memory state from the device and we want to work out what it’s doing an how it does it. This is technique number 2 from our earlier Wikipedia article.
Disassembly using a disassembler, meaning the raw machine language of the program is read and understood in its own terms, only with the aid of machine-language mnemonics. It works on any computer program but can take quite some time.
Setting up Ghidra
The tool we’re going to be using for disassembly is Ghidra not least because it’s the one I have the most experience in, but also it’s about the only one that I know of that can disassemble ARM for free.
Flash memory on the STM32 starts from address 0x0800 000
so when we load the binary into Ghidra we’ll set the start address for the binary data to start from that address. With that loaded and the initial auto analysis performed, already we can see interesting stuff.
Mapping out STM32 memory
So to make some sense of where to look I started out by studying the datasheets for the STM32F0 series. Section 2.2.2 discusses the memory map and register boundary addresses, this is where previously I’d worked out where to read the main flash memory from and as a result our initial starting address when we loaded the binary dump of the flash into Ghidra. Other memory addresses discussed in this section lay out where peripheral access and various other internal registers, in the context of what we’re trying to discover about this board, the most important ones being the I2C busses.
A lot of the references to memory locations in the disassembled code will reference some of these address spaces, so it would be useful for us to map those out in Ghidra so when there is an external call to a peripheral, we can know which things are being sent to which peripherals. In Ghidra these locations can be defined in the Tools > Memory Map window, and to start with I was going through the datasheet and manually entering the ranges a range at a time, but this quickly became tedious and the coder in me wanted to find a way to automate.
After a lot more research I came across SVD-Loader for Ghidra, an SVD file is esentially an XML file that has a standardised description of ARM based systems, you can read more about that here. So with the matching SVD file for the STM32 present in the radio, we could use this plugin to parse the SVD and load all the memory regions. I ended up finding those on the Arm Keil website here.
Understanding data structures
With the memory locations for the hardware defined now, pointers from flash to peripherals will now resolve. The SVD-Loader has also helpfully created some default data structures to break down the regions of memory for the peripherals, so in the example below, I2C1 now has chunks of 4 bytes allocated to uints (32 bit integer) for each register of the peripheral, eg. control registers, interrupts etc.
These autogenerated structures are a great starter to help us know where to look, but to understand things even deeper it would be helpful for us to match these data structures up to the data structures that were likely used to write the initial code. As I don’t know exactly how the initial code was written, I’m going to take some educated guesses and see what that leads us.
Adding more data types
So, making some educated guesses and assumptions, we’re going to assume that the software we’re analysing was generated with the standard STM32CubeMX platform, which would mean for the hardware etc. that’s the STM32CubeF0 HAL Driver. Ghidra has a helpful built in Parse C Source option that can parse C source and headers and it will create the data types to allow us to map memory regions to data structures that should more closely resemble where the binary came from.
Initially I struggled adding all of the HAL drivers via this method as Ghidra was throwing all kinds of errors, but researching some of these errors it seemed like I was not alone in this experience and helpfully I came across some Python scripts on Github that will reformat header files into a format that Ghidra should be able to parse.
With all the files processed by gdt_helper.py
I loaded them via Parse C Source, this took a few iterations and I discovered I needed to add -DSTM32F030xC
to the parse options based on some of the conditional includes in the header files to make sure I ended up with all the appropriate headers for this MCU.
Decompilation
So far has just been a lot of precursory work, but it’s work that in theory makes it easier to get to where we’re aiming for. We could at this stage try to make sense of the machine code and dig around but now all the above data types are in place we can begin to decompile the machine code and try to make sense of what the device is programmed to do. Again back to our Wikipedia definition.
Decompilation using a decompiler, a process that tries, with varying results, to recreate the source code in some high-level language for a program only available in machine code or bytecode.
The way I’ve analogised this process in my head though and the way I would best describe it to people is a lot like Michelangelo said about sculpture:
“The sculpture is already complete within the marble block, before I start my work. It is already there, I just have to chisel away the superfluous material.” - Michelangelo
That is to say, what we’re looking for is here in the machine code, all the behaviours and functionality of this radio is right in front of us, we just need to parse through it and make sense of it somehow.
Now I don’t want to give out the impression here that any of this process is straightforwards, I have spent a month on and off decompiling and navigating around this binary so far, the notes here are just the conclusions or end results documenting key milestones and where I ended up.
Where to begin
Usually the first place to start with SRE is the entrypoint of the code and follow the trail down, which I suppose if you want to understand the entire code base might make some sense. Code structure is like a tree, there’s one entry point and the code can branch in multiple ways depending on certain conditions etc. Here though we’re most interested in the I2C behaviour, and now we know the memory locations for the I2C registers, we can start there and navigate our way back up and work out every branch that has anything to do with the I2C bus.
Labelling and moving up
This process took me hours and I don’t think it’s ever really completed, and the more I did the more I doubted myself that I was straying too far, so I’m going to highlight the rough process I took and just trust that it was more of the same, over and over again.
So, starting from the I2C1 address range there are a bunch of registers, the first register is CR1, the control register, this register is used to control the I2C bus, starting and stopping the device etc. Looking at the parts of the code that references this register we can see that there are two distinct functions that call this register and initially they’ll have default names generated by Ghidra, so we’ll take a look at one of them.
Below is the initial result of the decompilation, and it’s kind of hard to parse, we can see that something called DAT_08000f68
is dereferenced, a bunch of values are set using bitmasks etc. etc. but it’s not super obvious, there are also some undefined types, eg. undefined4
, this is showing that there is a reference to a data type that is 4 bytes long but the type is not defined.
undefined4 FUN_08000e82(void)
{
uint *puVar1;
undefined4 unaff_r7;
puVar1 = DAT_08000f68;
*DAT_08000f68 = *DAT_08000f68 | 0x80;
*puVar1 = *puVar1 | 0x40;
*puVar1 = *puVar1 | 0x20;
*puVar1 = *puVar1 | 0x10;
*puVar1 = *puVar1 | 8;
*puVar1 = *puVar1 | 4;
*puVar1 = *puVar1 | 2;
puVar1[7] = puVar1[7] | 8;
puVar1[7] = DAT_08000f6c;
FUN_0800116a(0x17,0,1);
FUN_08001172(0x17);
return unaff_r7;
}
To make some more sense of this we need to methodically go through and make attempts to either defined types where we have confidence what they are, renaming variables can help legibility, we can also leave inline notes to describe what we might be seeing here etc.
Starting here with the variable named DAT_08000f68
, back in the machine code view we can locate that variable at memory location 08000f68
and notice that it’s pointer to 40005400h
, now this is part of one of the memory address ranges we loaded earlier, this is the register in I2C1 for CR1.
So now we know that the data type here should be a pointer to the I2C device, the 4 bytes specifically are the control register, looking at the source for the I2C device we can see that this would be an I2C_TypeDef
, but the data type here will be a pointer to that type, so we can hit T and set that type. Now Ghidra will change the name from DAT_08000f68
to PTR_I2C1_08000f68
, if we want we can also rename that variable manually too, so we’ll name it I2C1
for legibility, also the locally dereferenced value that was named puVar1
can be named.
The next thing we could clean up here for some consistency, half of the bitmasks are hex, the other half are decimal, so let’s change all of those to hex and see what we have now.
undefined4 FUN_08000e82(void)
{
undefined4 unaff_r7;
I2C_TypeDef *i2c1;
i2c1 = I2C1;
I2C1->CR1 = I2C1->CR1 | 0x80;
i2c1->CR1 = i2c1->CR1 | 0x40;
i2c1->CR1 = i2c1->CR1 | 0x20;
i2c1->CR1 = i2c1->CR1 | 0x10;
i2c1->CR1 = i2c1->CR1 | 0x8;
i2c1->CR1 = i2c1->CR1 | 0x4;
i2c1->CR1 = i2c1->CR1 | 0x2;
i2c1->ICR = i2c1->ICR | 0x8;
i2c1->ICR = DAT_08000f6c;
FUN_0800116a(0x17,0,1);
FUN_08001172(0x17);
return unaff_r7;
}
It’s not super clear still, but now we can at least see that a bunch of bits are being flipped on CR1, we can dig through the header files for the HAL driver to find the CR1 options and they are mostly defined by enums.
So now back to our decompiled code, looking at the first line we have I2C1->CR1 = I2C1->CR1 | 0x80;
so taking the value of the control register, and setting it to a logical OR of the current value and 0x80
, effectively flipping the 8th bit to the on position, and looking at the above bit definitions, 0x80
is I2C_CR1_ERRIE
or “Errors interrupt enable”. This line of code is enabling interrupts on errors, so we can add a comment in the code to say that’s what this is and move on to the next line, and so on, and so on. The gist of this function now is clearer, it’s enabling transmit and recieve etc, effectively setting up the I2C device, so we can now rename this function from FUN_08000e82
to something like EnableI2CDevice
.
Going back to the other function that also called this CR1 register, it looks much the same but flips the bits off, so this one appears to disable the device. At this stage, back to our tree analogy we would move up the tree and see what things call EnableI2CDevice
, there are a few places that call it so we repeat the above process, over and over again, try to determine what we can, label it, assign data type, comment it, move up again ad infinitum.
Findings
So in the previous article one of the things we identified on the I2C bus was the startup sequence of the FM radio, where a series of properties are configured. Once we follow the branches in the decompiled code for a while that sequence surfaces in the same order we observed on the wire, which is reassuring to know the process works and we’re on the right track. Having that confirmation and confidence, we can build from that, from here anywhere else that calls FMRadio_SetProperty
we have confidence that is what that code is doing.
We also know that anything that calls this function is something that’s being called to power the radio up, so either something that’s a button press handler, or an I2C event etc. only two other functions call this one, one is going to be the “normal” startup eg. pressing the physical button, the other, perhaps via I2C?