Sunday, August 28, 2016

Hacking the bq8030 with SANYO firmware

As mentioned in the previous article the bq8030 is the blank version of the bq20z90. If you bought some from Aliexpress they'd come up with the TI Boot ROM and you could use the flashing tool included in SMBusb to upload firmware and eeprom(data flash) to it.
Theoretically you could turn it into a bq20z90 by downloading the firmware from one and uploading that. (The procedure for accessing the Boot ROM on those chips is documented in datasheets and application notes.)



So how would you even start with a BQ8030 running proprietary firmware?

Google. Lots of Google.
Apparently they sell this tool for them::




Now with a SPECIAL! price of ONLY 3 THOUSAND US DOLLARS!! WHAT AN AMAZING DEAL!!!

I gathered everything I could find about this device and while it wasn't much it did provide clues that came in handy later on in the process. Especially this screenshot of the software that comes with it:




There was no way I could figure everything out based on just that but I did take notice of the function bar on the bottom.

Those could very well be SMBus commands right there.. would they have done that? Surely not.
Not really expecting much I tried a word write of 0x0214 to command 0x71 aand.. nothing obvious happened. So I moved on to poking at other things but eventually came back for a second look and that's when I realized:

Command scan starting at 0x70 before sending command

$ /smbusb_scan -w 0x16 -b 0x70
------------------------------------
             smbusb_scan
------------------------------------
SMBusb Firmware Version: 1.0.0
Scanning for command writability..
Scan range: 70 - ff
Skipping: None
------------------------------------
[71] ACK, Byte writable, Word writable
[72] ACK



And after


$ smbusb_comm -a 16 -c 71 -w 0x0214 
$ smbusb_scan -w 0x16 -b 0x70
------------------------------------
             smbusb_scan
------------------------------------
SMBusb Firmware Version: 1.0.0
Scanning for command writability..
Scan range: 70 - ff
Skipping: None
------------------------------------
[71] ACK, Byte writable, Word writable
[72] ACK
[73] ACK


So this actually unlocks an extra command which disappears again when an SBS command is issued (or when doing a full command scan starting from 0.)
The command however is not writable. Reading it returns:

$ smbusb_comm -a 16 -c 73 -r 2
023d

Interesting but insufficient.

Brick wall meet impatience

I couldn't really get any further with just that information so I started looking at the hardware instead. Having found slides from a TI presentation revealing the connection between the BQ8030 and bq20z90 I opened up the datasheet for the latter (since there's no public datasheet for the former).


Ok, nothing straightforward. No obvious BOOT pin as one would expect with a device that's not meant to be tampered with. But maybe pulling some pin high or low during reset will get me somewhere.

After the first pass no, not really. So maybe we have to set multiple pins into multiple states for it to work. Or maybe there's no such combination at all.
How about I try to abuse N/C pins instead. I have no logical explanation as to why I came to this decision. Maybe I saw a presentation somewhere about blackbox chips and N/C pins years and years and years ago but I could just be imagining things. Either way, about 5 minutes of poking at PIN #28 with a resistor connected to 3.3v in hand and triggering RESET at random intervals while running a continuous command scan:

$ smbusb_scan -w 0x16
------------------------------------
             smbusb_scan
------------------------------------
SMBusb Firmware Version: 1.0.1
Scanning for command writability..
Scan range: 00 - ff
Skipping: None
------------------------------------
[0] ACK, Byte writable, Word writable, Block writable
[1] ACK
[2] ACK
[3] ACK
[4] ACK, Byte writable, Word writable, Block writable
[5] ACK, Byte writable, Word writable, Block writable
[6] ACK, Byte writable, Word writable
[7] ACK, Byte writable, Word writable
[8] ACK
[9] ACK, Byte writable, Word writable
[a] ACK, Byte writable, Word writable


Wow, that worked?
Umm.. ok.. let's just reset for now..

$ smbusb_sbsreport
SMBusb Firmware Version: 1.0.1
-------------------------------------------------
Manufacturer Name:          ERROR
Device Name:                ERROR
Device Chemistry:           ERROR
Serial Number:              4294967287
Manufacture Date:           1980.00.00


Uh-oh.. Well that's not good!
It seems we're stuck in the Boot ROM. Is the chip fried? It's at this point that I coded up the flash tool to try and read the flash contents. (I wasn't really bothered by the chip dying as this was one of 2 sacrificial controller boards I kept just for messing around with.)
And the results? Apparently we can corrupt (ideally just) the first couple of blocks of flash if we bully PIN #28 while the chip is trying to start up. The good news though? (If we're lucky) We get 99% of the firmware, and thanks to Charlie Miller we have a disassembler(zip) for it.

Did messing with Pin #28 even have an effect? Could it just have been the erratic resetting of the chip that triggered the malfunction? Did I short VCELL+ to Pin28 while messing about? Was there high voltage on VCELL+? Was it just ESD?
No idea. But I did manage to reproduce the result on another chip using the same procedure. So when in doubt and you have nothing to lose, act like a caveman, I guess?
The only good thing about this method is that even if you have 0 knowledge about whether there even IS a method for entering the Boot ROM in the firmware let alone what it is there's still a high chance that you'll get in. How much of the firmware survives is another question.

Disassembly

A couple of hours of staring at unfamiliar assembly code later, here are the relevant parts for entering the Boot ROM with annotations:

cmd_handle_71
    ..      
    calls       smb_ACK
    ..
    calls       smbSlaveRecvWord
    move        a, (i3,0x1A)
    or          a, (i3,0x1B)
    jeq         check_71_pass
    move        r2, (i3,0x1B)
    add         r2, (i3,0x19) ; smb_word_LSB
    move        r3, (i3,0x1A)
    addc        r3, (i3,0x18) ; smb_word_MSB
    or          a, r3, r2
    jeq         accesslevel_oreq_40
    move        a, #0
    move        (i3,0x1A), a
    move        (i3,0x1B), a
   
check_71_pass:
    ..
    move        i1l, (i3,0x19) ; smb_word_LSB
    move        i1h, (i3,0x18) ; smb_word_MSB
    cmp         i1h, #2
    jne         wrong_pass
    cmp         i1l, #0x14  ; is 71 0214?
    jne         wrong_pass
    ..
    jeq         accesslevel_oreq_80


This is the first password check, seem familiar? It's the one that we saw in the screenshot above 0x0214 to 0x71. It sets an access flag that gets checked later on. Basically if (smbSlaveRecvWord(0x71) == 0x0214) { access_level |= 0x80 }; But wait.. It can set two access flags based on whatever (i3,0x1A) and (i3,0x1B) are. Hrmm.. Well I don't know what those are and can't find where they're set so let's assume the first jeq will not jump once we've given the correct first password because it would make sense. We can also see that it checks the word we send against those mystery bytes somehow and if it likes what it sees it sets access flag 0x40 and the mystery bytes to 0.

A little bit further up we find the entry point for the Boot ROM:

cmd_handle_70:
    *snip*
    move        r3, access_level
    and         r3, #0x40
    cmp         r3, #0        ; don't even bother if access
    jeq         cmd_handle_71 ; flag 0x40 is missing          
    *snip*   
    calls       smbSlaveRecvWord
    move        r2, (i3,0x19) ; smb_word_LSB
    move        r3, (i3,0x18) ; smb_word_MSB
    cmp         r3, #5
    jne         wrong_pass
    cmp         r2, #0x17      ; is 70 0517?
    jne         wrong_pass
    *snip* (prepare leaving the firmware safely)
    calls       bootrom_execute

So now we know pretty much what we need to do.

1. Send 0x0214 to 0x71
2. ???
3. Send 0x0517 to 0x70
4. Profit

And we've made the educated guess that Step 2 is really "Send 0x???? to 0x71" so we're pretty much done with the disassembly as 16 bits is way within the realm of bruteforceability and since I had another sacrificial board as well as a battery pack running SANYO firmware I had everything I needed to attempt it.
As it turns out there's another mandatory step between 1 and 2 and it was sheer luck that I left it in my brute force loop. 0x73, the command unlocked by sending the first password needs to be read before entering the second password. Which is...*drumroll*

0xFDC3

After realizing that the first unlocked command is important (why else would they have made it mandatory otherwise) it's not that surprising that when adding the number returned by it (0x023d) to the bruteforced value we get a nice round result: 0x10000 which is probably what the adding in the assembly and the mystery numbers are all about.

So to sum it all up:

1. Send 0x0214 to 0x71
2. Read Word X from 0x73
3. Send (0x10000 - X) to 0x71
4. Send 0x0517 to 0x70

Actually, sending the correct word in Step 3 will unlock several extra commands not just 0x70 for the BootROM entry but they all disappear as soon as you send an unrelated command much the same way as 0x73 does with the first password.

We don't really care about those though because we already have what we wanted:

$ smbusb_comm -a 16 -c 71 -w 0214
$ smbusb_comm -a 16 -c 73 -r 2

023d
$ smbusb_comm -a 16 -c 71 -w fdc3
$ smbusb_comm -a 16 -c 70 -w 0517

$ smbusb_bq8030flasher -p prg.bin -e eep.bin
------------------------------------
        smbusb_bq8030flasher
------------------------------------
SMBusb Firmware Version: 1.0.1
PEC is ENABLED
TI Boot ROM version 3.1
------------------------------------
Reading program flash
.............................................................
.............................................................
*snip*
.................................................
Done! 
Reading eeprom(data) flash
...................................................
Done!

$ xxd eep.bin
0000000: ffff 0031 076c 00c8 ffff 11f8 19e0 0355  ...1.l.........U
0000010: 0853 414e 594f 0030 3820 20ff ffff 0407  .SANYO.08  .....
0000020: 0b49 424d 2d34 3254 3532 3531 2020 2020  .IBM-42T5251   
0000030: 044c 494f 4e20 ffff ffff ffff ffff ffff  .LION 

*snip*


Huzzah!


Reset


To actually remove the permanent failure flag we need to look at the eeprom area.
The file is 2048 bytes and it has two sections.

The first 1024 bytes contains the static data (the beginning of which you can see in the hex dump above). It contains all the data set by the manufacturer that never changes during the lifetime of the battery. Design capacity/voltage, serial and model numbers, default settings, etc.
This part is protected by a checksum somewhere which you'll need to find and fix if you want to modify anything in there.

The second part contains the dynamic data. Basically the "log" of the battery with current remaining capacity and similar things that get updated as the battery is cycled.  Also, the failure flag.

You pretty much just need to start mapping out the values and then zeroing or FF-ing out the ones that you can't map to anything to see if that fixes it or breaks something else. There's no checksum on the dynamic area so you are free to modify this section all you want. Repeat until desired outcome is reached. That's what I did.
Some helpful tips:
  • On my specific battery the log starts at 0x500 and has several entries that all need to be modified (mostly duplicate data)
  • Battery capacity is stored as the remaining capacity reported through SBS divided by 2.
  • Cycle count is stored as CycleCount-1 (eg.: SBS value: 223, Eeprom byte: 222)
  • Remaining Capacity Alarm is stored as-is. A good place to start mapping.
  • It's a good idea to reset the cycle counter. I don't want to start conspiracy theories but... at least with this specific model there's been a lot that died inexplicably around the 200 cycle mark. Coincidence? Probably, but it can't hurt.
  • Please don't ask me to fix eeprom dumps :-) 
  • Good luck!
And the result:


It took the estimate a charge cycle to normalize. This particular battery lasts 2 hours on constant high CPU load after external recharging and clearing of the fail flag. Not bad for a 10 year old battery in a 12 year old machine and since the other choice was THROW IT AWAY AND BUY A NEW ONE  I consider this a win :)

14 comments:

  1. It was such a pleasure to read this! Thanks for posting it.

    Secret hacking recipe: "How about I try to abuse N/C pins instead."
    :o)

    ReplyDelete
  2. I've always wondered how boot ROM is loaded onto these chips in the first place. If one can overwrite the gauge chip with a custom boot ROM, wouldn't that allow loading arbitrary firmware?

    Also, do these chips have a JTAG interface that will allow for easy uploading/downloading of the firmware?

    ReplyDelete
    Replies
    1. There wouldn't be a need to mess with the boot rom if this low-level programming access was available. Could just use it to manipulate the other parts directly :-) But for all we know the boot rom could be stored in actual mask ROM on the chip, never to be changed.

      I wouldn't be surprised if there was some sort of hardware-level programming interface that they use to upload the boot rom in a manufacturing step but there's no way of knowing what/where it is or whether it was permanently disabled afterwards or not (that I know of).

      Delete
  3. Hi. Good job. I would like to ask You, maybe someone knows how to unlock charging in dell batteries with external power supply ? I know dell batteries must have 100 ohm resistance from GND to SYSpress pin, discharge works, but charge not. Maybe someone knows smbus command to unlock charge ?

    ReplyDelete
    Replies
    1. No clue, sorry. I haven't worked on them and Dells seem to be one of those packs that have "special" firmwares based on screenshots from payware battery hacking software. Honestly if you're not hell bent on not paying a cent to anyone like I was you're probably better off finding a local re-celling business. Good luck!

      Delete
  4. Hi,

    I found this blog using Google after scratching my head over a few days.
    My ThinkPad T430 battery has gone kaput. I disassembled the battery. The cells is good, each holds a charge of 3.9V but there is no output on the battery header (7 pins).
    My guess is the controller chips seem to think that it's time to declare death to my battery.
    How to reset this controller chips? Is it even possible to write a new firmware to these chips so they will accept to work again?
    I see a bd8030A and bq29330 chip.
    Below is pictures of my actual battery, disassembled.
    imgur.com/a/8IJi2

    ReplyDelete
    Replies
    1. The method described in the article above should work with that pack, yes. Some experience with electronics and reverse engineering binary data files is required. The article about fuses (sidebar, 2016, September) is also relevant. I can't offer any assistance beyond that. Good luck!

      Delete
  5. Very good job!
    I have here battery from Thinkpad Edge E520 which is working, but have low capacity. I want to try change cell for new Sanyo NCR18650GA 3500mAh and change FW to the right capaity.
    I hope that it will be success, I never tried play with laptop batteries chips :)

    ReplyDelete
    Replies
    1. I hope so too! Let me know how it turns out.

      Delete
  6. Hi, nice guide; I'm trying this out on some old X61 batteries
    I read out the eeprom and flash areas successfully, but when I do sbsreport subsequently I'm stuck with the error messages (with the date at 1980 etc, as you had above). How do you reset that?

    ReplyDelete
    Replies
    1. Hi,

      Did you try "smbusb_bq8030flasher --execute" ?

      Delete
  7. Hi Victor.
    I try to recover my dell battery. It has bq8050 on a board. And I have a question, how do you able to get a software from Bq8030 before disassembling a code? I don't understand, how do you know about 0x0214 password and I fully not understand 0xfdc3... Thank you!

    ReplyDelete
    Replies
    1. I have no clue about the bq8050 so keep in mind that it may not be compatible with the tool I've released. Also keep in mind that this bootrom entry method is SPECIFIC to the Sanyo firmware. Dell likely has their own custom firmware even if the cells used in the pack are Sanyo based on what I've seen in payware battery hacking software so this method is unlikely to work.

      On to the questions:
      It was possible to glitch the bq8030 into clearing the first block of program flash hence ending up in the boot rom "permanently". The method is written in the article. You will not be able to recover the chip after this without an intact firmware image so I wouldn't try it if you only have one battery and also YMMV. I had several controller boards to sacrifice so I didn't care. Once you're in the boot rom you can read the (corrupted, first block missing) firmware and since you're only missing initialization code at the start you can still disassemble this to extract the password(s) that you can then use on other batteries. Password #3 was brute-forced as I've also written. This was done with a one-off tool that isn't included on github. From there it was guesswork to arrive at the significance of the value read from 0x73 and the final password but the correctness of this has since been confirmed by others.

      Delete