Sunday, August 28, 2016

Hacking the bq8030 with SANYO firmware

As mentioned in the previous article the bq8030 is the blank version of the bq20z90. If you bought some from Aliexpress they'd come up with the TI Boot ROM and you could use the flashing tool included in SMBusb to upload firmware and eeprom(data flash) to it.
Theoretically you could turn it into a bq20z90 by downloading the firmware from one and uploading that. (The procedure for accessing the Boot ROM on those chips is documented in datasheets and application notes.)



So how would you even start with a BQ8030 running proprietary firmware?

Google. Lots of Google.
Apparently they sell this tool for them::




Now with a SPECIAL! price of ONLY 3 THOUSAND US DOLLARS!! WHAT AN AMAZING DEAL!!!

I gathered everything I could find about this device and while it wasn't much it did provide clues that came in handy later on in the process. Especially this screenshot of the software that comes with it:




There was no way I could figure everything out based on just that but I did take notice of the function bar on the bottom.

Those could very well be SMBus commands right there.. would they have done that? Surely not.
Not really expecting much I tried a word write of 0x0214 to command 0x71 aand.. nothing obvious happened. So I moved on to poking at other things but eventually came back for a second look and that's when I realized:

Command scan starting at 0x70 before sending command

$ /smbusb_scan -w 0x16 -b 0x70
------------------------------------
             smbusb_scan
------------------------------------
SMBusb Firmware Version: 1.0.0
Scanning for command writability..
Scan range: 70 - ff
Skipping: None
------------------------------------
[71] ACK, Byte writable, Word writable
[72] ACK



And after


$ smbusb_comm -a 16 -c 71 -w 0x0214 
$ smbusb_scan -w 0x16 -b 0x70
------------------------------------
             smbusb_scan
------------------------------------
SMBusb Firmware Version: 1.0.0
Scanning for command writability..
Scan range: 70 - ff
Skipping: None
------------------------------------
[71] ACK, Byte writable, Word writable
[72] ACK
[73] ACK


So this actually unlocks an extra command which disappears again when an SBS command is issued (or when doing a full command scan starting from 0.)
The command however is not writable. Reading it returns:

$ smbusb_comm -a 16 -c 73 -r 2
023d

Interesting but insufficient.

Brick wall meet impatience

I couldn't really get any further with just that information so I started looking at the hardware instead. Having found slides from a TI presentation revealing the connection between the BQ8030 and bq20z90 I opened up the datasheet for the latter (since there's no public datasheet for the former).


Ok, nothing straightforward. No obvious BOOT pin as one would expect with a device that's not meant to be tampered with. But maybe pulling some pin high or low during reset will get me somewhere.

After the first pass no, not really. So maybe we have to set multiple pins into multiple states for it to work. Or maybe there's no such combination at all.
How about I try to abuse N/C pins instead. I have no logical explanation as to why I came to this decision. Maybe I saw a presentation somewhere about blackbox chips and N/C pins years and years and years ago but I could just be imagining things. Either way, about 5 minutes of poking at PIN #28 with a resistor connected to 3.3v in hand and triggering RESET at random intervals while running a continuous command scan:

$ smbusb_scan -w 0x16
------------------------------------
             smbusb_scan
------------------------------------
SMBusb Firmware Version: 1.0.1
Scanning for command writability..
Scan range: 00 - ff
Skipping: None
------------------------------------
[0] ACK, Byte writable, Word writable, Block writable
[1] ACK
[2] ACK
[3] ACK
[4] ACK, Byte writable, Word writable, Block writable
[5] ACK, Byte writable, Word writable, Block writable
[6] ACK, Byte writable, Word writable
[7] ACK, Byte writable, Word writable
[8] ACK
[9] ACK, Byte writable, Word writable
[a] ACK, Byte writable, Word writable


Wow, that worked?
Umm.. ok.. let's just reset for now..

$ smbusb_sbsreport
SMBusb Firmware Version: 1.0.1
-------------------------------------------------
Manufacturer Name:          ERROR
Device Name:                ERROR
Device Chemistry:           ERROR
Serial Number:              4294967287
Manufacture Date:           1980.00.00


Uh-oh.. Well that's not good!
It seems we're stuck in the Boot ROM. Is the chip fried? It's at this point that I coded up the flash tool to try and read the flash contents. (I wasn't really bothered by the chip dying as this was one of 2 sacrificial controller boards I kept just for messing around with.)
And the results? Apparently we can corrupt (ideally just) the first couple of blocks of flash if we bully PIN #28 while the chip is trying to start up. The good news though? (If we're lucky) We get 99% of the firmware, and thanks to Charlie Miller we have a disassembler(zip) for it.

Did messing with Pin #28 even have an effect? Could it just have been the erratic resetting of the chip that triggered the malfunction? Did I short VCELL+ to Pin28 while messing about? Was there high voltage on VCELL+? Was it just ESD?
No idea. But I did manage to reproduce the result on another chip using the same procedure. So when in doubt and you have nothing to lose, act like a caveman, I guess?
The only good thing about this method is that even if you have 0 knowledge about whether there even IS a method for entering the Boot ROM in the firmware let alone what it is there's still a high chance that you'll get in. How much of the firmware survives is another question.

Disassembly

A couple of hours of staring at unfamiliar assembly code later, here are the relevant parts for entering the Boot ROM with annotations:

cmd_handle_71
    ..      
    calls       smb_ACK
    ..
    calls       smbSlaveRecvWord
    move        a, (i3,0x1A)
    or          a, (i3,0x1B)
    jeq         check_71_pass
    move        r2, (i3,0x1B)
    add         r2, (i3,0x19) ; smb_word_LSB
    move        r3, (i3,0x1A)
    addc        r3, (i3,0x18) ; smb_word_MSB
    or          a, r3, r2
    jeq         accesslevel_oreq_40
    move        a, #0
    move        (i3,0x1A), a
    move        (i3,0x1B), a
   
check_71_pass:
    ..
    move        i1l, (i3,0x19) ; smb_word_LSB
    move        i1h, (i3,0x18) ; smb_word_MSB
    cmp         i1h, #2
    jne         wrong_pass
    cmp         i1l, #0x14  ; is 71 0214?
    jne         wrong_pass
    ..
    jeq         accesslevel_oreq_80


This is the first password check, seem familiar? It's the one that we saw in the screenshot above 0x0214 to 0x71. It sets an access flag that gets checked later on. Basically if (smbSlaveRecvWord(0x71) == 0x0214) { access_level |= 0x80 }; But wait.. It can set two access flags based on whatever (i3,0x1A) and (i3,0x1B) are. Hrmm.. Well I don't know what those are and can't find where they're set so let's assume the first jeq will not jump once we've given the correct first password because it would make sense. We can also see that it checks the word we send against those mystery bytes somehow and if it likes what it sees it sets access flag 0x40 and the mystery bytes to 0.

A little bit further up we find the entry point for the Boot ROM:

cmd_handle_70:
    *snip*
    move        r3, access_level
    and         r3, #0x40
    cmp         r3, #0        ; don't even bother if access
    jeq         cmd_handle_71 ; flag 0x40 is missing          
    *snip*   
    calls       smbSlaveRecvWord
    move        r2, (i3,0x19) ; smb_word_LSB
    move        r3, (i3,0x18) ; smb_word_MSB
    cmp         r3, #5
    jne         wrong_pass
    cmp         r2, #0x17      ; is 70 0517?
    jne         wrong_pass
    *snip* (prepare leaving the firmware safely)
    calls       bootrom_execute

So now we know pretty much what we need to do.

1. Send 0x0214 to 0x71
2. ???
3. Send 0x0517 to 0x70
4. Profit

And we've made the educated guess that Step 2 is really "Send 0x???? to 0x71" so we're pretty much done with the disassembly as 16 bits is way within the realm of bruteforceability and since I had another sacrificial board as well as a battery pack running SANYO firmware I had everything I needed to attempt it.
As it turns out there's another mandatory step between 1 and 2 and it was sheer luck that I left it in my brute force loop. 0x73, the command unlocked by sending the first password needs to be read before entering the second password. Which is...*drumroll*

0xFDC3

After realizing that the first unlocked command is important (why else would they have made it mandatory otherwise) it's not that surprising that when adding the number returned by it (0x023d) to the bruteforced value we get a nice round result: 0x10000 which is probably what the adding in the assembly and the mystery numbers are all about.

So to sum it all up:

1. Send 0x0214 to 0x71
2. Read Word X from 0x73
3. Send (0x10000 - X) to 0x71
4. Send 0x0517 to 0x70

Actually, sending the correct word in Step 3 will unlock several extra commands not just 0x70 for the BootROM entry but they all disappear as soon as you send an unrelated command much the same way as 0x73 does with the first password.

We don't really care about those though because we already have what we wanted:

$ smbusb_comm -a 16 -c 71 -w 0214
$ smbusb_comm -a 16 -c 73 -r 2

023d
$ smbusb_comm -a 16 -c 71 -w fdc3
$ smbusb_comm -a 16 -c 70 -w 0517

$ smbusb_bq8030flasher -p prg.bin -e eep.bin
------------------------------------
        smbusb_bq8030flasher
------------------------------------
SMBusb Firmware Version: 1.0.1
PEC is ENABLED
TI Boot ROM version 3.1
------------------------------------
Reading program flash
.............................................................
.............................................................
*snip*
.................................................
Done! 
Reading eeprom(data) flash
...................................................
Done!

$ xxd eep.bin
0000000: ffff 0031 076c 00c8 ffff 11f8 19e0 0355  ...1.l.........U
0000010: 0853 414e 594f 0030 3820 20ff ffff 0407  .SANYO.08  .....
0000020: 0b49 424d 2d34 3254 3532 3531 2020 2020  .IBM-42T5251   
0000030: 044c 494f 4e20 ffff ffff ffff ffff ffff  .LION 

*snip*


Huzzah!


Reset


To actually remove the permanent failure flag we need to look at the eeprom area.
The file is 2048 bytes and it has two sections.

The first 1024 bytes contains the static data (the beginning of which you can see in the hex dump above). It contains all the data set by the manufacturer that never changes during the lifetime of the battery. Design capacity/voltage, serial and model numbers, default settings, etc.
This part is protected by a checksum somewhere which you'll need to find and fix if you want to modify anything in there.

The second part contains the dynamic data. Basically the "log" of the battery with current remaining capacity and similar things that get updated as the battery is cycled.  Also, the failure flag.

You pretty much just need to start mapping out the values and then zeroing or FF-ing out the ones that you can't map to anything to see if that fixes it or breaks something else. There's no checksum on the dynamic area so you are free to modify this section all you want. Repeat until desired outcome is reached. That's what I did.
Some helpful tips:
  • On my specific battery the log starts at 0x500 and has several entries that all need to be modified (mostly duplicate data)
  • Battery capacity is stored as the remaining capacity reported through SBS divided by 2.
  • Cycle count is stored as CycleCount-1 (eg.: SBS value: 223, Eeprom byte: 222)
  • Remaining Capacity Alarm is stored as-is. A good place to start mapping.
  • It's a good idea to reset the cycle counter. I don't want to start conspiracy theories but... at least with this specific model there's been a lot that died inexplicably around the 200 cycle mark. Coincidence? Probably, but it can't hurt.
  • Please don't ask me to fix eeprom dumps :-) 
  • Good luck!
And the result:


It took the estimate a charge cycle to normalize. This particular battery lasts 2 hours on constant high CPU load after external recharging and clearing of the fail flag. Not bad for a 10 year old battery in a 12 year old machine and since the other choice was THROW IT AWAY AND BUY A NEW ONE  I consider this a win :)

29 comments:

  1. It was such a pleasure to read this! Thanks for posting it.

    Secret hacking recipe: "How about I try to abuse N/C pins instead."
    :o)

    ReplyDelete
  2. I've always wondered how boot ROM is loaded onto these chips in the first place. If one can overwrite the gauge chip with a custom boot ROM, wouldn't that allow loading arbitrary firmware?

    Also, do these chips have a JTAG interface that will allow for easy uploading/downloading of the firmware?

    ReplyDelete
    Replies
    1. There wouldn't be a need to mess with the boot rom if this low-level programming access was available. Could just use it to manipulate the other parts directly :-) But for all we know the boot rom could be stored in actual mask ROM on the chip, never to be changed.

      I wouldn't be surprised if there was some sort of hardware-level programming interface that they use to upload the boot rom in a manufacturing step but there's no way of knowing what/where it is or whether it was permanently disabled afterwards or not (that I know of).

      Delete
  3. Hi. Good job. I would like to ask You, maybe someone knows how to unlock charging in dell batteries with external power supply ? I know dell batteries must have 100 ohm resistance from GND to SYSpress pin, discharge works, but charge not. Maybe someone knows smbus command to unlock charge ?

    ReplyDelete
    Replies
    1. No clue, sorry. I haven't worked on them and Dells seem to be one of those packs that have "special" firmwares based on screenshots from payware battery hacking software. Honestly if you're not hell bent on not paying a cent to anyone like I was you're probably better off finding a local re-celling business. Good luck!

      Delete
    2. Hi. It is possible. Just send to manufacturrer access comand 0x108 in hex

      Delete
  4. Hi,

    I found this blog using Google after scratching my head over a few days.
    My ThinkPad T430 battery has gone kaput. I disassembled the battery. The cells is good, each holds a charge of 3.9V but there is no output on the battery header (7 pins).
    My guess is the controller chips seem to think that it's time to declare death to my battery.
    How to reset this controller chips? Is it even possible to write a new firmware to these chips so they will accept to work again?
    I see a bd8030A and bq29330 chip.
    Below is pictures of my actual battery, disassembled.
    imgur.com/a/8IJi2

    ReplyDelete
    Replies
    1. The method described in the article above should work with that pack, yes. Some experience with electronics and reverse engineering binary data files is required. The article about fuses (sidebar, 2016, September) is also relevant. I can't offer any assistance beyond that. Good luck!

      Delete
  5. Very good job!
    I have here battery from Thinkpad Edge E520 which is working, but have low capacity. I want to try change cell for new Sanyo NCR18650GA 3500mAh and change FW to the right capaity.
    I hope that it will be success, I never tried play with laptop batteries chips :)

    ReplyDelete
    Replies
    1. I hope so too! Let me know how it turns out.

      Delete
  6. Hi, nice guide; I'm trying this out on some old X61 batteries
    I read out the eeprom and flash areas successfully, but when I do sbsreport subsequently I'm stuck with the error messages (with the date at 1980 etc, as you had above). How do you reset that?

    ReplyDelete
    Replies
    1. Hi,

      Did you try "smbusb_bq8030flasher --execute" ?

      Delete
  7. Hi Victor.
    I try to recover my dell battery. It has bq8050 on a board. And I have a question, how do you able to get a software from Bq8030 before disassembling a code? I don't understand, how do you know about 0x0214 password and I fully not understand 0xfdc3... Thank you!

    ReplyDelete
    Replies
    1. I have no clue about the bq8050 so keep in mind that it may not be compatible with the tool I've released. Also keep in mind that this bootrom entry method is SPECIFIC to the Sanyo firmware. Dell likely has their own custom firmware even if the cells used in the pack are Sanyo based on what I've seen in payware battery hacking software so this method is unlikely to work.

      On to the questions:
      It was possible to glitch the bq8030 into clearing the first block of program flash hence ending up in the boot rom "permanently". The method is written in the article. You will not be able to recover the chip after this without an intact firmware image so I wouldn't try it if you only have one battery and also YMMV. I had several controller boards to sacrifice so I didn't care. Once you're in the boot rom you can read the (corrupted, first block missing) firmware and since you're only missing initialization code at the start you can still disassemble this to extract the password(s) that you can then use on other batteries. Password #3 was brute-forced as I've also written. This was done with a one-off tool that isn't included on github. From there it was guesswork to arrive at the significance of the value read from 0x73 and the final password but the correctness of this has since been confirmed by others.

      Delete
  8. Hi Viktor,
    You have done amazing work! Also very well presented!
    I'm waiting for the adapter board to try to mess around with my battery :)
    Thank you very much for sharing it!

    ReplyDelete
    Replies
    1. Hey, glad you found it interesting!

      Delete
    2. Hi Viktor,
      I received the adapter yesterday and managed to patch capacity and cycle counts in my battery.
      The problem with my battery is that I installed 4.2V cells instead of 4.3V. There're values in the first 1024 bytes of the eeprom that looks like 4300 and 12900 and I would like to try to adjust them, but I need to figure out where is the checksum and how to recalculate it. I would be easier to do having several eeprom dumps. Can you share your eeprom dump?

      Delete
    3. Shoot me an email :-) (search for "Contact" on the page)

      Delete
  9. I was just wandering is it possible to communicate with the battery using internal laptops i2c (SMBus)?
    Linux has userspace utilities i2c-tools. Is it possible to access installed battery's rom/eeprom from a running system?
    Thanks!

    ReplyDelete
    Replies
    1. AFAIK you can't access the EC's SMBus through the machine. The EC manages the bus by itself and only provides an abstracted interface that's accessible to drivers. It might be technically possible but ECs are similar to battery controllers in that you'd have to find and reverse engineer this functionality on each and every one of them and I don't see EC i2c bus drivers in the linux kernel tree. I could be wrong though but that's my understanding :-)

      Delete
  10. Hello how are you. Congratulations on the post.
    I have a lenovo battery with two IC (1: BQ8030BT, 2: BQ29330). When replacing the cells, the cali bration data and the serial number were lost. The computer recognizes it, but even though it has a load it says "0% and loading".

    I am programmer and I have done things in electronica, I have eeprom programmers, I would be grateful if you could tell me how to recover the firmware data.

    thank you very much, Diego

    ReplyDelete
    Replies
    1. Hi,

      The serial/model number and basic factory calibration are usually in the static area. There really isn't a way for them to go away unless your data flash area is corrupted but then the whole pack would just stop working.

      I can't help with individual cases I'm afraid. I could give you a couple of hints if you posted an smbusb_sbsreport of the pack but that's about it.

      Good luck!

      Delete
  11. Hi, I'm working on a DELL battery with TI chip BQ30423. Do you think if it can be hacked? With this tool? Or I have I2C analyzer.

    Thanks in advance.

    ReplyDelete
    Replies
    1. No clue, sorry. I haven't worked on them and Dells seem to be one of those packs that have "special" firmwares based on screenshots from payware battery hacking software. Honestly if you're not hell bent on not paying a cent to anyone like I was you're probably better off finding a local re-celling business. Good luck!

      Delete
  12. Hi Viktor, interesting article and good job carried out!

    You mentioned about series cases with that kind of batteries where they failure to work after reaching 200 cycles of charges. It was just an assumption.

    After you make possible to modify dinamic area of eeprom, Did you play around with that counter value? You mentioned that you just reset it, but if you set it to 195 for example and making 5 to 10 discharge cycles it become possible to confirm or not a manuf.consp theory;)...

    Can you do this trick and share us the output...

    ReplyDelete
    Replies
    1. Hello,

      Glad you found it interesting!

      I didn't do anything like that, no. That particular pack was fairly weird in that it was basically two batteries in one and the firmware "bank-switched" between a 4cell 18650 and a 4cell flat-cell (which had an obscure model number which I forgot) pack. I was pretty much joking about the the conspiracy stuff ;-) If anything it's way more likely that 200 cycles is about where this whole bank-switching business catches up with the amount of cleverness they had to put into the firmware to make this pack work.

      I no longer own the machine or the battery so I can't do any more experiments on it unfortunately.

      Delete
  13. did you replaced battery elements as well, or just cleared PF and CC flags?

    in case elements replacing its required to do calibration for new elements by changing some data flash fields like battery chemistry, design capasity and so on. Its quite easy with TI bq Evaluation Software using their EV2300 usb board. But how to do this using SMBusb? how to find (map) eeprom fields relevant for that calibration?

    Thanks for replies

    ReplyDelete
    Replies
    1. I recharged the original cells externally and cleared the fail condition, no re-celling.

      It sounds like you're talking about the calibration values in the static area. I don't think they're as much for the cells as they are for the board/pack configuration. If you replace the cells with new, identical capacity ones then you can leave the static calibration alone and just raise the Full Charge Capacity values in the dynamic area back up to the Design Capacity(+~15% since with new cells you'll probably be over the design capacity and the controller likes adjusting downwards more than up). The controller should then re-learn the actual capacity in a charge cycle or three.

      If you want to re-cell with different capacity cells than what the pack was designed for THEN you'd need to poke at the static calibration. That area is checksum protected and I haven't really looked into it.

      Delete