Monday, September 5, 2016

The weird fuses in laptop batteries

I called them "externally triggerable fuse" in my first post. The datasheets call them "Combined Thermal Fuse/Resistor" or "Fuse-Resistance protector".
Chances are your first time seeing one of these will be in a smart battery, it sure was for me.

As an over-current protection device (ie. regular fuse) they're probably only going to trigger during catastrophic short-circuit scenarios. Their main purpose is to provide a way for the microcontroller to physically break the circuit if it detects a potentially dangerous condition like the overheating of the cells and can't stop it by shutting the FETs off. The problem is some firmwares will count the sudden disappearance/reappearance of cell voltages amongst the list of conditions that warrant doing this so if you want to re-cell a pack you might end up with a blown fuse.
Replacing or jumping them is a pain so it's a good idea to connect the Reset pin of the controller to ground before doing anything of the sort to keep it from overreacting.


The rectangular one




The Cyntec 12AH3 / 12AG3 (datasheet) or similar devices will seem like large capacitors on first sight, though the fact that they have 4 terminals might make you think "current sense resistor". Nope!

The black plastic cap attaches to the ceramic body with glue and you can tease it off with a pair or side-cutters, pliers or tweezers. Here's what you see under it:





Left: intact, Right: blown

What you have is a rectangle of "solder" (probably some special alloy) with a resistor/heating element under it in this configuration:


As you can see on the images above, when the device is triggered the thin layer of solder heats up, flows onto the center pad and breaks the connection. The heater is about a 10 ohm resistance so with a 12.6v battery voltage that's 14 watts dissipated on that tiny surface area (or more if the charger voltage is used)

To check this fuse you'd look for continuity on longer sides and for the heater resistance between either contact and Pin 4.
To jump it carefully dab older onto the broken connections. Use low temperature and thin solder. You can probably shave away at it afterwards and the fuse MAY be able to blow again.. unreliably.. and completely of spec.
The best course of action is of course to replace it with a new one once the repair is confirmed.

If the controller is blowing the fuse repeatedly you have an issue in your data or a hardware fault in the controller board or your cell connections.

The 3-legged one






The SEFUSE D6X / D6T (datasheet) is a weird looking thing that doesn't really resemble any other component.

It's the same deal with a different physical construction.





To jump it you break off the cap (if there is one) to find a resin coated square. Then you carefully scrape away at the coating roughly in the middle until you find a tiny via. This is the center point you see in the diagram above. Now you just need to reconnect it to the two leads where the fusible material broke the connection so tin the via and connect it to both leads (ignore the lead for the heater).

Here's a pretty terrible example:




As with the previous one, it's best to get a new one once the repair is confirmed.

Tuesday, August 30, 2016

Adding the M37512 with Panasonic/IBM firmware

Just "adding" because this battery controller is already public. You have the datasheet(pdf) which tells you the pin combination to enter the Boot ROM and most of the command set (how was the actual read command missed? weird). Then there are open-source flasher tools like this one. You can also use Google to find the passwords because you WILL need passwords (at least with this firmware) and that is after you set the correct pins to the correct states to enter the boot rom. Overkill? Yeah, overkill.

But since it's all out there it's just a matter of coding up a tool for SMBusb.




Quote:
 "Normal microcomputer mode is entered when the microcomputer is reset with pulling CNVSS pin low. In this case, the CPU starts operating using the control program in the User ROM area. When the microcomputer is reset by pulling the P24/SDA2/RXD pin high, the CNVss pin high, the CPU starts operating using the control program in the Boot ROM area"

After setting the pins to desired state and resetting the chip you get:

$ smbusb_scan -w 0x16
------------------------------------
             smbusb_scan
------------------------------------
SMBusb Firmware Version: 1.0.1
Scanning for command writability..
Scan range: 00 - ff
Skipping: None
------------------------------------
[0] ACK, Byte writable, Word writable, Block writable, >Block writable
[1] ACK, Byte writable, Word writable, Block writable, >Block writable
[2] ACK, Byte writable, Word writable, Block writable, >Block writable
[3] ACK, Byte writable, Word writable, Block writable, >Block writable
[4] ACK, Byte writable, Word writable, Block writable, >Block writable
[5] ACK, Byte writable, Word writable, Block writable, >Block writable
[6] ACK, Byte writable, Word writable, Block writable, >Block writable
*repeat for all commands*


Going at this blind would've been pretty terrible. This chip is waiting for the correct passwords and ACKing literally everything until it gets them.
Entering the correct passwords scoured from the internet:


$ smbusb_comm -a 0x16 -c 0xFF -w CDAB -b
$ smbusb_comm -a 0x16 -c 0xCF -w 3412 -b
$ smbusb_scan -w 0x16 -e 10
------------------------------------
             smbusb_scan
------------------------------------
SMBusb Firmware Version: 1.0.1
Scanning for command writability..
Scan range: 00 - 10
Skipping: None
------------------------------------
[0] ACK, Byte writable, Word writable, Block writable, >Block writable
[1] ACK
[2] ACK
[3] ACK
[4] ACK
[5] ACK
*snip*


It still ACKs every command but it's exposing the documented Boot ROM inteface now. Just don't scan it too much because writing the wrong thing to the wrong command will hang the controller and/or the entire bus which the SMBusb won't like too much either. (The Boot ROM in this chip has zero error handling.)

Some coding later:


$ smbusb_m37512flasher -w b0 -p b0
------------------------------------
        smbusb_m37512flasher
------------------------------------
SMBusb Firmware Version: 1.0.1
------------------------------------
Erasing flash block starting at 0xe000 ...
Done!
Writing memory 0xe000-0xffff ...
Done!
Verifying 0xe000-0xffff ...
Verified OK!

The tool is now a part of SMBusb.

I haven't done research into modification or resetting for this controller yet. Maybe in the future!

Monday, August 29, 2016

Hacking the R2J240 with LGC firmware


The second battery controller I looked at was the Renesas R2J240-10F020. It's a complete black box with very little information available except for some outtakes from the datasheet on Chinese developer forums. There is very little resemblance to the M37512, an older Mitsubishi/Renesas microcontroller used in earlier battery packs that's fairly well documented.


The first thing you notice is that this chip has the analog frontend integrated (unlike the M37512 or the bq8030) because there's no separate chip for measuring voltages and such. Cells are connected directly to this chip so it's a one-chip solution for building smart batteries.

SBS Report

$ smbusb_sbsreport
SMBusb Firmware Version: 1.0.1
-------------------------------------------------
Manufacturer Name:          LGC
Device Name:                LNV-42T4911
Device Chemistry:           LION
Serial Number:              41291
Manufacture Date:           2010.01.25

Manufacturer Access:        6001
Battery Mode:               e000
*snip*


Probing around



I started out by measuring voltages on all the pins. Just going by logic I was expecting some sort of differentiation on the various sides of the chip.

To summarize my findings after the first pass:
  • 1-12 is the "main microcontroller side" has the SMBus pins, VCC (and probably RESET and others)
  • 25-36  is connected to current sensing and exposes various built-in voltage regulators
  • 37-48  appears to be mainly unused with a couple of pins at 3.3v, GPIO side?
  • 13-24  has many pins connected directly to "high voltage" from the cells.
I took a 1k resistor connected to ground and started poking the pins with it to find reset. It should be possible to pull reset low through 1k resistor but unlikely on VCC and it shouldn't lead to a complete reset on an unrelated pin. It's also possible to rule out most pins through visual inspection and measurement. So long story short: Pin #12 is Reset.

Next I wanted to see if there's something like a Boot pin that's going to get me a different mode when pulled either low or high during reset so I started up a continuous command scan and started poking at the pins again.

Pulling Pin #4 (also connected to Test Point 1 on the other side of the PCB) low during reset gave me this:

$ smbusb_scan -w 0x16
------------------------------------
             smbusb_scan
------------------------------------
SMBusb Firmware Version: 1.0.1
Scanning for command writability..
Scan range: 00 - ff
Skipping: None
------------------------------------
*snip*
[f0] ACK, Byte writable
[f1] ACK
[f2] ACK
[f3] ACK
[f4] ACK
[f5] ACK
[f6] ACK
[f7] ACK
[f8] ACK
[f9] ACK
[fa] ACK, Byte writable, Word writable, Block writable
[fb] ACK, Byte writable, Word writable, Block writable
[fc] ACK, Byte writable, Word writable, Block writable, >Block writable
[fd] ACK, Byte writable, Word writable, Block writable, >Block writable
[fe] ACK
[ff] ACK

The chip was ACKing on every command. A deliberate attempt at confusing any would-be attacker perhaps? The write scan however reveals that the chip is actually exposing some real functionality on some of the commands and that a couple of them violate SMBus protocol.

Pin #4 appears to be BOOT (active-low).



Mapping

Mapping out the protocol took a while especially because it doesn't correspond to standard SMBus protocol but I was eventually able to figure out how to read and write to RAM and erase blocks of memory-mapped flash.
Just writing to the appropriate address in ram (after the flash blocks have been erased) writes the flash memory which is convenient.

There are several partitions of flash mapped into RAM and I'm sure I haven't found all of them. The ones I did are included as address&length presets in the flasher tool.



$ smbusb_r2j240flasher -d eep2.bin -p df2
------------------------------------
        smbusb_r2j240flasher
------------------------------------
SMBusb Firmware Version: 1.0.1
------------------------------------
Dumping memory 0x3400-0x37ff ...
Done!
$ xxd eep2.bin
0000000: 0000 0000 0000 0000 0000 ffff ffff ffff  ................
0000010: 4c4e 562d 3432 5434 3739 3700 0000 0000  LNV-42T4797.....
*snip*

$ smbusb_r2j240flasher -d eep3.bin -p df3
------------------------------------
        smbusb_r2j240flasher
------------------------------------
SMBusb Firmware Version: 1.0.1
------------------------------------
Dumping memory 0xc000-0xdfff ...
Done!
$ xxd eep3.bin
0000000: 0100 0700 b801 b801 1100 0203 0201 01e3  ................
0000010: e6fe e3ae 7000 e0e4 0cc8 0038 3150 14f0  ....p......81P..
0000020: 1530 2a4c 4743 0031 3100 0000 0000 0000  .0*LGC.11.......
0000030: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000040: 0000 004c 4e56 2d34 3254 3439 3131 0000  ...LNV-42T4911..
0000050: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000060: 0000 004c 494f 4e01 2d01 2d30 07fa 1031  ...LION
*snip*


In this particular battery pack the static information was stored in df3 and the dynamic in df2, df1 was empty.
Another battery stored dynamic info in df1 so this is going to differ between firmwares/packs.

Just like the bq8030 the static area is protected by a checksum on this controller/firmware as well. I took a shot at it just for kicks and it was pretty simple so I included it in the flasher tool.


$ smbusb_r2j240flasher -w eep3_f.bin -p df3 --fix-lgc-static-checksum --execute
------------------------------------
        smbusb_r2j240flasher
------------------------------------
SMBusb Firmware Version: 1.0.1
------------------------------------
Erasing flash block starting at 0xc000 ...
Done!
Fixing LGC static checksum..
Done!
Writing memory 0xc000-0xdfff ...
Done!
Verifying 0xc000-0xdfff ...
Verified OK!
Exiting Boot ROM and starting firmware.

$ smbusb_sbsreport
SMBusb Firmware Version: 1.0.1
-------------------------------------------------
Manufacturer Name:          LGC
Device Name:                Karosium000
Device Chemistry:           LION
Serial Number:              41291
Manufacture Date:           2010.01.25
*snip*


Reset

Pretty much the same procedure as with the bq8030. Map and modify the dynamic area. Eventually you'll find the error flag. As with the bq8030 the dynamic area isn't checksummed in this controller/firmware either.

Helpful tips:
  • Again, multiple log entries. The number of 0x00 bytes at the beginning of the section determine the number. Patch the duplicated data in all of them.
  • You can decrease the number of log entries to 1 for the time of mapping which will make the job a lot easier.
  • The real cycle count is stored encoded. No idea how. With this particular firmware it was at 0x78-79. Zeroing out the bytes still decreases the cycle count to 5 but the precise algorithm/obfuscation? No clue.
  • Please don't ask me to fix flash dumps :-)
  • Good luck!
Notes

No disassembly/ler for this chip. I don't really know what architecture it's based on. If I had to guess I'd say an extended version of the MELPS 7700, an old Mitsubishi architecture that Renesas inherited because trying to load it up in IDA with that core seems to produce something that starts to make sense but fails on invalid instructions. I could be completely wrong though.

If anyone wants to tackle this they could probably find a nice, easy way of getting into the Boot ROM using just SMBus commands.

Sunday, August 28, 2016

Hacking the bq8030 with SANYO firmware

As mentioned in the previous article the bq8030 is the blank version of the bq20z90. If you bought some from Aliexpress they'd come up with the TI Boot ROM and you could use the flashing tool included in SMBusb to upload firmware and eeprom(data flash) to it.
Theoretically you could turn it into a bq20z90 by downloading the firmware from one and uploading that. (The procedure for accessing the Boot ROM on those chips is documented in datasheets and application notes.)



So how would you even start with a BQ8030 running proprietary firmware?

Google. Lots of Google.
Apparently they sell this tool for them::




Now with a SPECIAL! price of ONLY 3 THOUSAND US DOLLARS!! WHAT AN AMAZING DEAL!!!

I gathered everything I could find about this device and while it wasn't much it did provide clues that came in handy later on in the process. Especially this screenshot of the software that comes with it:




There was no way I could figure everything out based on just that but I did take notice of the function bar on the bottom.

Those could very well be SMBus commands right there.. would they have done that? Surely not.
Not really expecting much I tried a word write of 0x0214 to command 0x71 aand.. nothing obvious happened. So I moved on to poking at other things but eventually came back for a second look and that's when I realized:

Command scan starting at 0x70 before sending command

$ /smbusb_scan -w 0x16 -b 0x70
------------------------------------
             smbusb_scan
------------------------------------
SMBusb Firmware Version: 1.0.0
Scanning for command writability..
Scan range: 70 - ff
Skipping: None
------------------------------------
[71] ACK, Byte writable, Word writable
[72] ACK



And after


$ smbusb_comm -a 16 -c 71 -w 0x0214 
$ smbusb_scan -w 0x16 -b 0x70
------------------------------------
             smbusb_scan
------------------------------------
SMBusb Firmware Version: 1.0.0
Scanning for command writability..
Scan range: 70 - ff
Skipping: None
------------------------------------
[71] ACK, Byte writable, Word writable
[72] ACK
[73] ACK


So this actually unlocks an extra command which disappears again when an SBS command is issued (or when doing a full command scan starting from 0.)
The command however is not writable. Reading it returns:

$ smbusb_comm -a 16 -c 73 -r 2
023d

Interesting but insufficient.

Brick wall meet impatience

I couldn't really get any further with just that information so I started looking at the hardware instead. Having found slides from a TI presentation revealing the connection between the BQ8030 and bq20z90 I opened up the datasheet for the latter (since there's no public datasheet for the former).


Ok, nothing straightforward. No obvious BOOT pin as one would expect with a device that's not meant to be tampered with. But maybe pulling some pin high or low during reset will get me somewhere.

After the first pass no, not really. So maybe we have to set multiple pins into multiple states for it to work. Or maybe there's no such combination at all.
How about I try to abuse N/C pins instead. I have no logical explanation as to why I came to this decision. Maybe I saw a presentation somewhere about blackbox chips and N/C pins years and years and years ago but I could just be imagining things. Either way, about 5 minutes of poking at PIN #28 with a resistor connected to 3.3v in hand and triggering RESET at random intervals while running a continuous command scan:

$ smbusb_scan -w 0x16
------------------------------------
             smbusb_scan
------------------------------------
SMBusb Firmware Version: 1.0.1
Scanning for command writability..
Scan range: 00 - ff
Skipping: None
------------------------------------
[0] ACK, Byte writable, Word writable, Block writable
[1] ACK
[2] ACK
[3] ACK
[4] ACK, Byte writable, Word writable, Block writable
[5] ACK, Byte writable, Word writable, Block writable
[6] ACK, Byte writable, Word writable
[7] ACK, Byte writable, Word writable
[8] ACK
[9] ACK, Byte writable, Word writable
[a] ACK, Byte writable, Word writable


Wow, that worked?
Umm.. ok.. let's just reset for now..

$ smbusb_sbsreport
SMBusb Firmware Version: 1.0.1
-------------------------------------------------
Manufacturer Name:          ERROR
Device Name:                ERROR
Device Chemistry:           ERROR
Serial Number:              4294967287
Manufacture Date:           1980.00.00


Uh-oh.. Well that's not good!
It seems we're stuck in the Boot ROM. Is the chip fried? It's at this point that I coded up the flash tool to try and read the flash contents. (I wasn't really bothered by the chip dying as this was one of 2 sacrificial controller boards I kept just for messing around with.)
And the results? Apparently we can corrupt (ideally just) the first couple of blocks of flash if we bully PIN #28 while the chip is trying to start up. The good news though? (If we're lucky) We get 99% of the firmware, and thanks to Charlie Miller we have a disassembler(zip) for it.

Did messing with Pin #28 even have an effect? Could it just have been the erratic resetting of the chip that triggered the malfunction? Did I short VCELL+ to Pin28 while messing about? Was there high voltage on VCELL+? Was it just ESD?
No idea. But I did manage to reproduce the result on another chip using the same procedure. So when in doubt and you have nothing to lose, act like a caveman, I guess?
The only good thing about this method is that even if you have 0 knowledge about whether there even IS a method for entering the Boot ROM in the firmware let alone what it is there's still a high chance that you'll get in. How much of the firmware survives is another question.

Disassembly

A couple of hours of staring at unfamiliar assembly code later, here are the relevant parts for entering the Boot ROM with annotations:

cmd_handle_71
    ..      
    calls       smb_ACK
    ..
    calls       smbSlaveRecvWord
    move        a, (i3,0x1A)
    or          a, (i3,0x1B)
    jeq         check_71_pass
    move        r2, (i3,0x1B)
    add         r2, (i3,0x19) ; smb_word_LSB
    move        r3, (i3,0x1A)
    addc        r3, (i3,0x18) ; smb_word_MSB
    or          a, r3, r2
    jeq         accesslevel_oreq_40
    move        a, #0
    move        (i3,0x1A), a
    move        (i3,0x1B), a
   
check_71_pass:
    ..
    move        i1l, (i3,0x19) ; smb_word_LSB
    move        i1h, (i3,0x18) ; smb_word_MSB
    cmp         i1h, #2
    jne         wrong_pass
    cmp         i1l, #0x14  ; is 71 0214?
    jne         wrong_pass
    ..
    jeq         accesslevel_oreq_80


This is the first password check, seem familiar? It's the one that we saw in the screenshot above 0x0214 to 0x71. It sets an access flag that gets checked later on. Basically if (smbSlaveRecvWord(0x71) == 0x0214) { access_level |= 0x80 }; But wait.. It can set two access flags based on whatever (i3,0x1A) and (i3,0x1B) are. Hrmm.. Well I don't know what those are and can't find where they're set so let's assume the first jeq will not jump once we've given the correct first password because it would make sense. We can also see that it checks the word we send against those mystery bytes somehow and if it likes what it sees it sets access flag 0x40 and the mystery bytes to 0.

A little bit further up we find the entry point for the Boot ROM:

cmd_handle_70:
    *snip*
    move        r3, access_level
    and         r3, #0x40
    cmp         r3, #0        ; don't even bother if access
    jeq         cmd_handle_71 ; flag 0x40 is missing          
    *snip*   
    calls       smbSlaveRecvWord
    move        r2, (i3,0x19) ; smb_word_LSB
    move        r3, (i3,0x18) ; smb_word_MSB
    cmp         r3, #5
    jne         wrong_pass
    cmp         r2, #0x17      ; is 70 0517?
    jne         wrong_pass
    *snip* (prepare leaving the firmware safely)
    calls       bootrom_execute

So now we know pretty much what we need to do.

1. Send 0x0214 to 0x71
2. ???
3. Send 0x0517 to 0x70
4. Profit

And we've made the educated guess that Step 2 is really "Send 0x???? to 0x71" so we're pretty much done with the disassembly as 16 bits is way within the realm of bruteforceability and since I had another sacrificial board as well as a battery pack running SANYO firmware I had everything I needed to attempt it.
As it turns out there's another mandatory step between 1 and 2 and it was sheer luck that I left it in my brute force loop. 0x73, the command unlocked by sending the first password needs to be read before entering the second password. Which is...*drumroll*

0xFDC3

After realizing that the first unlocked command is important (why else would they have made it mandatory otherwise) it's not that surprising that when adding the number returned by it (0x023d) to the bruteforced value we get a nice round result: 0x10000 which is probably what the adding in the assembly and the mystery numbers are all about.

So to sum it all up:

1. Send 0x0214 to 0x71
2. Read Word X from 0x73
3. Send (0x10000 - X) to 0x71
4. Send 0x0517 to 0x70

Actually, sending the correct word in Step 3 will unlock several extra commands not just 0x70 for the BootROM entry but they all disappear as soon as you send an unrelated command much the same way as 0x73 does with the first password.

We don't really care about those though because we already have what we wanted:

$ smbusb_comm -a 16 -c 71 -w 0214
$ smbusb_comm -a 16 -c 73 -r 2

023d
$ smbusb_comm -a 16 -c 71 -w fdc3
$ smbusb_comm -a 16 -c 70 -w 0517

$ smbusb_bq8030flasher -p prg.bin -e eep.bin
------------------------------------
        smbusb_bq8030flasher
------------------------------------
SMBusb Firmware Version: 1.0.1
PEC is ENABLED
TI Boot ROM version 3.1
------------------------------------
Reading program flash
.............................................................
.............................................................
*snip*
.................................................
Done! 
Reading eeprom(data) flash
...................................................
Done!

$ xxd eep.bin
0000000: ffff 0031 076c 00c8 ffff 11f8 19e0 0355  ...1.l.........U
0000010: 0853 414e 594f 0030 3820 20ff ffff 0407  .SANYO.08  .....
0000020: 0b49 424d 2d34 3254 3532 3531 2020 2020  .IBM-42T5251   
0000030: 044c 494f 4e20 ffff ffff ffff ffff ffff  .LION 

*snip*


Huzzah!


Reset


To actually remove the permanent failure flag we need to look at the eeprom area.
The file is 2048 bytes and it has two sections.

The first 1024 bytes contains the static data (the beginning of which you can see in the hex dump above). It contains all the data set by the manufacturer that never changes during the lifetime of the battery. Design capacity/voltage, serial and model numbers, default settings, etc.
This part is protected by a checksum somewhere which you'll need to find and fix if you want to modify anything in there.

The second part contains the dynamic data. Basically the "log" of the battery with current remaining capacity and similar things that get updated as the battery is cycled.  Also, the failure flag.

You pretty much just need to start mapping out the values and then zeroing or FF-ing out the ones that you can't map to anything to see if that fixes it or breaks something else. There's no checksum on the dynamic area so you are free to modify this section all you want. Repeat until desired outcome is reached. That's what I did.
Some helpful tips:
  • On my specific battery the log starts at 0x500 and has several entries that all need to be modified (mostly duplicate data)
  • Battery capacity is stored as the remaining capacity reported through SBS divided by 2.
  • Cycle count is stored as CycleCount-1 (eg.: SBS value: 223, Eeprom byte: 222)
  • Remaining Capacity Alarm is stored as-is. A good place to start mapping.
  • It's a good idea to reset the cycle counter. I don't want to start conspiracy theories but... at least with this specific model there's been a lot that died inexplicably around the 200 cycle mark. Coincidence? Probably, but it can't hurt.
  • Please don't ask me to fix eeprom dumps :-) 
  • Good luck!
And the result:


It took the estimate a charge cycle to normalize. This particular battery lasts 2 hours on constant high CPU load after external recharging and clearing of the fail flag. Not bad for a 10 year old battery in a 12 year old machine and since the other choice was THROW IT AWAY AND BUY A NEW ONE  I consider this a win :)