So, I just botched a firmware update on one of my PowerLine couplers. The model I had trouble with is a TP-Link TL-PA4020P. However, this quick guide should work with pretty much any Atheros-based PLC device. It was confirmed working on at least a TL-WPA4220 as well.
Steps to Reproduce
- Download manufacturer’s setup tool
- Download firmware files (“nvm” and “pib” files)
- Refrain from directly connecting the PLC to the computer, instead leave it hooked up to a switch and other hardware.
- Start firmware update using the aforementioned setup tool
- Have a bit of bad luck, or a power outage during the update.
Result
- “Firmware Upgrade failed” error message.
- Next up, “Local device not connected” error message.
- After unplugging and re-plugging the PLC, no lights light up.
- Faint hissing from the device, in normal operation it is silent.
Diagnosis
OK, keep calm. This is a modern piece of hardware, surely it wouldn’t need to be disassembled to flash a firmware. Right? Or so my hopes went as I started panicking. Looks like I’m not getting the manufacturer tool to retry the update on the (hopefully just) soft-bricked device. That piece of software only tells me that it can’t find the local device. A quick web search (“TP-Link PowerLine failed firmware fix” and similar) didn’t come up with anything good right away.
Well then, I thought, let’s see if it gives off any signs of life. I directly connected the PLC to my trusty MacBook and fired up WireShark. When the first packets started appearing I breathed a sigh of relief. The PLC still manages to get an ethernet link up. Amidst the stuff the Mac fires off when detecting a link (DHCP, MDNS, etc.) I finally found what I was looking for: Broadcast packets, “HomePlug AV” protocol, “Atheros_something” MAC, “Action Required Notification (Bootloader)”. Awesome! This thing is even politely asking me to remote-boot it. Let’s figure out how.
Armed with the right keywords to feed to my preferred search engine, I finally found “open-plc-utils“. BSD-Licensed tools to set up Atheros-based PLC equipment. And, not really surprisingly, that includes a “plcboot” tool, which does just that – feed the PLC a firmware such that it can proceed to boot.
” aka “The Fix
- Connect the PLC’s network port directly to your computer’s.
- Rename the .nvm and .pib files from the manufacturer firmware package to nvm and pib (The atheros utilities are picky when it comes to file names, something I only found out after head-scratchingly reading the code.)
- Download and compile the tools, and boot the PLC:
git clone https://github.com/qca/open-plc-utils.git cd open-plc-utils make cd .. open-plc-utils/plc/plcboot -i eth0 -N nvm -P pib # replace eth0 if you use another ethernet
- Since the pib file shipped with the firmware only contains a default mac address, the adapter will have been set to that, too. I presume the firmware utility reads it beforehand and updates the pib before uploading it. Anyway, to get the correct mac address back onto your PLC, have a look at the label on your plc, and then:
open-plc-utils/pib/modpib -M <your mac address> pib open-plc-utils/plc/plctool -P pib -R local
- Finally, flash the firmware again, using the manufacturer tool. plcboot only performs a one-time boot when given the options above. To make the firmware permanent again, the flash needs to be rewritten. (Allegedly, plcboot can do that too, but it needs a “softloader” file, which I couldn’t be bothered to extract from the TP-Link software.)
Hints and issues
- The plcboot and plctool utility may need root permissions, since they try to write raw packets to the ethernet interface. You may need to put a sudo in front.
- It may not work running the tools inside a VM, depending on your setup. This may be due to the raw packets not being properly forwarded to your physical interface. At any rate, you would need to put the VMs network interface into bridge mode.
- That said, I recommend doing all of these steps on bare hardware (it should be possible using a Linux live CD/stick).
- Depending on how broken the firmware on your device is, it may still try to boot. Therefore you may only have a short time window in which to run plcboot. Try plugging in the device to the power socket, and then as soon as possible press enter on the command line.
- If in doubt, have a look at the tool documentation, which you can view with
man open-plc-utils/plc/plcboot.1
- There is a “PLC Recovery.exe” tool for Windows floating around the internet, which seems to do the same things as my tutorial. However, since its origins are not verifiable to me, and it doesn’t seem to come from TP-Link or Qualcomm, I won’t link to it. You’re free to search the web for it yourself, although ultimately it seems the method presented here is more successful than the exe.
Conclusions
I love the fact that chip manufacturers are building in sensible bootloaders, and that there is open source software available to access these. This is for example also the case with the Atmel ARM processor families of Arduino Due fame. I, for one, welcome this trend, making it increasingly hard to turn your hardware into a paperweight. On the downside, OEMs like TP-Link try to hide these as best as they can: the manual just says to return the device to the distributor for service when experiencing the symptoms I’ve encountered.
Bottom line: When a firmware update goes bad, don’t panic. It’s just a matter of finding the right tools. Also, it helps a lot having a general grasp of how things work on the inside to actually know what may or may not be possible.
Revisions
Nov. 2017: Added a “cd ..” to the command example and changed the path to plcboot to make the example more consistent. The open-plc-utils folder also has an “nvm” an “plc” subfolder, which may have been confusing.
Dec. 2017: Added bullet point on resetting the MAC address, thanks to reader Manfred Lischka for notifying me of the fact the MAC gets changed. Since I only had one broken adapter, I didn’t care too much. But he had two, which after recovery didn’t work since they had the same MAC.
Jan. 2018: I actually got the MAC reset commands wrong, since I didn’t do it myself. Should be better now.
June 2018: Added -i flag to plcboot command line, since the default of eth1 probably won’t work for most people. Added “Hints and issues” section. Thanks to Paolo Fiore for sharing his story, as well as Michael Stornes. Their experiences have inspired some of these changes.