Learning to Decapsulate Integrated Circuits using Acid Deposition

In this post:

I’ve been looking to try my hand at IC decapsulation for years, and finally got the time to do it. The process took plenty of trial and error, so this post will document most of my failures and successes, and detail the methodologies used for each attempt. These are most of the ICs I worked on throughout the process:

Most ICs I experimented on

A typical chip is built as a silicon die, connected to its leads/contacts through bonding wires, and encapsulated in resin for protection.

Of course, there are other ICs that use different designs and encapsulation materials: mostly metal and plastics. But the epoxy-based design is extremely common, so we’ll be focusing on it.

This picture of a DIP package -courtesy of Wikipedia- explains it very well:

Generic IC diagram

The decapsulation/decapping of Integrated Circuits, also known as “delidding”, is nothing new.

It’s used in the industry to debug hardware issues, reverse engineer chips, verify the authenticity of parts, and other tasks that require access to the underlying circuitry.

That’s why there’s plenty of commercial services that will decapsulate your ICs using expensive and dedicated equipment. I’ve linked a couple of them in the bibliography. But, without having any idea how much they cost or how long they take, I can’t imagine them being an option for the average hacker.

Hackers and smaller companies generally decap Integrated Circuits to identify counterfeits, gain a very rudimentary understanding of the parts comprising them, or just to share the pretty pictures of the silicon die. For those cases, a DIY process is generally good enough.

Decapping/delidding has also been used by hackers for more fun purposes, such as unsetting efuses from production hardware to extract and/or override their firmware, private keys, etc. So there are cases where a commercial-level result can be worth it.

I’ve been wanting to try my hand at decapping ICs for years, for no other purpose than to satiate my curiosity. I’ve finally had the time to get to it, so this post will describe the methods I tried and the hurdles I encountered.

Existing methodologies

The biggest factor to decide which method is best for your project is whether or not you need the chip to still work after it’s been decapsulated. That means not destroying or disconnecting the die, bonding wires, external contact points, etc. so you can still use the chip after the process is complete.

Here’s a list of the most common options.

Destructive methods:

Non-destructive methods:

In this post we’re gonna focus on the manual acid deposition method, to achieve non-destructive decapsulation at a reasonable cost.

Personal Safety

First of all, let me preface this safety talk with an important disclaimer: I HAVE NO IDEA WHAT I’M DOING. My thing is firmware and electronics, not chemistry.

PLEASE, do not assume the safety measures discussed here are valid or enough to protect yourself. Do your own research, follow any and all measures you deem appropriate, and remain paranoid all along the process.

We’re dealing with very dangerous chemicals. If you decide to replicate the experiments it’s at your own peril.

Here are more authoritative sources of safety information for a project like this. Review as much of this info as you can, and take it with the seriousness it requires:

After doing enough research to feel comfortable with the risks involved, I settled for following these measures:

  • Run all experiments outside, with all nearby windows closed, and never accessing the area without using PPE. I might invest in a fume hood in the future, either commercial or DIY
  • Wear chemical splash goggles. They should protect your eyes from droplets coming from any direction. If they become uncomfortable or fog up, do not remove them or pull them off your face in the working area
  • Wear a respirator mask with filters that are appropriate for chemical fumes. Preferably a full face mask, to avoid acid splashes
  • Wear gloves that are appropriate for the acids you’re dealing with. Nitrile gloves should NOT be used to work with nitric acid; especially fuming (98%+) nitric acid. Long, thick neoprene-based gloves are best for Nitric, but make delicate tasks difficult. I settled for wearing a thick neoprene glove on my non-dominant hand, and a vinyl glove on my dominant hand for the more delicate work. When touching any surface that’s hot or has been in touch with acid, I use the neoprene glove
  • Expose as little of your skin as possible: Wear shoes, trousers (not shorts), long sleeves… Preferably use a lab coat, so you can remove the acid-splashed clothing without dragging it over your face
  • Never mix chemicals without fully understanding the outcome to expect. Keep different chemicals as far apart from each other as possible. Keep the smallest possible amount of dangerous chemicals in the working area
  • Be prepared for the worst:
    • Keep enough sodium bicarbonate at hand to neutralize acid spills and leftover acid. Keep in mind that neutralizing acid with bicarb will give off heat, and the bubbling could be dangerously vigorous for significant amounts. Expect the possibility of a spill during the neutralization process
    • Keep enough water at hand to dilute chemicals in case of spills, splashes, etc.
    • Understand what are the recommended procedures in case of any given chemical contacting your skin, eyes, etc. Eyes are generally the most sensitive to chemical splashes, as they can be permanently damaged in seconds; they’re also the most difficult to clean up, so be particularly careful with them and have a plan of action in case the worst happens

Picture of my PPE

My own experiments

These describe failures and successes, and what I learned along the way. Keep in mind that most resources I’ve found recommend using fuming nitric acid (86%+).

I was not able to source fuming nitric acid, so I used concentrated nitric (69%) instead. That could account for some of the problems I’m about to describe, but worked fine once I found the most fitting methodology.

Tools and materials

Chemicals I used during my experiments:

  1. Concentrated (69%) Nitric Acid
  2. 100% Sodium bicarbonate - Bought on Amazon
  3. 98% methanol - From a hardware store
  4. Acetone - From a hardware store
  5. Water - Either tap water or regular distilled water from a grocery store

I also ran a couple of tests using Sulfuric Acid, both standalone and mixed with the Nitric, but the results were not very promising. Probably because of the encapsulation material used in my ICs. These are the acids I used:

Both bottles of acid used during my experiments

Necessary equipment:

  1. Dremel
  2. Hot plate
  3. High-temperature and acid safe recipient
    • Tried using ceramic recipients and they worked well enough, but it was hard to maintain a stable temperature outside using my hot plate
    • Graphite ingot molds ended up being better at conducting heat (hence maintaining a reasonably stable temperature outside) and providing easy access to the IC inside
  4. Plastic tweezers
  5. A syringe, or preferably an assortment of them

Other very useful equipment:

  1. An Erlenmeyer flask to keep a small amount of acid in a stable container
  2. Pipettes:
    • 10ml pipette to transfer acid from its primary container to the flask
    • 1ml pipette to drop acid on the ICs
  3. Beakers:
    • A small one for acetone
    • A large one for water, to rinse tools
  4. Thermocouple to monitor the temperature of the hot plate and IC container
  5. Tongs to move the hot ceramic/graphite container
  6. Ultrasonic cleaner. Explained later

This picture shows most of the equipment used for the most successful method:

All the equipment used for the most successful method

Attempt 1: Sand down the epoxy packaging

This is how the project started. I just got a new microscope, met with a couple of good friends, and we started looking at some random samples. Blood, dust, etc.

Then we decided to take a look at some random IC. We were not looking to see anything useful or complete; just an overall image of the silicon die in an IC, so I sanded down a microcontroller and we took a look. The result, as expected, was absolute garbage:

Macro: Sanded down IC

Micro: Sanded down silicon die

Well, that went exactly as terribly as expected… Time to go down the rabbit hole.

Attempt 2: Dremel + 69% Nitric Acid + Gentle acetone bath

Steps:

  1. Drill a pocket on the top of the epoxy package so the acid does not spill over to the leads.
  2. Place IC on a ceramic or graphite recipient, on top of the hot plate. Attach a thermocouple to the recipient to monitor its temperature
  3. When the temperature is appropriate (around 100 degrees C), drop one or 2 drops of acid in the epoxy pocket we just drilled. Wait until there is no more acid, and continue to apply acid when that happens
  4. Every once in a while, grab the IC with plastic tweezers, dip it in acetone and move it around to remove the reacted epoxy

Results: Terrible

A simple acetone bath and some stirring are not enough to remove any significant amount of reacted epoxy packaging. After a while of carefully following this process, I ended up losing my patience. That eventually resulted in applying too much acid, spilling it over the IC’s leads, and melting them off. Then all bets were off, so I just kept applying acid until the IC was embarrassingly destroyed:

Macro: Utterly fucked up IC

Micro: Utterly fucked up IC

Reacted epoxy had a similar consistency to wet charcoal, so I could easily remove it with the tip of my tweezers. In a second test, that worked fine for a bit; until I encountered the die and ripped all the bonding wires right off it.

For a third attempt, I exercised patience and spent hours and hours applying acid to decap a simpler IC. It worked well enough:

Micro: 74LS48 BCD to 7segment decoder - Minimal damage

That being said, spending an entire afternoon in PPE and constant tedious work for only a chance to get an undamaged IC is unacceptable to me.

I tried magnetically stirring the acetone bath in an attempt to accelerate the process while remaining reasonably gentle, but it made no significant difference.

I need a better method to remove the reacted epoxy.

Attempt 3: Dremel + 69% Nitric Acid + Acetone Syringe

After the previous failure, it seemed obvious I needed a method to remove more of the reacted epoxy without using a hard tool to manually extract it. I decided to push a stream of acetone aimed directly at the target area.

To avoid splashing acetone all over myself, I first submerged the IC in the acetone bath, and pushed the stream underneath the surface:

Acetone syringe aimed at IC in the acetone bath

The process was significantly more effective, removing epoxy more precisely and at a much quicker rate. It still took a long time, but the syringe was a bit too effective in a sense…

The high-pressure stream pulled too much reacted epoxy off the IC, creating a wider, deeper hole much more rapidly. Soon enough, I was able to reach the top of the bond wires, hence locating the die. But the bond wires are akin to steel rods in reinforced concrete: they improve the robustness of the epoxy area atop of the silicon die.

By attempting to use the syringe method to extract the last area of epoxy over the die, it’s easy to first expose the internal parts of the leads surrounding it. Once that happens, and we apply more nitric acid, it will dissolve the leads faster than it weakens the epoxy atop of the die. The result is an unusable IC due to the disconnection of bond wires to its leads:

IC missing inner segments of the leads

You’ll notice this happening before you actually dissolve the leads, because the acid reaction will be apparently different and more fuming, and the acid will quickly turn green-ish.

Green acid due to lead dissolution

The workaround for this problem is rather simple: We need to minimize the size of the reacted area. We can easily achieve that by drilling a smaller and deeper pocket with the Dremel, directly over the silicon die.

  1. Locate the position and depth of the silicon die
    • We could achieve this non-destructively with an X-Ray machine, but having a second identical IC to destroy with a Dremel and/or acid will work well enough
  2. Mark the position of the silicon die on the top of the target IC and dig as deep as you can with the Dremel without damaging the bond wires or the die
  3. Follow the same acid+acetone syringe procedure explained before

Drilling as deep as possible without damaging the IC is tricky. It’s easy to overshoot and end up with a damaged die:

Damaged die from drilling too deep

But once you get that part right, the results are quite decent:

Damaged bonds in IC through precise drilling+nitric+acetone syringe

Still, as you can see, most of the wire bonds are detached from the die. That’s almost definitely caused by the excessive acetone pressure exerted through the syringe when removing the last layer of epoxy off the die.

As you can see in the previous picture, there’s a lot of residues left over the silicon die. Trying to get as much as possible of the die is what resulted in excessive acetone pressure ripping the bonds off the die.

For complex ICs, where using increased magnification is necessary to discern more details, that residue will obstruct the view of the silicon die way too much. See:

Dry residue over the silicon die under higher magnification

I’ve seen people get rid of such residue with their fingernails when using destructive decapsulation methods. Fingernails are supposedly hard enough to remove residue and soft enough not to damage the die. But that would obviously destruct the die bonds.

We need to figure out a better, gentler way to remove the last layer of epoxy and clean up residue off the silicon die without damaging the bonds.

Attempt 4: Dremel + 69% Nitric Acid + Acetone Syringe epoxy removal + Ultrasonic Methanol Cleanup

I’ve seen sources suggesting a pure methanol bath in an ultrasonic cleaning device to clean up the silicon die after the acid etching procedure. Let’s give it a try…

I bought a cheap ultrasound cleaning device off Amazon and tried using it to clean up the die in a methanol bath. It was able to remove a small part of the residue, and microscopic imaging seemed pretty successful immediately after bathing it for multiple minutes in separate attempts:

post methanol cleanup - wet residue

However, once the residue dries up again, most of the ingrained residue is still there:

post methanol cleanup - dry residue

This could be due to the cheap equipment I used, or the specific IC packaging and acids I used, but trying to replace methanol with acetone for the syringe procedure was also useless, so methanol is not gonna work for me.

Let’s try an acetone ultrasonic bath instead…

Attempt 5: Dremel + 69% Nitric Acid + Acetone Syringe + Ultrasonic Acetone Cleanup

First, once again, use the Dremel to drill a rather precise pocket over the silicon die

Dremel pocket over the silicon die

Use the Nitric deposition + Acetone syringe extraction method to dig through epoxy until the top of the bond wires are exposed

Exposed top of wire bonds

Once the bond wires are found and we’re about to reach the die, it’s time to stop using the acetone syringe method to extract reacted epoxy. Instead, submerge the IC in the ultrasonic acetone bath

IC in ultrasonic acetone bath

Repeat the acid deposition + acetone ultrasonic bath until the entire silicon die is exposed. If the ultrasonic bath is unable to get any particular chunk of epoxy, use a thin syringe to apply light pressure over that area.

Micro: 555, Fully exposed silicon die, intact wire bonds

SUCCESS!

After all the previous attempts, this methodology was finally enough to non-destructively expose the entire silicon die. It worked on the first attempt, and the surface of the die was pristine right away. No messy residue obscuring the view under a microscope.

Let’s try with a more complex IC: a PIC16f84A.

Macro: PIC16f84 new

Drill a pocket for the acid:

Macro: PIC16f84 pocketed

Apply acid on the hot plate:

Macro: PIC16f84 during acid deposition

Extract reacted epoxy with the syringes and ultrasonic acetone bath, as explained earlier.

And, et voilà!

Micro: PIC16f84 after ultrasonic acetone method

That’s the best result yet!

Here are some more details from the same IC:

Micro: PIC16f84 detail pic 1

Micro: PIC16f84 detail pic 2

Micro: PIC16f84 detail pic 3

Taking great pictures under the microscope is not easy without expensive, specialized cameras. Getting the focus and lighting right throughout the sample is tricky, which makes it hard to get decent results from image stitching software. More info on imaging later.

Still, for the sake of gaining some more detail in the overall picture, here’s a composite image created from higher magnification pictures:

Micro: Composite image of PIC16f84

Attempt 6: Dremel + 69% Nitric Acid + Room temp Nitric Acid + Acetone Syringe + Ultrasonic Acetone Cleanup

I’ve seen different sources recommend a room temp nitric acid bath after the etching process. IIRC, the goal is to improve the uniformity of the etched area.

I tested it, and it didn’t make much of a difference. Perhaps it’s not that useful for manual deposition? I’m not sure, but given the lack of discernible differences, there’s no point for me to document it any further.

Attempt 7: Dremel + 98% Sulfuric Acid + 69% Nitric Acid + Ultrasonic Acetone Bath

I’ve seen a mix of sulfuric and nitric acid recommended in some literature. I did try it a couple of times, and it made no significant difference. It might help with different encapsulation or bond wire materials, but it was pretty much pointless in my tests.

I’d rather not deal with mixing dangerous acids, or the extra fuming it entails, so I gave up trying and would not recommend it unless pure Nitric is not doing the job.

Still, here’s one of the tests I ran. The methodology is identical to the previous attempt, only I mixed about 5ml of 98% Sulfuric Acid and 6.6ml of 69% Nitric Acid. That results in an approximate 50/50 mix of the active chemicals, accounting for the difference in purity.

One of the tests I ran was on ADXL345 accelerometers, since I wanted to take a good look at a MicroElectroMechanical System (MEMS) IC. The IC itself is so small, I did not remove it from its development PCB so I could move it around more easily.

Macro: ADXL345 accelerometer

MEMS devices use micro-scale moving pieces, so they need to build differently. Let’s take a look…

Micro: ADXL345 accelerometer, lid on

Here we can see that the circuit’s internals are covered by a metal lid, so the epoxy does not glue the moving pieces together.

I first tried to remove the lid by pushing thin metallic tweezers from its side. In the process, I destroyed the upper layer of the MEMS:

Micro: ADXL345 accelerometer, destroyed during delidding

I decapped another ADXL345, loaded a brand new blade on my X-Acto knife, and carefully cut through the lid’s edges using a USB “microscope” to see what I was doing. I cut through the wire bonds in the process, because I just cared about seeing the MEMS; not being non-destructive. Here’s the undamaged top layer of the accelerometer:

Micro: ADXL345 accelerometer, intact MEMS, damaged bonds

Looks great! I did not remove the residue for these pictures to avoid damaging the very delicate MEMS parts.

Taking Pictures Under the Microscope

The ideal way to take pictures under the microscope would be to use a camera designed to fit into the microscope. I tried using a cheapish one included with the microscope I bought, but the results were pretty terrible; probably because lighting completely opaque samples like these is hard and imperfect in a microscope like mine.

Better cameras would definitely yield MUCH better results, but they are not cheap.

A good smartphone’s camera is better at handling poor/irregular lighting conditions, but aligning it to the eyepiece is terribly annoying and imprecise.

My solution was to design and 3D print an adapter to hold my old iPhone 6 directly aligned to the microscope’s lens. It’s not perfect, but it’s immensely better than everything else I tried.

iPhone-microscope adapter in action

Link to the 3d model

Using this method, taking one decent picture of the IC is easy enough. But if we want a complete picture of the IC under higher magnification, we need to take multiple pictures and stitch them together. For that, I used Image Composite Editor, created by the Microsoft Research Computational Photography Group. Getting it to create good results was not without headaches and tediousness, but once it works, it does create pretty amazing results. I’d definitely recommend it.

Successful Imaging

74LS48 BCD to 7 segment display decoder:

Micro: 74LS48 BCD to 7segment decoder - Minimal damage

555 Timer:

Micro: 555, Fully exposed silicon die, intact wire bonds

Low magnification PIC16f84:

Micro: PIC16f84 after ultrasonic acetone method

Composite image of PIC16f84:

Micro: Composite image of PIC16f84

ADXL345 accelerometer:

Micro: ADXL345 accelerometer, intact MEMS, damaged bonds

Resources

Papers:

Commercial documentation:

Insightful research:

Quick development of bluetooth-based costume props using Arduino and ESP32

We hosted a Halloween party for some friends last week, and I wanted to integrate my costume (whatever it was) with the house decorations. I only had a handful of evenings available to get everything up and running, so I had to build something just complex enough to entice guests to play with it. Preferably using parts I already had in the lab.

Here’s the final product:

This post describes the final solution, pitfalls I ran into, and the reasoning behind some decissions.

The Hardware

Hardware overview

The BLE server

The circuit hidden in the costume prop is called, in bluetooth GATT terms, the server. It uses an accelerometer to detect abrupt movements. The sensor is calibrated so the prop can be moved freely, and only hitting it, tapping it on the ground, or moving it aggressively will trigger a notification to the other device.

It uses this hardware:

BT server circuit diagram

Since this part is gonna be smacked around, it’s important to solder the connections together. This is my final circuit soldered on a perfboard:

BT server picture

In order to avoid soldering the development boards themselves (in case we wanna use them for another project in the future), it’s a good idea to trim female pin headers like these and connect them to the dev modules. Then, solder those headers to the perfboard.

The BLE client

On the other end, there’s the GATT client. It boots, scans for servers, connects to ours, and starts polling its attributes. Whenever it receives a notification from the server, it simulates lightning using the LED strip, and plays thunder sound files over the speakers. If the ambiance jumper (GPIO19) is set, it behaves as if a notification was received every 45-65 seconds.

For a detailed explanation on how to wire the MP3 shield, check out the Arduino sketches. You may also find useful this post in oneguyblog.com. I used his audio files and some of his code for this project.

This is the hardware I used:

  • 1x ESP32 Core Board V2. Any other ESP32 dev board should work
  • 1x TIP31A NPN power transistor. Used to drive the LED strip using PWM
  • 1x MP3-TF-16P MP3 player module. It’s a co-processor contolled over serial to read audio files from an SD card and play them over headphones or speakers
  • Analog LED strip
    • Some LED strips are controlled using digital signals, usually to make LED clusters individually addressable. We want one of the simpler strips, controlled directly trough the voltage you apply to it, so we can run it using PWM
    • I used an RGB, common-cathode strip (4 pins: +12V, R, G, B). But RGB are shorted, so we’re controlling it like a white LED strip (2 pins: +12V, GND).
  • Computer speakers
  • Generics:
    • Perf board
    • Pin array headers
    • 2x 1Kohm resistors
    • Mini-jack connector
    • Wire spool
    • female-to-female wires (if not soldering everything)
    • 1x Jumper
    • Solder
    • [optional] female single line headers - to avoid soldering dev modules directly
  • Off the shelf:
    • 12V power supply
    • USB power supply - Or a 12V to 5V regulator (I recommend Traco Power)

BT client circuit diagram

This is how it looks soldered together; the LED strip would be connected at either one of the 2 headers labeled LED.

BT client picture

The Firmware

As I said, I wrote this firmware in a handful of nights. It’s heavily based on the very rudimentary BLE GATT example included with the ESP32 libraries, and includes some pieces of code gathered online. It’s not pretty, it’s imperfect, and you may have to manually reboot the BLE client at some point.

Since I may improve this firmware in the future to fix issues or add functionality, I recommend getting the code from the BLE-Halloween-Costume GitHub repository.

For the sake of completeness, and in case that GH repo ever becomes incompatible with the exact hardware described in this post, here’s a snapshot of the code in its current status:

BLE Client firmware

BLE Server firmware

Decission Making

Picking a wireless solution

There’s plenty of wireless solutions available in the market that could be used to solve this problem, from raw 433MHz radios to WiFi, LoRa, bluetooth, ZigBee…

In order to decide which one is best, first you need to lay down your requirements. These were mine; I think they apply to most costumes:

  • Good for close and medium range: Approx. 15m radius, with walls and EM noise disturbing the signal
  • No need for an Internet connection
  • Relatively low-power consumption on the transmitter
  • Modern, consumer-grade protocol
  • Privacy and security are irrelevant, since this device is not critical, not dangerous, and is only gonna be used once (Watch out!)

Bluetooth 5 can certainly handle those requirements.

Bluetooth 5 can be used in different ways to optimize certain requirements: energy consumption, data throughput, bidirectional communication, etc. Using the Generic Attribute Profile (GATT), we can cover most requirements that might come up in a costume. We can send a notification whenever an event happens in the prop, and we can use attributes to report other status information (e.g. switches in the prop could set the other device’s behaviour).

If you’re using this post to build your own project, and your requirements differ from mine, make sure Bluetooth 5 is the right fit for you. And even if Bluetooth 5 is an appropriate choice, GATT might not be. Figure out what you need to optimize, and find out what’s the best fit.

Resources: Bluetooth Core Specification

Picking the hardware

Picking the right microcontroller and development board for a serious project is often a huge and rather complicated task. Fortunately, this project is designed for a single use, so my only concerns were part lead time and development time. I was able to use only parts I already had, except for the MP3-TF-16P module.

If you’re interested in the thought process behind my hardware choices, here’s some of it:

I’m very very familiar with the ESP32 from work, I have like a dozen of them at hand, and I knew they work well with Arduino. It’s one of the most popular microcontrollers in the IoT market, specially for Proofs of Concept and DIY projects. And it also supports WiFi, which could be useful in future versions. Perfect fit for this project.

I picked the ADXL345 accelerometer because I had a few development modules at hand.

Picked the MP3-TF-16P because it was rather popular in the Arduino communities, and was available for next-day delivery.

The LED strip had to be analog and not individually addressable so we could better simmulate lightning using PWM. A white LED strip would work fine, but I only had RGB ones at hand.

The power transistor had to be beefy enough to drive all 3 channels (RGB) for the entire LED strip, and -less importantly- fast enough to handle PWM. I already had some TIP31As in the lab, and it can easily drive this strip’s power without breaking a sweat. A MOSFET would be better for PWM control, but the ones I had at hand were not beefy enough to handle the current required for my LED strip.

Everything else (power brick, perf board, resistors, pin headers…) is generic stuff I had in the lab.

Building on this

If you’d like to build your own project based on this one, here’s a few suggestions on where to start:

  • Create a good costume for it (zeus, electric chair, boxer…)
  • Fix connectivity issues. Under some circumstances, the client needs to be restarted to re-connect to the server
  • The custom board for the client should include a 12V to 5V regulator. Feeding power from 2 different outlets is completely unnecessary
  • Use GATT characteristics to set the frequency of ambiance lightning/thunder dynamically from the costume prop
  • Set up more clients in your costume/props/drink/home so more things react when an event occurs
  • Create 2 costumes that interact with one another
  • Find a good way to run the 12V LED strip off a portable battery, and integrate another LED strip into the costume
  • Use more capabilities of Bluetooth GATT to create more diverse and complex interactions

If this post was useful to you, and you decide to use it, don’t forget to send me a video of your results! :)

Also, if you’re gonna build more functionality into your system, you should really try to understand Bluetooth GATT a little better. See the resources below.

Resources

Sources:

Understanding Bluetooth GATT:

Technical documents:

Practical Reverse Engineering Part 5 - Digging Through the Firmware

  • Part 1: Hunting for Debug Ports
  • Part 2: Scouting the Firmware
  • Part 3: Following the Data
  • Part 4: Dumping the Flash
  • Part 5: Digging Through the Firmware

In part 4 we extracted the entire firmware from the router and decompressed it. As I explained then, you can often get most of the firmware directly from the manufacturer’s website: Firmware upgrade binaries often contain partial or entire filesystems, or even entire firmwares.

In this post we’re gonna dig through the firmware to find potentially interesting code, common vulnerabilities, etc.

I’m gonna explain some basic theory on the Linux architecture, disassembling binaries, and other related concepts. Feel free to skip some of the parts marked as [Theory]; the real hunt starts at ‘Looking for the Default WiFi Password Generation Algorithm’. At the end of the day, we’re just: obtaining source code in case we can use it, using grep and common sense to find potentially interesting binaries, and disassembling them to find out how they work.

One step at a time.

Gathering and Analysing Open Source Components

GPL Licenses - What They Are and What to Expect [Theory]

Linux, U-Boot and other tools used in this router are licensed under the General Public License. This license mandates that the source code for any binaries built with GPL’d projects must be made available to anyone who wants it.

Having access to all that source code can be a massive advantage during the reversing process. The kernel and the bootloader are particularly interesting, and not just to find security issues.

When hunting for GPL’d sources you can usually expect one of these scenarios:

  1. The code is freely available on the manufacturer’s website, nicely ordered and completely open to be tinkered with. For instance: apple products or the amazon echo
  2. The source code is available by request
    • They send you an email with the sources you requested
    • They ask you for “a reasonable amount” of money to ship you a CD with the sources
  3. They decide to (illegally) ignore your requests. If this happens to you, consider being nice over trying to get nasty.

In the case of this router, the source code was available on their website, even though it was a huge pain in the ass to find; it took me a long time of manual and automated searching but I ended up finding it in the mobile version of the site:

ls -lh gpl_source

But what if they’re hiding something!? How could we possibly tell whether the sources they gave us are the same they used to compile the production binaries?

Challenges of Binary Verification [Theory]

Theoretically, we could try to compile the source code ourselves and compare the resulting binary with the one we extracted from the device. In practice, that is extremely more complicated than it sounds.

The exact contents of the binary are strongly tied to the toolchain and overall environment they were compiled in. We could try to replicate the environment of the original developers, finding the exact same versions of everything they used, so we can obtain the same results. Unfortunately, most compilers are not built with output replicability in mind; even if we managed to find the exact same version of everything, details like timestamps, processor-specific optimizations or file paths would stop us from getting a byte-for-byte identical match.

If you’d like to read more about it, I can recommend this paper. The authors go through the challenges they had to overcome in order to verify that the official binary releases of the application ‘TrueCrypt’ were not backdoored.

Introduction to the Architecture of Linux [Theory]

In multiple parts of the series, we’ve discussed the different components found in the firmware: bootloader, kernel, filesystem and some protected memory to store configuration data. In order to know where to look for what, it’s important to understand the overall architecture of the system. Let’s quickly review this device’s:

Linux Architecture

The bootloader is the first piece of code to be executed on boot. Its job is to prepare the kernel for execution, jump into it and stop running. From that point on, the kernel controls the hardware and uses it to run user space logic. A few more details on each of the components:

  1. Hardware: The CPU, Flash, RAM and other components are all physically connected
  2. Linux Kernel: It knows how to control the hardware. The developers take the Open Source Linux kernel, write drivers for their specific device and compile everything into an executable Kernel. It manages memory, reads and writes hardware registers, etc. In more complex systems, “kernel modules” provide the possibility of keeping device drivers as separate entities in the file system, and dynamically load them when required; most embedded systems don’t need that level of versatility, so developers save precious resources by compiling everything into the kernel
  3. libc (“The C Library”): It serves as a general purpose wrapper for the System Call API, including extremely common functions like printf, malloc or system. Developers are free to call the system call API directly, but in most cases, it’s MUCH more convenient to use libc. Instead of the extremely common glibc (GNU C library) we usually find in more powerful systems, this device uses a version optimised for embedded devices: uClibc.
  4. User Applications: Executable binaries in /bin/ and shared objects in /lib/ (libraries that contain functions used by multiple binaries) comprise most of the high-level logic. Shared objects are used to save space by storing commonly used functions in a single location

Bootloader Source Code

As I’ve mentioned multiple times over this series, this router’s bootloader is U-Boot. U-Boot is GPL licensed, but Huawei failed to include the source code in their website’s release.

Having the source code for the bootloader can be very useful for some projects, where it can help you figure out how to run a custom firmware on the device or modify something; some bootloaders are much more feature-rich than others. In this case, I’m not interested in anything U-Boot has to offer, so I didn’t bother following up on the source code.

Kernel Source Code

Let’s just check out the source code and look for anything that might help. Remember the factory reset button? The button is part of the hardware layer, which means the GPIO pin that detects the button press must be controlled by the drivers. These are the logs we saw coming out of the UART port in a previous post:

UART system restore logs

With some simple grep commands we can see how the different components of the system (kernel, binaries and shared objects) can work together and produce the serial output we saw:

System reset button propagates to user space

Having the kernel can help us find poorly implemented security-related algorithms and other weaknesses that are sometimes considered ‘accepted risks’ by manufacturers. Most importantly, we can use the drivers to compile and run our own OS on the device.

User Space Source Code

As we can see in the GPL release, some components of the user space are also open source, such as busybox and iptables. Given the right (wrong) versions, public vulnerability databases could be enough to find exploits for any of these.

That being said, if you’re looking for 0-days, backdoors or sensitive data, your best bet is not the open source projects. Device-specific and closed-source code developed by the manufacturer or their contractors has not been so heavily tested, and may very well be riddled with bugs. Most of this code is stored as binaries in the user space; we’ve got the entire filesystem, so we’re good.

Without the source code for user space binaries, we need to find a way to read the machine code inside them. That’s where disassembly comes in.

Binary Disassembly [Theory]

The code inside every executable binary is just a compilation of instructions encoded as Machine Code so they can be processed by the CPU. Our processor’s datasheet will explain the direct equivalence between assembly instructions and their machine code representations. A disassembler has been given that equivalence so it can go through the binary, find data and machine code and translate it into assembly. Assembly is not pretty, but at least it’s human-readable.

Due to the very low-level nature of the kernel, and how heavily it interacts with the hardware, it is incredibly difficult to make any sense of its binary. User space binaries, on the other hand, are abstracted away from the hardware and follow unix standards for calling conventions, binary format, etc. They’re an ideal target for disassembly.

There are lots of disassemblers for popular architectures like MIPS; some better than others both in terms of functionality and usability. I’d say these 3 are the most popular and powerful disassemblers in the market right now:

  • IDA Pro: By far the most popular disassembler/debugger in the market. It is extremely powerful, multi-platform, and there are loads of users, tutorials, plugins, etc. around it. Unfortunately, it’s also VERY expensive; a single person license of the Pro version (required to disassemble MIPS binaries) costs over $1000
  • Radare2: Completely Open Source, uses an impressively advanced command line interface, and there’s a great community of hackers around it. On the other hand, the complex command line interface -necessary for the sheer amount of features- makes for a rather steep learning curve
  • Binary Ninja: Not open source, but reasonably priced at $100 for a personal license, it’s middle ground between IDA and radare. It’s still a very new tool; it was just released this year, but it’s improving and gaining popularity day by day. It already works very well for some architectures, but unfortunately it’s still missing MIPS support (coming soon) and some other features I needed for these binaries. I look forward to giving it another try when it’s more mature

In order to display the assembly code in a more readable way, all these disasemblers use a “Graph View”. It provides an intuitive way to follow the different possible execution flows in the binary:

IDA Small Function Graph View

Such a clear representation of branches, and their conditionals, loops, etc. is extremely useful. Without it, we’d have to manually jump from one branch to another in the raw assembly code. Not so fun.

If you read the code in that function you can see the disassembler makes a great job displaying references to functions and hardcoded strings. That might be enough to help us find something juicy, but in most cases you’ll need to understand the assembly code to a certain extent.

Gathering Intel on the CPU and Its Assembly Code [Theory]

Let’s take a look at the format of our binaries:

$ file bin/busybox
bin/busybox: ELF 32-bit LSB executable, MIPS, MIPS-II version 1 (SYSV), dynamically linked (uses shared libs), corrupted section header size

Because ELF headers are designed to be platform-agnostic, we can easily find out some info about our binaries. As you can see, we know the architecture (32-bit MIPS), endianness (LSB), and whether it uses shared libraries.

We can verify that information thanks to the Ralink’s product brief, which specifies the processor core it uses: MIPS24KEc

Product Brief Highlighted Processor Core

With the exact version of the CPU core, we can easily find its datasheet as released by the company that designed it: Imagination Technologies.

Once we know the basics we can just drop the binary into the disassembler. It will help validate some of our findings, and provide us with the assembly code. In order to understand that code we’re gonna need to know the architecture’s instruction sets and register names:

  • MIPS Instruction Set
  • MIPS Pseudo-Instructions: Very simple combinations of basic instructions, used for developer/reverser convenience
  • MIPS Alternate Register Names: In MIPS, there’s no real difference between registers; the CPU doesn’t about what they’re called. Alternate register names exist to make the code more readable for the developer/reverser: $a0 to $a3 for function arguments, $t0 to $t9 for temporary registers, etc.

Beyond instructions and registers, some architectures may have some quirks. One example of this would be the presence of delay slots in MIPS: Instructions that appear immediately after branch instructions (e.g. beqz, jalr) but are actually executed before the jump. That sort of non-linearity would be unthinkable in other architectures.

Some interesting links if you’re trying to learn MIPS: Intro to MIPS Reversing using Radare2, MIPS Assembler and Runtime Simulator, Toolchains to cross-compile for MIPS targets.

Example of User Space Binary Disassembly

Following up on the reset key example we were using for the Kernel, we’ve got the code that generated some of the UART log messages, but not all of them. Since we couldn’t find the ‘button has been pressed’ string in the kernel’s source code, we can deduce it must have come from user space. Let’s find out which binary printed it:

~/Tech/Reversing/Huawei-HG533_TalkTalk/router_filesystem
$ grep -i -r "restore default success" .
Binary file ./bin/cli matches
Binary file ./bin/equipcmd matches
Binary file ./lib/libcfmapi.so matches

3 files contain the next string found in the logs: 2 executables in /bin/ and 1 shared object in /lib/. Let’s take a look at /bin/equipcmd with IDA:

restore success string in /bin/equipcmd - IDA GUI

If we look closely, we can almost read the C code that was compiled into these instructions. We can see a “clear configuration file”, which would match the ERASE commands we saw in the SPI traffic capture to the flash IC. Then, depending on the result, one of two strings is printed: restore default success or restore default fail . On success, it then prints something else, flushes some buffers and reboots; this also matches the behaviour we observed when we pressed the reset button.

That function is a perfect example of delay slots: the addiu instructions that set both strings as arguments -$a0- for the 2 puts are in the delay slots of the branch if equals zero and jump and link register instructions. They will actually be executed before branching/jumping.

As you can see, IDA has the name of all the functions in the binary. That won’t necessarily be the case in other binaries, and now’s a good time to discuss why.

Function Names in a Binary - Intro to Symbol Tables [Theory]

The ELF format specifies the usage of symbol tables: chunks of data inside a binary that provide useful debugging information. Part of that information are human-readable names for every function in the binary. This is extremely convenient for a developer debugging their binary, but in most cases it should be removed before releasing the production binary. The developers were nice enough to leave most of them in there :)

In order to remove them, the developers can use tools like strip, which know what must be kept and what can be spared. These tools serve a double purpose: They save memory by removing data that won’t be necessary at runtime, and they make the reversing process much more complicated for potential attackers. Function names give context to the code we’re looking at, which is massively helpful.

In some cases -mostly when disassembling shared objects- you may see some function names or none at all. The ones you WILL see are the Dynamic Symbols in the .dymsym table: We discussed earlier the massive amount of memory that can be saved by using shared objects to keep the pieces of code you need to re-use all over the system (e.g. printf()). In order to locate pieces of data inside the shared object, the caller uses their human-readable name. That means the names for functions and variables that need to be publicly accessible must be left in the binary. The rest of them can be removed, which is why ELF uses 2 symbol tables: .dynsym for publicly accessible symbols and .symtab for the internal ones.

For more details on symbol tables and other intricacies of the ELF format, check out: The ELF Format - How programs look from the inside, Inside ELF Symbol Tables and the ELF spec (PDF).

Looking for the Default WiFi Password Generation Algorithm

What do We Know?

Remember the wifi password generation algorithm we discussed in part 3? (The Pot of Gold at the End of the Firmware) I explained then why I didn’t expect this router to have one, but let’s take a look anyway.

If you recall, these are the default WiFi credentials in my router:

Router Sticker - Annotated

So what do we know?

  1. Each device is pre-configured with a different set of WiFi credentials
  2. The credentials could be hardcoded at the factory or generated on the device. Either way, we know from previous posts that both SSID and password are stored in the reserved area of Flash memory, and they’re right next to each other
    • If they were hardcoded at the factory, the router only needs to read them from a known memory location
    • If they are generated in the device and then stored to flash, there must be an algorithm in the router that -given the same inputs- always generates the same outputs. If the inputs are public (e.g. the MAC address) and we can find, reverse and replicate the algorithm, we could calculate default WiFi passwords for any other router that uses the same algorithm

Let’s see what we can do with that…

Finding Hardcoded Strings

Let’s assume there IS such algorithm in the router. Between username and password, there’s only one string that remains constant across devices: TALKTALK-. This string is prepended to the last 6 characters of the MAC address. If the generation algorithm is in the router, surely this string must be hardcoded in there. Let’s look it up:

$ grep -r 'TALKTALK-' .
Binary file ./bin/cms matches
Binary file ./bin/nmbd matches
Binary file ./bin/smbd matches

2 of those 3 binaries (nmbd and smbd) are part of samba, the program used to use the USB flash drive as a network storage device. They’re probably used to identify the router over the network. Let’s take a look at the other one: /bin/cms.

Reversing the Functions that Uses Them

IDA TALKTALK-XXXXXX String Being Built

That looks exactly the way we’d expect the SSID generation algorithm to look. The code is located inside a rather large function called ATP_WLAN_Init, and somewhere in there it performs the following actions:

  1. Find out the MAC address of the device we’re running on:
    • mac = BSP_NET_GetBaseMacAddress()
  2. Create the SSID string:
    • snprintf(SSID, "TALKTALK-%02x%02x%02x", mac[3], mac[4], mac[5])
  3. Save the string somewhere:
    • ATP_DBSetPara(SavedRegister3, 0xE8801E09, SSID)

Unfortunately, right after this branch the function simply does an ATP_DBSave and moves on to start running commands and whatnot. e.g.:

ATP_WLAN_Init moves on before you

Further inspection of this function and other references to ATP_DBSave did not reveal anything interesting.

Giving Up

After some time using this process to find potentially relevant pieces of code, reverse them, and analyse them, I didn’t find anything that looked like the password generation algorithm. That would confirm the suspicions I’ve had since we found the default credentials in the protected flash area: The manufacturer used proper security techniques and flashed the credentials at the factory, which is why there is no algorithm. Since the designers manufacture their own hardware, the decision makes perfect sense for this device. They can do whatever they want with their manufacturing lines, so they decided to do it right.

I might take another look at it in the future, or try to find it in some other router (I’d like to document the process of reversing it), but you should know this method DOES work for a lot of products. There’s a long history of freely available default WiFi password generators.

Since we already know how to find relevant code in the filesystem binaries, let’s see what else we can do with that knowledge.

Looking for Command Injection Vulnerabilities

One of the most common, easy to find and dangerous vulnerabilities is command injection. The idea is simple; we find an input string that is gonna be used as an argument for a shell command. We try to append our own commands and get them to execute, bypassing any filters that the developers may have implemented. In embedded devices, such vulnerabilities often result in full root control of the device.

These vulnerabilities are particularly common in embedded devices due to their memory constraints. Say you’re developing the web interface used by the users to configure the device; you want to add the possibility to ping a user-defined server from the router, because it’s very valuable information to debug network problems. You need to give the user the option to define the ping target, and you need to serve them the results:

Router WEB Interface Ping in action

Once you receive the data of which server to target, you have two options: You find a library with the ICMP protocol implemented and call it directly from the web backend, or you could use a single, standard function call and use the router’s already existing ping shell command. The later is easier to implement, saves memory, etc. and it’s the obvious choice. Taking user input (target server address) and using it as part of a shell command is where the danger comes in. Let’s see how this router’s web application, /bin/web, handles it:

/bin/web's ping function

A call to libc’s system() (not to be confused with a system call/syscall) is the easiest way to execute a shell command from an application. Sometimes developers wrap system() in custom functions in order to systematically filter all inputs, but there’s always something the wrapper can’t do or some developer who doesn’t get the memo.

Looking for references to system in a binary is an excellent way to find vectors for command injections. Just investigate the ones that look like may be using unfiltered user input. These are all the references to system() in the /bin/web binary:

xrefs to system in /bin/web

Even the names of the functions can give you clues on whether or not a reference to system() will receive user input. We can also see some references to PIN and PUK codes, SIMs, etc. Seems like this application is also used in some mobile product…

I spent some time trying to find ways around the filtering provided by atp_gethostbyname (anything that isn’t a domain name causes an error), but I couldn’t find anything in this field or any others. Further analysis may prove me wrong. The idea would be to inject something to the effects of this:

Attempt reboot injection on ping field

Which would result in this final string being executed as a shell command: ping google.com -c 1; reboot; ping 192.168.1.1 > /dev/null. If the router reboots, we found a way in.

As I said, I couldn’t find anything. Ideally we’d like to verify that for all input fields, whether they’re in the web interface or some other network interface. Another example of a network interface potentially vulnerable to remote command injections is the “LAN-Side DSL CPE Configuration” protocol, or TR-064. Even though this protocol was designed to be used over the internal network only, it’s been used to configure routers over the internet in the past. Command injection vulnerabilities in some implementations of this protocol have been used to remotely extract data like WiFi credentials from routers with just a few packets.

This router has a binary conveniently named /bin/tr064; if we take a look, we find this right in the main() function:

/bin/tr064 using /etc/serverkey.pem

That’s the private RSA key we found in Part 2 being used for SSL authentication. Now we might be able to supplant a router in the system and look for vulnerabilities in their servers, or we might use it to find other attack vectors. Most importantly, it closes the mistery of the private key we found while scouting the firmware.

Looking for More Complex Vulnerabilities [Theory]

Even if we couldn’t find any command injection vulnerabilities, there are always other vectors to gain control of the router. The most common ones are good old buffer overflows. Any input string into the router, whether it is for a shell command or any other purpose, is handled, modified and passed around the code. An error by the developer calculating expected buffer lengths, not validating them, etc. in those string operations can result in an exploitable buffer overflow, which an attacker can use to gain control of the system.

The idea behind a buffer overflow is rather simple: We manage to pass a string into the system that contains executable code. We override some address in the program so the execution flow jumps into the code we just injected. Now we can do anything that binary could do -in embedded systems like this one, where everything runs as root, it means immediate root pwnage.

Introducing an unexpectedly long input

Developing an exploit for this sort of vulnerability is not as simple as appending commands to find your way around a filter. There are multiple possible scenarios, and different techniques to handle them. Exploits using more involved techniques like ROP can become necessary in some cases. That being said, most household embedded systems nowadays are decades behind personal computers in terms of anti-exploitation techniques. Methods like Address Space Layout Randomization (ASLR), which are designed to make exploit development much more complicated, are usually disabled or not implemented at all.

If you’d like to find a potential vulnerability so you can learn exploit development on your own, you can use the same techniques we’ve been using so far. Find potentially interesting inputs, locate the code that manages them using function names, hardcoded strings, etc. and try to trigger a malfunction sending an unexpected input. If we find an improperly handled string, we might have an exploitable bug.

Once we’ve located the piece of disassembled code we’re going to attack, we’re mostly interested in string manipulation functions like strcpy, strcat, sprintf, etc. Their more secure counterparts strncpy, strncat, etc. are also potentially vulnerable to some techniques, but usually much more complicated to work with.

Pic of strcpy handling an input

Even though I’m not sure that function -extracted from /bin/tr064- is passed any user inputs, it’s still a good example of the sort of code you should be looking for. Once you find potentially insecure string operations that may handle user input, you need to figure out whether there’s an exploitable bug.

Try to cause a crash by sending unexpectedly long inputs and work from there. Why did it crash? How many characters can I send without causing a crash? Which payload can I fit in there? Where does it land in memory? etc. etc. I may write about this process in more detail at some point, but there’s plenty of literature available online if you’re interested.

Don’t spend all your efforts on the most obvious inputs only -which are also more likely to be properly filtered/handled-; using tools like the burp web proxy (or even the browser itself), we can modify fields like cookies to check for buffer overflows.

Web vulnerabilities like CSRF are also extremely common in embedded devices with web interfaces. Exploiting them to write to files or bypass authentication can lead to absolute control of the router, specially when combined with command injections. An authentication bypass for a router with the web interface available from the Internet could very well expose the network to being remotely man in the middle’d. They’re definitely an important attack vector, even though I’m not gonna go into how to find them.

Decompiling Binaries [Theory]

When you decompile a binary, instead of simply translating Machine Code to Assembly Code, the decompiler uses algorithms to identify functions, loops, branches, etc. and replicate them in a higher level language like C or Python.

That sounds like a brilliant idea for anybody who has been banging their head against some assembly code for a few hours, but an additional layer of abstraction means more potential errors, which can result in massive wastes of time.

In my (admittedly short) personal experience, the output just doesn’t look reliable enough. It might be fine when using expensive decompilers (IDA itself supports a couple of architectures), but I haven’t found one I can trust with MIPS binaries. That being said, if you’d like to give one a try, the RetDec online decompiler supports multiple architectures- including MIPS.

Binary Decompiled to C by RetDec

Even as a ‘high level’ language, the code is not exactly pretty to look at.

Next Steps

Whether we want to learn something about an algorithm we’re reversing, to debug an exploit we’re developing or to find any other sort of vulnerability, being able to execute (and, if possible, debug) the binary on an environment we fully control would be a massive advantage. In some/most cases -like this router-, being able to debug on the original hardware is not possible. In the next post, we’ll work on CPU emulation to debug the binaries in our own computers.

Thanks for reading! I’m sorry this post took so long to come out. Between work, hardwear.io and seeing family/friends, this post was written about 1 paragraph at a time from 4 different countries. Things should slow down for a while, so hopefully I’ll be able to publish Part 6 soon. I’ve also got some other reversing projects coming down the pipeline, starting with hacking the Amazon Echo and a router with JTAG. I’ll try to get to those soon, work permitting… Happy Hacking :)


Tips and Tricks

Mistaken xrefs and how to remove them

Sometimes an address is loaded into a register for 16bit/32bit adjustments. The contents of that address have no effect on the rest of the code; it’s just a routinary adjustment. If the address that is assigned to the register happens to be pointing to some valid data, IDA will rename the address in the assembly and display the contents in a comment.

It is up to you to figure out whether an x-ref makes sense or not. If it doesn’t, select the variable and press o in IDA to ignore the contents and give you only the address. This makes the code much less confusing.

Setting function prototypes so IDA comments the args around calls for us

Set the cursor on a function and press y. Set the prototype for the function: e.g. int memcpy(void *restrict dst, const void *restrict src, int n);. Note:IDA only understands built-in types, so we can’t use types like size_t.

Once again we can use the extern declarations found in the GPL source code. When available, find the declaration for a specific function, and use the same types and names for the arguments in IDA.

Taking Advantage of the GPL Source Code

If we wanna figure out what are the 1st and 2nd parameters of a function like ATP_DBSetPara, we can sometimes rely on the GPL source code. Lots of functions are not implemented in the kernel or any other open source component, but they’re still used from one of them. That means we can’t see the code we’re interested in, but we can see the extern declarations for it. Sometimes the source will include documentation comments or descriptive variable names; very useful info that the disassembly doesn’t provide:

ATP_DBSetPara extern declaration in gpl_source/inc/cfmapi.h

Unfortunately, the function documentation comment is not very useful in this case -seems like there were encoding issues with the file at some point, and everything written in Chinese was lost. At least now we know that the first argument is a list of keys, and the second is something they call ParamCMO. ParamCMO is a constant in our disassembly, so it’s probably just a reference to the key we’re trying to set.

Disassembly Methods - Linear Sweep vs Recursive Descent

The structure of a binary can vary greatly depending on compiler, developers, etc. How functions call each other is not always straightforward for a disassembler to figure out. That means you may run into lots of ‘orphaned’ functions, which exist in the binary but do not have a known caller.

Which disassembler you use will dictate whether you see those functions or not, some of which can be extremely important to us (e.g. the ping function in the web binary we reversed earlier). This is due to how they scan binaries for content:

  1. Linear Sweep: Read the binary one byte at a time, anything that looks like a function is presented to the user. This requires significant logic to keep false positives to a minimum
  2. Recursive Descent: We know the binary’s entry point. We find all functions called from main(), then we find the functions called from those, and keep recursively displaying functions until we’ve got “all” of them. This method is very robust, but any functions not referenced in a standard/direct way will be left out

Make sure your disassembler supports linear sweep if you feel like you’re missing any data. Make sure the code you’re looking at makes sense if you’re using linear sweep.

Practical Reverse Engineering Part 4 - Dumping the Flash

  • Part 1: Hunting for Debug Ports
  • Part 2: Scouting the Firmware
  • Part 3: Following the Data
  • Part 4: Dumping the Flash
  • Part 5: Digging Through the Firmware

In Parts 1 to 3 we’ve been gathering data within its context. We could sniff the specific pieces of data we were interested in, or observe the resources used by each process. On the other hand, they had some serious limitations; we didn’t have access to ALL the data, and we had to deal with very minimal tools… And what if we had not been able to find a serial port on the PCB? What if we had but it didn’t use default credentials?

In this post we’re gonna get the data straight from the source, sacrificing context in favour of absolute access. We’re gonna dump the data from the Flash IC and decompress it so it’s usable. This method doesn’t require expensive equipment and is independent of everything we’ve done until now. An external Flash IC with a public datasheet is a reverser’s great ally.

Dumping the Memory Contents

As discussed in Part 3, we’ve got access to the datasheet for the Flash IC, so there’s no need to reverse its pinout:

Flash Pic Annotated Pinout

We also have its instruction set, so we can communicate with the IC using almost any device capable of ‘speaking’ SPI.

We also know that powering up the router will cause the Ralink to start communicating with the Flash IC, which would interfere with our own attempts to read the data. We need to stop the communication between the Ralink and the Flash IC, but the best way to do that depends on the design of the circuit we’re working with.

Do We Need to Desolder The Flash IC? [Theory]

The perfect way to avoid interference would be to simply desolder the Flash IC so it’s completely isolated from the rest of the circuit. It gives us absolute control and removes all possible sources of interference. Unfortunately, it also requires additional equipment, experience and time, so let’s see if we can avoid it.

The second option would be to find a way of keeping the Ralink inactive while everything else around it stays in standby. Microcontrollers often have a Reset pin that will force them to shut down when pulled to 0; they’re commonly used to force IC reboots without interrupting power to the board. In this case we don’t have access to the Ralink’s full datasheet (it’s probably distributed only to customers and under NDA); the IC’s form factor and the complexity of the circuit around it make for a very hard pinout to reverse, so let’s keep thinking…

What about powering one IC up but not the other? We can try applying voltage directly to the power pins of the Flash IC instead of powering up the whole circuit. Injecting power into the PCB in a way it wasn’t designed for could blow something up; we could reverse engineer the power circuit, but that’s tedious work. This router is cheap and widely available, so I took the ‘fuck it’ approach. The voltage required, according to the datasheet, is 3V; I’m just gonna apply power directly to the Flash IC and see what happens. It may power up the Ralink too, but it’s worth a try.

Flash Powered UART Connected

We start supplying power while observing the board and waiting for data from the Ralink’s UART port. We can see some LEDs light up at the back of the PCB, but there’s no data coming out of the UART port; the Ralink must not be running. Even though the Ralink is off, its connection to the Flash IC may still interfere with our traffic because of multiple design factors in both power circuit and the silicon. It’s important to keep that possibility in mind in case we see anything dodgy later on; if that was to happen we’d have to desolder the Flash IC (or just its data pins) to physically disconnect it from everything else.

The LEDs and other static components can’t communicate with the Flash IC, so they won’t be an issue as long as we can supply enough current for all of them. I’m just gonna use a bench power supply, with plenty of current available for everything. If you don’t have one you can try using the Master’s power lines, or some USB power adapter if you need some more current. They’ll probably do just fine.

Time to connect our SPI Master.

Connecting to the Flash IC

Now that we’ve confirmed there’s no need to desolder the Ralink we can connect any device that speaks SPI and start reading memory contents block by block. Any microcontroller will do, but a purpose-specific SPI-USB bridge will often be much faster. In this case I’m gonna be using a board based on the FT232H, which supports SPI among some other low level protocols.

We’ve got the pinout for both the Flash and my USB-SPI bridge, so let’s get everything connected.

Shikra and Power Connected to Flash

Now that the hardware is ready it’s time to start pumping data out.

Dumping the Data

We need some software in our computer that can understand the USB-SPI bridge’s traffic and replicate the memory contents as a binary file. Writing our own wouldn’t be difficult, but there are programs out there that already support lots of common Masters and Flash ICs. Let’s try the widely known and open source flashrom.

flashrom is old and buggy, but it already supports both the FT232H as Master and the FL064PIF as Slave. It gave me lots of trouble in both OSX and an Ubuntu VM, but ended up working just fine on a Raspberry Pi (Raspbian):

flashrom stdout

Success! We’ve got our memory dump, so we can ditch the hardware and start preparing the data for analysis.

Splitting the Binary

The file command has been able to identify some data about the binary, but that’s just because it starts with a header in a supported format. In a 0-knowledge scenario we’d use binwalk to take a first look at the binary file and find the data we’d like to extract.

Binwalk is a very useful tool for binary analysis created by the awesome hackers at /dev/ttyS0; you’ll certainly get to know them if you’re into hardware hacking.

binwalk spidump.bin

In this case we’re not in a 0-knowledge scenario; we’ve been gathering data since day 1, and we obtained a complete memory map of the Flash IC in Part 2. The addresses mentioned in the debug message are confirmed by binwalk, and it makes for much cleaner splitting of the binary, so let’s use it:

Flash Memory Map From Part 2

With the binary and the relevant addresses, it’s time to split the binary into its 4 basic segments. dd takes its parameters in terms of block size (bs, bytes), offset (skip, blocks) and size (count, blocks); all of them in decimal. We can use a calculator or let the shell do the hex do decimal conversions with $(()):

$ dd if=spidump.bin of=bootloader.bin bs=1 count=$((0x020000))
    131072+0 records in
    131072+0 records out
    131072 bytes transferred in 0.215768 secs (607467 bytes/sec)
$ dd if=spidump.bin of=mainkernel.bin bs=1 count=$((0x13D000-0x020000)) skip=$((0x020000))
    1167360+0 records in
    1167360+0 records out
    1167360 bytes transferred in 1.900925 secs (614101 bytes/sec)
$ dd if=spidump.bin of=mainrootfs.bin bs=1 count=$((0x660000-0x13D000)) skip=$((0x13D000))
    5386240+0 records in
    5386240+0 records out
    5386240 bytes transferred in 9.163635 secs (587784 bytes/sec)
$ dd if=spidump.bin of=protect.bin bs=1 count=$((0x800000-0x660000)) skip=$((0x660000))
    1703936+0 records in
    1703936+0 records out
    1703936 bytes transferred in 2.743594 secs (621060 bytes/sec)

We have created 4 different binary files:

  1. bootloader.bin: U-boot. The bootloader. It’s not compressed because the Ralink wouldn’t know how to decompress it.
  2. mainkernel.bin: Linux Kernel. The basic firmware in charge of controlling the bare metal. Compressed using lzma
  3. mainrootfs.bin: Filesystem. Contains all sorts of important binaries and configuration files. Compressed as squashfs using the lzma algorithm
  4. protect.bin: Miscellaneous data as explained in Part 3. Not compressed

Extracting the Data

Now that we’ve split the binary into its 4 basic segments, let’s take a closer look at each of them.

Bootloader

binwalk bootloader.bin

Binwalk found the uImage header and decoded it for us. U-Boot uses these headers to identify relevant memory areas. It’s the same info that the file command displayed when we fed it the whole memory dump because it’s the first header in the file.

We don’t care much for the bootloader’s contents in this case, so let’s ignore it.

Kernel

binwalk mainkernel.bin

Compression is something we have to deal with before we can make any use of the data. binwalk has confirmed what we discovered in Part 2, the kernel is compressed using lzma, a very popular compression algorithm in embedded systems. A quick check with strings mainkernel.bin | less confirms there’s no human readable data in the binary, as expected.

There are multiple tools that can decompress lzma, such as 7z or xz. None of those liked mainkernel.bin:

$ xz --decompress mainkernel.bin
xz: mainkernel.bin: File format not recognized

The uImage header is probably messing with tools, so we’re gonna have to strip it out. We know the lzma data starts at byte 0x40, so let’s copy everything but the first 64 bytes.

dd if=mainkernel of=noheader

And when we try to decompress…

$ xz --decompress mainkernel_noheader.lzma
xz: mainkernel_noheader.lzma: Compressed data is corrupt

xz has been able to recognize the file as lzma, but now it doesn’t like the data itself. We’re trying to decompress the whole mainkernel Flash area, but the stored data is extremely unlikely to be occupying 100% of the memory segment. Let’s remove any unused memory from the tail of the binary and try again:

Cut off the tail; decompression success

xz seems to have decompressed the data successfully. We can easily verify that using the strings command, which finds ASCII strings in binary files. Since we’re at it, we may as well look for something useful…

strings kernel grep key

The Wi-Fi Easy and Secure Key Derivation string looks promising, but as it turns out it’s just a hardcoded string defined by the Wi-Fi Protected Setup spec. Nothing to do with the password generation algorithm we’re interested in.

We’ve proven the data has been properly decompressed, so let’s keep moving.

Filesystem

binwalk mainrootfs.bin

The mainrootfs memory segment does not have a uImage header because it’s relevant to the kernel but not to U-Boot.

SquashFS is a very common filesystem in embedded systems. There are multiple versions and variations, and manufacturers sometimes use custom signatures to make the data harder to locate inside the binary. We may have to fiddle with multiple versions of unsquashfs and/or modify the signatures, so let me show you what the signature looks like in this case:

sqsh signature in hexdump

Since the filesystem is very common and finding the right configuration is tedious work, somebody may have already written a script to automate the task. I came across this OSX-specific fork of the Firmware Modification Kit, which compiles multiple versions of unsquashfs and includes a neat script called unsquashfs_all.sh to run all of them. It’s worth a try.

unsquashfs_all.sh mainrootfs.bin

Wasn’t that easy? We got lucky with the SquashFS version and supported signature, and unsquashfs_all.sh managed to decompress the filesystem. Now we’ve got every binary in the filesystem, every symlink and configuration file, and everything is nice and tidy:

tree unsquashed_filesystem

In the complete file tree we can see we’ve got every file in the system, (other than runtime files like those in /var/, of course).

Using the intel we have been gathering on the firmware since day 1 we can start looking for potentially interesting binaries:

grep -i -r '$INTEL' squashfs-root

If we were looking for network/application vulnerabilities in the router, having every binary and config file in the system would be massively useful.

Protected

binwalk protect.bin

As we discussed in Part 3, this memory area is not compressed and contains all pieces of data that need to survive across reboots but be different across devices. strings seems like an appropriate tool for a quick overview of the data:

strings protect.bin

Everything in there seems to be just the curcfg.xml contents, some logs and those few isolated strings in the picture. We already sniffed and analysed all of that data in Part 3, so there’s nothing else to discuss here.

Next Steps

At this point all hardware reversing for the Ralink is complete and we’ve collected everything there was to collect in ROM. Just think of what you may be interested in and there has to be a way to find it. Imagine we wanted to control the router through the UART debug port we found in Part 1, but when we try to access the ATP CLI we can’t figure out the credentials. After dumping the external Flash we’d be able to find the XML file in the protect area, and discover the credentials just like we did in Part 2 (The Rambo Approach to Intel Gathering, admin:admin).

If you couldn’t dump the memory IC for any reason, the firmware upgrade files provided by the manufacturers will sometimes be complete memory segments; the device simply overwrites the relevant flash areas using code previously loaded to RAM. Downloading the file from the manufacturer would be the equivalent of dumping those segments from flash, so we just need to decompress them. They won’t have all the data, but it may be enough for your purposes.

Now that we’ve got the firmware we just need to think of anything we may be interested in and start looking for it through the data. In the next post we’ll dig a bit into different binaries and try to find more potentially useful data.

Practical Reverse Engineering Part 3 - Following the Data

  • Part 1: Hunting for Debug Ports
  • Part 2: Scouting the Firmware
  • Part 3: Following the Data
  • Part 4: Dumping the Flash
  • Part 5: Digging Through the Firmware

The best thing about hardware hacking is having full access to very bare metal, and all the electrical signals that make the system work. With ingenuity and access to the right equipment we should be able to obtain any data we want. From simply sniffing traffic with a cheap logic analyser to using thousands of dollars worth of equipment to obtain private keys by measuring the power consumed by the device with enough precision (power analysis side channel attack); if the physics make sense, it’s likely to work given the right circumstances.

In this post I’d like to discuss traffic sniffing and how we can use it to gather intel.

Traffic sniffing at a practical level is used all the time for all sorts of purposes, from regular debugging during the delopment process to reversing the interface of gaming controllers, etc. It’s definitely worth a post of its own, even though this device can be reversed without it.

Please check out the legal disclaimer in case I come across anything sensitive.

Full disclosure: I’m in contact with Huawei’s security team. I tried to contact TalkTalk, but their security staff is nowhere to be seen.

Data Flows In the PCB

Data is useless within its static memory cells, it needs to be read, written and passed around in order to be useful. A quick look at the board is enough to deduce where the data is flowing through, based on IC placement and PCB traces:

PCB With Data Flows and Some IC Names

We’re not looking for hardware backdoors or anything buried too deep, so we’re only gonna look into the SPI data flowing between the Ralink and its external Flash.

Pretty much every IC in the market has a datasheet documenting all its technical characteristics, from pinouts to power usage and communication protocols. There are tons of public datasheets on google, so find the ones relevant to the traffic you want to sniff:

Now we’ve got pinouts, electrical characteristics, protocol details… Let’s take a first look and extract the most relevant pieces of data.

Understanding the Flash IC

We know which data flow we’re interested: The SPI traffic between the Ralink IC and Flash. Let’s get started; the first thing we need is to figure out how to connect the logic analyser. In this case we’ve got the datasheet for the Flash IC, so there’s no need to reverse engineer any pinouts:

Flash Pic Annotated Pinout

Standard SPI communication uses 4 pins:

  1. MISO (Master In Slave Out): Data line Ralink<-Flash
  2. MOSI (Master Out Slave In): Data line Ralink->Flash
  3. SCK (Clock Signal): Coordinates when to read the data lines
  4. CS# (Chip Select): Enables the Flash IC when set to 0 so multiple of them can share MISO/MOSI/SCK lines.

We know the pinout, so let’s just connect a logic analyser to those 4 pins and capture some random transmission:

Connected Logic Analyser

In order to set up our logic analyser we need to find out some SPI configuation options, specifically:

  • Transmission endianness [Standard: MSB First]
  • Number of bits per transfer [Standard: 8]. Will be obvious in the capture
  • CPOL: Default state of the clock line while inactive [0 or 1]. Will be obvious in the capture
  • CPHA: Clock edge that triggers the data read in the data lines [0=leading, 1=trailing]. We’ll have to deduce this

The datasheet explains that the flash IC understands only 2 combinations of CPOL and CPHA: (CPOL=0, CPHA=0) or (CPOL=1, CPHA=1)

Datasheet SPI Settings

Let’s take a first look at some sniffed data:

Logic Screencap With CPOL/CPHA Annotated

In order to understand exactly what’s happenning you’ll need the FL064PIF’s instruction set, available in its datasheet:

FL064PIF Instruction Set

Now we can finally analyse the captured data:

Logic Sample SPI Packet

In the datasheet we can see that the FL064PIF has high-performance features for read and write operations: Dual and Quad options that multiplex the data over more lines to increase the transmission speed. From taking a few samples, it doesn’t seem like the router uses these features much -if at all-, but it’s important to keep the possibility in mind in case we see something odd in a capture.

Transmission modes that require additional pins can be a problem if your logic analyser is not powerful enough.

The Importance of Your Sampling Rate [Theory]

A logic analyser is a conceptually simple device: It reads signal lines as digital inputs every x microseconds for y seconds, and when it’s done it sends the data to your computer to be analysed.

For the protocol analyser to generate accurate data it’s vital that we record digital inputs faster than the device writes them. Otherwise the data will be mangled by missing bits or deformed waveforms.

Unfortunately, your logic analyser’s maximum sampling rate depends on how powerful/expensive it is and how many lines you need to sniff at a time. High-speed interfaces with multiple data lines can be a problem if you don’t have access to expensive equipment.

I recorded this data from the Ralink-Flash SPI bus using a low-end Saleae analyser at its maximum sampling rate for this number of lines, 24 MS/s:

Picture of Deformed Clock Signal

As you can see, even though the clock signal has the 8 low to high transitions required for each byte, the waveform is deformed.

Since the clock signal is used to coordinate when to read the data lines, this kind of waveform deformation may cause data corruption even if we don’t drop any bits (depending partly on the design of your logic analyser). There’s always some wiggle room for read inaccuracies, and we don’t need 100% correct data at this point, but it’s important to keep all error vectors in mind.

Let’s sniff the same bus using a higher performance logic analyser at 100 MS/s:

High Sampling Rate SPI Sample Reading

As you can see, this clock signal is perfectly regular when our Sampling Rate is high enough.

If you see anything dodgy in your traffic capture, consider how much data you’re willing to lose and whether you’re being limited by your equipment. If that’s the case, either skip this Reversing vector or consider investing in a better logic analyser.

Seeing the Data Flow

We’re already familiar with the system thanks to the overview of the firmware we did in Part 2, so we can think of some specific SPI transmissions that we may be interested in sniffing. Simply connecting an oscilloscope to the MISO and MOSI pins will help us figure out how to trigger those transmissions and yield some other useful data.

Scope and UART Connected

Here’s a video (no audio) showing both the serial interface and the MISO/MOSI signals while we manipulate the router:

This is a great way of easily identifying processes or actions that trigger flash read/write actions, and will help us find out when to start recording with the logic analyser and for how long.

Analysing SPI Traffic - ATP’s Save Command

In Post 2 I mentioned ATP CLI has a save command that stores something to flash; unfortunately, the help menu (save ?) won’t tell you what it’s doing and the only output when you run it is a few dots that act as a progress bar. Why don’t we find out by ourselves? Let’s make a plan:

  1. Wait until boot sequence is complete and the router is idle so there’s no unexpected SPI traffic
  2. Start the ATP Cli as explained in Part 1
  3. Connect the oscilloscope to MISO/MOSI and run save to get a rough estimate of how much time we need to capture data for
  4. Set a trigger in the enable line sniffed by the logic analyser so it starts recording as soon as the flash IC is selected
  5. Run save
  6. Analyse the captured data

Steps 3 and 4 can be combined so you see the data flow in real time in the scope while you see the charge bar for the logic analyser; that way you can make sure you don’t miss any data. In order to comfortably connect both scope and logic sniffer to the same pins, these test clips come in very handy:

SOIC16 Test Clip Connected to Flash IC

Once we’ve got the traffic we can take a first look at it:

Analysing Save Capture on Logic

Let’s consider what sort of data could be extracted from this traffic dump that might be useful to us. We’re working with a memory storage IC, so we can see the data that is being read/written and the addresses where it belongs. I think we can represent that data in a useful way by 2 means:

  1. Traffic map depicting which Flash areas are being written, read or erased in chronological order
  2. Create binary files that replicate the memory blocks that were read/written, preferably removing all the protocol rubbish that we sniffed along with them.

Saleae’s SPI analyser will export the data as a CSV file. Ideally we’d improve their protocol analyser to add the functionality we want, but that would be too much work for this project. One of the great things about low level protocols like SPI is that they’re usually very straightforward; I decided to write some python spaghetti code to analyse the CSV file and extract the data we’re looking for: binmaker.py and traffic_mapper.py

The workflow to analyse a capture is the following:

  1. Export sniffed traffic as CSV
  2. Run the script:
    • Iterate through the CSV file
    • Identify different commands by their index
    • Recognise the command expressed by the first byte
    • Process its arguments (addresses, etc.)
    • Identify the read/write payload
    • Convert ASCII representation of each payload byte to binary
    • Write binary blocks to different files for MISO (read) and MOSI (write)
  3. Read the traffic map (regular text) and the binaries (hexdump -C output.bin | less)

The scripts generate these results:

The traffic map is much more useful when combined with the Flash memory map we found in Part 2:

Flash Memory Map From Part 2

From the traffic map we can see the bulk of the save command’s traffic is simple:

  1. Read about 64kB of data from the protect area
  2. Overwrite the data we just read

In the MISO binary we can see most of the read data was just tons of 1s:

Picture MISO Hexdump 0xff

Most of the data in the MOSI binary is plaintext XML, and it looks exactly like the /var/curcfg.xml file we discovered in Part 2. As we discussed then, this “current configuration” file contains tons of useful data, including the current WiFi credentials.

It’s standard to keep reserved areas in flash; they’re mostly for miscellaneous data that needs to survive across reboots and be configurable by user, firmware or factory. It makes sense for a command called save to write data to such area, it explains why the data is perfectly readable as opposed to being compressed like the filesystem, and why we found the XML file in the /var/ folder of the filesystem (it’s a folder for runtime files; data in the protect area has to be loaded to memory separately from the filesystem).

The Pot of Gold at the End of the Firmware [Theory]

During this whole process it’s useful to have some sort of target to keep you digging in the same general direction.

Our target is an old one: the algorithm that generates the router’s default WiFi password. If we get our hands on such algorithm and it happens to derive the password from public information, any HG533 in the world with default WiFi credentials would probably be vulnerable.

That exact security issue has been found countless times in the past, usually deriving the password from public data like the Access Point’s MAC address or its SSID.

That being said, not all routers are vulnerable, and I personally don’t expect this one to be. The main reason behind targeting this specific vector is that it’s caused by a recurrent problem in embedded engineering: The need for a piece of data that is known by the firmware, unique to each device and known by an external entity. From default WiFi passwords to device credentials for IoT devices, this problem manifests in different ways all over the Industry.

Future posts will probably reference the different possibilities I’m about to explain, so let me get all that theory out of the way now.

The Sticker Problem

In this day and era, connecting to your router via ethernet so there’s no need for default WiFi credentials is not an option, using a display to show a randomly generated password would be too expensive, etc. etc. etc. The most widely adopted solution for routers is to create a WiFi network using default credentials, print those credentials on a sticker at the factory and stick it to the back of the device.

Router Sticker - Annotated

The WiFi password is the ‘unique piece of data’, and the computer printing the stickers in the factory is the ‘external entity’. Both the firmware and the computer need to know the default WiFi credentials, so the engineer needs to decide how to coordinate them. Usually there are 2 options available:

  1. The same algorithm is implemented in both the device and the computer, and its input parameters are known to both of them
  2. A computer generates the credentials for each device and they’re stored into each device separately

Developer incompetence aside, the first approach is usually taken as a last resort; if you can’t get your hardware manufacturer to flash unique data to each device or can’t afford the increase in manufacturing cost.

The second approach is much better by design: We’re not trusting the hardware with data sensitive enough to compromise every other device in the field. That being said, the company may still decide to use an algorithm with predictable outputs instead of completely random data; that would make the system as secure as the weakest link between the algorithm -mathematically speaking-, the confidentiality of their source code and the security of the computers/network running it.

Sniffing Factory Reset

So now that we’ve discussed our target, let’s gather some data about it. The first thing we wanna figure out is which actions will kickstart the flow of relevant data on the PCB. In this case there’s 1 particular action: Pressing the Factory Reset button for 10s. This should replace the existing WiFi credentials with the default ones, so the default creds will have to be generated/read. If the key or the generation algorithm need to be retrieved from Flash, we’ll see them in a traffic capture.

That’s exactly what we’re gonna do, and we’re gonna observe the UART interface, the oscilloscope and the logic analyser during/after pressing the reset button. The same process we followed for ATP’s save gives us these results:

UART output:

UART Factory Reset Debug Messages

Traffic overview:

Logic Screencap Traffic Overview

Output from our python scripts:

The traffic map tells us the device first reads and overwrites 2 large chunks of data from the protect area and then reads a smaller chunk of data from the filesystem (possibly part of the next process to execute):

___________________
|Transmission  Map|
|  MOSI  |  MISO  |
|        |0x7e0000| Size: 12    //Part of the Protected area
|        |0x7e0000| Size: 1782
|        |0x7e073d| Size: 63683
| ERASE 0x7e073d  | Size: 64kB
|0x7e073d|        | Size: 195
|0x7e0800|        | Size: 256
|0x7e0900|        | Size: 256
---------//--------
       [...]
---------//--------
|0x7e0600|        | Size: 256
|0x7e0700|        | Size: 61
|        |0x7d0008| Size: 65529 //Part of the Protected area
| ERASE 0x7d0008  | Size: 64kB
|0x7d0008|        | Size: 248
|0x7d0100|        | Size: 256
---------//--------
       [...]
---------//--------
|0x7dff00|        | Size: 256
|0x7d0000|        | Size: 8
|        |0x1c3800| Size: 512   //Part of the Filesystem
|        |0x1c3a00| Size: 512
---------//--------
       [...]
---------//--------
|        |0x1c5a00| Size: 512
|        |0x1c5c00| Size: 512
-------------------

Once again, we combine transmission map and binary files to gain some insight into the system. In this case, the ‘factory reset’ code seems to:

  1. Read ATP_LOG from Flash; it contains info such as remote router accesses or factory resets. It ends with a large chunk of 1s (0xff)
  2. Overwrite that memory segment with 1s
  3. write a ‘new’ ATP_LOG followed by the “current configuration” curcfg.xml file
  4. Read compressed (unintelligible to us) memory chunk from the filesystem

The chunk from the filesystem is read AFTER writing the new password to Flash, which doesn’t make sense for a password generation algorithm. That being said, the algorithm may be already loaded into memory, so its absence in the SPI traffic is not conclusive on whether or not it exists.

As part of the MOSI data we can see the new WiFi password be saved to Flash inside the XML string:

Found Current Password MOSI

What about the default password being read? If we look in the MISO binary, it’s nowhere to be seen. Either the Ralink is reading it using a different mode (secure/dual/quad/?) or the credentials/algorithm are already loaded in RAM (no need to read them from Flash again, since they can’t change). The later seems more likely, so I’m not gonna bother updating my scripts to support different read modes. We write down what we’ve found and we’ll get back to the default credentials in the next part.

Since we’re at it, let’s take a look at the SPI traffic generated when setting new WiFi credentials via HTTP: Map, MISO, MOSI. We can actually see the default credentials being read from the protect area of Flash this time (not sure why the Ralink would load it to set a new password; it’s probably incidental):

Default WiFi Creds In MISO Capture

As you can see, they’re in plain text and separated from almost anything else in Flash. This may very well mean there’s no password generation algorithm in this device, but it is NOT conclusive. The developers could have decided to generate the credentials only once (first boot?) and store them to flash in order to limit the number of times the algorithm is accessed/executed, which helps hide the binary that contains it. Otherwise we could just observe the running processes in the router while we press the Factory Reset button and see which ones spawn or start consuming more resources.

Next Steps

Now that we’ve got the code we need to create binary recreations of the traffic and transmission maps, getting from a capture to binary files takes seconds. I captured other transmissions such as the first few seconds of boot (map, miso), but there wasn’t much worth discussing. The ability to easily obtain such useful data will probably come in handy moving forward, though.

In the next post we get the data straight from the source, communicating with the Flash IC directly to dump its memory. We’ll deal with compression algorithms for the extracted data, and we’ll keep piecing everything together.

Happy Hacking! :)