Extracting Firmware from Embedded Devices (SPI NOR Flash) ⚡

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

One of the first things you must do when hacking an embedded device is to obtain and analyse the firmware. If you're lucky, you can download it from a website, or if you have a root shell, you can just get all the files from there. But what if none of these options are available? In this video, we will show you different firmware storage systems used by embedded devices, and how you can connect directly to a memory chip to dump a firmware image and find your vulns. Let's get started! Hey this is Pedro from the Flashback Team. Let's start by looking at a router board. We like to demo using routers because these devices are cheap, readily available, can be purchased by anyone and they contain a lot of technologies in big and small embedded devices. In the picture we have three things highlighted. We have the MCU, the SPI and the Flash Memory. The MCU is the micro controller unit. This is effectively a CPU in a package with some ram and some input and output peripherals. This MCU is a Realtek chip, as you can see by the logo and by the RTL designation. Why don't you go online and check the datasheet for this one? This package does not contain a lot of memory, so the firmware is not stored there. It will be stored in our Flash Memory chip. Upon boot, the MCU will communicate with the Flash Memory in order to get the firmware that will be executed. How does this happen? Well that's the magic of the SPI bus: Serial Peripheral Interface. This is a high speed, full duplex bus that allows the MCU to communicate with this NOR Flash Memory. Don't worry, we'll go into more details later. Let's briefly talk about flash memory types. On the left hand side, we have the one that we just saw, which is the NOR flash. It is shown in a SOIC8 package, but can come in different packages. Package is just a fancy name for how many legs the chip has. In the middle we got the NAND flash, which is higher density flash. Here is shown in a TSOP48 package but again can have many more legs or less in some cases. And lastly on the right hand side we have eMMC flash. This is a flash type that has very high density, meaning very large capacities, and here it comes in a Ball Grid Array (BGA) package, where the pins are under the chip. There's a fourth type of memory used in high-end devices that we do not mention here and is slowly replacing eMMC. It's called UFS. Your computer BIOS is most likely stored in SPI flash. Go ahead, open it up and have a look. But don't brick it! NAND would typically be used in larger devices, such as high-end routers, smart TVs, anything that needs bigger firmware. Whereas eMMC would be used in high-end devices. This would be more expensive stuff such as mobile phones, digital cameras, tablets, etc. We will not go into NAND versus eMMC discussions. Just bear in mind that NAND is accessed as a raw flash by the operating system, so it requires a bit more tricks let's say. Whereas the eMMC is a NAND with a built-in controller. Yeah, I know it sounds a bit complicated. If you want to know more, we might release a video in the future, or better yet just come to our training, where we explain all of this and play with all memory types. Let's talk a bit about NOR flash. It's a storage medium for non-volatile data. This means that the data which is written to it will remain in the chip until it is rewritten. Volatile data like RAM is erased when you reboot your computer, reset the device or turn the power off. Data can be read or written to on a byte by byte basis. This is actually a key feature of NOR flash. For example with NAND flash you cannot just read 1, 10 or 100 bytes. You have to read the size of the page of that NAND chip, which varies per chip, but is usually 4096 bytes. This means you have a lot more flexibility reading and writing with NOR flash. Another key property of NOR flash when compared to NAND flash and other types, is that it's mostly error free. What this means is that it does not require any special error correcting features in the chip to work properly, and this is not the case for NAND. Additionally it has a very low latency, which means you can execute directly from the flash memory, which again is not possible with NAND flash. As we said previously, this means that NOR flash is mostly used for embedded devices that do not require a lot of storage but need fast execution and fast memory. So how do we look at a NOR flash chip and identify it as such? Well they usually look like this as you're seeing on the left hand side. They usually come in a SOIC package although they come can come in bigger or smaller packages. The flash chip usually has a model number written on it. This is true not only for NOR but also for NAND. What you do is you read that model number, either with your eyes, magnifying lupe, or one of the things we like to do is take a high resolution photo with our phones and then just zoom in. And then you go on the Internet and you look for that model number. In this case MX25L6406E. If you're lucky, your datasheet will pop on your Google search and then you can read the data sheet, understand how the NOR flash model works and understand how you can interface with it. But sometimes the datasheet is not available. In this case you don't need to worry much, as we will see the SPI protocol is quite standardised, and as long as you guess where the power is and where the ground is so you don't fry your chip, you're usually OK, and Radek will show us how. Let's talk about the SPI protocol then. The SPI protocol allows for synchronous serial communication in full duplex mode using a master slave architecture. This mumbo jumbo means that usually there is a master which controls the communication and a slave which follows its orders, and where the data is written to or read from, but data flows both ways. The master provides a clock signal which will determine the speed of the communication. Multiple slaves can be connected to one master but a slave cannot connect to a slave. Finally, and not less important, SPI requires a minimum of four wires for communication. SCLK: this is the clock signal we just mentioned. CS: Chip Select, think of it like an enable button for the SPI chip, which turns it on. MOSI: Master Out Slave In. As the name implies this is where the chip receives data from the master, the output from the master. MISO: Master In Slave Out. This is where the slave sends the data back to the master. OK, I know this is all very abstract, you're bored, tired of my voice. The good news is that Mr. Radek is going to take over from me right now and show exactly how all of this works in practice. This is Raaaaddddeeeekkk... All the content we put on our YouTube channel, all the advisories and exploits we release, that's all free and sponsored by... Our own training! Do you enjoy watching our videos, learning new hacking techniques and finding and exploiting vulnerabilities? Then why don't you come to our embedded device hacking course, which we regularly host all over the world, live, in person. There are many hardware and embedded device hacking, IoT exploitation courses out there but ours is truly unique. Why is that Rado? We focus only on REAL vulnerabilities that we or other hackers have found. And as you can see from our videos, we have a lot of experience hacking real devices in Pwn2Own and for our day jobs. The goal of the course is to teach you how to take apart embedded devices by analysing the hardware, obtaining the firmware, reverse engineer it, find a vulnerability and exploit it. Our mottos are: NO FAKE VULNS and PoC || GTFO! For more details check our website: training.flashback.sh Get your ticket now! That is a lot of theory... Let's turn that into practice. This is our target. We can see the main CPU and some data lines which go to the flash. The flash contains all the information that is needed for the router to work. Firmware, configuration and so on. It has to read it at the boot time. Can you sniff it? Let's try. For this we would need some hardware: hooks or a SOIC8 clip, cables and the logic analyser sal dei saleh salias... Jesus! With hooks we can connect directly to the legs of the chip. But in this case I prefer the clip, it's just faster and a little bit more stable. After that I connected the logic analyser according to the datasheet pinout. Channel 0 to Chip Select, Channel 1 to SO data line, channel 2 to clock and channel 3 to SI data line. The logic analyser will allow us to sniff the traffic. Let's hit that power button and see what is happening on the wire. This is the software that comes with Saleae. I set the device to capture data of highest possible speed settings. Make sure you use the original high quality cables delivered with Saleae or you might not always get the most reliable data. OK let's start capturing. Nothing yet... Wow we see some data! This is literally the router pulling some data from the chip. You're sniffing on the bus, how awesome is that! Let's have a closer look and see what is happening on the wire. OK this looks very interesting and promising. We can see some waves changing but let's rename the channels to where they are connected physically. Chip Select, Master In Slave Out, Clock, Master Out Slave In. This software has an amazing feature: we can add analysers and they support SPI out of the box. That means it will try to interpret the data. We assign the channels according to the protocol definition, everything everything else stays as default. And there we go! It interpreted the waves but what does it really mean? For that we have to dig into the datasheet and try to understand a little better how SPI works. We are sniffing on four pins of the flash. Pedro explained earlier in the video what is the responsibility for each of the pins,. So we know that the flash will receive data on line MOSI. SPI implements a set of commands. When it receives data, it matches it against available commands and and interprets the following data accordingly. In our case we check the command 0x3, which is the READ command. It has an action of: "n" bytes read until Chip Select goes high. OK let's try to understand it better. This is the sequence diagram for the READ command. On top the communication is started by be carefully pulling the CS line down. By default it idles high. Next the clock dictates the speed of the communication. Each clock cycle is 1 bit and then 8 bits are sent on the SI line, which sends the 0x3 command which is READ. But it is followed by 3 bytes of data. Those 3 bytes say from which memory address to read data from. And after this command, data should be shifted out on the SO line until the CS line goes back to high. OK let's see if that is what we saw in the logic analyser software. Yes! Notice it's exactly as presented in the datasheet. We can see the data being returned. So does it mean we can sniff firmware like this? Well I would advise not to do so and be careful with this. The data read by the router might not be in sequence, might not be complete or some other things might happen. I would always advise to dump the firmware offline. But how do we do that? We need a microcontroller that is able to speak the SPI protocol. Ladies and gentlemen, let me introduce to you one of my best hardware hacking tools: the hydrabus! It's an open source device that implements tons of protocols. It's like a Swiss army knife tool. You just check the breakout diagram and wire cables accordingly to what you want to do. Notice it has two channels for SPI. There are other tools out there that you could use for SPI dumping like the FTDI 232 chip, Bus Pirate, but this one here is much faster, implements more protocols and I can extend its functionality as it is open source. OK let's see how we could use the hydrabus to dump the firmware from our target. What do we need: a clip, the hydrabus of course, good quality cables. Now we have to match the pins on the chip to the breakout on the hydrabus. I will be using SPI2 now as it supports serprog that I will use to dump the firmware shortly. So I have to simply match each of the lines on the chip to the pins on the hydrabus, but this time I'll be providing 3.3 volt power to the target, so I have to wire power and ground lines too. OK everything wired, we are ready to connect the hydrabus to our computer and we can use a magic tool called flashrom in serprog mode. flashrom is a standard tool used to dump firmware via SPI. It has tons of hardware supported so you might be lucky with our target. Let's try. Oh so it says it recognised more possible chip models and is unsure which one to use. Let's help it as we are we were able to read the model of the chip package. Wow it dumps it! OK is binwalk going to recognise it? Yes that's the file system of the target! Awesome!. Sometimes flashrom might have problems to read data from the chip and you might have to lower the clock speed. You can easily do that with the spispeed command. OK so it's a true Hackerman job. But what happens if flashrom doesn't support your target chip? You can cry in a corner or speak with the SPI chip on your own terms, and I will show you now how to send any SPI command to the chip. For this we need the hydrabus. We can connect to the hydrabus via serial console. It welcomes us with a nice menu. We select SPI in here, we have tons of things that we can set and tune. Not everything is in scope of this video but I suggest you explore on your own. I use the SPI1 device now as this is where my cables are connected on the hydrabus. OK let's have a look at the datasheet. One of the very nice command to send is READ ID. Each chip has a hardcoded ID and if we ask nicely it will return it to us. We already know how to interpret this diagram. We have to pull the CS line down, send 0x9F command and read data from the MISO line. In the hydrabus, we can do it very easily. The bracket [ is used to instruct the hydrabus to pull the CS line down, then we send the command 0x9F and we tell hydrabus to read three bytes. OK let's try it. Yes it returned data! And yes this is valid READ ID data, as it matches what we expect based on the data sheet! And this is how it looks on the wire in the logic analyser. So what else can we do? Let's send the READ command and read from the beginning of the chip. It is 0x3, and as you want to read from the beginning, we send 00 00 00 and read data off the line. Yeah data is returned! So is this the way to dump the firmware? Well I think that would be quite inefficient... but we can script it! Yes Radek we can script it. But guess what, I have already done it back in 2018! You just need to go to the hydrobus git repository, to the contrib directory and you will see there's an SPI dumping script which works very quickly. Give it a try and you see it works quite well. You can use it to dump any standard SPI chip. And then, once that's done, it's time to analyse the firmware. We win!

Info

Channel: Flashback Team

Views: 411,510

Rating: undefined out of 5

Keywords: security, vulnerability research, SPI, NOR Flash, Flash, Firmware extraction, Firmware dumping, flashrom, hydrabus, Pwn2Own, embedded devices

Id: nruUuDalNR0

Channel Id: undefined

Length: 18min 40sec (1120 seconds)

Published: Fri Sep 09 2022