Extracting Firmware from Embedded Devices (SPI NOR Flash) ⚡

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
One of the first things you must  do when hacking an embedded device   is to obtain and analyse the firmware. If you're lucky, you can download it from  a website, or if you have a root shell,   you can just get all the files from there. But what if none of these options are available? In this video, we will show you different  firmware storage systems used by embedded devices,   and how you can connect directly to a memory chip  to dump a firmware image and find your vulns. Let's get started! Hey this is Pedro from the Flashback Team. Let's start by looking at a router board. We like to demo using routers because  these devices are cheap, readily available,   can be purchased by anyone and they contain a lot  of technologies in big and small embedded devices. In the picture we have three things highlighted. We have the MCU, the SPI and the Flash Memory. The MCU is the micro controller unit. This is effectively a CPU in a package with  some ram and some input and output peripherals. This MCU is a Realtek chip, as you can see  by the logo and by the RTL designation. Why don't you go online and  check the datasheet for this one? This package does not contain a lot of  memory, so the firmware is not stored there. It will be stored in our Flash Memory chip. Upon boot, the MCU will communicate with the   Flash Memory in order to get the  firmware that will be executed. How does this happen? Well that's the magic of the SPI  bus: Serial Peripheral Interface. This is a high speed, full duplex bus that allows  the MCU to communicate with this NOR Flash Memory. Don't worry, we'll go into more details later. Let's briefly talk about flash memory types. On the left hand side, we have the one  that we just saw, which is the NOR flash.  It is shown in a SOIC8 package,  but can come in different packages. Package is just a fancy name  for how many legs the chip has. In the middle we got the NAND flash,  which is higher density flash. Here is shown in a TSOP48 package but again  can have many more legs or less in some cases. And lastly on the right hand  side we have eMMC flash. This is a flash type that has very high  density, meaning very large capacities,   and here it comes in a Ball Grid Array (BGA)  package, where the pins are under the chip. There's a fourth type of memory used in  high-end devices that we do not mention here   and is slowly replacing eMMC. It's called UFS. Your computer BIOS is most  likely stored in SPI flash. Go ahead, open it up and have a look. But don't brick it! NAND would typically be used in larger devices,   such as high-end routers, smart TVs,  anything that needs bigger firmware. Whereas eMMC would be used in high-end devices. This would be more expensive stuff such as  mobile phones, digital cameras, tablets, etc. We will not go into NAND versus eMMC discussions. Just bear in mind that NAND is accessed  as a raw flash by the operating system,   so it requires a bit more tricks let's say. Whereas the eMMC is a NAND  with a built-in controller. Yeah, I know it sounds a bit complicated. If you want to know more, we might release  a video in the future, or better yet just   come to our training, where we explain all  of this and play with all memory types. Let's talk a bit about NOR flash. It's a storage medium for non-volatile data. This means that the data which is written to it   will remain in the chip until it is rewritten. Volatile data like RAM is erased when you   reboot your computer, reset the  device or turn the power off. Data can be read or written  to on a byte by byte basis. This is actually a key feature of NOR flash. For example with NAND flash you  cannot just read 1, 10 or 100 bytes. You have to read the size of  the page of that NAND chip,   which varies per chip, but is usually 4096 bytes. This means you have a lot more flexibility  reading and writing with NOR flash. Another key property of NOR flash when  compared to NAND flash and other types,   is that it's mostly error free. What this means is that it does not require  any special error correcting features   in the chip to work properly, and  this is not the case for NAND. Additionally it has a very low  latency, which means you can execute   directly from the flash memory, which  again is not possible with NAND flash. As we said previously, this means that  NOR flash is mostly used for embedded   devices that do not require a lot of storage  but need fast execution and fast memory. So how do we look at a NOR flash  chip and identify it as such? Well they usually look like this as  you're seeing on the left hand side. They usually come in a SOIC package although  they come can come in bigger or smaller packages. The flash chip usually has a  model number written on it. This is true not only for NOR but also for NAND. What you do is you read that model number, either  with your eyes, magnifying lupe, or one of the   things we like to do is take a high resolution  photo with our phones and then just zoom in. And then you go on the Internet  and you look for that model number. In this case MX25L6406E. If you're lucky, your datasheet will pop on your  Google search and then you can read the data   sheet, understand how the NOR flash model works  and understand how you can interface with it. But sometimes the datasheet is not available. In this case you don't need to worry much, as we  will see the SPI protocol is quite standardised,   and as long as you guess where the power is and  where the ground is so you don't fry your chip,   you're usually OK, and Radek will show us how. Let's talk about the SPI protocol then. The SPI protocol allows for  synchronous serial communication   in full duplex mode using a  master slave architecture. This mumbo jumbo means that  usually there is a master   which controls the communication and  a slave which follows its orders,   and where the data is written to or  read from, but data flows both ways. The master provides a clock signal which will  determine the speed of the communication. Multiple slaves can be connected to one  master but a slave cannot connect to a slave. Finally, and not less important, SPI requires  a minimum of four wires for communication. SCLK: this is the clock signal we just mentioned. CS: Chip Select, think of it like an enable  button for the SPI chip, which turns it on. MOSI: Master Out Slave In. As the name  implies this is where the chip receives   data from the master, the output from the master. MISO: Master In Slave Out. This is where  the slave sends the data back to the master. OK, I know this is all very abstract,  you're bored, tired of my voice. The good news is that Mr. Radek  is going to take over from me   right now and show exactly how  all of this works in practice. This is Raaaaddddeeeekkk... All the content we put on our YouTube channel,  all the advisories and exploits we release,   that's all free and sponsored by... Our own training! Do you enjoy watching our videos,  learning new hacking techniques   and finding and exploiting vulnerabilities? Then why don't you come to our embedded device   hacking course, which we regularly host  all over the world, live, in person. There are many hardware and embedded device  hacking, IoT exploitation courses out there   but ours is truly unique. Why is that Rado? We focus only on REAL vulnerabilities  that we or other hackers have found. And as you can see from our videos,  we have a lot of experience hacking   real devices in Pwn2Own and for our day jobs. The goal of the course is to teach you how to take  apart embedded devices by analysing the hardware,   obtaining the firmware, reverse engineer  it, find a vulnerability and exploit it. Our mottos are: NO FAKE VULNS and PoC || GTFO!  For more details check our  website: training.flashback.sh Get your ticket now! That is a lot of theory...  Let's turn that into practice. This is our target. We can see the main CPU and some  data lines which go to the flash. The flash contains all the information  that is needed for the router to work.   Firmware, configuration and so on. It has to read it at the boot time. Can you sniff it? Let's try. For this we would need some  hardware: hooks or a SOIC8 clip,   cables and the logic analyser  sal dei saleh salias... Jesus!  With hooks we can connect  directly to the legs of the chip. But in this case I prefer the clip, it's  just faster and a little bit more stable. After that I connected the logic analyser  according to the datasheet pinout.  Channel 0 to Chip Select, Channel 1 to   SO data line, channel 2 to clock  and channel 3 to SI data line. The logic analyser will allow  us to sniff the traffic. Let's hit that power button and  see what is happening on the wire. This is the software that comes with Saleae. I set the device to capture data  of highest possible speed settings. Make sure you use the original high  quality cables delivered with Saleae   or you might not always  get the most reliable data. OK let's start capturing. Nothing yet... Wow we see some data! This is literally the router  pulling some data from the chip. You're sniffing on the bus, how awesome is that! Let's have a closer look and see  what is happening on the wire. OK this looks very interesting and promising. We can see some waves changing but  let's rename the channels to where   they are connected physically. Chip Select, Master In Slave  Out, Clock, Master Out Slave In. This software has an amazing feature: we can add  analysers and they support SPI out of the box.  That means it will try to interpret the data. We assign the channels according  to the protocol definition,   everything everything else stays as default. And there we go!  It interpreted the waves but  what does it really mean? For that we have to dig into the datasheet and  try to understand a little better how SPI works. We are sniffing on four pins of the flash. Pedro explained earlier in the video what  is the responsibility for each of the pins,. So we know that the flash will  receive data on line MOSI. SPI implements a set of commands. When it receives data, it matches  it against available commands   and and interprets the following data accordingly. In our case we check the command  0x3, which is the READ command. It has an action of: "n" bytes  read until Chip Select goes high. OK let's try to understand it better. This is the sequence diagram for the READ command. On top the communication is started by  be carefully pulling the CS line down. By default it idles high. Next the clock dictates the  speed of the communication. Each clock cycle is 1 bit and then   8 bits are sent on the SI line, which  sends the 0x3 command which is READ. But it is followed by 3 bytes of data. Those 3 bytes say from which  memory address to read data from. And after this command, data should  be shifted out on the SO line   until the CS line goes back to high. OK let's see if that is what we  saw in the logic analyser software. Yes! Notice it's exactly as  presented in the datasheet. We can see the data being returned. So does it mean we can sniff firmware like this? Well I would advise not to do  so and be careful with this. The data read by the router  might not be in sequence,   might not be complete or some  other things might happen. I would always advise to  dump the firmware offline. But how do we do that? We need a microcontroller that is  able to speak the SPI protocol. Ladies and gentlemen, let me introduce to you one  of my best hardware hacking tools: the hydrabus! It's an open source device that  implements tons of protocols. It's like a Swiss army knife tool. You just check the breakout diagram and wire  cables accordingly to what you want to do. Notice it has two channels for SPI. There are other tools out there that you could  use for SPI dumping like the FTDI 232 chip,   Bus Pirate, but this one here is much faster,   implements more protocols and I can extend  its functionality as it is open source. OK let's see how we could use the hydrabus  to dump the firmware from our target. What do we need: a clip, the hydrabus  of course, good quality cables. Now we have to match the pins on the  chip to the breakout on the hydrabus. I will be using SPI2 now as it supports serprog  that I will use to dump the firmware shortly. So I have to simply match each of the lines on the  chip to the pins on the hydrabus, but this time   I'll be providing 3.3 volt power to the target,  so I have to wire power and ground lines too. OK everything wired, we are ready to  connect the hydrabus to our computer   and we can use a magic tool  called flashrom in serprog mode. flashrom is a standard tool  used to dump firmware via SPI. It has tons of hardware supported so you  might be lucky with our target. Let's try. Oh so it says it recognised more possible  chip models and is unsure which one to use. Let's help it as we are we were able  to read the model of the chip package. Wow it dumps it! OK is binwalk going to recognise it? Yes that's the file system  of the target! Awesome!. Sometimes flashrom might have  problems to read data from the   chip and you might have to lower the clock speed. You can easily do that with the spispeed command. OK so it's a true Hackerman job. But what happens if flashrom  doesn't support your target chip? You can cry in a corner or speak  with the SPI chip on your own terms,   and I will show you now how to  send any SPI command to the chip. For this we need the hydrabus. We can connect to the hydrabus via serial console. It welcomes us with a nice menu. We select SPI   in here, we have tons of things  that we can set and tune. Not everything is in scope of this video  but I suggest you explore on your own. I use the SPI1 device now as this is where  my cables are connected on the hydrabus. OK let's have a look at the datasheet. One of the very nice command to send is READ ID. Each chip has a hardcoded ID and if  we ask nicely it will return it to us. We already know how to interpret this diagram. We have to pull the CS line down, send 0x9F  command and read data from the MISO line. In the hydrabus, we can do it very easily. The  bracket [ is used to instruct the hydrabus to   pull the CS line down, then we send the command  0x9F and we tell hydrabus to read three bytes. OK let's try it. Yes it returned data! And yes this is valid READ ID data, as it  matches what we expect based on the data sheet! And this is how it looks on  the wire in the logic analyser. So what else can we do? Let's send the READ command and  read from the beginning of the chip. It is 0x3,   and as you want to read from the beginning,  we send 00 00 00 and read data off the line. Yeah data is returned! So is this the way to dump the firmware? Well I think that would be quite  inefficient... but we can script it! Yes Radek we can script it. But guess what, I have  already done it back in 2018! You just need to go to the  hydrobus git repository,   to the contrib directory and you will see there's  an SPI dumping script which works very quickly. Give it a try and you see it works quite well. You can use it to dump any standard SPI chip. And then, once that's done, it's  time to analyse the firmware. We win!
Info
Channel: Flashback Team
Views: 411,510
Rating: undefined out of 5
Keywords: security, vulnerability research, SPI, NOR Flash, Flash, Firmware extraction, Firmware dumping, flashrom, hydrabus, Pwn2Own, embedded devices
Id: nruUuDalNR0
Channel Id: undefined
Length: 18min 40sec (1120 seconds)
Published: Fri Sep 09 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.