Creating Custom AXI Slave Interfaces Part 1 (Lesson 6)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hi how are you doing I'm Mohammed Sadri a member of micro electronics systems design research group of tu casa slaughter and this is video number six of our zinc training series and it's titled creating custom aksoy slave peripherals the easy way so there are a lot of methods that you can use to create in fact Aksai a slave plugs for your custom modules that you create and what we want to emphasize on is we want to make it as easy as possible so that everybody can do it so here is the topics that I want to talk about today first a kind of small introduction and motivation of why you should do that so why do we need to be able to realize our own acts is slave interfaces and different methods and different types that exists for these acts is slave blocks and then a brief overview of the overall idea that we use for realizing modules with these acts our slave clocks and then I take one example design and then I go through that example design through a practical experiment to design the real module with its axis lave interfaces and to make it run on the zc 706 board so here suppose that you have a task in which you have the zinc device or generally speaking any other FPGA and then there are some data which are residing outside of the system and you want to read this data and then you want to transfer this data to the CPUs that you have in your system so if you are using the zinc you are obviously the CPU is the arm house if you are using any other FPGA probably the CPU is microblaze so if I want to read this data and to transfer it to the arm hearts the arm house and the arms up system that I have here they are entirely talking based on axiom actually interfaces actual protocol that we have talked about it previously and so what I need to do is in fact when I receive this data I need to get the data and transfer the data to the CPU based on that XY protocol so I need a logic which is capable of talking based on this protocol and as a very quick solution if it's just a simple piece of data that I want to read then I put one XII GPIO it's kind of module which is already available so I put my XO GPIO and then I configure XO GPIO to generate an interrupt whenever a new data is arrived so the new data arrives the interrupt gets generated the CPU sees the interrupt and then for example through the gp0 port of the zinc device this axon master will perform a read transaction to this actual slave and will read the data but this is not always true it's not always true that you can use a simple axle GPIO to perform any kind of in fact input output transfer task that you want just imagine that receiving the data from outside involves a kind of complicated communication protocol for example there are set of handshake signals which should be set correctly by your logic before you can receive the data or for example suppose that I am NOT interested in receiving all of the data but I am receiving in I'm interested in receiving just some specific pieces of the data so every time the data arrives I want to first examine the data and then if it is of interest then I want to transfer it to the CPU it happens many times that you need a kind of costume design here for interfacing to the outside world and an oxide GPIO which is a kind of pre-built module and ready to use module and it's not really customizable it cannot be used here it happens in many situations that you need to realize your own logic here to create your custom interfaces to the rest of the logic to the rest of the world that you have so we need to be able in fact to create our own custom peripheral and our own custom peripheral should be able to talk to the CPUs of system through the aksoy portugal so the costume peripheral that i create should for example feature an axial slave clock so that the CPU can perform actual transaction treat and can do the data transfers or another example suppose that I have my zinc device and then there are a set of computational tasks that I want to do and I know that I can implement these computational tasks directly on hardware and if I do that I can do those computational tasks match faster and with much lower power consumption because I directly realize the computational tasks on hardware instead of letting a CPU to do it cycle by cycle which usually takes longer time and higher amount of energy I want to directly implement my computational tasks on Hardware so I develop in fact the kind of hardware accelerator which is basically my computational task and this hardware accelerator that I have created should be able to talk to the CPU subsystem and the CPU subsystems talking aksoy so my harder accelerator should be able in fact to understand actual protocol and at least it should be able to respond to the actual requests coming from outside correctly again this is another example that shows that the RTO the logic that I'm realizing needs to be able to understand and talk to the aksoy devices to the other oxide devices in the system so up to now we have come into conclusion that we need to be able to talk to Axl so what possible solutions do I have to add this feature to my own logic the possible solutions are the following first your logic needs to talk to an actual interface well you can read their aksoy specification which is a complete document it's coming from arm and then based on that specification you can develop a logic a set of RTL code which will enable your module to send and receive data through aksoy so the first option which is kind of for very professionals and it's kind of time-consuming and difficult is that you develop your whole thing yourself however there are other options for example there are set of IP blocks that are called IP interface so we have a set of Xilinx i pif units and each of these IP RF units is specialized to talk in fact to a specific type of actual interface or is specialized to perform a specific type of transfer tasks for example there's an IP RF that allows you to instantiate this block inside your module and then when you instantiate this IPF inside your module in fact you can talk to the other devices based on actual light protocol or there's an IP I f4 in fact talking to the outside world to the X award to the aksoy bursts so what basically the IP I have does is that it simplifies talking to the aksoy world so at one side it gets connected to the aksoy world and it generates all of those required signals we require timings and on the other side the IP IAF provides us with a very simple and easy to use memory like interface an interface which is basically an address or read enabled a write enable size of data which should be transferred so the interface here is a very easy to understand interface set of simple signals so that the developer which is going to develop his own logic can very easily develop the required logic to talk to this interface and then the IP if' is responsible for taking this signals and converting them and making them compatible with the aksoy timing and actually complicated signaling and protocol so IP ifs they are the second way that there's another way and that is when you use the Vivaro environment there's this V sword that allows you to create custom peripherals and through the wizard you are able in fact to create units with aksoy a slave or master plugs with some sample pre generated code so when you use this wizard with Vivid all the bravado will also produce a kind of code for you kind of our tale for you which is already working and is doing the basic functionality which is required for that specific of EXO vlog I usually it's a very good idea to take this pre-generated code and to work on it a little to change it a little and to adapt it to the type of application that we have finally another method in fact to create modules with aksoy blocks is to use the HLS flow so using the Veda HLS you directly code in C you don't code in a very long or VHDL you directly code in C and then through few lines of code you can realize different types of I saw a master or slave blocks very very easily so actually you you write a set of C program lines that just initiates the required transactions or accept some transactions and then we rather HLS environment will convert all of your C code automatically into a suitable RTL which is producing all of those required signals for the actual protocol and with correct timing for this video what I want to show you is in fact a third option so I used the very environment and then I asked it to generate some almost ready to use code and then I customized it for my specific name okay so as we go ahead for aksoy slave interfaces you will notice that there exist two types of exile slave interfaces the first one is actual light XY a slave light and here are the list of signals for an exile a clock in light mode in light mode your axial slave does not support bears transactions so it only transfers one word of data each time it also does not support the idea signals on the other hand the other possibility for an ex-wife or a slave block is to have an ex-wife or a slave fool in fact interface and there you have complete list of signals for your acts or interface and your actual slave plug should support in fact births transactions and your actual slave block may support also kind of responding to several transactions concurrently and therefore you have the IDC so for both of the modes of our excise LAIV block exile a flight and an actual slave foo we have these basically these five channels so for both of them we have in fact there is address channel read response channel and write address channel right response channel and then write data so in terms of basic structure they are completely the same but the axial slave light is much simpler than an actual slave full interface and for many of tasks that we want to do really and I saw a slave light plug is completely so we really don't need an actual slave full interface because for many of the tasks that video we don't really need to support bursts and we don't really need to be able to handle several transactions concurrently so what is this ID signal that I was talking about why it's important why does it exist so if you look at the list of signals in actual food and compare it with the list of signals in the axial light you can see that for all of the channels the ID signals are missing imagine this simple case I have 4 X or masters here 1 X are inter connect and then 1 X is slave for now suppose that my axe is slave is a simple access lake and is capable of performing only single beat transaction single word transactions and is only capable of serving one transaction at each time in this case this actual slave is receiving only one transaction is showing it processes the transaction and it responds to the transaction and then it goes to an X transaction so at each time only one of these acts or masters is allowed to perform a read or a write operation to his exit slave so the actual inter connect that I have here can assign an ID to each of these acts or masters in fact each of its actual slave ports that it has here and whenever transaction comes on this port it routes the transaction to the slave and then when the response of the transaction comes back the actual interconnect already knows who was the master that was that had initiated that transaction so it routes back their response to that specific master this is because only one master at each time is practically active and it's practically performing it translate but now imagine a case that you have your ex or masters and you have an actual slave block which supports bursts and it's kind of capable of answering to the requests of several masters at the same time suppose that for example it's a very high performance memory controller and can handle several requests at the same time so if three of these guys are sending transactions at the same time then how should the actual interconnect recognize that the response which is coming back from the actual slave should be routed to which of these acts or masters so at the 10th at the time of routing back the response from the actual slave to the actual master we will have a problem because basically these three are running in parallel we have three transactions ongoing in parallel and if you be very difficult we're almost impossible for the axon interconnect to understand by itself which response should be routed there this is why the idea signals exist in fact in this case every transaction will be identified by an ID and then based on the ID the response will be routed back to the suite level X our master so if I look at my actual slave full interface for example I can see that for the right address Channel I have this aww ID signal so whenever an excellent master issues a transaction it puts a value on this aww ID and then when the aksoy a slave is responding to this specific right transaction on the right response channel that we have here on the bi DC know the actual slave will put exactly the same ID that it had received previously through the right address channel this way the transactions will be identified from each other completely okay so for now the general idea that we want to follow is the following we want to be able to create our own hardware accelerator and I don't want to create everything myself but I want to use what there is already there as much as possible and how does this system is going to work this system will have only access slave clocks and then it will have also an interrupt port so whenever we want to use this hardware accelerator what we do is we transfer the data to our hardware accelerator and we enable the hardware accelerator to perform the operation and whenever the hora accelerator is finished with the computational task it generates the intro and then we receive the interrupt we go and read back the response so this block for our current designs is completely a passive block it's not going to generate or initiating transaction by its own and it has only as X is left blocks but then it has its interrupt port and through the interrupt port it informs the CPU then it should come and read the data and then it should come and write a new data so for now we don't need any axon master plug for our hardware Isolators but the mechanism through the interrupt really allows us to use our Hardware accelerate or efficiently so I want as a practical example create for you a simple hardware accelerator it happens in many of different applications that you have a bunch of data and in this data that you have you want to see how many times a specific pattern has repeated it's a kind of statistical analysis that you do on the receive data it's used in many cases for example in most of database applications this is a kind of search that you temporarily or you usually need to do so what's the story the story is that I have the CPU and the CPU has a bunch of data and I want to develop a hardware accelerator which receives this data and reports back to the CPU how many times a specific pattern has repeated in this data so it has a kind of RAM the data will be written to the RAM and then it has a module a kind of pattern matching module that when you're writing the data this pattern matching module is also active and analyzes the data and sees if any part in the data is matching their specific pattern that we want and then we have a kind of register in our system that when the data transfer and analysis task is finished this register will contain a number and this number is indicating how many times the specific pattern has repeated so here is the general operation the CPU copies the data to the RAM of the hardware accelerator the hardware accelerator begins the processing of the data and as soon as it finishes it generates an interrupt the CPU that I have finished my processing and as soon as the CPU receives the interrupt it goes and reads this register so it understands how many times that specific pattern has occurred in the text or in the data array that it has now I want my hardware accelerator to talk also to the outside world so I have a signal for my Hardware accelerate or the signal gets activated when this number is above a specific threshold for example if the number of specific patterns that we are looking for is above 10 I want this signal also to get activated and then one hardware accelerator has also an enable input module through which from the outside world you can in fact enable and disable this harder accelerator completely so in my practical experiment that I want to do I want to connect this signal for example to Alette and that's you know to a simple deep switch and then I have the arm heart of the zinc as my CPU and then here is the hardware accelerator that I am going to develop so here is a kind of basic structure of our Hardware accelerate our Hardware accelerate or and this is just an example of how you can do it it's not necessarily the most efficient example but maybe it's the simplest way to do it so what I want to do for my harder accelerator I will realize in fact to exile slave blocks one axial slave plug is responsible for receiving the data this one should be an actual slave food clock because I want to receive the data as fast as possible and in order to do that I should allow the CPU to in fact send the data in bursts so I need an actual slave food the dioxide Lefou block contains a ram and memory which is responsible for storing the data and then it also contains this pattern matching engine which is basically looking at the incoming data and is identifying if the specified pattern exists in the incoming date in the output of the pattern matching engine is a kind of counter which is indicating how many you know how many times this specific pattern has been repeated and my heart accelerator has an axle slave light block and this actual slave light block will contain a register which I allow the CPU to read this register through the aksoy slave light interface of this actual slave light block also the intro up to the CPU will be generated by this actual slave light the enable signal the enable and the sino are also connected both to this access slave light so my hardware accelerator it contains two XO a slave plugs one and access slave light and one an exile a fool let's see how in vivid oh we can create this block and what's the procedure that we follow to in fact create a hardware that we want you
Info
Channel: Microelectronic Systems Design Research Group
Views: 30,428
Rating: 4.826087 out of 5
Keywords: AXi, Zynq, Xilinx, Mohammad Sadr, Matthias Jung, Norbert Wehn, University Of Kaiserslautern (College/University), Technische Universität Kaiserslautern, AXI, Lectures, Microelectronics, Microelectronic System Design Research Group, Lessons, Teacher, Advanced Microcontroller Bus Architecture, Student, Learn
Id: meQcwzC4Vtk
Channel Id: undefined
Length: 30min 50sec (1850 seconds)
Published: Wed Jan 21 2015
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.