AMD ROCm on WINDOWS for STABLE DIFFUSION released SOON? 7x faster STABLE DIFFUSION on AMD/WINDOWS.

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Stable Diffusion with AMD ROCm on Windows. Will we have a release soon? And is there a way to speed up Stable Diffusion with AMD Graphics Cards on Windows just in case ROCm still takes a while? For sure you've heard about AMD's plans to support ROCm on Windows. ROCm is an AMD software stack supporting GPUs. Among other things it's used to run AI tools like Stable Diffusion. In this video I'll tell you the status of ROCm on Windows for Stable Diffusion and how to get detailed information. In the second part of the video I'll show you how to speed up Stable Diffusion on Windows with your AMD GPU. Image generation on Windows with my Radeon 6800 is 7 times faster now. But let's start with ROCm. If you ever have started Stable Diffusion then you know it depends on PyTorch. On this webpage you see what's required for PyTorch. You find all links in the description below. We choose the nightly build as it includes the most current versions. Select Windows and Pip is the package manager for Python. Now if you select ROCm then you can see ROCm is not available on Windows. But what does it mean? Will ROCm be available someday on Windows? In order to answer this question we'll have a look at the related GitHub page. There we can find an issue ROCm support for Windows. That's quite promising. But as you can see the issue is still open. Nevertheless it looks like they really want to support Windows. If we scroll down a bit then we can see Windows support for PyTorch is currently not available. And AMD is continuing to invest in Windows support. For details we find the link. Behind this link you find the list of all ROCm components. And you see the status for Linux and for Windows. As you can see HIP is supported. And exactly that was the AMD announcement regarding ROCm. HIP is a runtime API that can be used to write portable code which can be run both on AMD and NVIDIA GPUs. And it's not, again it's not the component being required for PyTorch. So for Stable Diffusion. The component which is required for PyTorch is MIOpen. And as you can see it is currently not available. On the PyTorch GitHub page there is another interesting comment. We already know that MIOpen is missing and here is even a link to the related pull request to MIOpen. On the ROCm MIOpen GIT Hub page we can indeed find a pull request to enable MIOpen on Windows. And good news it's already closed. But if we scroll down then we find the work continues in a separate series of pull requests. Now we check the pull requests and check for items related to Windows. As you can see currently there are still four pull requests to go. I had a look at the latest closed pull requests and there is indeed some movement. Nevertheless most of these pull requests had a duration from one to two months. To summarize it really seems both the ROCm team and the PyTorch team are willing to have a running version of PyTorch based on ROCm for Windows. Which is the requirement for stable diffusion. But as you have seen both on the ROCm side as well as on the PyTorch side there is still development to be done. Don't count the release of PyTorch on ROCm and Windows in weeks. Count it, well, in months. Nevertheless now you can check yourself when the MIOpen development has been finished on the ROCm side. Just check for the open pull requests and I'm sure it will be announced in the MIOpen readme. There is another source which can be checked and it's the documentation of MIOpen. I have the impression it's not updated very frequently. At least version 2.19 is already released and it's not even listed here. Regarding PyTorch, ROCm support for Windows, well check this webpage. And the official announcement for PyTorch and ROCm for Windows, well I'm sure that will be here. Well now you know how to get the latest information about the stable diffusion requirement PyTorch on ROCm and Windows. Or you can simply subscribe to this channel. I will tell you for sure. In the second part of this video we want to speed up stable diffusion with AMD on Windows by using Microsoft Alive ONNX. As a preparation we have to install GIT and Miniconda. The installation both of GIT and Miniconda is straightforward so you shouldn't expect any trouble here. For GIT choose the 64bit Windows setup. For Miniconda the Windows 64bit version. After installing Miniconda you will find on your PC an Anaconda prompt and an Anaconda PowerShell. We will use the Anaconda prompt. Choose the suitable directory. We create the conda environment for the automatic Olli webui including Python 3.10. We activate the newly created environment. Now we clone a popular version of the Windows stable diffusion webui. We enter the newly created directory, update the sub-modules and start the backend of the WebUI. The first run takes several minutes as it prepares the required packages. You start the WebUI by using this URL. As you can see the stable diffusion model failed to load. The reason is now we have to create the ONNX model. We change the tab to Olive. Microsoft Olive is a tool which optimizes the given source model resulting in an ONNX based model and a much faster toolchain based on this ONNX model. Optimize ONNX model. The required information is already entered here so we can just choose optimize model using Olive. This will take some time. Don't be confused by the keyword no GPU for ONNX runtime. The reason is the conversion is done by CPU but once this is finished the model will be accelerated by GPU. After everything is finished we can simply refresh and choose the optimized checkpoint. Then we switch to text to image. We enter a prompt and select DPM as the sampling method. Then we start generating and with my graphics card which is a Radeon 6800 we have nearly 4 iterations per second. Although the picture looks a bit strange you see it's working and it's quite fast. Without ONNX my Radeon 6800 makes 1 iteration in around 1.7 seconds. So the described ONNX solution is 7 times faster. What's left is StableDiffusion XL. Well frankly speaking this is currently not working with this package but I'll show you what to do in case it's finished and how to use StableDiffusion XL with this UI. Go to hugging face, search for Stability AI, select the XL model and in files and versions download the save tensors. Put the files into StableDiffusion web UI, models, StableDiffusion. Go back to the UI, select to live, optimized checkpoint. As we put the model into the right directory the name of the sub directory is not required here. This should be the XL version and here we simply add the image size and of course it should be set to the 1024. Choosing method I think best is DPM here. As of today you have to disable the safety checker. Now press the optimize button. Now similar to the StableDiffusion model before an optimized SDXL version is created. Well on my machine it crashed and as I've seen in the related GIT repository there is still some work to be done. The link to the repository is in the description below. You might check yourself if there has been some progress so that you can even use the SDXL version. Sidenote, don't expect it to work with less than 16GB VRAM. So far we only have considered Automatic1111 WebUI and for sure you are keen on knowing how to handle ComfyUI. In fact I still haven't found a solution for running it on Windows. As of today if you want to run ComfyUI with AMD the only solution is Linux. If you want to try check my video on how to run ComfyUI and Automatic1111 WebUI with AMD on Linux. And if you don't have a Linux installation check my video about creating an Ubuntu Linux USB flash drive. If this video has helped you speeding up StableDiffusion with your AMD graphics card or you found the information about ROCm useful please consider leaving a like or a comment.
Info
Channel: Next Tech and AI
Views: 6,903
Rating: undefined out of 5
Keywords: stable diffusion, stable diffusion tutorial, stable diffusion ai, stable diffusion xl, amd rocm, nexttechandai, automatic1111 stable diffusion, comfyui, automatic1111, stable diffusion on radeon, AMD GPU AI Tutorial, stable diffusion for amd, rocm windows, amd rocm windows, amd windows, amd stable diffusion windows, stable diffusion amd, rocm
Id: IBQH19BTh9w
Channel Id: undefined
Length: 11min 49sec (709 seconds)
Published: Sun Nov 05 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.