Generating Images Faster on Potato Laptops

Also known as installing openvinotoolkit/stable-diffusion-webui and running on Intel HD Graphics.

This Gist duplicates the information contained in:

If there's a discrepancy between this Gist and those linked websites, follow their instructions.

1: Contents

1: Contents
2: Is This For Me?
3: PowerShell, Python, and Git
- 3.1: PowerShell
- 3.2: Python
- 3.3: Git
4: Installing openvinotoolkit/stable-diffusion-webui
- 4.1: How does it differ from AUTOMATIC1111/stable-diffusion-webui ?
- 4.2: Cloning the Git repository
- 4.3: Running batch files
5: Usage
6: Notes and Limitations
- 6.1: Miscellaneous Notes
- 6.2: Speed Results During Testing
- 6.3: Limitations

2: Is This For Me?

This guide is about installing

openvinotoolkit/stable-diffusion-webui
on a Windows machine
with an Intel integrated GPU

OpenVINO is a set of software made by Intel that makes AI run faster on Intel stuff like CPUs, integrated GPUs, and Arc graphics cards. This guide is geared towards users of potato laptops since I have a potato laptop. If you have a gaming rig with a dedicated GPU, there's probably a more relevant guide available somewhere.

You'll need at least 30 GB of storage as well as 16 GB of RAM. The OpenVINO script does some caching and optimizations to make it run faster on your computer, but in doing so it takes a lot of storage and RAM.

If you were able to get AUTOMATIC1111/stable-diffusion-webui running, you probably don't need section 3. However, it has been added just in case.

3: PowerShell, Python, and Git

You'll need three tools before installing Stable Diffusion WebUI (either the AUTOMATIC1111 or openvinotoolkit versions):

PowerShell
Python
Git

3.1: PowerShell

Powershell is a command line interface, which basically means you can type stuff into the computer to make it do things. It's already included with every Windows computer.

The easiest way to open it is to use File Explorer to go to where you want to set up Stable Diffusion WebUI, then click File > Open Windows PowerShell (see images below). Make sure you're in the place you want to install the WebUI in and that it has enough space (at least 30 GB) to do future steps. Any place works, you don't have to use D:\test2 as I have done.

🖼️ Navigating to the installation place and opening PowerShell (Click to expand)

ℹ️ What's the difference between PowerShell and Command Prompt (cmd.exe) ? (Click to expand)

For the purpose of this installation guide, they do pretty much the same things. Command Prompt just doesn't have an easy way to open from File Explorer in the installation place you want.

3.2: Python

Python is the programming language that Stable Diffusion WebUI uses. Specifically, it uses Gradio for the user interface and PyTorch for the number crunching and image generation. You don't need to know about any of this and the installation files will take care of any dependencies, but you will need to first install Python yourself.

AUTOMATIC1111's Automatic Installation on Windows instructions also tells you to install Python and Git. Steps 1 and 2 there are equivalent to sections 3.2 and 3.3 here. However, don't do step 3 or later. That will install AUTOMATIC1111's version of Stable Diffusion WebUI, which isn't what you're here for.

Go to the Python 3.10.6 download site and scroll to the bottom. You'll want the Windows installer (64-bit). When you start it up, enable Add Python 3.10 to PATH (see image below) before clicking Install Now. Follow the rest of the instructions.

🖼️ First page of Python 3.10.6 setup (Click to expand)

ℹ️ Why is Python 3.10.6 used ? (Click to expand)

Everything in Stable Diffusion WebUI has been checked to make sure it works with Python 3.10.6. Other versions may work, but are not guaranteed to.

ℹ️ What does Add Python 3.10 to PATH do? (Click to expand)

PATH is something that tells Windows and PowerShell where to look for programs to run. Enabling that checkbox allows code to use:

python ...

instead of:

C:\wherever_you_installed_it\Python310\python.exe ...

Stable Diffusion WebUI expects Python 3.10 to be added to PATH.

3.3: Git

Git is a version control system (a way for software developers to keep track of how files change over time), but it only matters to you because it's the most convenient way to download openvinotoolkit/stable-diffusion-webui .

Just like in the Python section, AUTOMATIC1111's instructions also tell you to install git. Go to the Git download site and click Click here to download for the Git installer. You don't need to change any settings. Just go through the default settings and it will add itself to PATH.

4: Installing openvinotoolkit/stable-diffusion-webui

Here's where installation instructions for openvinotoolkit/stable-diffusion-webui differ from those for AUTOMATIC1111/stable-diffusion-webui.

4.1: How does it differ from AUTOMATIC1111/stable-diffusion-webui ?

The folks behind openvinotoolkit have created a fork of AUTOMATIC1111's stable-diffusion-webui repository. This means they have their own version with files they added or changed (like making OpenVINO work), but the original version by AUTOMATIC1111 can still be downloaded by everyone else who doesn't have a potato laptop.

They've also included a few batch files (discussed in section 4.3 below) to make running their repository easier.

4.2: Cloning the Git repository

Now that you have PowerShell, Python, and Git, you can clone the openvinotoolkit/stable-diffusion-webui repository. That just means using Git to copy that stable-diffusion-webui folder to your computer.

These are basically openvinotoolkit's Installation on Intel Silicon instructions but reworded.

Open PowerShell again (remember to open it from the folder you want to install stable-diffusion-webui in) and paste:

git clone https://github.com/openvinotoolkit/stable-diffusion-webui.git

When it finishes making the stable-diffusion-webui folder (it shouldn't take more than a minute or two), paste:

cd stable-diffusion-webui

This changes your directory (folder) (cd) to the stable-diffusion-webui you just made. It's needed because the next two commands expect you to be in the same folder as the batch file they use.

4.3: Running batch files

A batch file is a series of commands that PowerShell can use. Using a batch file means you don't have to type all the commands inside them manually.

Paste:

.\first-time-runner.bat

Note that .\ has been added to the front compared to openvinotoolkit's instructions. If you don't do that, PowerShell will complain (see image below).

🖼️ PowerShell complaining that the command is sus (Click to expand)

ℹ️ What does first-time-runner.bat do? (Click to expand)

first-time-runner.bat changes webui-user.bat by removing the line:

set COMMANDLINE_ARGS=

and adding the lines

set COMMANDLINE_ARGS=--skip-torch-cuda-test --precision full --no-half
set PYTORCH_TRACING_MODE=TORCHFX

--skip-torch-cuda-test: CUDA is a set of capabilities in Nvidia GPUs, which if you're following this guide you probably don't have. Without this argument, the WebUI will complain that you don't have CUDA even though openvinotoolkit's Stable Diffusion WebUI is modified to not use CUDA anyways.

--precision full and --no-half: The WebUI tries to use half precision floating point numbers instead of full precision. They take up half the memory, are sometimes faster to calculate, and the tradeoff in precision is tolerable. Unfortunately, CPUs and integrated GPUs tend to not have half precision capabilities.

This took 15 minutes on my machine. You should see something like the following after it downloads the Stable Diffusion v1.5 model and finishes:

Downloading: "https://huggingface.co/runwayml/stable-diffusion-v1-5/resolve/main/v1-5-pruned-emaonly.safetensors" to D:\test2\stable-diffusion-webui\models\Stable-diffusion\v1-5-pruned-emaonly.safetensors

100%|█████████████████████████████████████████████████████████████████████████████| 3.97G/3.97G [03:22<00:00, 21.1MB/s]
Calculating sha256 for D:\test2\stable-diffusion-webui\models\Stable-diffusion\v1-5-pruned-emaonly.safetensors: Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Startup time: 846.6s (launcher: 620.4s, import torch: 10.5s, import gradio: 2.4s, setup paths: 3.1s, other imports: 3.5s, setup codeformer: 0.2s, list SD models: 202.6s, load scripts: 2.9s, create ui: 0.8s, gradio launch: 0.1s).
6ce0161689b3853acaa03779ec93eafe75a02f4ced659bee03f50797806fa2fa
Loading weights [6ce0161689] from D:\test2\stable-diffusion-webui\models\Stable-diffusion\v1-5-pruned-emaonly.safetensors
Creating model from config: D:\test2\stable-diffusion-webui\configs\v1-inference.yaml
LatentDiffusion: Running in eps-prediction mode
DiffusionWrapper has 859.52 M params.
Applying attention optimization: InvokeAI... done.
Model loaded in 15.5s (calculate hash: 12.1s, load weights from disk: 0.3s, create model: 0.8s, apply weights to model: 1.9s, calculate empty prompt: 0.4s).

When you see that, close PowerShell by clicking the X in the top right. Don't do the normal WebUI procedure of visiting http://127.0.0.1:7860 in your web browser yet.

ℹ️ What happens if I use the WebUI without running torch-install.bat ? (Click to expand)

first-time-runner.bat installs PyTorch version 2.0.1+cu118 (using Nvidia CUDA), which is Stable Diffusion's default version. The OpenVINO script won't work and if you try to generate without using a script, it will use your CPU and be very slow.

Navigate to stable-diffusion-webui (you can do this in the File Explorer so you don't have to do a cd command), open a new instance of PowerShell and paste:

.\torch-install.bat

This took 3 minutes on my machine. You should see Press any key to continue . . . when it finishes.

ℹ️ What does torch-install.bat do? (Click to expand)

torch-install.bat uninstalls PyTorch version 2.0.1+cu118 and installs version 2.1.0.dev20230726+cpu . This is the version the OpenVINO script needs.

It also adds the --skip-prepare-environment argument to webui-user.bat which turns off various things related to setup.

Congratulations, you're finished setting up your WebUI. You can now close PowerShell, or keep it open to actually start the WebUI (see section 5 below).

5: Usage

Every time you want to start the WebUI, you need to have a PowerShell window open in your stable-diffusion-webui folder, paste:

.\webui-user.bat

Then go to http://127.0.0.1:7860 in your web browser to start generating.

The thing to keep in mind with openvinotoolkit/stable-diffusion-webui is that it only speeds up image generation when you use the included script. To use the Accelerate with OpenVINO script, go to openvinotoolkit's documentation or follow the instructions here (see images below).

🖼️ Instructions for using Accelerate with OpenVINO (Click to expand)

Click this down arrow to open the script selection.

Choose the Accelerate with OpenVINO script.

Choose the v1-inference.yaml and either the CPU or GPU setting. You'll want to use GPU since it's faster.

You can now type a prompt, change other settings (although some combinations of settings and the script will not work or run slowly), and generate images.

6: Notes and Limitations

Since OpenVINO support through the script requires diverting some parts of the Stable Diffusion workflow, there are a lot of things to keep in mind when using it.

6.1: Miscellaneous Notes

openvinotoolkit/stable-diffusion-webui uses a significant amount of storage.
- It caches models to your stable-diffusion-webui\cache folder to speed up optimization in future WebUI startups.
- Even though openvinotoolkit's WebUI is about 7 GB when you finish installing it, trying both CPU and GPU options, even with just the default Stable Diffusion v1.5 model, increases the size of stable-diffusion-webui to 24 GB.
- If you use other models and loras (see limitations below), you may need even more storage.
It also uses a lot of RAM during the optimization phase.
- I recommend 16 GB: 4 for Windows, 8 for the running part of the OpenVINO script, 1 for your web browser, and some extra just in case.
- Even with 16 GB, I still ran out of memory during the optimization part and had to offload to the page file (I saw about 22 GB of committed memory).
- I think I saw 40 GB of committed memory once while messing with batch sizes (also see limitations below) and loras. The thrashing from having to load and offload parts of memory made it even slower than not using the OpenVINO script at all.
Changing batch size (and sampler, also see limitations below) requires doing another round of optimization and caching.
- Furthermore, I've never been able to get batch size > 1 to work on GPU. It seems to fall back to CPU, use huge amounts of RAM, and slow down to the point of taking 40 minutes to generate 1 image.
- If this happens to you, I recommend keeping batch size at 1 and increasing batch count to however many images you need. This is the opposite of the general rule in A1111, which is to maximize batch size for a shorter overall running time.

6.2: Speed Results During Testing

I have an i7-8550U laptop with 16 GB of RAM. It runs on 15 watts of power and has an Intel UHD 620 integrated graphics.

An "image" here is a 512 x 512 image generated with 20 steps of the Euler a sampler. Optimization time is the difference between the 1st image and 2nd image and is how long it takes the script to get ready to generate. This time was determined on a fresh install, so no caching is involved. The time may be shorter in subsequent runs.

PyTorch version 2.0.1+cu118 : 10:45, 32.29s/it
PyTorch version 2.1.0.dev20230726+cpu : 09:06, 27.33s/it
- Accelerate with OpenVINO:
- - CPU: 02:06 optimization time + 04:41 generation time, 14.09s/it
- - GPU: 03:20 optimization time + 02:00 generation time, 6.02s/it

Excluding the optimization before the first image, this WebUI is 5.36x as fast as A1111 in generating 1 image.

Including the optimization, this WebUI is 2.02x as fast. This ratio increases as batch count increases.

Oddly, the PyTorch version that openvinotoolkit's WebUI uses slightly speeds up image generation even without using the OpenVINO script.

6.3: Limitations

The script does not work with the progress bar or live previews. You get a progress bar update every time an image finishes, but no visual output.
Only the samplers inside the script's settings box are guaranteed to work. Your choice is limited compared to what A1111 provides.
The script cannot be used at the same time as any other script (Prompt matrix, X/Y/Z plot, etc.)
Only one lora can be used at once. If you try to use more than one lora, only the first will be used.
Many other tools that have been developed by the Stable Diffusion community don't work with the script. I'm not an expert in the latest Stable Diffusion workflows so I can't say much from personal experience.

On point 1 above, I was able to get live preview working after doing some funny stuff to the OpenVINO script (diverting the data flow between each sampling step). An image is shown below. If people ask for it I'll post the necessary modifications somewhere.

🖼️ Example of Accelerate with OpenVINO and live preview (Click to expand)

SpencerLeung1024/installing_openvino_toolkit_stable_diffusion_webui.md

Select an option

No results found