Running Stable Diffusion XL on Cloud GPUs: Complete Setup

Why SDXL on Cloud GPUs?

Stable Diffusion XL is the most powerful open-source image generation model available. It produces stunning 1024x1024 images that rival Midjourney and DALL-E 3.

Running SDXL locally requires a good GPU (12GB+ VRAM), but cloud GPUs let you:

Generate faster: L40S generates 1024x1024 images in ~3 seconds
Batch process: Create thousands of images without heating your room
Access anywhere: Generate from your phone, tablet, or laptop
Scale up: Use multiple GPUs for parallel generation

GPU Requirements

GPU	VRAM	Time per Image	Cost/Hour	Cost per 1000 Images
RTX 4090	24 GB	~4 sec	$0.55	$0.61
L40S	48 GB	~3 sec	$0.90	$0.75
A100 40GB	40 GB	~2.5 sec	$1.20	$0.83
H100	80 GB	~1.5 sec	$2.80	$1.17

For most users, RTX 4090 or L40S offers the best price-performance for image generation.

Option 1: ComfyUI (Recommended)

ComfyUI is a node-based interface that gives you complete control over the generation pipeline.

Installation

# SSH into your GPU instance
ssh root@YOUR_INSTANCE_IP

# Clone ComfyUI
git clone https://github.com/comfyanonymous/ComfyUI.git
cd ComfyUI

# Create virtual environment
python3 -m venv venv
source venv/bin/activate

# Install dependencies
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
pip install -r requirements.txt

# Download SDXL model
mkdir -p models/checkpoints
cd models/checkpoints
wget https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/resolve/main/sd_xl_base_1.0.safetensors

# Download SDXL Refiner (optional, for better quality)
wget https://huggingface.co/stabilityai/stable-diffusion-xl-refiner-1.0/resolve/main/sd_xl_refiner_1.0.safetensors

Run ComfyUI

# Start ComfyUI with remote access
cd ~/ComfyUI
python main.py --listen 0.0.0.0 --port 8188

Open http://YOUR_IP:8188 in your browser!

💡 Pro Tip: Use Custom Nodes

Install ComfyUI Manager for easy access to hundreds of custom nodes: controlnet, IP-adapter, face restoration, and more.

Option 2: Automatic1111 WebUI

The classic SD interface, feature-rich and beginner-friendly.

# Clone A1111
git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui.git
cd stable-diffusion-webui

# Download SDXL
mkdir -p models/Stable-diffusion
cd models/Stable-diffusion
wget https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/resolve/main/sd_xl_base_1.0.safetensors

# Run with remote access
cd ~/stable-diffusion-webui
./webui.sh --listen --xformers --enable-insecure-extension-access

Option 3: API-Only with Diffusers

For programmatic access, use the Diffusers library:

from diffusers import StableDiffusionXLPipeline
import torch

# Load SDXL
pipe = StableDiffusionXLPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    torch_dtype=torch.float16,
    use_safetensors=True,
    variant="fp16"
)
pipe = pipe.to("cuda")

# Enable optimizations
pipe.enable_xformers_memory_efficient_attention()

# Generate image
prompt = "A majestic lion in a cyberpunk city, neon lights, rain, 8k, detailed"
image = pipe(
    prompt=prompt,
    num_inference_steps=30,
    guidance_scale=7.5,
    width=1024,
    height=1024
).images[0]

image.save("lion_cyberpunk.png")

Performance Optimization

1. Use FP16 Precision

Halves VRAM usage with minimal quality loss.

2. Enable xFormers

pipe.enable_xformers_memory_efficient_attention()

3. Compile the Model (PyTorch 2.0+)

pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead")

4. Use SDXL Turbo for Speed

SDXL Turbo generates in just 1-4 steps (vs 30+ for base SDXL):

pipe = StableDiffusionXLPipeline.from_pretrained(
    "stabilityai/sdxl-turbo",
    torch_dtype=torch.float16
)
image = pipe(prompt, num_inference_steps=4, guidance_scale=0.0).images[0]

Cost Comparison

Generating 10,000 images at 1024x1024:

Service	Cost per Image	10,000 Images
Midjourney	$0.04	$400
DALL-E 3	$0.04	$400
Stability API	$0.02	$200
GPUBrazil (RTX 4090)	$0.0006	$6

Running SDXL yourself is 66x cheaper than using Midjourney!

Start Generating AI Images Today

Get an RTX 4090 for $0.55/hour. Generate unlimited images.

Get $5 Free Credit →

Popular Extensions & Models

ControlNet: Guide generation with poses, edges, depth maps
IP-Adapter: Use reference images for style/face consistency
AnimateDiff: Generate animations from prompts
SDXL Lightning: 4-step generation with great quality
Juggernaut XL: Popular fine-tuned SDXL for photorealism

Conclusion

Running SDXL on cloud GPUs gives you professional-grade AI image generation at a fraction of the cost of commercial APIs. With ComfyUI or A1111, you get unlimited creative control.