Why Use Cloud GPUs for Stable Diffusion?
Stable Diffusion XL needs at least 8 GB VRAM for basic generation and 12+ GB for comfortable batch workflows. Instead of buying an expensive GPU, rent one by the hour.
Choosing the Right GPU Tier
| Tier | GPU | VRAM | SDXL Performance | Price |
|---|---|---|---|---|
| Starter | RTX 3060 | 12 GB | ~15 sec/image | $0.40/hr |
| Standard | RTX 3090 | 24 GB | ~8 sec/image | $0.60/hr |
| Pro | RTX 4090 | 24 GB | ~4 sec/image | $0.90/hr |
| Power | A6000 | 48 GB | ~6 sec/image | $1.20/hr |
For most AI art workflows, the Standard tier (RTX 3090) is the sweet spot — fast enough for real-time iteration at a great price.
Setting Up ComfyUI
After deploying your machine and connecting via RDP:
# Clone ComfyUI git clone https://github.com/comfyanonymous/ComfyUI.git cd ComfyUI # Install dependencies pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121 pip install -r requirements.txt # Download SDXL model cd models/checkpoints wget https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/resolve/main/sd_xl_base_1.0.safetensors # Start ComfyUI cd ../.. python main.py
Open a browser on the remote machine to http://localhost:8188 and you're ready to generate.
Workflow Tips
LoRA Training on TurboGPU
The Power tier (A6000, 48 GB VRAM) is perfect for training custom LoRA models:
# Install kohya_ss training toolkit git clone https://github.com/kohya-ss/sd-scripts.git cd sd-scripts pip install -r requirements.txt # Train a LoRA with 20-30 images in ~20 minutes accelerate launch train_network.py \ --pretrained_model_name_or_path="sd_xl_base_1.0.safetensors" \ --train_data_dir="./training_images" \ --output_dir="./output" \ --network_module=networks.lora \ --max_train_epochs=10
Cost for a Typical Session
A 2-hour AI art session on the Standard tier costs $1.20. You can generate hundreds of images in that time. Compare that to buying an RTX 3090 for $800+.
