modelos/DPO (Direct Preference Optimization) LoRA for XL and 1.5 - OpenRail++ - SDXL - V1.0

DPO (Direct Preference Optimization) LoRA for XL and 1.5 - OpenRail++ - SDXL - V1.0

9/2/2025

10:20:41 AM

Palavras-chave e Tags Relacionadas

direct preference optimization,dpo,dpo (direct preference optimization) lora for xl and 1.5 - openrail++,enfugue,finetuning diffusion models,lora,sdxl 1.0,sdxl - v1.0,stable diffusion 1.5,stable diffusion xl,ferramenta

A photorealistic portrait of a female warrior with red hair wearing detailed armor, holding a glowing sword and a blue shield with a red emblem, standing in a dense forest.

Close-up macro image of a bioluminescent extraterrestrial creature with iridescent blue and green feathers, large expressive eyes, and red markings, perched on a vivid red alien plant.

Young woman with a blonde razorcut pixie haircut wearing a school uniform with a red tie, sitting elegantly in a red armchair in a living room.

Detailed Neo-Byzantine style circular ornament featuring ruby, sapphire, gold, and intricate floral mosaic patterns accented by silver leaves.

A majestic pointed mountain reflected in a crystal-clear lake with a dramatic fiery orange sunset sky in the background and rocky terrain in the foreground.

Colorful cute robot character with multiple arms, generated using Stable Diffusion AI.

A mountain temple surrounded by misty peaks and calm waters, AI generated using Stable Diffusion.

Prompts Recomendados

RAW photo, a close-up picture of a cat, a close-up picture of a dog, orange eyes, blue eyes, reflection in it's eyes

Parâmetros Recomendados

samplers

DPM2

steps

cfg

Patrocinadores do Criador

Check out the Pick-a-Pic v2 Dataset used for training: https://huggingface.co/datasets/yuvalkirstain/pickapic_v2

Read the original paper on DPO: https://huggingface.co/papers/2311.12908

Explore and download models on HuggingFace:
DPO Stable Diffusion XL
DPO Stable Diffusion 1.5

See source checkpoints and extracted LoRA at:
DPO SD1.5 on CivitAI
DPO SDXL on CivitAI

What is DPO?

DPO is Direct Preference Optimization, the name given to the process whereby a diffusion model is finetuned based on human-chosen images. Meihua Dang et. al. have trained Stable Diffusion 1.5 and Stable Diffusion XL using this method and the Pick-a-Pic v2 Dataset, which can be found at https://huggingface.co/datasets/yuvalkirstain/pickapic_v2, and wrote a paper about it at https://huggingface.co/papers/2311.12908.

What does it Do?

The trained DPO models have been observed to produce higher quality images than their untuned counterparts, with a significant emphasis on the adherence of the model to your prompt. These LoRA can bring that prompt adherence to other fine-tuned Stable Diffusion models.

Who Trained This?

These LoRA are based on the works of Meihua Dang (https://huggingface.co/mhdang) at

https://huggingface.co/mhdang/dpo-sdxl-text2image-v1 and https://huggingface.co/mhdang/dpo-sd1.5-text2image-v1, licensed under OpenRail++.

How were these LoRA Made?

They were created using Kohya SS by extracting them from other OpenRail++ licensed checkpoints on CivitAI and HuggingFace.

1.5: https://civitai.com/models/240850/sd15-direct-preference-optimization-dpo extracted from https://huggingface.co/fp16-guy/Stable-Diffusion-v1-5_fp16_cleaned/blob/main/sd_1.5.safetensors.

XL: https://civitai.com/models/238319/sd-xl-dpo-finetune-direct-preference-optimization extracted from https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/blob/main/sd_xl_base_1.0_0.9vae.safetensors

These are also hosted on HuggingFace at https://huggingface.co/benjamin-paine/sd-dpo-offsets/

Colaborador

enfugue

Glass Sculptures - SDXL VERSION

Bruce Holwerda - V1

Usar este modelo

Detalhes do Modelo

Tipo de modelo

LORA

Modelo base

SDXL 1.0

Versão do modelo

SDXL - V1.0

Hash do modelo

c100ec5708

Criador

enfugue

Discussão

Por favor, faça log in para deixar um comentário.

Coleção de Modelos - DPO (Direct Preference Optimization) LoRA for XL and 1.5 - OpenRail++

LORAMODELOS