feat(img2vid,txt2vid): Initial support for img2vid,txt2vid (#1442)

* feat(img2vid): Initial support for img2vid

* doc(SD): fix SDXL Example

* Minor fixups for img2vid

* docs(img2img): fix example curl call

* feat(txt2vid): initial support

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

* diffusers: be retro-compatible with CUDA settings

* docs(img2vid, txt2vid): examples

* Add notice on docs

---------

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
This commit is contained in:
Ettore Di Giacinto 2023-12-15 18:06:20 -05:00 committed by GitHub
parent fb6a5bc620
commit dd982acf2c
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
7 changed files with 150 additions and 27 deletions

View file

@ -147,7 +147,6 @@ backend: diffusers
# Force CPU usage - set to true for GPU
f16: false
diffusers:
pipeline_type: StableDiffusionXLPipeline
cuda: false # Enable for GPU usage (CUDA)
scheduler_type: euler_a
```

View file

@ -15,7 +15,6 @@ backend: diffusers
# Force CPU usage - set to true for GPU
f16: false
diffusers:
pipeline_type: StableDiffusionXLPipeline
cuda: false # Enable for GPU usage (CUDA)
scheduler_type: dpm_2_a
```

View file

@ -27,12 +27,9 @@ name: animagine-xl
parameters:
model: Linaqruf/animagine-xl
backend: diffusers
# Force CPU usage - set to true for GPU
f16: false
cuda: true
f16: true
diffusers:
pipeline_type: StableDiffusionXLPipeline
cuda: false # Enable for GPU usage (CUDA)
scheduler_type: euler_a
```
@ -47,9 +44,9 @@ parameters:
backend: diffusers
step: 30
f16: true
cuda: true
diffusers:
pipeline_type: StableDiffusionPipeline
cuda: true
enable_parameters: "negative_prompt,num_inference_steps,clip_skip"
scheduler_type: "k_dpmpp_sde"
cfg_scale: 8
@ -69,7 +66,7 @@ The following parameters are available in the configuration file:
| `scheduler_type` | Scheduler type | `k_dpp_sde` |
| `cfg_scale` | Configuration scale | `8` |
| `clip_skip` | Clip skip | None |
| `pipeline_type` | Pipeline type | `StableDiffusionPipeline` |
| `pipeline_type` | Pipeline type | `AutoPipelineForText2Image` |
There are available several types of schedulers:
@ -131,17 +128,16 @@ parameters:
model: nitrosocke/Ghibli-Diffusion
backend: diffusers
step: 25
cuda: true
f16: true
diffusers:
pipeline_type: StableDiffusionImg2ImgPipeline
cuda: true
enable_parameters: "negative_prompt,num_inference_steps,image"
```
```bash
IMAGE_PATH=/path/to/your/image
(echo -n '{"image": "'; base64 $IMAGE_PATH; echo '", "prompt": "a sky background","size": "512x512","model":"stablediffusion-edit"}') |
(echo -n '{"file": "'; base64 $IMAGE_PATH; echo '", "prompt": "a sky background","size": "512x512","model":"stablediffusion-edit"}') |
curl -H "Content-Type: application/json" -d @- http://localhost:8080/v1/images/generations
```
@ -157,14 +153,67 @@ backend: diffusers
step: 50
# Force CPU usage
f16: true
cuda: true
diffusers:
pipeline_type: StableDiffusionDepth2ImgPipeline
cuda: true
enable_parameters: "negative_prompt,num_inference_steps,image"
cfg_scale: 6
```
```bash
(echo -n '{"image": "'; base64 ~/path/to/image.jpeg; echo '", "prompt": "a sky background","size": "512x512","model":"stablediffusion-depth"}') |
(echo -n '{"file": "'; base64 ~/path/to/image.jpeg; echo '", "prompt": "a sky background","size": "512x512","model":"stablediffusion-depth"}') |
curl -H "Content-Type: application/json" -d @- http://localhost:8080/v1/images/generations
```
## img2vid
{{% notice note %}}
Experimental and available only on master builds. See: https://github.com/mudler/LocalAI/pull/1442
{{% /notice %}}
```yaml
name: img2vid
parameters:
model: stabilityai/stable-video-diffusion-img2vid
backend: diffusers
step: 25
# Force CPU usage
f16: true
cuda: true
diffusers:
pipeline_type: StableVideoDiffusionPipeline
```
```bash
(echo -n '{"file": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/svd/rocket.png?download=true","size": "512x512","model":"img2vid"}') |
curl -H "Content-Type: application/json" -X POST -d @- http://localhost:8080/v1/images/generations
```
## txt2vid
{{% notice note %}}
Experimental and available only on master builds. See: https://github.com/mudler/LocalAI/pull/1442
{{% /notice %}}
```yaml
name: txt2vid
parameters:
model: damo-vilab/text-to-video-ms-1.7b
backend: diffusers
step: 25
# Force CPU usage
f16: true
cuda: true
diffusers:
pipeline_type: VideoDiffusionPipeline
cuda: true
```
```bash
(echo -n '{"prompt": "spiderman surfing","size": "512x512","model":"txt2vid"}') |
curl -H "Content-Type: application/json" -X POST -d @- http://localhost:8080/v1/images/generations
```