LocalAI/backend/index.yaml
Ettore Di Giacinto 2d64269763
feat: Add backend gallery (#5607)
* feat: Add backend gallery

This PR add support to manage backends as similar to models. There is
now available a backend gallery which can be used to install and remove
extra backends.
The backend gallery can be configured similarly as a model gallery, and
API calls allows to install and remove new backends in runtime, and as
well during the startup phase of LocalAI.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add backends docs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* wip: Backend Dockerfile for python backends

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat: drop extras images, build python backends separately

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixup on all backends

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* test CI

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Tweaks

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Drop old backends leftovers

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fixup CI

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Move dockerfile upper

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fix proto

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Feature dropped for consistency - we prefer model galleries

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add missing packages in the build image

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* exllama is ponly available on cublas

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* pin torch on chatterbox

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fixups to index

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* CI

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Debug CI

* Install accellerators deps

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add target arch

* Add cuda minor version

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Use self-hosted runners

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* ci: use quay for test images

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixups for vllm and chatterbox

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Small fixups on CI

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chatterbox is only available for nvidia

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Simplify CI builds

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Adapt test, use qwen3

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(model gallery): add jina-reranker-v1-tiny-en-gguf

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(gguf-parser): recover from potential panics that can happen while reading ggufs with gguf-parser

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Use reranker from llama.cpp in AIO images

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Limit concurrent jobs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-06-15 14:56:52 +02:00

295 lines
No EOL
9.8 KiB
YAML

- name: "cuda11-rerankers"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-nvidia-cuda-11-rerankers"
alias: "cuda11-rerankers"
- name: "cuda11-vllm"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-nvidia-cuda-11-vllm"
alias: "cuda11-vllm"
- name: "cuda11-transformers"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-nvidia-cuda-11-transformers"
alias: "cuda11-transformers"
- name: "cuda11-diffusers"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-nvidia-cuda-11-diffusers"
alias: "cuda11-diffusers"
- name: "cuda11-exllama2"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-nvidia-cuda-11-exllama2"
alias: "cuda11-exllama2"
- name: "cuda12-rerankers"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-nvidia-cuda-12-rerankers"
alias: "cuda12-rerankers"
- name: "cuda12-vllm"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-nvidia-cuda-12-vllm"
alias: "cuda12-vllm"
- name: "cuda12-transformers"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-nvidia-cuda-12-transformers"
alias: "cuda12-transformers"
- name: "cuda12-diffusers"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-nvidia-cuda-12-diffusers"
alias: "cuda12-diffusers"
- name: "cuda12-exllama2"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-nvidia-cuda-12-exllama2"
alias: "cuda12-exllama2"
- name: "rocm-rerankers"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-rocm-hipblas-rerankers"
alias: "rocm-rerankers"
- name: "rocm-vllm"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-rocm-hipblas-vllm"
alias: "rocm-vllm"
- name: "rocm-transformers"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-rocm-hipblas-transformers"
alias: "rocm-transformers"
- name: "rocm-diffusers"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-rocm-hipblas-diffusers"
alias: "rocm-diffusers"
- name: "intel-sycl-f32-rerankers"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-intel-sycl-f32-rerankers"
alias: "intel-sycl-f32-rerankers"
- name: "intel-sycl-f16-rerankers"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-intel-sycl-f16-rerankers"
alias: "intel-sycl-f16-rerankers"
- name: "intel-sycl-f32-vllm"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-intel-sycl-f32-vllm"
alias: "intel-sycl-f32-vllm"
- name: "intel-sycl-f16-vllm"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-intel-sycl-f16-vllm"
alias: "intel-sycl-f16-vllm"
- name: "intel-sycl-f32-transformers"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-intel-sycl-f32-transformers"
alias: "intel-sycl-f32-transformers"
- name: "intel-sycl-f16-transformers"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-intel-sycl-f16-transformers"
alias: "intel-sycl-f16-transformers"
- name: "intel-sycl-f32-diffusers"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-intel-sycl-f32-diffusers"
alias: "intel-sycl-f32-diffusers"
- name: "cuda11-rerankers-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-11-rerankers"
alias: "rerankers"
- name: "cuda11-vllm-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-11-vllm"
alias: "vllm"
- name: "cuda11-transformers-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-11-transformers"
alias: "transformers"
- name: "cuda11-diffusers-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-11-diffusers"
alias: "diffusers"
- name: "cuda11-exllama2-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-11-exllama2"
alias: "exllama2"
- name: "cuda12-rerankers-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-12-rerankers"
alias: "rerankers"
- name: "cuda12-vllm-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-12-vllm"
alias: "vllm"
- name: "cuda12-transformers-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-12-transformers"
alias: "transformers"
- name: "cuda12-diffusers-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-12-diffusers"
alias: "diffusers"
- name: "cuda12-exllama2-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-12-exllama2"
alias: "exllama2"
- name: "rocm-rerankers-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-rocm-hipblas-rerankers"
alias: "rerankers"
- name: "rocm-vllm-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-rocm-hipblas-vllm"
alias: "vllm"
- name: "rocm-transformers-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-rocm-hipblas-transformers"
alias: "transformers"
- name: "rocm-diffusers-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-rocm-hipblas-diffusers"
alias: "diffusers"
- name: "intel-sycl-f32-rerankers-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-intel-sycl-f32-rerankers"
alias: "rerankers"
- name: "intel-sycl-f16-rerankers-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-intel-sycl-f16-rerankers"
alias: "rerankers"
- name: "intel-sycl-f32-vllm-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-intel-sycl-f32-vllm"
alias: "vllm"
- name: "intel-sycl-f16-vllm-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-intel-sycl-f16-vllm"
alias: "vllm"
- name: "intel-sycl-f32-transformers-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-intel-sycl-f32-transformers"
alias: "transformers"
- name: "intel-sycl-f16-transformers-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-intel-sycl-f16-transformers"
alias: "transformers"
- name: "intel-sycl-f32-diffusers-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-intel-sycl-f32-diffusers"
alias: "diffusers"
- name: "cuda11-kokoro-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-11-kokoro"
alias: "kokoro"
- name: "cuda12-kokoro-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-12-kokoro"
alias: "kokoro"
- name: "rocm-kokoro-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-rocm-hipblas-kokoro"
alias: "kokoro"
- name: "sycl-f32-kokoro"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-intel-sycl-f32-kokoro"
alias: "kokoro"
- name: "sycl-f16-kokoro"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-intel-sycl-f16-kokoro"
alias: "kokoro"
- name: "sycl-f16-kokoro-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-intel-sycl-f16-kokoro"
alias: "kokoro"
- name: "sycl-f32-kokoro-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-intel-sycl-f32-kokoro"
alias: "kokoro"
- name: "cuda11-faster-whisper-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-11-faster-whisper"
alias: "faster-whisper"
- name: "cuda12-faster-whisper-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-12-faster-whisper"
alias: "faster-whisper"
- name: "rocm-faster-whisper-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-rocm-hipblas-faster-whisper"
alias: "faster-whisper"
- name: "sycl-f32-faster-whisper"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-intel-sycl-f32-faster-whisper"
alias: "faster-whisper"
- name: "sycl-f16-faster-whisper"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-intel-sycl-f16-faster-whisper"
alias: "faster-whisper"
- name: "sycl-f32-faster-whisper-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-intel-sycl-f32-faster-whisper"
alias: "faster-whisper"
- name: "sycl-f16-faster-whisper-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-intel-sycl-f16-faster-whisper"
alias: "faster-whisper"
- name: "cuda11-coqui-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-11-coqui"
alias: "coqui"
- name: "cuda12-coqui-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-12-coqui"
alias: "coqui"
- name: "rocm-coqui-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-rocm-hipblas-coqui"
alias: "coqui"
- name: "sycl-f32-coqui"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-intel-sycl-f32-coqui"
alias: "coqui"
- name: "sycl-f16-coqui"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-intel-sycl-f16-coqui"
alias: "coqui"
- name: "sycl-f32-coqui-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-intel-sycl-f32-coqui"
alias: "coqui"
- name: "sycl-f16-coqui-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-intel-sycl-f16-coqui"
alias: "coqui"
- name: "cuda11-bark-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-11-bark"
alias: "bark"
- name: "cuda12-bark-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-12-bark"
alias: "bark"
- name: "rocm-bark-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-rocm-hipblas-bark"
alias: "bark"
- name: "sycl-f32-bark"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-intel-sycl-f32-bark"
alias: "bark"
- name: "sycl-f16-bark"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-intel-sycl-f16-bark"
alias: "bark"
- name: "sycl-f32-bark-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-intel-sycl-f32-bark"
alias: "bark"
- name: "sycl-f16-bark-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-intel-sycl-f16-bark"
alias: "bark"
- name: "cuda11-chatterbox-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-11-chatterbox"
alias: "chatterbox"
- name: "cuda12-chatterbox-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-12-chatterbox"
alias: "chatterbox"
- name: "cuda11-chatterbox"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-nvidia-cuda-11-chatterbox"
alias: "chatterbox"
- name: "cuda12-chatterbox"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-nvidia-cuda-12-chatterbox"
alias: "chatterbox"