LocalAI/pkg
Ettore Di Giacinto 2d64269763
feat: Add backend gallery (#5607)
* feat: Add backend gallery

This PR add support to manage backends as similar to models. There is
now available a backend gallery which can be used to install and remove
extra backends.
The backend gallery can be configured similarly as a model gallery, and
API calls allows to install and remove new backends in runtime, and as
well during the startup phase of LocalAI.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add backends docs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* wip: Backend Dockerfile for python backends

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat: drop extras images, build python backends separately

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixup on all backends

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* test CI

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Tweaks

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Drop old backends leftovers

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fixup CI

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Move dockerfile upper

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fix proto

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Feature dropped for consistency - we prefer model galleries

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add missing packages in the build image

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* exllama is ponly available on cublas

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* pin torch on chatterbox

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fixups to index

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* CI

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Debug CI

* Install accellerators deps

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add target arch

* Add cuda minor version

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Use self-hosted runners

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* ci: use quay for test images

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixups for vllm and chatterbox

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Small fixups on CI

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chatterbox is only available for nvidia

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Simplify CI builds

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Adapt test, use qwen3

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(model gallery): add jina-reranker-v1-tiny-en-gguf

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(gguf-parser): recover from potential panics that can happen while reading ggufs with gguf-parser

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Use reranker from llama.cpp in AIO images

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Limit concurrent jobs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-06-15 14:56:52 +02:00
..
assets fix: use rice when embedding large binaries (#5309) 2025-05-04 16:42:42 +02:00
audio feat: Realtime API support reboot (#5392) 2025-05-25 22:25:05 +02:00
concurrency chore: update jobresult_test.go (#4124) 2024-11-12 08:52:18 +01:00
downloader fix: typos (#5376) 2025-05-16 12:45:48 +02:00
functions Improve Comments and Documentation for MixedMode and ParseJSON Functions (#5626) 2025-06-11 09:46:53 +02:00
grpc Fix Typos in Comments and Error Messages (#5637) 2025-06-12 18:34:32 +02:00
langchain feat(llama.cpp): do not specify backends to autoload and add llama.cpp variants (#2232) 2024-05-04 17:56:12 +02:00
library fix: use rice when embedding large binaries (#5309) 2025-05-04 16:42:42 +02:00
model feat: Add backend gallery (#5607) 2025-06-15 14:56:52 +02:00
oci chore: fix go.mod module (#2635) 2024-06-23 08:24:36 +00:00
sound feat: Realtime API support reboot (#5392) 2025-05-25 22:25:05 +02:00
startup feat: Add backend gallery (#5607) 2025-06-15 14:56:52 +02:00
store chore: fix go.mod module (#2635) 2024-06-23 08:24:36 +00:00
templates feat(llama.cpp): add support for audio input (#5466) 2025-05-26 16:06:03 +02:00
utils fix: adapt test to error changes 2025-05-30 17:43:59 +02:00
xsync chore: fix go.mod module (#2635) 2024-06-23 08:24:36 +00:00
xsysinfo feat: improve RAM estimation by using values from summary (#5525) 2025-06-05 19:16:26 +02:00