feat: Add backend gallery (#5607)

* feat: Add backend gallery

This PR add support to manage backends as similar to models. There is
now available a backend gallery which can be used to install and remove
extra backends.
The backend gallery can be configured similarly as a model gallery, and
API calls allows to install and remove new backends in runtime, and as
well during the startup phase of LocalAI.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add backends docs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* wip: Backend Dockerfile for python backends

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat: drop extras images, build python backends separately

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixup on all backends

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* test CI

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Tweaks

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Drop old backends leftovers

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fixup CI

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Move dockerfile upper

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fix proto

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Feature dropped for consistency - we prefer model galleries

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add missing packages in the build image

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* exllama is ponly available on cublas

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* pin torch on chatterbox

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fixups to index

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* CI

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Debug CI

* Install accellerators deps

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add target arch

* Add cuda minor version

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Use self-hosted runners

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* ci: use quay for test images

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixups for vllm and chatterbox

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Small fixups on CI

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chatterbox is only available for nvidia

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Simplify CI builds

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Adapt test, use qwen3

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(model gallery): add jina-reranker-v1-tiny-en-gguf

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(gguf-parser): recover from potential panics that can happen while reading ggufs with gguf-parser

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Use reranker from llama.cpp in AIO images

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Limit concurrent jobs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
This commit is contained in:
Ettore Di Giacinto 2025-06-15 14:56:52 +02:00 committed by GitHub
parent a7a6020328
commit 2d64269763
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
114 changed files with 3996 additions and 1382 deletions

17
go.mod
View file

@ -9,8 +9,10 @@ require (
github.com/GeertJohan/go.rice v1.0.3
github.com/Masterminds/sprig/v3 v3.3.0
github.com/alecthomas/kong v0.9.0
github.com/census-instrumentation/opencensus-proto v0.4.1
github.com/charmbracelet/glamour v0.7.0
github.com/chasefleming/elem-go v0.26.0
github.com/cncf/xds/go v0.0.0-20240723142845-024c85f92f20
github.com/containerd/containerd v1.7.19
github.com/dave-gray101/v2keyauth v0.0.0-20240624150259-c45d584d25e2
github.com/elliotchance/orderedmap/v2 v2.2.0
@ -23,9 +25,11 @@ require (
github.com/gofiber/template/html/v2 v2.1.2
github.com/gofiber/websocket/v2 v2.2.1
github.com/gofrs/flock v0.12.1
github.com/golang/protobuf v1.5.4
github.com/google/go-containerregistry v0.19.2
github.com/google/uuid v1.6.0
github.com/gpustack/gguf-parser-go v0.17.0
github.com/grpc-ecosystem/grpc-gateway v1.5.0
github.com/hpcloud/tail v1.0.0
github.com/ipfs/go-log v1.0.5
github.com/jaypipes/ghw v0.12.0
@ -39,6 +43,7 @@ require (
github.com/nikolalohinski/gonja/v2 v2.3.2
github.com/onsi/ginkgo/v2 v2.22.2
github.com/onsi/gomega v1.36.2
github.com/orcaman/writerseeker v0.0.0-20200621085525-1d3f536ff85e
github.com/otiai10/openaigo v1.7.0
github.com/phayes/freeport v0.0.0-20220201140144-74d24b5ae9f5
github.com/prometheus/client_golang v1.20.5
@ -57,6 +62,7 @@ require (
go.opentelemetry.io/otel/exporters/prometheus v0.50.0
go.opentelemetry.io/otel/metric v1.34.0
go.opentelemetry.io/otel/sdk/metric v1.28.0
google.golang.org/api v0.180.0
google.golang.org/grpc v1.67.1
google.golang.org/protobuf v1.36.5
gopkg.in/yaml.v2 v2.4.0
@ -65,14 +71,22 @@ require (
)
require (
cel.dev/expr v0.16.0 // indirect
cloud.google.com/go/auth v0.4.1 // indirect
cloud.google.com/go/auth/oauth2adapt v0.2.2 // indirect
cloud.google.com/go/compute/metadata v0.5.0 // indirect
github.com/containerd/platforms v0.2.1 // indirect
github.com/cpuguy83/dockercfg v0.3.2 // indirect
github.com/daaku/go.zipexe v1.0.2 // indirect
github.com/distribution/reference v0.6.0 // indirect
github.com/dustin/go-humanize v1.0.1 // indirect
github.com/envoyproxy/protoc-gen-validate v1.1.0 // indirect
github.com/fasthttp/websocket v1.5.8 // indirect
github.com/felixge/httpsnoop v1.0.4 // indirect
github.com/go-task/slim-sprig/v3 v3.0.0 // indirect
github.com/google/s2a-go v0.1.7 // indirect
github.com/googleapis/enterprise-certificate-proxy v0.3.2 // indirect
github.com/googleapis/gax-go/v2 v2.12.4 // indirect
github.com/json-iterator/go v1.1.12 // indirect
github.com/magiconair/properties v1.8.7 // indirect
github.com/moby/docker-image-spec v1.3.1 // indirect
@ -82,7 +96,6 @@ require (
github.com/modern-go/reflect2 v1.0.2 // indirect
github.com/morikuni/aec v1.0.0 // indirect
github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 // indirect
github.com/orcaman/writerseeker v0.0.0-20200621085525-1d3f536ff85e // indirect
github.com/pion/datachannel v1.5.10 // indirect
github.com/pion/dtls/v2 v2.2.12 // indirect
github.com/pion/dtls/v3 v3.0.4 // indirect
@ -112,7 +125,9 @@ require (
go.opentelemetry.io/auto/sdk v1.1.0 // indirect
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.56.0 // indirect
go.uber.org/mock v0.5.0 // indirect
golang.org/x/oauth2 v0.24.0 // indirect
golang.org/x/time v0.8.0 // indirect
google.golang.org/genproto/googleapis/api v0.0.0-20241007155032-5fefd90f89a9 // indirect
)
require (