mirror of https://github.com/mudler/LocalAI.git synced 2025-06-17 08:15:00 +00:00

Ettore Di Giacinto 2d64269763

* feat: Add backend gallery

This PR add support to manage backends as similar to models. There is
now available a backend gallery which can be used to install and remove
extra backends.
The backend gallery can be configured similarly as a model gallery, and
API calls allows to install and remove new backends in runtime, and as
well during the startup phase of LocalAI.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add backends docs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* wip: Backend Dockerfile for python backends

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat: drop extras images, build python backends separately

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixup on all backends

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* test CI

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Tweaks

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Drop old backends leftovers

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fixup CI

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Move dockerfile upper

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fix proto

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Feature dropped for consistency - we prefer model galleries

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add missing packages in the build image

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* exllama is ponly available on cublas

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* pin torch on chatterbox

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fixups to index

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* CI

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Debug CI

* Install accellerators deps

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add target arch

* Add cuda minor version

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Use self-hosted runners

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* ci: use quay for test images

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixups for vllm and chatterbox

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Small fixups on CI

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chatterbox is only available for nvidia

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Simplify CI builds

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Adapt test, use qwen3

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(model gallery): add jina-reranker-v1-tiny-en-gguf

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(gguf-parser): recover from potential panics that can happen while reading ggufs with gguf-parser

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Use reranker from llama.cpp in AIO images

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Limit concurrent jobs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

2025-06-15 14:56:52 +02:00

3.5 KiB

Raw Blame History

title	description	weight
Backends	Learn how to use, manage, and develop backends in LocalAI	4

Backends

LocalAI supports a variety of backends that can be used to run different types of AI models. There are core Backends which are included, and there are containerized applications that provide the runtime environment for specific model types, such as LLMs, diffusion models, or text-to-speech models.

Managing Backends in the UI

The LocalAI web interface provides an intuitive way to manage your backends:

Navigate to the "Backends" section in the navigation menu
Browse available backends from configured galleries
Use the search bar to find specific backends by name, description, or type
Filter backends by type using the quick filter buttons (LLM, Diffusion, TTS, Whisper)
Install or delete backends with a single click
Monitor installation progress in real-time

Each backend card displays:

Backend name and description
Type of models it supports
Installation status
Action buttons (Install/Delete)
Additional information via the info button

Backend Galleries

Backend galleries are repositories that contain backend definitions. They work similarly to model galleries but are specifically for backends.

Adding a Backend Gallery

You can add backend galleries by specifying the Environment Variable LOCALAI_BACKEND_GALLERIES:

export LOCALAI_BACKEND_GALLERIES='[{"name":"my-gallery","url":"https://raw.githubusercontent.com/username/repo/main/backends"}]'

The URL needs to point to a valid yaml file, for example:

- name: "test-backend"
  uri: "quay.io/image/tests:localai-backend-test"
  alias: "foo-backend"

Where URI is the path to an OCI container image.

Backend Gallery Structure

A backend gallery is a collection of YAML files, each defining a backend. Here's an example structure:

# backends/llm-backend.yaml
name: "llm-backend"
description: "A backend for running LLM models"
uri: "quay.io/username/llm-backend:latest"
alias: "llm"
tags:
  - "llm"
  - "text-generation"

Pre-installing Backends

You can pre-install backends when starting LocalAI using the LOCALAI_EXTERNAL_BACKENDS environment variable:

export LOCALAI_EXTERNAL_BACKENDS="llm-backend,diffusion-backend"
local-ai run

Creating a Backend

To create a new backend, you need to:

Create a container image that implements the LocalAI backend interface
Define a backend YAML file
Publish your backend to a container registry

Backend Container Requirements

Your backend container should:

Implement the LocalAI backend interface (gRPC or HTTP)
Handle model loading and inference
Support the required model types
Include necessary dependencies
Have a top level run.sh file that will be used to run the backend
Pushed to a registry so can be used in a gallery

Publishing Your Backend

Build your container image:

docker build -t quay.io/username/my-backend:latest .

Push to a container registry:

docker push quay.io/username/my-backend:latest

Add your backend to a gallery:
- Create a YAML entry in your gallery repository
- Include the backend definition
- Make the gallery accessible via HTTP/HTTPS

Backend Types

LocalAI supports various types of backends:

LLM Backends: For running language models
Diffusion Backends: For image generation
TTS Backends: For text-to-speech conversion
Whisper Backends: For speech-to-text conversion

3.5 KiB Raw Blame History