feat: Add backend gallery (#5607)

* feat: Add backend gallery

This PR add support to manage backends as similar to models. There is
now available a backend gallery which can be used to install and remove
extra backends.
The backend gallery can be configured similarly as a model gallery, and
API calls allows to install and remove new backends in runtime, and as
well during the startup phase of LocalAI.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add backends docs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* wip: Backend Dockerfile for python backends

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat: drop extras images, build python backends separately

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixup on all backends

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* test CI

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Tweaks

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Drop old backends leftovers

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fixup CI

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Move dockerfile upper

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fix proto

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Feature dropped for consistency - we prefer model galleries

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add missing packages in the build image

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* exllama is ponly available on cublas

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* pin torch on chatterbox

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fixups to index

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* CI

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Debug CI

* Install accellerators deps

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add target arch

* Add cuda minor version

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Use self-hosted runners

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* ci: use quay for test images

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixups for vllm and chatterbox

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Small fixups on CI

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chatterbox is only available for nvidia

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Simplify CI builds

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Adapt test, use qwen3

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(model gallery): add jina-reranker-v1-tiny-en-gguf

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(gguf-parser): recover from potential panics that can happen while reading ggufs with gguf-parser

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Use reranker from llama.cpp in AIO images

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Limit concurrent jobs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
This commit is contained in:
Ettore Di Giacinto 2025-06-15 14:56:52 +02:00 committed by GitHub
parent a7a6020328
commit 2d64269763
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
114 changed files with 3996 additions and 1382 deletions

View file

@ -61,10 +61,6 @@ updates:
directory: "/backend/python/openvoice"
schedule:
interval: "weekly"
- package-ecosystem: "pip"
directory: "/backend/python/parler-tts"
schedule:
interval: "weekly"
- package-ecosystem: "pip"
directory: "/backend/python/rerankers"
schedule:

View file

@ -9,13 +9,12 @@ concurrency:
cancel-in-progress: true
jobs:
extras-image-build:
image-build:
uses: ./.github/workflows/image_build.yml
with:
tag-latest: ${{ matrix.tag-latest }}
tag-suffix: ${{ matrix.tag-suffix }}
ffmpeg: ${{ matrix.ffmpeg }}
image-type: ${{ matrix.image-type }}
build-type: ${{ matrix.build-type }}
cuda-major-version: ${{ matrix.cuda-major-version }}
cuda-minor-version: ${{ matrix.cuda-minor-version }}
@ -36,16 +35,6 @@ jobs:
fail-fast: false
matrix:
include:
# This is basically covered by the AIO test
# - build-type: ''
# platforms: 'linux/amd64'
# tag-latest: 'false'
# tag-suffix: '-ffmpeg'
# ffmpeg: 'true'
# image-type: 'extras'
# runs-on: 'arc-runner-set'
# base-image: "ubuntu:22.04"
# makeflags: "--jobs=3 --output-sync=target"
- build-type: 'cublas'
cuda-major-version: "12"
cuda-minor-version: "0"
@ -53,8 +42,7 @@ jobs:
tag-latest: 'false'
tag-suffix: '-cublas-cuda12-ffmpeg'
ffmpeg: 'true'
image-type: 'extras'
runs-on: 'arc-runner-set'
runs-on: 'ubuntu-latest'
base-image: "ubuntu:22.04"
makeflags: "--jobs=3 --output-sync=target"
- build-type: 'hipblas'
@ -62,10 +50,9 @@ jobs:
tag-latest: 'false'
tag-suffix: '-hipblas'
ffmpeg: 'false'
image-type: 'extras'
base-image: "rocm/dev-ubuntu-22.04:6.1"
grpc-base-image: "ubuntu:22.04"
runs-on: 'arc-runner-set'
runs-on: 'ubuntu-latest'
makeflags: "--jobs=3 --output-sync=target"
- build-type: 'sycl_f16'
platforms: 'linux/amd64'
@ -74,77 +61,13 @@ jobs:
grpc-base-image: "ubuntu:22.04"
tag-suffix: 'sycl-f16-ffmpeg'
ffmpeg: 'true'
image-type: 'extras'
runs-on: 'arc-runner-set'
runs-on: 'ubuntu-latest'
makeflags: "--jobs=3 --output-sync=target"
- build-type: 'vulkan'
platforms: 'linux/amd64'
tag-latest: 'false'
tag-suffix: '-vulkan-ffmpeg-core'
ffmpeg: 'true'
image-type: 'core'
runs-on: 'ubuntu-latest'
base-image: "ubuntu:22.04"
makeflags: "--jobs=4 --output-sync=target"
# core-image-build:
# uses: ./.github/workflows/image_build.yml
# with:
# tag-latest: ${{ matrix.tag-latest }}
# tag-suffix: ${{ matrix.tag-suffix }}
# ffmpeg: ${{ matrix.ffmpeg }}
# image-type: ${{ matrix.image-type }}
# build-type: ${{ matrix.build-type }}
# cuda-major-version: ${{ matrix.cuda-major-version }}
# cuda-minor-version: ${{ matrix.cuda-minor-version }}
# platforms: ${{ matrix.platforms }}
# runs-on: ${{ matrix.runs-on }}
# base-image: ${{ matrix.base-image }}
# grpc-base-image: ${{ matrix.grpc-base-image }}
# makeflags: ${{ matrix.makeflags }}
# secrets:
# dockerUsername: ${{ secrets.DOCKERHUB_USERNAME }}
# dockerPassword: ${{ secrets.DOCKERHUB_PASSWORD }}
# quayUsername: ${{ secrets.LOCALAI_REGISTRY_USERNAME }}
# quayPassword: ${{ secrets.LOCALAI_REGISTRY_PASSWORD }}
# strategy:
# matrix:
# include:
# - build-type: ''
# platforms: 'linux/amd64'
# tag-latest: 'false'
# tag-suffix: '-ffmpeg-core'
# ffmpeg: 'true'
# image-type: 'core'
# runs-on: 'ubuntu-latest'
# base-image: "ubuntu:22.04"
# makeflags: "--jobs=4 --output-sync=target"
# - build-type: 'sycl_f16'
# platforms: 'linux/amd64'
# tag-latest: 'false'
# base-image: "quay.io/go-skynet/intel-oneapi-base:latest"
# grpc-base-image: "ubuntu:22.04"
# tag-suffix: 'sycl-f16-ffmpeg-core'
# ffmpeg: 'true'
# image-type: 'core'
# runs-on: 'arc-runner-set'
# makeflags: "--jobs=3 --output-sync=target"
# - build-type: 'cublas'
# cuda-major-version: "12"
# cuda-minor-version: "0"
# platforms: 'linux/amd64'
# tag-latest: 'false'
# tag-suffix: '-cublas-cuda12-ffmpeg-core'
# ffmpeg: 'true'
# image-type: 'core'
# runs-on: 'ubuntu-latest'
# base-image: "ubuntu:22.04"
# makeflags: "--jobs=4 --output-sync=target"
# - build-type: 'vulkan'
# platforms: 'linux/amd64'
# tag-latest: 'false'
# tag-suffix: '-vulkan-ffmpeg-core'
# ffmpeg: 'true'
# image-type: 'core'
# runs-on: 'ubuntu-latest'
# base-image: "ubuntu:22.04"
# makeflags: "--jobs=4 --output-sync=target"

View file

@ -37,24 +37,9 @@ jobs:
quayUsername: ${{ secrets.LOCALAI_REGISTRY_USERNAME }}
quayPassword: ${{ secrets.LOCALAI_REGISTRY_PASSWORD }}
strategy:
# Pushing with all jobs in parallel
# eats the bandwidth of all the nodes
max-parallel: 2
matrix:
include:
- build-type: 'hipblas'
platforms: 'linux/amd64'
tag-latest: 'auto'
tag-suffix: '-hipblas-extras'
ffmpeg: 'true'
image-type: 'extras'
aio: "-aio-gpu-hipblas"
base-image: "rocm/dev-ubuntu-22.04:6.1"
grpc-base-image: "ubuntu:22.04"
latest-image: 'latest-gpu-hipblas-extras'
latest-image-aio: 'latest-aio-gpu-hipblas'
runs-on: 'arc-runner-set'
makeflags: "--jobs=3 --output-sync=target"
- build-type: 'hipblas'
platforms: 'linux/amd64'
tag-latest: 'false'
@ -66,112 +51,8 @@ jobs:
runs-on: 'arc-runner-set'
makeflags: "--jobs=3 --output-sync=target"
latest-image: 'latest-gpu-hipblas'
self-hosted-jobs:
uses: ./.github/workflows/image_build.yml
with:
tag-latest: ${{ matrix.tag-latest }}
tag-suffix: ${{ matrix.tag-suffix }}
ffmpeg: ${{ matrix.ffmpeg }}
image-type: ${{ matrix.image-type }}
build-type: ${{ matrix.build-type }}
cuda-major-version: ${{ matrix.cuda-major-version }}
cuda-minor-version: ${{ matrix.cuda-minor-version }}
platforms: ${{ matrix.platforms }}
runs-on: ${{ matrix.runs-on }}
base-image: ${{ matrix.base-image }}
grpc-base-image: ${{ matrix.grpc-base-image }}
aio: ${{ matrix.aio }}
makeflags: ${{ matrix.makeflags }}
latest-image: ${{ matrix.latest-image }}
latest-image-aio: ${{ matrix.latest-image-aio }}
secrets:
dockerUsername: ${{ secrets.DOCKERHUB_USERNAME }}
dockerPassword: ${{ secrets.DOCKERHUB_PASSWORD }}
quayUsername: ${{ secrets.LOCALAI_REGISTRY_USERNAME }}
quayPassword: ${{ secrets.LOCALAI_REGISTRY_PASSWORD }}
strategy:
# Pushing with all jobs in parallel
# eats the bandwidth of all the nodes
max-parallel: ${{ github.event_name != 'pull_request' && 5 || 8 }}
matrix:
include:
- build-type: 'cublas'
cuda-major-version: "11"
cuda-minor-version: "7"
platforms: 'linux/amd64'
tag-latest: 'false'
tag-suffix: '-cublas-cuda11-extras'
ffmpeg: 'true'
image-type: 'extras'
runs-on: 'arc-runner-set'
base-image: "ubuntu:22.04"
aio: "-aio-gpu-nvidia-cuda-11"
latest-image: 'latest-gpu-nvidia-cuda-11-extras'
latest-image-aio: 'latest-aio-gpu-nvidia-cuda-11'
makeflags: "--jobs=3 --output-sync=target"
- build-type: 'cublas'
cuda-major-version: "12"
cuda-minor-version: "0"
platforms: 'linux/amd64'
tag-latest: 'false'
tag-suffix: '-cublas-cuda12-extras'
ffmpeg: 'true'
image-type: 'extras'
runs-on: 'arc-runner-set'
base-image: "ubuntu:22.04"
aio: "-aio-gpu-nvidia-cuda-12"
latest-image: 'latest-gpu-nvidia-cuda-12-extras'
latest-image-aio: 'latest-aio-gpu-nvidia-cuda-12'
makeflags: "--jobs=3 --output-sync=target"
- build-type: 'sycl_f16'
platforms: 'linux/amd64'
tag-latest: 'false'
base-image: "quay.io/go-skynet/intel-oneapi-base:latest"
grpc-base-image: "ubuntu:22.04"
tag-suffix: '-sycl-f16-extras'
ffmpeg: 'true'
image-type: 'extras'
runs-on: 'arc-runner-set'
aio: "-aio-gpu-intel-f16"
latest-image: 'latest-gpu-intel-f16-extras'
latest-image-aio: 'latest-aio-gpu-intel-f16'
makeflags: "--jobs=3 --output-sync=target"
- build-type: 'sycl_f32'
platforms: 'linux/amd64'
tag-latest: 'false'
base-image: "quay.io/go-skynet/intel-oneapi-base:latest"
grpc-base-image: "ubuntu:22.04"
tag-suffix: '-sycl-f32-extras'
ffmpeg: 'true'
image-type: 'extras'
runs-on: 'arc-runner-set'
aio: "-aio-gpu-intel-f32"
latest-image: 'latest-gpu-intel-f32-extras'
latest-image-aio: 'latest-aio-gpu-intel-f32'
makeflags: "--jobs=3 --output-sync=target"
# Core images
- build-type: 'sycl_f16'
platforms: 'linux/amd64'
tag-latest: 'false'
base-image: "quay.io/go-skynet/intel-oneapi-base:latest"
grpc-base-image: "ubuntu:22.04"
tag-suffix: '-sycl-f16'
ffmpeg: 'true'
image-type: 'core'
runs-on: 'arc-runner-set'
makeflags: "--jobs=3 --output-sync=target"
latest-image: 'latest-gpu-intel-f16'
- build-type: 'sycl_f32'
platforms: 'linux/amd64'
tag-latest: 'false'
base-image: "quay.io/go-skynet/intel-oneapi-base:latest"
grpc-base-image: "ubuntu:22.04"
tag-suffix: '-sycl-f32'
ffmpeg: 'true'
image-type: 'core'
runs-on: 'arc-runner-set'
makeflags: "--jobs=3 --output-sync=target"
latest-image: 'latest-gpu-intel-f32'
aio: "-aio-gpu-hipblas"
latest-image-aio: 'latest-aio-gpu-hipblas'
core-image-build:
uses: ./.github/workflows/image_build.yml
@ -226,7 +107,9 @@ jobs:
base-image: "ubuntu:22.04"
makeflags: "--jobs=4 --output-sync=target"
skip-drivers: 'false'
latest-image: 'latest-gpu-nvidia-cuda-12'
latest-image: 'latest-gpu-nvidia-cuda-11'
aio: "-aio-gpu-nvidia-cuda-11"
latest-image-aio: 'latest-aio-gpu-nvidia-cuda-11'
- build-type: 'cublas'
cuda-major-version: "12"
cuda-minor-version: "0"
@ -240,6 +123,8 @@ jobs:
skip-drivers: 'false'
makeflags: "--jobs=4 --output-sync=target"
latest-image: 'latest-gpu-nvidia-cuda-12'
aio: "-aio-gpu-nvidia-cuda-12"
latest-image-aio: 'latest-aio-gpu-nvidia-cuda-12'
- build-type: 'vulkan'
platforms: 'linux/amd64'
tag-latest: 'false'
@ -251,6 +136,35 @@ jobs:
skip-drivers: 'false'
makeflags: "--jobs=4 --output-sync=target"
latest-image: 'latest-gpu-vulkan'
aio: "-aio-gpu-vulkan"
latest-image-aio: 'latest-aio-gpu-vulkan'
- build-type: 'sycl_f16'
platforms: 'linux/amd64'
tag-latest: 'false'
base-image: "quay.io/go-skynet/intel-oneapi-base:latest"
grpc-base-image: "ubuntu:22.04"
tag-suffix: '-sycl-f16'
ffmpeg: 'true'
image-type: 'core'
runs-on: 'arc-runner-set'
makeflags: "--jobs=3 --output-sync=target"
latest-image: 'latest-gpu-intel-f16'
aio: "-aio-gpu-intel-f16"
latest-image-aio: 'latest-aio-gpu-intel-f16'
- build-type: 'sycl_f32'
platforms: 'linux/amd64'
tag-latest: 'false'
base-image: "quay.io/go-skynet/intel-oneapi-base:latest"
grpc-base-image: "ubuntu:22.04"
tag-suffix: '-sycl-f32'
ffmpeg: 'true'
image-type: 'core'
runs-on: 'arc-runner-set'
makeflags: "--jobs=3 --output-sync=target"
latest-image: 'latest-gpu-intel-f32'
aio: "-aio-gpu-intel-f32"
latest-image-aio: 'latest-aio-gpu-intel-f32'
gh-runner:
uses: ./.github/workflows/image_build.yml
with:

View file

@ -53,10 +53,6 @@ on:
description: 'Skip drivers by default'
default: 'false'
type: string
image-type:
description: 'Image type'
default: ''
type: string
runs-on:
description: 'Runs on'
required: true
@ -159,11 +155,11 @@ jobs:
uses: docker/metadata-action@v5
with:
images: |
ttl.sh/localai-ci-pr-${{ github.event.number }}
quay.io/go-skynet/ci-tests
tags: |
type=ref,event=branch
type=semver,pattern={{raw}}
type=sha
type=ref,event=branch,suffix=localai${{ github.event.number }}-${{ inputs.build-type }}-${{ inputs.cuda-major-version }}-${{ inputs.cuda-minor-version }}
type=semver,pattern={{raw}},suffix=localai${{ github.event.number }}-${{ inputs.build-type }}-${{ inputs.cuda-major-version }}-${{ inputs.cuda-minor-version }}
type=sha,suffix=localai${{ github.event.number }}-${{ inputs.build-type }}-${{ inputs.cuda-major-version }}-${{ inputs.cuda-minor-version }}
flavor: |
latest=${{ inputs.tag-latest }}
suffix=${{ inputs.tag-suffix }}
@ -211,7 +207,7 @@ jobs:
password: ${{ secrets.dockerPassword }}
- name: Login to DockerHub
if: github.event_name != 'pull_request'
# if: github.event_name != 'pull_request'
uses: docker/login-action@v3
with:
registry: quay.io
@ -232,7 +228,6 @@ jobs:
CUDA_MAJOR_VERSION=${{ inputs.cuda-major-version }}
CUDA_MINOR_VERSION=${{ inputs.cuda-minor-version }}
FFMPEG=${{ inputs.ffmpeg }}
IMAGE_TYPE=${{ inputs.image-type }}
BASE_IMAGE=${{ inputs.base-image }}
GRPC_BASE_IMAGE=${{ inputs.grpc-base-image || inputs.base-image }}
GRPC_MAKEFLAGS=--jobs=4 --output-sync=target
@ -261,7 +256,6 @@ jobs:
CUDA_MAJOR_VERSION=${{ inputs.cuda-major-version }}
CUDA_MINOR_VERSION=${{ inputs.cuda-minor-version }}
FFMPEG=${{ inputs.ffmpeg }}
IMAGE_TYPE=${{ inputs.image-type }}
BASE_IMAGE=${{ inputs.base-image }}
GRPC_BASE_IMAGE=${{ inputs.grpc-base-image || inputs.base-image }}
GRPC_MAKEFLAGS=--jobs=4 --output-sync=target
@ -275,10 +269,6 @@ jobs:
push: true
tags: ${{ steps.meta_pull_request.outputs.tags }}
labels: ${{ steps.meta_pull_request.outputs.labels }}
- name: Testing image
if: github.event_name == 'pull_request'
run: |
echo "Image is available at ttl.sh/localai-ci-pr-${{ github.event.number }}:${{ steps.meta_pull_request.outputs.version }}" >> $GITHUB_STEP_SUMMARY
## End testing image
- name: Build and push AIO image
if: inputs.aio != ''

457
.github/workflows/python_backend.yml vendored Normal file
View file

@ -0,0 +1,457 @@
---
name: 'build python backend container images'
on:
push:
branches:
- master
tags:
- '*'
pull_request:
concurrency:
group: ci-backends-${{ github.head_ref || github.ref }}-${{ github.repository }}
cancel-in-progress: true
jobs:
backend-jobs:
uses: ./.github/workflows/python_backend_build.yml
with:
tag-latest: ${{ matrix.tag-latest }}
tag-suffix: ${{ matrix.tag-suffix }}
build-type: ${{ matrix.build-type }}
cuda-major-version: ${{ matrix.cuda-major-version }}
cuda-minor-version: ${{ matrix.cuda-minor-version }}
platforms: ${{ matrix.platforms }}
runs-on: ${{ matrix.runs-on }}
base-image: ${{ matrix.base-image }}
backend: ${{ matrix.backend }}
latest-image: ${{ matrix.latest-image }}
secrets:
dockerUsername: ${{ secrets.DOCKERHUB_USERNAME }}
dockerPassword: ${{ secrets.DOCKERHUB_PASSWORD }}
quayUsername: ${{ secrets.LOCALAI_REGISTRY_USERNAME }}
quayPassword: ${{ secrets.LOCALAI_REGISTRY_PASSWORD }}
strategy:
fail-fast: false
max-parallel: ${{ github.event_name != 'pull_request' && 6 || 4 }}
matrix:
include:
# CUDA 11 builds
- build-type: 'cublas'
cuda-major-version: "11"
cuda-minor-version: "7"
platforms: 'linux/amd64'
tag-latest: 'true'
tag-suffix: '-gpu-nvidia-cuda-11-rerankers'
runs-on: 'arc-runner-set'
base-image: "ubuntu:22.04"
backend: "rerankers"
latest-image: 'latest-gpu-nvidia-cuda-11-rerankers'
- build-type: 'cublas'
cuda-major-version: "11"
cuda-minor-version: "7"
platforms: 'linux/amd64'
tag-latest: 'true'
tag-suffix: '-gpu-nvidia-cuda-11-vllm'
runs-on: 'arc-runner-set'
base-image: "ubuntu:22.04"
backend: "vllm"
latest-image: 'latest-gpu-nvidia-cuda-11-vllm'
- build-type: 'cublas'
cuda-major-version: "11"
cuda-minor-version: "7"
platforms: 'linux/amd64'
tag-latest: 'true'
tag-suffix: '-gpu-nvidia-cuda-11-transformers'
runs-on: 'arc-runner-set'
base-image: "ubuntu:22.04"
backend: "transformers"
latest-image: 'latest-gpu-nvidia-cuda-11-transformers'
- build-type: 'cublas'
cuda-major-version: "11"
cuda-minor-version: "7"
platforms: 'linux/amd64'
tag-latest: 'true'
tag-suffix: '-gpu-nvidia-cuda-11-diffusers'
runs-on: 'arc-runner-set'
base-image: "ubuntu:22.04"
backend: "diffusers"
latest-image: 'latest-gpu-nvidia-cuda-11-diffusers'
# CUDA 11 additional backends
- build-type: 'cublas'
cuda-major-version: "11"
cuda-minor-version: "7"
platforms: 'linux/amd64'
tag-latest: 'true'
tag-suffix: '-gpu-nvidia-cuda-11-kokoro'
runs-on: 'arc-runner-set'
base-image: "ubuntu:22.04"
backend: "kokoro"
latest-image: 'latest-gpu-nvidia-cuda-11-kokoro'
- build-type: 'cublas'
cuda-major-version: "11"
cuda-minor-version: "7"
platforms: 'linux/amd64'
tag-latest: 'true'
tag-suffix: '-gpu-nvidia-cuda-11-faster-whisper'
runs-on: 'arc-runner-set'
base-image: "ubuntu:22.04"
backend: "faster-whisper"
latest-image: 'latest-gpu-nvidia-cuda-11-faster-whisper'
- build-type: 'cublas'
cuda-major-version: "11"
cuda-minor-version: "7"
platforms: 'linux/amd64'
tag-latest: 'true'
tag-suffix: '-gpu-nvidia-cuda-11-coqui'
runs-on: 'arc-runner-set'
base-image: "ubuntu:22.04"
backend: "coqui"
latest-image: 'latest-gpu-nvidia-cuda-11-coqui'
- build-type: 'cublas'
cuda-major-version: "11"
cuda-minor-version: "7"
platforms: 'linux/amd64'
tag-latest: 'true'
tag-suffix: '-gpu-nvidia-cuda-11-bark'
runs-on: 'arc-runner-set'
base-image: "ubuntu:22.04"
backend: "bark"
latest-image: 'latest-gpu-nvidia-cuda-11-bark'
- build-type: 'cublas'
cuda-major-version: "11"
cuda-minor-version: "7"
platforms: 'linux/amd64'
tag-latest: 'true'
tag-suffix: '-gpu-nvidia-cuda-11-chatterbox'
runs-on: 'arc-runner-set'
base-image: "ubuntu:22.04"
backend: "chatterbox"
latest-image: 'latest-gpu-nvidia-cuda-11-chatterbox'
# CUDA 12 builds
- build-type: 'cublas'
cuda-major-version: "12"
cuda-minor-version: "0"
platforms: 'linux/amd64'
tag-latest: 'true'
tag-suffix: '-gpu-nvidia-cuda-12-rerankers'
runs-on: 'arc-runner-set'
base-image: "ubuntu:22.04"
backend: "rerankers"
latest-image: 'latest-gpu-nvidia-cuda-12-rerankers'
- build-type: 'cublas'
cuda-major-version: "12"
cuda-minor-version: "0"
platforms: 'linux/amd64'
tag-latest: 'true'
tag-suffix: '-gpu-nvidia-cuda-12-vllm'
runs-on: 'arc-runner-set'
base-image: "ubuntu:22.04"
backend: "vllm"
latest-image: 'latest-gpu-nvidia-cuda-12-vllm'
- build-type: 'cublas'
cuda-major-version: "12"
cuda-minor-version: "0"
platforms: 'linux/amd64'
tag-latest: 'true'
tag-suffix: '-gpu-nvidia-cuda-12-transformers'
runs-on: 'arc-runner-set'
base-image: "ubuntu:22.04"
backend: "transformers"
latest-image: 'latest-gpu-nvidia-cuda-12-transformers'
- build-type: 'cublas'
cuda-major-version: "12"
cuda-minor-version: "0"
platforms: 'linux/amd64'
tag-latest: 'true'
tag-suffix: '-gpu-nvidia-cuda-12-diffusers'
runs-on: 'arc-runner-set'
base-image: "ubuntu:22.04"
backend: "diffusers"
latest-image: 'latest-gpu-nvidia-cuda-12-diffusers'
# CUDA 12 additional backends
- build-type: 'cublas'
cuda-major-version: "12"
cuda-minor-version: "0"
platforms: 'linux/amd64'
tag-latest: 'true'
tag-suffix: '-gpu-nvidia-cuda-12-kokoro'
runs-on: 'arc-runner-set'
base-image: "ubuntu:22.04"
backend: "kokoro"
latest-image: 'latest-gpu-nvidia-cuda-12-kokoro'
- build-type: 'cublas'
cuda-major-version: "12"
cuda-minor-version: "0"
platforms: 'linux/amd64'
tag-latest: 'true'
tag-suffix: '-gpu-nvidia-cuda-12-faster-whisper'
runs-on: 'arc-runner-set'
base-image: "ubuntu:22.04"
backend: "faster-whisper"
latest-image: 'latest-gpu-nvidia-cuda-12-faster-whisper'
- build-type: 'cublas'
cuda-major-version: "12"
cuda-minor-version: "0"
platforms: 'linux/amd64'
tag-latest: 'true'
tag-suffix: '-gpu-nvidia-cuda-12-coqui'
runs-on: 'arc-runner-set'
base-image: "ubuntu:22.04"
backend: "coqui"
latest-image: 'latest-gpu-nvidia-cuda-12-coqui'
- build-type: 'cublas'
cuda-major-version: "12"
cuda-minor-version: "0"
platforms: 'linux/amd64'
tag-latest: 'true'
tag-suffix: '-gpu-nvidia-cuda-12-bark'
runs-on: 'arc-runner-set'
base-image: "ubuntu:22.04"
backend: "bark"
latest-image: 'latest-gpu-nvidia-cuda-12-bark'
- build-type: 'cublas'
cuda-major-version: "12"
cuda-minor-version: "0"
platforms: 'linux/amd64'
tag-latest: 'true'
tag-suffix: '-gpu-nvidia-cuda-12-chatterbox'
runs-on: 'arc-runner-set'
base-image: "ubuntu:22.04"
backend: "chatterbox"
latest-image: 'latest-gpu-nvidia-cuda-12-chatterbox'
# hipblas builds
- build-type: 'hipblas'
cuda-major-version: ""
cuda-minor-version: ""
platforms: 'linux/amd64'
tag-latest: 'true'
tag-suffix: '-gpu-rocm-hipblas-rerankers'
runs-on: 'arc-runner-set'
base-image: "rocm/dev-ubuntu-22.04:6.1"
backend: "rerankers"
latest-image: 'latest-gpu-rocm-hipblas-rerankers'
- build-type: 'hipblas'
cuda-major-version: ""
cuda-minor-version: ""
platforms: 'linux/amd64'
tag-latest: 'true'
tag-suffix: '-gpu-rocm-hipblas-vllm'
runs-on: 'arc-runner-set'
base-image: "rocm/dev-ubuntu-22.04:6.1"
backend: "vllm"
latest-image: 'latest-gpu-rocm-hipblas-vllm'
- build-type: 'hipblas'
cuda-major-version: ""
cuda-minor-version: ""
platforms: 'linux/amd64'
tag-latest: 'true'
tag-suffix: '-gpu-rocm-hipblas-transformers'
runs-on: 'arc-runner-set'
base-image: "rocm/dev-ubuntu-22.04:6.1"
backend: "transformers"
latest-image: 'latest-gpu-rocm-hipblas-transformers'
- build-type: 'hipblas'
cuda-major-version: ""
cuda-minor-version: ""
platforms: 'linux/amd64'
tag-latest: 'true'
tag-suffix: '-gpu-rocm-hipblas-diffusers'
runs-on: 'arc-runner-set'
base-image: "rocm/dev-ubuntu-22.04:6.1"
backend: "diffusers"
latest-image: 'latest-gpu-rocm-hipblas-diffusers'
# ROCm additional backends
- build-type: 'hipblas'
cuda-major-version: ""
cuda-minor-version: ""
platforms: 'linux/amd64'
tag-latest: 'true'
tag-suffix: '-gpu-rocm-hipblas-kokoro'
runs-on: 'arc-runner-set'
base-image: "rocm/dev-ubuntu-22.04:6.1"
backend: "kokoro"
latest-image: 'latest-gpu-rocm-hipblas-kokoro'
- build-type: 'hipblas'
cuda-major-version: ""
cuda-minor-version: ""
platforms: 'linux/amd64'
tag-latest: 'true'
tag-suffix: '-gpu-rocm-hipblas-faster-whisper'
runs-on: 'arc-runner-set'
base-image: "rocm/dev-ubuntu-22.04:6.1"
backend: "faster-whisper"
latest-image: 'latest-gpu-rocm-hipblas-faster-whisper'
- build-type: 'hipblas'
cuda-major-version: ""
cuda-minor-version: ""
platforms: 'linux/amd64'
tag-latest: 'true'
tag-suffix: '-gpu-rocm-hipblas-coqui'
runs-on: 'arc-runner-set'
base-image: "rocm/dev-ubuntu-22.04:6.1"
backend: "coqui"
latest-image: 'latest-gpu-rocm-hipblas-coqui'
- build-type: 'hipblas'
cuda-major-version: ""
cuda-minor-version: ""
platforms: 'linux/amd64'
tag-latest: 'true'
tag-suffix: '-gpu-rocm-hipblas-bark'
runs-on: 'arc-runner-set'
base-image: "rocm/dev-ubuntu-22.04:6.1"
backend: "bark"
latest-image: 'latest-gpu-rocm-hipblas-bark'
# sycl builds
- build-type: 'sycl_f32'
cuda-major-version: ""
cuda-minor-version: ""
platforms: 'linux/amd64'
tag-latest: 'true'
tag-suffix: '-gpu-intel-sycl-f32-rerankers'
runs-on: 'arc-runner-set'
base-image: "quay.io/go-skynet/intel-oneapi-base:latest"
backend: "rerankers"
latest-image: 'latest-gpu-intel-sycl-f32-rerankers'
- build-type: 'sycl_f16'
cuda-major-version: ""
cuda-minor-version: ""
platforms: 'linux/amd64'
tag-latest: 'true'
tag-suffix: '-gpu-intel-sycl-f16-rerankers'
runs-on: 'arc-runner-set'
base-image: "quay.io/go-skynet/intel-oneapi-base:latest"
backend: "rerankers"
latest-image: 'latest-gpu-intel-sycl-f16-rerankers'
- build-type: 'sycl_f32'
cuda-major-version: ""
cuda-minor-version: ""
platforms: 'linux/amd64'
tag-latest: 'true'
tag-suffix: '-gpu-intel-sycl-f32-vllm'
runs-on: 'arc-runner-set'
base-image: "quay.io/go-skynet/intel-oneapi-base:latest"
backend: "vllm"
latest-image: 'latest-gpu-intel-sycl-f32-vllm'
- build-type: 'sycl_f16'
cuda-major-version: ""
cuda-minor-version: ""
platforms: 'linux/amd64'
tag-latest: 'true'
tag-suffix: '-gpu-intel-sycl-f16-vllm'
runs-on: 'arc-runner-set'
base-image: "quay.io/go-skynet/intel-oneapi-base:latest"
backend: "vllm"
latest-image: 'latest-gpu-intel-sycl-f16-vllm'
- build-type: 'sycl_f32'
cuda-major-version: ""
cuda-minor-version: ""
platforms: 'linux/amd64'
tag-latest: 'true'
tag-suffix: '-gpu-intel-sycl-f32-transformers'
runs-on: 'arc-runner-set'
base-image: "quay.io/go-skynet/intel-oneapi-base:latest"
backend: "transformers"
latest-image: 'latest-gpu-intel-sycl-f32-transformers'
- build-type: 'sycl_f16'
cuda-major-version: ""
cuda-minor-version: ""
platforms: 'linux/amd64'
tag-latest: 'true'
tag-suffix: '-gpu-intel-sycl-f16-transformers'
runs-on: 'arc-runner-set'
base-image: "quay.io/go-skynet/intel-oneapi-base:latest"
backend: "transformers"
latest-image: 'latest-gpu-intel-sycl-f16-transformers'
- build-type: 'sycl_f32'
cuda-major-version: ""
cuda-minor-version: ""
platforms: 'linux/amd64'
tag-latest: 'true'
tag-suffix: '-gpu-intel-sycl-f32-diffusers'
runs-on: 'arc-runner-set'
base-image: "quay.io/go-skynet/intel-oneapi-base:latest"
backend: "diffusers"
latest-image: 'latest-gpu-intel-sycl-f32-diffusers'
# SYCL additional backends
- build-type: 'sycl_f32'
cuda-major-version: ""
cuda-minor-version: ""
platforms: 'linux/amd64'
tag-latest: 'true'
tag-suffix: '-gpu-intel-sycl-f32-kokoro'
runs-on: 'arc-runner-set'
base-image: "quay.io/go-skynet/intel-oneapi-base:latest"
backend: "kokoro"
latest-image: 'latest-gpu-intel-sycl-f32-kokoro'
- build-type: 'sycl_f16'
cuda-major-version: ""
cuda-minor-version: ""
platforms: 'linux/amd64'
tag-latest: 'true'
tag-suffix: '-gpu-intel-sycl-f16-kokoro'
runs-on: 'arc-runner-set'
base-image: "quay.io/go-skynet/intel-oneapi-base:latest"
backend: "kokoro"
latest-image: 'latest-gpu-intel-sycl-f16-kokoro'
- build-type: 'sycl_f32'
cuda-major-version: ""
cuda-minor-version: ""
platforms: 'linux/amd64'
tag-latest: 'true'
tag-suffix: '-gpu-intel-sycl-f32-faster-whisper'
runs-on: 'arc-runner-set'
base-image: "quay.io/go-skynet/intel-oneapi-base:latest"
backend: "faster-whisper"
latest-image: 'latest-gpu-intel-sycl-f32-faster-whisper'
- build-type: 'sycl_f16'
cuda-major-version: ""
cuda-minor-version: ""
platforms: 'linux/amd64'
tag-latest: 'true'
tag-suffix: '-gpu-intel-sycl-f16-faster-whisper'
runs-on: 'arc-runner-set'
base-image: "quay.io/go-skynet/intel-oneapi-base:latest"
backend: "faster-whisper"
latest-image: 'latest-gpu-intel-sycl-f16-faster-whisper'
- build-type: 'sycl_f32'
cuda-major-version: ""
cuda-minor-version: ""
platforms: 'linux/amd64'
tag-latest: 'true'
tag-suffix: '-gpu-intel-sycl-f32-coqui'
runs-on: 'arc-runner-set'
base-image: "quay.io/go-skynet/intel-oneapi-base:latest"
backend: "coqui"
latest-image: 'latest-gpu-intel-sycl-f32-coqui'
- build-type: 'sycl_f16'
cuda-major-version: ""
cuda-minor-version: ""
platforms: 'linux/amd64'
tag-latest: 'true'
tag-suffix: '-gpu-intel-sycl-f16-coqui'
runs-on: 'arc-runner-set'
base-image: "quay.io/go-skynet/intel-oneapi-base:latest"
backend: "coqui"
latest-image: 'latest-gpu-intel-sycl-f16-coqui'
- build-type: 'sycl_f32'
cuda-major-version: ""
cuda-minor-version: ""
platforms: 'linux/amd64'
tag-latest: 'true'
tag-suffix: '-gpu-intel-sycl-f32-bark'
runs-on: 'arc-runner-set'
base-image: "quay.io/go-skynet/intel-oneapi-base:latest"
backend: "bark"
latest-image: 'latest-gpu-intel-sycl-f32-bark'
- build-type: 'sycl_f16'
cuda-major-version: ""
cuda-minor-version: ""
platforms: 'linux/amd64'
tag-latest: 'true'
tag-suffix: '-gpu-intel-sycl-f16-bark'
runs-on: 'arc-runner-set'
base-image: "quay.io/go-skynet/intel-oneapi-base:latest"
backend: "bark"
latest-image: 'latest-gpu-intel-sycl-f16-bark'

View file

@ -0,0 +1,198 @@
---
name: 'build python backend container images (reusable)'
on:
workflow_call:
inputs:
base-image:
description: 'Base image'
required: true
type: string
build-type:
description: 'Build type'
default: ''
type: string
cuda-major-version:
description: 'CUDA major version'
default: "12"
type: string
cuda-minor-version:
description: 'CUDA minor version'
default: "1"
type: string
platforms:
description: 'Platforms'
default: ''
type: string
tag-latest:
description: 'Tag latest'
default: ''
type: string
latest-image:
description: 'Tag latest'
default: ''
type: string
tag-suffix:
description: 'Tag suffix'
default: ''
type: string
runs-on:
description: 'Runs on'
required: true
default: ''
type: string
backend:
description: 'Backend to build'
required: true
type: string
secrets:
dockerUsername:
required: true
dockerPassword:
required: true
quayUsername:
required: true
quayPassword:
required: true
jobs:
reusable_python_backend-build:
runs-on: ${{ inputs.runs-on }}
steps:
- name: Force Install GIT latest
run: |
sudo apt-get update \
&& sudo apt-get install -y software-properties-common \
&& sudo apt-get update \
&& sudo add-apt-repository -y ppa:git-core/ppa \
&& sudo apt-get update \
&& sudo apt-get install -y git
- name: Checkout
uses: actions/checkout@v4
- name: Release space from worker
if: inputs.runs-on == 'ubuntu-latest'
run: |
echo "Listing top largest packages"
pkgs=$(dpkg-query -Wf '${Installed-Size}\t${Package}\t${Status}\n' | awk '$NF == "installed"{print $1 "\t" $2}' | sort -nr)
head -n 30 <<< "${pkgs}"
echo
df -h
echo
sudo apt-get autoremove -y
sudo apt-get clean
echo
df -h
- name: Docker meta
id: meta
if: github.event_name != 'pull_request'
uses: docker/metadata-action@v5
with:
images: |
quay.io/go-skynet/local-ai-backends
localai/localai-backends
tags: |
type=ref,event=branch
type=semver,pattern={{raw}}
type=sha
flavor: |
latest=${{ inputs.tag-latest }}
suffix=${{ inputs.tag-suffix }}
- name: Docker meta for PR
id: meta_pull_request
if: github.event_name == 'pull_request'
uses: docker/metadata-action@v5
with:
images: |
quay.io/go-skynet/ci-tests
tags: |
type=ref,event=branch,suffix=${{ github.event.number }}-${{ inputs.backend }}-${{ inputs.build-type }}-${{ inputs.cuda-major-version }}-${{ inputs.cuda-minor-version }}
type=semver,pattern={{raw}},suffix=${{ github.event.number }}-${{ inputs.backend }}-${{ inputs.build-type }}-${{ inputs.cuda-major-version }}-${{ inputs.cuda-minor-version }}
type=sha,suffix=${{ github.event.number }}-${{ inputs.backend }}-${{ inputs.build-type }}-${{ inputs.cuda-major-version }}-${{ inputs.cuda-minor-version }}
flavor: |
latest=${{ inputs.tag-latest }}
suffix=${{ inputs.tag-suffix }}
## End testing image
- name: Set up QEMU
uses: docker/setup-qemu-action@master
with:
platforms: all
- name: Set up Docker Buildx
id: buildx
uses: docker/setup-buildx-action@master
- name: Login to DockerHub
if: github.event_name != 'pull_request'
uses: docker/login-action@v3
with:
username: ${{ secrets.dockerUsername }}
password: ${{ secrets.dockerPassword }}
- name: Login to Quay.io
# if: github.event_name != 'pull_request'
uses: docker/login-action@v3
with:
registry: quay.io
username: ${{ secrets.quayUsername }}
password: ${{ secrets.quayPassword }}
- name: Build and push
uses: docker/build-push-action@v6
if: github.event_name != 'pull_request'
with:
builder: ${{ steps.buildx.outputs.name }}
build-args: |
BUILD_TYPE=${{ inputs.build-type }}
CUDA_MAJOR_VERSION=${{ inputs.cuda-major-version }}
CUDA_MINOR_VERSION=${{ inputs.cuda-minor-version }}
BASE_IMAGE=${{ inputs.base-image }}
BACKEND=${{ inputs.backend }}
context: ./backend
file: ./backend/Dockerfile.python
cache-from: type=gha
platforms: ${{ inputs.platforms }}
push: ${{ github.event_name != 'pull_request' }}
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
- name: Build and push (PR)
uses: docker/build-push-action@v6
if: github.event_name == 'pull_request'
with:
builder: ${{ steps.buildx.outputs.name }}
build-args: |
BUILD_TYPE=${{ inputs.build-type }}
CUDA_MAJOR_VERSION=${{ inputs.cuda-major-version }}
CUDA_MINOR_VERSION=${{ inputs.cuda-minor-version }}
BASE_IMAGE=${{ inputs.base-image }}
BACKEND=${{ inputs.backend }}
context: ./backend
file: ./backend/Dockerfile.python
cache-from: type=gha
platforms: ${{ inputs.platforms }}
push: true
tags: ${{ steps.meta_pull_request.outputs.tags }}
labels: ${{ steps.meta_pull_request.outputs.labels }}
- name: Cleanup
run: |
docker builder prune -f
docker system prune --force --volumes --all
- name: Latest tag
if: github.event_name != 'pull_request' && inputs.latest-image != '' && github.ref_type == 'tag'
run: |
docker pull localai/localai-backends:${{ steps.meta.outputs.version }}
docker tag localai/localai-backends:${{ steps.meta.outputs.version }} localai/localai-backends:${{ inputs.latest-image }}
docker push localai/localai-backends:${{ inputs.latest-image }}
docker pull quay.io/go-skynet/local-ai-backends:${{ steps.meta.outputs.version }}
docker tag quay.io/go-skynet/local-ai-backends:${{ steps.meta.outputs.version }} quay.io/go-skynet/local-ai-backends:${{ inputs.latest-image }}
docker push quay.io/go-skynet/local-ai-backends:${{ inputs.latest-image }}
- name: job summary
run: |
echo "Built image: ${{ steps.meta.outputs.labels }}" >> $GITHUB_STEP_SUMMARY

View file

@ -188,7 +188,7 @@ jobs:
PATH="$PATH:$HOME/go/bin" make protogen-go
- name: Build images
run: |
docker build --build-arg FFMPEG=true --build-arg IMAGE_TYPE=extras --build-arg EXTRA_BACKENDS=rerankers --build-arg MAKEFLAGS="--jobs=5 --output-sync=target" -t local-ai:tests -f Dockerfile .
docker build --build-arg FFMPEG=true --build-arg MAKEFLAGS="--jobs=5 --output-sync=target" -t local-ai:tests -f Dockerfile .
BASE_IMAGE=local-ai:tests DOCKER_AIO_IMAGE=local-ai-aio:test make docker-aio
- name: Test
run: |

View file

@ -1,10 +1,9 @@
ARG IMAGE_TYPE=extras
ARG BASE_IMAGE=ubuntu:22.04
ARG GRPC_BASE_IMAGE=${BASE_IMAGE}
ARG INTEL_BASE_IMAGE=${BASE_IMAGE}
# The requirements-core target is common to all images. It should not be placed in requirements-core unless every single build will use it.
FROM ${BASE_IMAGE} AS requirements-core
FROM ${BASE_IMAGE} AS requirements
USER root
@ -15,13 +14,12 @@ ARG TARGETARCH
ARG TARGETVARIANT
ENV DEBIAN_FRONTEND=noninteractive
ENV EXTERNAL_GRPC_BACKENDS="coqui:/build/backend/python/coqui/run.sh,transformers:/build/backend/python/transformers/run.sh,rerankers:/build/backend/python/rerankers/run.sh,bark:/build/backend/python/bark/run.sh,diffusers:/build/backend/python/diffusers/run.sh,faster-whisper:/build/backend/python/faster-whisper/run.sh,kokoro:/build/backend/python/kokoro/run.sh,vllm:/build/backend/python/vllm/run.sh,exllama2:/build/backend/python/exllama2/run.sh,chatterbox:/build/backend/python/chatterbox/run.sh"
RUN apt-get update && \
apt-get install -y --no-install-recommends \
build-essential \
ccache \
ca-certificates \
ca-certificates espeak-ng \
curl libssl-dev \
git \
git-lfs \
@ -76,38 +74,12 @@ RUN apt-get update && \
WORKDIR /build
###################################
###################################
# The requirements-extras target is for any builds with IMAGE_TYPE=extras. It should not be placed in this target unless every IMAGE_TYPE=extras build will use it
FROM requirements-core AS requirements-extras
# Install uv as a system package
RUN curl -LsSf https://astral.sh/uv/install.sh | UV_INSTALL_DIR=/usr/bin sh
ENV PATH="/root/.cargo/bin:${PATH}"
RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
RUN apt-get update && \
apt-get install -y --no-install-recommends \
espeak-ng \
espeak \
python3-pip \
python-is-python3 \
python3-dev llvm \
python3-venv && \
apt-get clean && \
rm -rf /var/lib/apt/lists/* && \
pip install --upgrade pip
# Install grpcio-tools (the version in 22.04 is too old)
RUN pip install --user grpcio-tools==1.71.0 grpcio==1.71.0
###################################
###################################
# The requirements-drivers target is for BUILD_TYPE specific items. If you need to install something specific to CUDA, or specific to ROCM, it goes here.
# This target will be built on top of requirements-core or requirements-extras as retermined by the IMAGE_TYPE build-arg
FROM requirements-${IMAGE_TYPE} AS requirements-drivers
FROM requirements AS requirements-drivers
ARG BUILD_TYPE
ARG CUDA_MAJOR_VERSION=12
@ -376,8 +348,6 @@ FROM requirements-drivers
ARG FFMPEG
ARG BUILD_TYPE
ARG TARGETARCH
ARG IMAGE_TYPE=extras
ARG EXTRA_BACKENDS
ARG MAKEFLAGS
ENV BUILD_TYPE=${BUILD_TYPE}
@ -416,56 +386,13 @@ COPY --from=builder /build/local-ai ./
# Copy shared libraries for piper
COPY --from=builder /build/sources/go-piper/piper-phonemize/pi/lib/* /usr/lib/
# Change the shell to bash so we can use [[ tests below
SHELL ["/bin/bash", "-c"]
# We try to strike a balance between individual layer size (as that affects total push time) and total image size
# Splitting the backends into more groups with fewer items results in a larger image, but a smaller size for the largest layer
# Splitting the backends into fewer groups with more items results in a smaller image, but a larger size for the largest layer
RUN if [[ ( "${IMAGE_TYPE}" == "extras ")]]; then \
apt-get -qq -y install espeak-ng \
; fi
RUN if [[ ( "${EXTRA_BACKENDS}" =~ "coqui" || -z "${EXTRA_BACKENDS}" ) && "$IMAGE_TYPE" == "extras" ]]; then \
make -C backend/python/coqui \
; fi && \
if [[ ( "${EXTRA_BACKENDS}" =~ "faster-whisper" || -z "${EXTRA_BACKENDS}" ) && "$IMAGE_TYPE" == "extras" ]]; then \
make -C backend/python/faster-whisper \
; fi && \
if [[ ( "${EXTRA_BACKENDS}" =~ "diffusers" || -z "${EXTRA_BACKENDS}" ) && "$IMAGE_TYPE" == "extras" ]]; then \
make -C backend/python/diffusers \
; fi && \
if [[ ( "${EXTRA_BACKENDS}" =~ "chatterbox" || -z "${EXTRA_BACKENDS}" ) && "$IMAGE_TYPE" == "extras" && "${BUILD_TYPE}" = "cublas" && "${CUDA_MAJOR_VERSION}" = "12" ]]; then \
make -C backend/python/chatterbox \
; fi
RUN if [[ ( "${EXTRA_BACKENDS}" =~ "kokoro" || -z "${EXTRA_BACKENDS}" ) && "$IMAGE_TYPE" == "extras" ]]; then \
make -C backend/python/kokoro \
; fi && \
if [[ ( "${EXTRA_BACKENDS}" =~ "exllama2" || -z "${EXTRA_BACKENDS}" ) && "$IMAGE_TYPE" == "extras" ]]; then \
make -C backend/python/exllama2 \
; fi && \
if [[ ( "${EXTRA_BACKENDS}" =~ "transformers" || -z "${EXTRA_BACKENDS}" ) && "$IMAGE_TYPE" == "extras" ]]; then \
make -C backend/python/transformers \
; fi
RUN if [[ ( "${EXTRA_BACKENDS}" =~ "vllm" || -z "${EXTRA_BACKENDS}" ) && "$IMAGE_TYPE" == "extras" ]]; then \
make -C backend/python/vllm \
; fi && \
if [[ ( "${EXTRA_BACKENDS}" =~ "bark" || -z "${EXTRA_BACKENDS}" ) && "$IMAGE_TYPE" == "extras" ]]; then \
make -C backend/python/bark \
; fi && \
if [[ ( "${EXTRA_BACKENDS}" =~ "rerankers" || -z "${EXTRA_BACKENDS}" ) && "$IMAGE_TYPE" == "extras" ]]; then \
make -C backend/python/rerankers \
; fi
# Make sure the models directory exists
RUN mkdir -p /build/models
RUN mkdir -p /build/models /build/backends
# Define the health check command
HEALTHCHECK --interval=1m --timeout=10m --retries=10 \
CMD curl -f ${HEALTHCHECK_ENDPOINT} || exit 1
VOLUME /build/models
VOLUME /build/models /build/backends
EXPOSE 8080
ENTRYPOINT [ "/build/entrypoint.sh" ]

View file

@ -1,7 +1,13 @@
name: jina-reranker-v1-base-en
backend: rerankers
reranking: true
f16: true
parameters:
model: cross-encoder
model: jina-reranker-v1-tiny-en.f16.gguf
download_files:
- filename: jina-reranker-v1-tiny-en.f16.gguf
sha256: 5f696cf0d0f3d347c4a279eee8270e5918554cdac0ed1f632f2619e4e8341407
uri: huggingface://mradermacher/jina-reranker-v1-tiny-en-GGUF/jina-reranker-v1-tiny-en.f16.gguf
usage: |
You can test this model with curl like this:

View file

@ -1,7 +1,13 @@
name: jina-reranker-v1-base-en
backend: rerankers
reranking: true
f16: true
parameters:
model: cross-encoder
model: jina-reranker-v1-tiny-en.f16.gguf
download_files:
- filename: jina-reranker-v1-tiny-en.f16.gguf
sha256: 5f696cf0d0f3d347c4a279eee8270e5918554cdac0ed1f632f2619e4e8341407
uri: huggingface://mradermacher/jina-reranker-v1-tiny-en-GGUF/jina-reranker-v1-tiny-en.f16.gguf
usage: |
You can test this model with curl like this:

View file

@ -1,7 +1,13 @@
name: jina-reranker-v1-base-en
backend: rerankers
reranking: true
f16: true
parameters:
model: cross-encoder
model: jina-reranker-v1-tiny-en.f16.gguf
download_files:
- filename: jina-reranker-v1-tiny-en.f16.gguf
sha256: 5f696cf0d0f3d347c4a279eee8270e5918554cdac0ed1f632f2619e4e8341407
uri: huggingface://mradermacher/jina-reranker-v1-tiny-en-GGUF/jina-reranker-v1-tiny-en.f16.gguf
usage: |
You can test this model with curl like this:

123
backend/Dockerfile.python Normal file
View file

@ -0,0 +1,123 @@
ARG BASE_IMAGE=ubuntu:22.04
FROM ${BASE_IMAGE} AS builder
ARG BACKEND=rerankers
ARG BUILD_TYPE
ENV BUILD_TYPE=${BUILD_TYPE}
ARG CUDA_MAJOR_VERSION
ARG CUDA_MINOR_VERSION
ARG SKIP_DRIVERS=false
ENV CUDA_MAJOR_VERSION=${CUDA_MAJOR_VERSION}
ENV CUDA_MINOR_VERSION=${CUDA_MINOR_VERSION}
ENV DEBIAN_FRONTEND=noninteractive
ARG TARGETARCH
ARG TARGETVARIANT
RUN apt-get update && \
apt-get install -y --no-install-recommends \
build-essential \
ccache \
ca-certificates \
espeak-ng \
curl \
libssl-dev \
git \
git-lfs \
unzip \
upx-ucl \
curl python3-pip \
python-is-python3 \
python3-dev llvm \
python3-venv make && \
apt-get clean && \
rm -rf /var/lib/apt/lists/* && \
pip install --upgrade pip
# Cuda
ENV PATH=/usr/local/cuda/bin:${PATH}
# HipBLAS requirements
ENV PATH=/opt/rocm/bin:${PATH}
# Vulkan requirements
RUN <<EOT bash
if [ "${BUILD_TYPE}" = "vulkan" ] && [ "${SKIP_DRIVERS}" = "false" ]; then
apt-get update && \
apt-get install -y --no-install-recommends \
software-properties-common pciutils wget gpg-agent && \
wget -qO - https://packages.lunarg.com/lunarg-signing-key-pub.asc | apt-key add - && \
wget -qO /etc/apt/sources.list.d/lunarg-vulkan-jammy.list https://packages.lunarg.com/vulkan/lunarg-vulkan-jammy.list && \
apt-get update && \
apt-get install -y \
vulkan-sdk && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
fi
EOT
# CuBLAS requirements
RUN <<EOT bash
if [ "${BUILD_TYPE}" = "cublas" ] && [ "${SKIP_DRIVERS}" = "false" ]; then
apt-get update && \
apt-get install -y --no-install-recommends \
software-properties-common pciutils
if [ "amd64" = "$TARGETARCH" ]; then
curl -O https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb
fi
if [ "arm64" = "$TARGETARCH" ]; then
curl -O https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/arm64/cuda-keyring_1.1-1_all.deb
fi
dpkg -i cuda-keyring_1.1-1_all.deb && \
rm -f cuda-keyring_1.1-1_all.deb && \
apt-get update && \
apt-get install -y --no-install-recommends \
cuda-nvcc-${CUDA_MAJOR_VERSION}-${CUDA_MINOR_VERSION} \
libcufft-dev-${CUDA_MAJOR_VERSION}-${CUDA_MINOR_VERSION} \
libcurand-dev-${CUDA_MAJOR_VERSION}-${CUDA_MINOR_VERSION} \
libcublas-dev-${CUDA_MAJOR_VERSION}-${CUDA_MINOR_VERSION} \
libcusparse-dev-${CUDA_MAJOR_VERSION}-${CUDA_MINOR_VERSION} \
libcusolver-dev-${CUDA_MAJOR_VERSION}-${CUDA_MINOR_VERSION} && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
fi
EOT
# If we are building with clblas support, we need the libraries for the builds
RUN if [ "${BUILD_TYPE}" = "clblas" ] && [ "${SKIP_DRIVERS}" = "false" ]; then \
apt-get update && \
apt-get install -y --no-install-recommends \
libclblast-dev && \
apt-get clean && \
rm -rf /var/lib/apt/lists/* \
; fi
RUN if [ "${BUILD_TYPE}" = "hipblas" ] && [ "${SKIP_DRIVERS}" = "false" ]; then \
apt-get update && \
apt-get install -y --no-install-recommends \
hipblas-dev \
rocblas-dev && \
apt-get clean && \
rm -rf /var/lib/apt/lists/* && \
# I have no idea why, but the ROCM lib packages don't trigger ldconfig after they install, which results in local-ai and others not being able
# to locate the libraries. We run ldconfig ourselves to work around this packaging deficiency
ldconfig \
; fi
# Install uv as a system package
RUN curl -LsSf https://astral.sh/uv/install.sh | UV_INSTALL_DIR=/usr/bin sh
ENV PATH="/root/.cargo/bin:${PATH}"
RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
# Install grpcio-tools (the version in 22.04 is too old)
RUN pip install --user grpcio-tools==1.71.0 grpcio==1.71.0
COPY python/${BACKEND} /${BACKEND}
COPY backend.proto /${BACKEND}/backend.proto
COPY python/common/ /${BACKEND}/common
RUN cd /${BACKEND} && make
FROM scratch
ARG BACKEND=rerankers
COPY --from=builder /${BACKEND}/ /

295
backend/index.yaml Normal file
View file

@ -0,0 +1,295 @@
- name: "cuda11-rerankers"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-nvidia-cuda-11-rerankers"
alias: "cuda11-rerankers"
- name: "cuda11-vllm"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-nvidia-cuda-11-vllm"
alias: "cuda11-vllm"
- name: "cuda11-transformers"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-nvidia-cuda-11-transformers"
alias: "cuda11-transformers"
- name: "cuda11-diffusers"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-nvidia-cuda-11-diffusers"
alias: "cuda11-diffusers"
- name: "cuda11-exllama2"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-nvidia-cuda-11-exllama2"
alias: "cuda11-exllama2"
- name: "cuda12-rerankers"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-nvidia-cuda-12-rerankers"
alias: "cuda12-rerankers"
- name: "cuda12-vllm"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-nvidia-cuda-12-vllm"
alias: "cuda12-vllm"
- name: "cuda12-transformers"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-nvidia-cuda-12-transformers"
alias: "cuda12-transformers"
- name: "cuda12-diffusers"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-nvidia-cuda-12-diffusers"
alias: "cuda12-diffusers"
- name: "cuda12-exllama2"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-nvidia-cuda-12-exllama2"
alias: "cuda12-exllama2"
- name: "rocm-rerankers"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-rocm-hipblas-rerankers"
alias: "rocm-rerankers"
- name: "rocm-vllm"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-rocm-hipblas-vllm"
alias: "rocm-vllm"
- name: "rocm-transformers"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-rocm-hipblas-transformers"
alias: "rocm-transformers"
- name: "rocm-diffusers"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-rocm-hipblas-diffusers"
alias: "rocm-diffusers"
- name: "intel-sycl-f32-rerankers"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-intel-sycl-f32-rerankers"
alias: "intel-sycl-f32-rerankers"
- name: "intel-sycl-f16-rerankers"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-intel-sycl-f16-rerankers"
alias: "intel-sycl-f16-rerankers"
- name: "intel-sycl-f32-vllm"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-intel-sycl-f32-vllm"
alias: "intel-sycl-f32-vllm"
- name: "intel-sycl-f16-vllm"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-intel-sycl-f16-vllm"
alias: "intel-sycl-f16-vllm"
- name: "intel-sycl-f32-transformers"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-intel-sycl-f32-transformers"
alias: "intel-sycl-f32-transformers"
- name: "intel-sycl-f16-transformers"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-intel-sycl-f16-transformers"
alias: "intel-sycl-f16-transformers"
- name: "intel-sycl-f32-diffusers"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-intel-sycl-f32-diffusers"
alias: "intel-sycl-f32-diffusers"
- name: "cuda11-rerankers-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-11-rerankers"
alias: "rerankers"
- name: "cuda11-vllm-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-11-vllm"
alias: "vllm"
- name: "cuda11-transformers-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-11-transformers"
alias: "transformers"
- name: "cuda11-diffusers-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-11-diffusers"
alias: "diffusers"
- name: "cuda11-exllama2-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-11-exllama2"
alias: "exllama2"
- name: "cuda12-rerankers-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-12-rerankers"
alias: "rerankers"
- name: "cuda12-vllm-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-12-vllm"
alias: "vllm"
- name: "cuda12-transformers-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-12-transformers"
alias: "transformers"
- name: "cuda12-diffusers-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-12-diffusers"
alias: "diffusers"
- name: "cuda12-exllama2-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-12-exllama2"
alias: "exllama2"
- name: "rocm-rerankers-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-rocm-hipblas-rerankers"
alias: "rerankers"
- name: "rocm-vllm-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-rocm-hipblas-vllm"
alias: "vllm"
- name: "rocm-transformers-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-rocm-hipblas-transformers"
alias: "transformers"
- name: "rocm-diffusers-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-rocm-hipblas-diffusers"
alias: "diffusers"
- name: "intel-sycl-f32-rerankers-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-intel-sycl-f32-rerankers"
alias: "rerankers"
- name: "intel-sycl-f16-rerankers-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-intel-sycl-f16-rerankers"
alias: "rerankers"
- name: "intel-sycl-f32-vllm-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-intel-sycl-f32-vllm"
alias: "vllm"
- name: "intel-sycl-f16-vllm-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-intel-sycl-f16-vllm"
alias: "vllm"
- name: "intel-sycl-f32-transformers-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-intel-sycl-f32-transformers"
alias: "transformers"
- name: "intel-sycl-f16-transformers-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-intel-sycl-f16-transformers"
alias: "transformers"
- name: "intel-sycl-f32-diffusers-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-intel-sycl-f32-diffusers"
alias: "diffusers"
- name: "cuda11-kokoro-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-11-kokoro"
alias: "kokoro"
- name: "cuda12-kokoro-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-12-kokoro"
alias: "kokoro"
- name: "rocm-kokoro-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-rocm-hipblas-kokoro"
alias: "kokoro"
- name: "sycl-f32-kokoro"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-intel-sycl-f32-kokoro"
alias: "kokoro"
- name: "sycl-f16-kokoro"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-intel-sycl-f16-kokoro"
alias: "kokoro"
- name: "sycl-f16-kokoro-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-intel-sycl-f16-kokoro"
alias: "kokoro"
- name: "sycl-f32-kokoro-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-intel-sycl-f32-kokoro"
alias: "kokoro"
- name: "cuda11-faster-whisper-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-11-faster-whisper"
alias: "faster-whisper"
- name: "cuda12-faster-whisper-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-12-faster-whisper"
alias: "faster-whisper"
- name: "rocm-faster-whisper-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-rocm-hipblas-faster-whisper"
alias: "faster-whisper"
- name: "sycl-f32-faster-whisper"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-intel-sycl-f32-faster-whisper"
alias: "faster-whisper"
- name: "sycl-f16-faster-whisper"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-intel-sycl-f16-faster-whisper"
alias: "faster-whisper"
- name: "sycl-f32-faster-whisper-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-intel-sycl-f32-faster-whisper"
alias: "faster-whisper"
- name: "sycl-f16-faster-whisper-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-intel-sycl-f16-faster-whisper"
alias: "faster-whisper"
- name: "cuda11-coqui-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-11-coqui"
alias: "coqui"
- name: "cuda12-coqui-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-12-coqui"
alias: "coqui"
- name: "rocm-coqui-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-rocm-hipblas-coqui"
alias: "coqui"
- name: "sycl-f32-coqui"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-intel-sycl-f32-coqui"
alias: "coqui"
- name: "sycl-f16-coqui"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-intel-sycl-f16-coqui"
alias: "coqui"
- name: "sycl-f32-coqui-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-intel-sycl-f32-coqui"
alias: "coqui"
- name: "sycl-f16-coqui-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-intel-sycl-f16-coqui"
alias: "coqui"
- name: "cuda11-bark-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-11-bark"
alias: "bark"
- name: "cuda12-bark-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-12-bark"
alias: "bark"
- name: "rocm-bark-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-rocm-hipblas-bark"
alias: "bark"
- name: "sycl-f32-bark"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-intel-sycl-f32-bark"
alias: "bark"
- name: "sycl-f16-bark"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-intel-sycl-f16-bark"
alias: "bark"
- name: "sycl-f32-bark-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-intel-sycl-f32-bark"
alias: "bark"
- name: "sycl-f16-bark-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-intel-sycl-f16-bark"
alias: "bark"
- name: "cuda11-chatterbox-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-11-chatterbox"
alias: "chatterbox"
- name: "cuda12-chatterbox-master"
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-12-chatterbox"
alias: "chatterbox"
- name: "cuda11-chatterbox"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-nvidia-cuda-11-chatterbox"
alias: "chatterbox"
- name: "cuda12-chatterbox"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-nvidia-cuda-12-chatterbox"
alias: "chatterbox"

View file

@ -22,7 +22,7 @@ protogen-clean:
$(RM) backend_pb2_grpc.py backend_pb2.py
backend_pb2_grpc.py backend_pb2.py:
python3 -m grpc_tools.protoc -I../.. --python_out=. --grpc_python_out=. backend.proto
python3 -m grpc_tools.protoc -I../.. -I./ --python_out=. --grpc_python_out=. backend.proto
.PHONY: clean
clean: protogen-clean

View file

@ -1,7 +1,12 @@
#!/bin/bash
set -e
source $(dirname $0)/../common/libbackend.sh
backend_dir=$(dirname $0)
if [ -d $backend_dir/common ]; then
source $backend_dir/common/libbackend.sh
else
source $backend_dir/../common/libbackend.sh
fi
# This is here because the Intel pip index is broken and returns 200 status codes for every package name, it just doesn't return any package links.
# This makes uv think that the package exists in the Intel pip index, and by default it stops looking at other pip indexes once it finds a match.

View file

@ -1,4 +1,9 @@
#!/bin/bash
source $(dirname $0)/../common/libbackend.sh
backend_dir=$(dirname $0)
if [ -d $backend_dir/common ]; then
source $backend_dir/common/libbackend.sh
else
source $backend_dir/../common/libbackend.sh
fi
startBackend $@

View file

@ -1,6 +1,11 @@
#!/bin/bash
set -e
source $(dirname $0)/../common/libbackend.sh
backend_dir=$(dirname $0)
if [ -d $backend_dir/common ]; then
source $backend_dir/common/libbackend.sh
else
source $backend_dir/../common/libbackend.sh
fi
runUnittests

View file

@ -22,7 +22,7 @@ protogen-clean:
$(RM) backend_pb2_grpc.py backend_pb2.py
backend_pb2_grpc.py backend_pb2.py:
python3 -m grpc_tools.protoc -I../.. --python_out=. --grpc_python_out=. backend.proto
python3 -m grpc_tools.protoc -I../.. -I./ --python_out=. --grpc_python_out=. backend.proto
.PHONY: clean
clean: protogen-clean

View file

@ -1,7 +1,12 @@
#!/bin/bash
set -e
source $(dirname $0)/../common/libbackend.sh
backend_dir=$(dirname $0)
if [ -d $backend_dir/common ]; then
source $backend_dir/common/libbackend.sh
else
source $backend_dir/../common/libbackend.sh
fi
# This is here because the Intel pip index is broken and returns 200 status codes for every package name, it just doesn't return any package links.
# This makes uv think that the package exists in the Intel pip index, and by default it stops looking at other pip indexes once it finds a match.

View file

@ -1,6 +1,6 @@
--extra-index-url https://download.pytorch.org/whl/rocm6.0
torch==2.6.0+rocm6.0
torchaudio==2.6.0+rocm6.0
torch==2.6.0+rocm6.1
torchaudio==2.6.0+rocm6.1
transformers==4.46.3
chatterbox-tts
accelerate

View file

@ -8,5 +8,4 @@ accelerate
oneccl_bind_pt==2.3.100+xpu
optimum[openvino]
setuptools
transformers==4.48.3
accelerate

View file

@ -1,4 +1,9 @@
#!/bin/bash
source $(dirname $0)/../common/libbackend.sh
backend_dir=$(dirname $0)
if [ -d $backend_dir/common ]; then
source $backend_dir/common/libbackend.sh
else
source $backend_dir/../common/libbackend.sh
fi
startBackend $@

View file

@ -1,6 +1,11 @@
#!/bin/bash
set -e
source $(dirname $0)/../common/libbackend.sh
backend_dir=$(dirname $0)
if [ -d $backend_dir/common ]; then
source $backend_dir/common/libbackend.sh
else
source $backend_dir/../common/libbackend.sh
fi
runUnittests

View file

@ -1,7 +1,12 @@
#!/bin/bash
set -e
source $(dirname $0)/../common/libbackend.sh
backend_dir=$(dirname $0)
if [ -d $backend_dir/common ]; then
source $backend_dir/common/libbackend.sh
else
source $backend_dir/../common/libbackend.sh
fi
# This is here because the Intel pip index is broken and returns 200 status codes for every package name, it just doesn't return any package links.
# This makes uv think that the package exists in the Intel pip index, and by default it stops looking at other pip indexes once it finds a match.

View file

@ -1,6 +1,11 @@
#!/bin/bash
set -e
source $(dirname $0)/../common/libbackend.sh
backend_dir=$(dirname $0)
if [ -d $backend_dir/common ]; then
source $backend_dir/common/libbackend.sh
else
source $backend_dir/../common/libbackend.sh
fi
python3 -m grpc_tools.protoc -I../.. --python_out=. --grpc_python_out=. backend.proto
python3 -m grpc_tools.protoc -I../.. -I./ --python_out=. --grpc_python_out=. backend.proto

View file

@ -1,4 +1,9 @@
#!/bin/bash
source $(dirname $0)/../common/libbackend.sh
backend_dir=$(dirname $0)
if [ -d $backend_dir/common ]; then
source $backend_dir/common/libbackend.sh
else
source $backend_dir/../common/libbackend.sh
fi
startBackend $@

View file

@ -1,6 +1,11 @@
#!/bin/bash
set -e
source $(dirname $0)/../common/libbackend.sh
backend_dir=$(dirname $0)
if [ -d $backend_dir/common ]; then
source $backend_dir/common/libbackend.sh
else
source $backend_dir/../common/libbackend.sh
fi
runUnittests

View file

@ -22,7 +22,7 @@ protogen-clean:
$(RM) backend_pb2_grpc.py backend_pb2.py
backend_pb2_grpc.py backend_pb2.py:
python3 -m grpc_tools.protoc -I../.. --python_out=. --grpc_python_out=. backend.proto
python3 -m grpc_tools.protoc -I../.. -I./ --python_out=. --grpc_python_out=. backend.proto
.PHONY: clean
clean: protogen-clean

View file

@ -1,7 +1,12 @@
#!/bin/bash
set -e
source $(dirname $0)/../common/libbackend.sh
backend_dir=$(dirname $0)
if [ -d $backend_dir/common ]; then
source $backend_dir/common/libbackend.sh
else
source $backend_dir/../common/libbackend.sh
fi
# This is here because the Intel pip index is broken and returns 200 status codes for every package name, it just doesn't return any package links.
# This makes uv think that the package exists in the Intel pip index, and by default it stops looking at other pip indexes once it finds a match.

View file

@ -1,4 +1,9 @@
#!/bin/bash
source $(dirname $0)/../common/libbackend.sh
backend_dir=$(dirname $0)
if [ -d $backend_dir/common ]; then
source $backend_dir/common/libbackend.sh
else
source $backend_dir/../common/libbackend.sh
fi
startBackend $@

View file

@ -1,6 +1,11 @@
#!/bin/bash
set -e
source $(dirname $0)/../common/libbackend.sh
backend_dir=$(dirname $0)
if [ -d $backend_dir/common ]; then
source $backend_dir/common/libbackend.sh
else
source $backend_dir/../common/libbackend.sh
fi
runUnittests

View file

@ -32,7 +32,7 @@ protogen-clean:
$(RM) backend_pb2_grpc.py backend_pb2.py
backend_pb2_grpc.py backend_pb2.py:
python3 -m grpc_tools.protoc -I../.. --python_out=. --grpc_python_out=. backend.proto
python3 -m grpc_tools.protoc -I../.. -I./ --python_out=. --grpc_python_out=. backend.proto
.PHONY: clean
clean: protogen-clean

View file

@ -1,7 +1,12 @@
#!/bin/bash
set -e
source $(dirname $0)/../common/libbackend.sh
backend_dir=$(dirname $0)
if [ -d $backend_dir/common ]; then
source $backend_dir/common/libbackend.sh
else
source $backend_dir/../common/libbackend.sh
fi
# This is here because the Intel pip index is broken and returns 200 status codes for every package name, it just doesn't return any package links.
# This makes uv think that the package exists in the Intel pip index, and by default it stops looking at other pip indexes once it finds a match.

View file

@ -1,4 +1,9 @@
#!/bin/bash
source $(dirname $0)/../common/libbackend.sh
backend_dir=$(dirname $0)
if [ -d $backend_dir/common ]; then
source $backend_dir/common/libbackend.sh
else
source $backend_dir/../common/libbackend.sh
fi
startBackend $@

View file

@ -1,6 +1,11 @@
#!/bin/bash
set -e
source $(dirname $0)/../common/libbackend.sh
backend_dir=$(dirname $0)
if [ -d $backend_dir/common ]; then
source $backend_dir/common/libbackend.sh
else
source $backend_dir/../common/libbackend.sh
fi
runUnittests

View file

@ -16,7 +16,7 @@ protogen-clean:
$(RM) backend_pb2_grpc.py backend_pb2.py
backend_pb2_grpc.py backend_pb2.py:
python3 -m grpc_tools.protoc -I../.. --python_out=. --grpc_python_out=. backend.proto
python3 -m grpc_tools.protoc -I../.. -I./ --python_out=. --grpc_python_out=. backend.proto
.PHONY: clean
clean: protogen-clean

View file

@ -5,7 +5,12 @@ LIMIT_TARGETS="cublas"
EXTRA_PIP_INSTALL_FLAGS="--no-build-isolation"
EXLLAMA2_VERSION=c0ddebaaaf8ffd1b3529c2bb654e650bce2f790f
source $(dirname $0)/../common/libbackend.sh
backend_dir=$(dirname $0)
if [ -d $backend_dir/common ]; then
source $backend_dir/common/libbackend.sh
else
source $backend_dir/../common/libbackend.sh
fi
installRequirements

View file

@ -1,6 +1,11 @@
#!/bin/bash
LIMIT_TARGETS="cublas"
source $(dirname $0)/../common/libbackend.sh
backend_dir=$(dirname $0)
if [ -d $backend_dir/common ]; then
source $backend_dir/common/libbackend.sh
else
source $backend_dir/../common/libbackend.sh
fi
startBackend $@

View file

@ -1,6 +1,11 @@
#!/bin/bash
set -e
source $(dirname $0)/../common/libbackend.sh
backend_dir=$(dirname $0)
if [ -d $backend_dir/common ]; then
source $backend_dir/common/libbackend.sh
else
source $backend_dir/../common/libbackend.sh
fi
runUnittests

View file

@ -1,7 +1,12 @@
#!/bin/bash
set -e
source $(dirname $0)/../common/libbackend.sh
backend_dir=$(dirname $0)
if [ -d $backend_dir/common ]; then
source $backend_dir/common/libbackend.sh
else
source $backend_dir/../common/libbackend.sh
fi
# This is here because the Intel pip index is broken and returns 200 status codes for every package name, it just doesn't return any package links.
# This makes uv think that the package exists in the Intel pip index, and by default it stops looking at other pip indexes once it finds a match.

View file

@ -1,6 +1,11 @@
#!/bin/bash
set -e
source $(dirname $0)/../common/libbackend.sh
backend_dir=$(dirname $0)
if [ -d $backend_dir/common ]; then
source $backend_dir/common/libbackend.sh
else
source $backend_dir/../common/libbackend.sh
fi
python3 -m grpc_tools.protoc -I../.. --python_out=. --grpc_python_out=. backend.proto
python3 -m grpc_tools.protoc -I../.. -I./ --python_out=. --grpc_python_out=. backend.proto

View file

@ -1,4 +1,9 @@
#!/bin/bash
source $(dirname $0)/../common/libbackend.sh
backend_dir=$(dirname $0)
if [ -d $backend_dir/common ]; then
source $backend_dir/common/libbackend.sh
else
source $backend_dir/../common/libbackend.sh
fi
startBackend $@

View file

@ -1,6 +1,11 @@
#!/bin/bash
set -e
source $(dirname $0)/../common/libbackend.sh
backend_dir=$(dirname $0)
if [ -d $backend_dir/common ]; then
source $backend_dir/common/libbackend.sh
else
source $backend_dir/../common/libbackend.sh
fi
runUnittests

View file

@ -1,7 +1,12 @@
#!/bin/bash
set -e
source $(dirname $0)/../common/libbackend.sh
backend_dir=$(dirname $0)
if [ -d $backend_dir/common ]; then
source $backend_dir/common/libbackend.sh
else
source $backend_dir/../common/libbackend.sh
fi
# This is here because the Intel pip index is broken and returns 200 status codes for every package name, it just doesn't return any package links.
# This makes uv think that the package exists in the Intel pip index, and by default it stops looking at other pip indexes once it finds a match.

View file

@ -1,6 +1,11 @@
#!/bin/bash
set -e
source $(dirname $0)/../common/libbackend.sh
backend_dir=$(dirname $0)
if [ -d $backend_dir/common ]; then
source $backend_dir/common/libbackend.sh
else
source $backend_dir/../common/libbackend.sh
fi
python3 -m grpc_tools.protoc -I../.. --python_out=. --grpc_python_out=. backend.proto
python3 -m grpc_tools.protoc -I../.. -I./ --python_out=. --grpc_python_out=. backend.proto

View file

@ -1,4 +1,9 @@
#!/bin/bash
source $(dirname $0)/../common/libbackend.sh
backend_dir=$(dirname $0)
if [ -d $backend_dir/common ]; then
source $backend_dir/common/libbackend.sh
else
source $backend_dir/../common/libbackend.sh
fi
startBackend $@

View file

@ -1,6 +1,11 @@
#!/bin/bash
set -e
source $(dirname $0)/../common/libbackend.sh
backend_dir=$(dirname $0)
if [ -d $backend_dir/common ]; then
source $backend_dir/common/libbackend.sh
else
source $backend_dir/../common/libbackend.sh
fi
runUnittests

View file

@ -23,7 +23,7 @@ protogen-clean:
$(RM) backend_pb2_grpc.py backend_pb2.py
backend_pb2_grpc.py backend_pb2.py:
python3 -m grpc_tools.protoc -I../.. --python_out=. --grpc_python_out=. backend.proto
python3 -m grpc_tools.protoc -I../.. -I./ --python_out=. --grpc_python_out=. backend.proto
.PHONY: clean
clean: protogen-clean

View file

@ -1,7 +1,13 @@
#!/bin/bash
set -e
source $(dirname $0)/../common/libbackend.sh
backend_dir=$(dirname $0)
if [ -d $backend_dir/common ]; then
source $backend_dir/common/libbackend.sh
else
source $backend_dir/../common/libbackend.sh
fi
# This is here because the Intel pip index is broken and returns 200 status codes for every package name, it just doesn't return any package links.
# This makes uv think that the package exists in the Intel pip index, and by default it stops looking at other pip indexes once it finds a match.

View file

@ -1,4 +1,10 @@
#!/bin/bash
source $(dirname $0)/../common/libbackend.sh
backend_dir=$(dirname $0)
if [ -d $backend_dir/common ]; then
source $backend_dir/common/libbackend.sh
else
source $backend_dir/../common/libbackend.sh
fi
startBackend $@

View file

@ -1,6 +1,11 @@
#!/bin/bash
set -e
source $(dirname $0)/../common/libbackend.sh
backend_dir=$(dirname $0)
if [ -d $backend_dir/common ]; then
source $backend_dir/common/libbackend.sh
else
source $backend_dir/../common/libbackend.sh
fi
runUnittests

View file

@ -23,7 +23,7 @@ protogen-clean:
$(RM) backend_pb2_grpc.py backend_pb2.py
backend_pb2_grpc.py backend_pb2.py:
python3 -m grpc_tools.protoc -I../.. --python_out=. --grpc_python_out=. backend.proto
python3 -m grpc_tools.protoc -I../.. -I./ --python_out=. --grpc_python_out=. backend.proto
.PHONY: clean
clean: protogen-clean

View file

@ -1,7 +1,12 @@
#!/bin/bash
set -e
source $(dirname $0)/../common/libbackend.sh
backend_dir=$(dirname $0)
if [ -d $backend_dir/common ]; then
source $backend_dir/common/libbackend.sh
else
source $backend_dir/../common/libbackend.sh
fi
# This is here because the Intel pip index is broken and returns 200 status codes for every package name, it just doesn't return any package links.
# This makes uv think that the package exists in the Intel pip index, and by default it stops looking at other pip indexes once it finds a match.

View file

@ -1,5 +1,10 @@
#!/bin/bash
source $(dirname $0)/../common/libbackend.sh
backend_dir=$(dirname $0)
if [ -d $backend_dir/common ]; then
source $backend_dir/common/libbackend.sh
else
source $backend_dir/../common/libbackend.sh
fi
if [ -d "/opt/intel" ]; then
# Assumes we are using the Intel oneAPI container image

View file

@ -1,6 +1,11 @@
#!/bin/bash
set -e
source $(dirname $0)/../common/libbackend.sh
backend_dir=$(dirname $0)
if [ -d $backend_dir/common ]; then
source $backend_dir/common/libbackend.sh
else
source $backend_dir/../common/libbackend.sh
fi
runUnittests

View file

@ -22,7 +22,7 @@ protogen-clean:
$(RM) backend_pb2_grpc.py backend_pb2.py
backend_pb2_grpc.py backend_pb2.py:
python3 -m grpc_tools.protoc -I../.. --python_out=. --grpc_python_out=. backend.proto
python3 -m grpc_tools.protoc -I../.. -I./ --python_out=. --grpc_python_out=. backend.proto
.PHONY: clean
clean: protogen-clean

View file

@ -3,7 +3,13 @@ set -e
EXTRA_PIP_INSTALL_FLAGS="--no-build-isolation"
source $(dirname $0)/../common/libbackend.sh
backend_dir=$(dirname $0)
if [ -d $backend_dir/common ]; then
source $backend_dir/common/libbackend.sh
else
source $backend_dir/../common/libbackend.sh
fi
# This is here because the Intel pip index is broken and returns 200 status codes for every package name, it just doesn't return any package links.
# This makes uv think that the package exists in the Intel pip index, and by default it stops looking at other pip indexes once it finds a match.

View file

@ -1,5 +1,5 @@
--extra-index-url https://download.pytorch.org/whl/rocm6.0
--extra-index-url https://download.pytorch.org/whl/nightly/rocm6.3
accelerate
torch==2.7.0+rocm6.3
torch
transformers
bitsandbytes

View file

@ -1,4 +1,11 @@
#!/bin/bash
source $(dirname $0)/../common/libbackend.sh
backend_dir=$(dirname $0)
if [ -d $backend_dir/common ]; then
source $backend_dir/common/libbackend.sh
else
source $backend_dir/../common/libbackend.sh
fi
startBackend $@

View file

@ -1,6 +1,12 @@
#!/bin/bash
set -e
source $(dirname $0)/../common/libbackend.sh
backend_dir=$(dirname $0)
if [ -d $backend_dir/common ]; then
source $backend_dir/common/libbackend.sh
else
source $backend_dir/../common/libbackend.sh
fi
runUnittests

View file

@ -6,6 +6,7 @@ import (
"github.com/mudler/LocalAI/core/backend"
"github.com/mudler/LocalAI/core/config"
"github.com/mudler/LocalAI/core/gallery"
"github.com/mudler/LocalAI/core/services"
"github.com/mudler/LocalAI/internal"
"github.com/mudler/LocalAI/pkg/assets"
@ -60,12 +61,20 @@ func New(opts ...config.AppOption) (*Application, error) {
log.Error().Err(err).Msg("error installing models")
}
if err := pkgStartup.InstallExternalBackends(options.BackendGalleries, options.BackendsPath, nil, options.ExternalBackends...); err != nil {
log.Error().Err(err).Msg("error installing external backends")
}
configLoaderOpts := options.ToConfigLoaderOptions()
if err := application.BackendLoader().LoadBackendConfigsFromPath(options.ModelPath, configLoaderOpts...); err != nil {
log.Error().Err(err).Msg("error loading config files")
}
if err := gallery.RegisterBackends(options.BackendsPath, application.ModelLoader()); err != nil {
log.Error().Err(err).Msg("error registering external backends")
}
if options.ConfigFile != "" {
if err := application.BackendLoader().LoadMultipleBackendConfigsSingleFile(options.ConfigFile, configLoaderOpts...); err != nil {
log.Error().Err(err).Msg("error loading config file")

View file

@ -86,7 +86,7 @@ func (mi *ModelsInstall) Run(ctx *cliContext.Context) error {
modelURI := downloader.URI(modelName)
if !modelURI.LooksLikeOCI() {
model := gallery.FindModel(models, modelName, mi.ModelsPath)
model := gallery.FindGalleryElement(models, modelName, mi.ModelsPath)
if model == nil {
log.Error().Str("model", modelName).Msg("model not found")
return err

View file

@ -3,6 +3,7 @@ package cli
import (
"context"
"fmt"
"os"
"strings"
"time"
@ -19,6 +20,8 @@ import (
type RunCMD struct {
ModelArgs []string `arg:"" optional:"" name:"models" help:"Model configuration URLs to load"`
ExternalBackends []string `env:"LOCALAI_EXTERNAL_BACKENDS,EXTERNAL_BACKENDS" help:"A list of external backends to load from gallery on boot" group:"backends"`
BackendsPath string `env:"LOCALAI_BACKENDS_PATH,BACKENDS_PATH" type:"path" default:"${basepath}/backends" help:"Path containing backends used for inferencing" group:"backends"`
ModelsPath string `env:"LOCALAI_MODELS_PATH,MODELS_PATH" type:"path" default:"${basepath}/models" help:"Path containing models used for inferencing" group:"storage"`
BackendAssetsPath string `env:"LOCALAI_BACKEND_ASSETS_PATH,BACKEND_ASSETS_PATH" type:"path" default:"/tmp/localai/backend_data" help:"Path used to extract libraries that are required by some of the backends in runtime" group:"storage"`
GeneratedContentPath string `env:"LOCALAI_GENERATED_CONTENT_PATH,GENERATED_CONTENT_PATH" type:"path" default:"/tmp/generated/content" help:"Location for generated content (e.g. images, audio, videos)" group:"storage"`
@ -27,8 +30,8 @@ type RunCMD struct {
LocalaiConfigDir string `env:"LOCALAI_CONFIG_DIR" type:"path" default:"${basepath}/configuration" help:"Directory for dynamic loading of certain configuration files (currently api_keys.json and external_backends.json)" group:"storage"`
LocalaiConfigDirPollInterval time.Duration `env:"LOCALAI_CONFIG_DIR_POLL_INTERVAL" help:"Typically the config path picks up changes automatically, but if your system has broken fsnotify events, set this to an interval to poll the LocalAI Config Dir (example: 1m)" group:"storage"`
// The alias on this option is there to preserve functionality with the old `--config-file` parameter
ModelsConfigFile string `env:"LOCALAI_MODELS_CONFIG_FILE,CONFIG_FILE" aliases:"config-file" help:"YAML file containing a list of model backend configs" group:"storage"`
ModelsConfigFile string `env:"LOCALAI_MODELS_CONFIG_FILE,CONFIG_FILE" aliases:"config-file" help:"YAML file containing a list of model backend configs" group:"storage"`
BackendGalleries string `env:"LOCALAI_BACKEND_GALLERIES,BACKEND_GALLERIES" help:"JSON list of backend galleries" group:"backends" default:"${backends}"`
Galleries string `env:"LOCALAI_GALLERIES,GALLERIES" help:"JSON list of galleries" group:"models" default:"${galleries}"`
AutoloadGalleries bool `env:"LOCALAI_AUTOLOAD_GALLERIES,AUTOLOAD_GALLERIES" group:"models"`
PreloadModels string `env:"LOCALAI_PRELOAD_MODELS,PRELOAD_MODELS" help:"A List of models to apply in JSON at start" group:"models"`
@ -73,11 +76,15 @@ type RunCMD struct {
}
func (r *RunCMD) Run(ctx *cliContext.Context) error {
os.MkdirAll(r.BackendsPath, 0750)
os.MkdirAll(r.ModelsPath, 0750)
opts := []config.AppOption{
config.WithConfigFile(r.ModelsConfigFile),
config.WithJSONStringPreload(r.PreloadModels),
config.WithYAMLConfigPreload(r.PreloadModelsConfig),
config.WithModelPath(r.ModelsPath),
config.WithBackendsPath(r.BackendsPath),
config.WithContextSize(r.ContextSize),
config.WithDebug(zerolog.GlobalLevel() <= zerolog.DebugLevel),
config.WithGeneratedContentDir(r.GeneratedContentPath),
@ -87,6 +94,7 @@ func (r *RunCMD) Run(ctx *cliContext.Context) error {
config.WithDynamicConfigDirPollInterval(r.LocalaiConfigDirPollInterval),
config.WithF16(r.F16),
config.WithStringGalleries(r.Galleries),
config.WithBackendGalleries(r.BackendGalleries),
config.WithCors(r.CORS),
config.WithCorsAllowOrigins(r.CORSAllowOrigins),
config.WithCsrf(r.CSRF),
@ -97,6 +105,7 @@ func (r *RunCMD) Run(ctx *cliContext.Context) error {
config.WithUploadLimitMB(r.UploadLimit),
config.WithApiKeys(r.APIKeys),
config.WithModelsURL(append(r.Models, r.ModelArgs...)...),
config.WithExternalBackends(r.ExternalBackends...),
config.WithOpaqueErrors(r.OpaqueErrors),
config.WithEnforcedPredownloadScans(!r.DisablePredownloadScan),
config.WithSubtleKeyComparison(r.UseSubtleKeyComparison),

View file

@ -15,6 +15,8 @@ type ApplicationConfig struct {
Context context.Context
ConfigFile string
ModelPath string
BackendsPath string
ExternalBackends []string
LibPath string
UploadLimitMB, Threads, ContextSize int
F16 bool
@ -45,7 +47,8 @@ type ApplicationConfig struct {
DisableGalleryEndpoint bool
LoadToMemory []string
Galleries []Gallery
Galleries []Gallery
BackendGalleries []Gallery
BackendAssets *rice.Box
AssetsDestination string
@ -95,6 +98,18 @@ func WithModelPath(path string) AppOption {
}
}
func WithBackendsPath(path string) AppOption {
return func(o *ApplicationConfig) {
o.BackendsPath = path
}
}
func WithExternalBackends(backends ...string) AppOption {
return func(o *ApplicationConfig) {
o.ExternalBackends = backends
}
}
func WithMachineTag(tag string) AppOption {
return func(o *ApplicationConfig) {
o.MachineTag = tag
@ -218,6 +233,20 @@ func WithStringGalleries(galls string) AppOption {
}
}
func WithBackendGalleries(galls string) AppOption {
return func(o *ApplicationConfig) {
if galls == "" {
o.BackendGalleries = []Gallery{}
return
}
var galleries []Gallery
if err := json.Unmarshal([]byte(galls), &galleries); err != nil {
log.Error().Err(err).Msg("failed loading galleries")
}
o.BackendGalleries = append(o.BackendGalleries, galleries...)
}
}
func WithGalleries(galleries []Gallery) AppOption {
return func(o *ApplicationConfig) {
o.Galleries = append(o.Galleries, galleries...)

View file

@ -565,7 +565,7 @@ func (c *BackendConfig) GuessUsecases(u BackendConfigUsecases) bool {
}
}
if (u & FLAG_TTS) == FLAG_TTS {
ttsBackends := []string{"bark-cpp", "parler-tts", "piper", "transformers-musicgen"}
ttsBackends := []string{"bark-cpp", "piper", "transformers-musicgen"}
if !slices.Contains(ttsBackends, c.Backend) {
return false
}

View file

@ -159,6 +159,5 @@ parameters:
Expect(i.HasUsecases(FLAG_TTS)).To(BeFalse())
Expect(i.HasUsecases(FLAG_COMPLETION)).To(BeTrue())
Expect(i.HasUsecases(FLAG_CHAT)).To(BeTrue())
})
})

View file

@ -22,17 +22,25 @@ func guessDefaultsFromFile(cfg *BackendConfig, modelPath string, defaultCtx int)
// We try to guess only if we don't have a template defined already
guessPath := filepath.Join(modelPath, cfg.ModelFileName())
defer func() {
if r := recover(); r != nil {
log.Error().Msgf("guessDefaultsFromFile: %s", "panic while parsing gguf file")
}
}()
defer func() {
if cfg.ContextSize == nil {
if defaultCtx == 0 {
defaultCtx = defaultContextSize
}
cfg.ContextSize = &defaultCtx
}
}()
// try to parse the gguf file
f, err := gguf.ParseGGUFFile(guessPath)
if err == nil {
guessGGUFFromFile(cfg, f, defaultCtx)
return
}
if cfg.ContextSize == nil {
if defaultCtx == 0 {
defaultCtx = defaultContextSize
}
cfg.ContextSize = &defaultCtx
}
}

View file

@ -0,0 +1,35 @@
package gallery
import "github.com/mudler/LocalAI/core/config"
type GalleryBackend struct {
Metadata `json:",inline" yaml:",inline"`
Alias string `json:"alias,omitempty" yaml:"alias,omitempty"`
URI string `json:"uri,omitempty" yaml:"uri,omitempty"`
}
type GalleryBackends []*GalleryBackend
func (m *GalleryBackend) SetGallery(gallery config.Gallery) {
m.Gallery = gallery
}
func (m *GalleryBackend) SetInstalled(installed bool) {
m.Installed = installed
}
func (m *GalleryBackend) GetName() string {
return m.Name
}
func (m *GalleryBackend) GetGallery() config.Gallery {
return m.Gallery
}
func (m *GalleryBackend) GetDescription() string {
return m.Description
}
func (m *GalleryBackend) GetTags() []string {
return m.Tags
}

107
core/gallery/backends.go Normal file
View file

@ -0,0 +1,107 @@
package gallery
import (
"fmt"
"os"
"path/filepath"
"github.com/mudler/LocalAI/core/config"
"github.com/mudler/LocalAI/pkg/model"
"github.com/mudler/LocalAI/pkg/oci"
)
// Installs a model from the gallery
func InstallBackendFromGallery(galleries []config.Gallery, name string, basePath string, downloadStatus func(string, string, string, float64)) error {
backends, err := AvailableBackends(galleries, basePath)
if err != nil {
return err
}
backend := FindGalleryElement(backends, name, basePath)
if backend == nil {
return fmt.Errorf("no model found with name %q", name)
}
return InstallBackend(basePath, backend, downloadStatus)
}
func InstallBackend(basePath string, config *GalleryBackend, downloadStatus func(string, string, string, float64)) error {
// Create base path if it doesn't exist
err := os.MkdirAll(basePath, 0750)
if err != nil {
return fmt.Errorf("failed to create base path: %v", err)
}
name := config.Name
img, err := oci.GetImage(config.URI, "", nil, nil)
if err != nil {
return fmt.Errorf("failed to get image %q: %v", config.URI, err)
}
backendPath := filepath.Join(basePath, name)
if err := os.MkdirAll(backendPath, 0750); err != nil {
return fmt.Errorf("failed to create backend path %q: %v", backendPath, err)
}
if err := oci.ExtractOCIImage(img, backendPath); err != nil {
return fmt.Errorf("failed to extract image %q: %v", config.URI, err)
}
if config.Alias != "" {
// Write an alias file inside
aliasFile := filepath.Join(backendPath, "alias")
if err := os.WriteFile(aliasFile, []byte(config.Alias), 0644); err != nil {
return fmt.Errorf("failed to write alias file %q: %v", aliasFile, err)
}
}
return nil
}
func DeleteBackendFromSystem(basePath string, name string) error {
backendFile := filepath.Join(basePath, name)
return os.RemoveAll(backendFile)
}
func ListSystemBackends(basePath string) (map[string]string, error) {
backends, err := os.ReadDir(basePath)
if err != nil {
return nil, err
}
backendsNames := make(map[string]string)
for _, backend := range backends {
if backend.IsDir() {
runFile := filepath.Join(basePath, backend.Name(), "run.sh")
backendsNames[backend.Name()] = runFile
aliasFile := filepath.Join(basePath, backend.Name(), "alias")
if _, err := os.Stat(aliasFile); err == nil {
// read the alias file, and use it as key
alias, err := os.ReadFile(aliasFile)
if err != nil {
return nil, err
}
backendsNames[string(alias)] = runFile
}
}
}
return backendsNames, nil
}
func RegisterBackends(basePath string, modelLoader *model.ModelLoader) error {
backends, err := ListSystemBackends(basePath)
if err != nil {
return err
}
for name, runFile := range backends {
modelLoader.SetExternalBackend(name, runFile)
}
return nil
}

View file

@ -0,0 +1,151 @@
package gallery
import (
"os"
"path/filepath"
"github.com/mudler/LocalAI/core/config"
. "github.com/onsi/ginkgo/v2"
. "github.com/onsi/gomega"
)
var _ = Describe("Gallery Backends", func() {
var (
tempDir string
galleries []config.Gallery
)
BeforeEach(func() {
var err error
tempDir, err = os.MkdirTemp("", "gallery-test-*")
Expect(err).NotTo(HaveOccurred())
// Setup test galleries
galleries = []config.Gallery{
{
Name: "test-gallery",
URL: "https://gist.githubusercontent.com/mudler/71d5376bc2aa168873fa519fa9f4bd56/raw/0557f9c640c159fa8e4eab29e8d98df6a3d6e80f/backend-gallery.yaml",
},
}
})
AfterEach(func() {
os.RemoveAll(tempDir)
})
Describe("InstallBackendFromGallery", func() {
It("should return error when backend is not found", func() {
err := InstallBackendFromGallery(galleries, "non-existent", tempDir, nil)
Expect(err).To(HaveOccurred())
Expect(err.Error()).To(ContainSubstring("no model found with name"))
})
It("should install backend from gallery", func() {
err := InstallBackendFromGallery(galleries, "test-backend", tempDir, nil)
Expect(err).ToNot(HaveOccurred())
Expect(filepath.Join(tempDir, "test-backend", "run.sh")).To(BeARegularFile())
})
})
Describe("InstallBackend", func() {
It("should create base path if it doesn't exist", func() {
newPath := filepath.Join(tempDir, "new-path")
backend := GalleryBackend{
Metadata: Metadata{
Name: "test-backend",
},
URI: "test-uri",
}
err := InstallBackend(newPath, &backend, nil)
Expect(err).To(HaveOccurred()) // Will fail due to invalid URI, but path should be created
Expect(newPath).To(BeADirectory())
})
It("should create alias file when specified", func() {
backend := GalleryBackend{
Metadata: Metadata{
Name: "test-backend",
},
URI: "quay.io/mudler/tests:localai-backend-test",
Alias: "test-alias",
}
err := InstallBackend(tempDir, &backend, nil)
Expect(err).ToNot(HaveOccurred())
Expect(filepath.Join(tempDir, "test-backend", "alias")).To(BeARegularFile())
content, err := os.ReadFile(filepath.Join(tempDir, "test-backend", "alias"))
Expect(err).ToNot(HaveOccurred())
Expect(string(content)).To(ContainSubstring("test-alias"))
Expect(filepath.Join(tempDir, "test-backend", "run.sh")).To(BeARegularFile())
// Check that the alias was recognized
backends, err := ListSystemBackends(tempDir)
Expect(err).ToNot(HaveOccurred())
Expect(backends).To(HaveKeyWithValue("test-alias", filepath.Join(tempDir, "test-backend", "run.sh")))
Expect(backends).To(HaveKeyWithValue("test-backend", filepath.Join(tempDir, "test-backend", "run.sh")))
})
})
Describe("DeleteBackendFromSystem", func() {
It("should delete backend directory", func() {
backendName := "test-backend"
backendPath := filepath.Join(tempDir, backendName)
// Create a dummy backend directory
err := os.MkdirAll(backendPath, 0750)
Expect(err).NotTo(HaveOccurred())
err = DeleteBackendFromSystem(tempDir, backendName)
Expect(err).NotTo(HaveOccurred())
Expect(backendPath).NotTo(BeADirectory())
})
It("should not error when backend doesn't exist", func() {
err := DeleteBackendFromSystem(tempDir, "non-existent")
Expect(err).NotTo(HaveOccurred())
})
})
Describe("ListSystemBackends", func() {
It("should list backends without aliases", func() {
// Create some dummy backend directories
backendNames := []string{"backend1", "backend2", "backend3"}
for _, name := range backendNames {
err := os.MkdirAll(filepath.Join(tempDir, name), 0750)
Expect(err).NotTo(HaveOccurred())
}
backends, err := ListSystemBackends(tempDir)
Expect(err).NotTo(HaveOccurred())
Expect(backends).To(HaveLen(len(backendNames)))
for _, name := range backendNames {
Expect(backends).To(HaveKeyWithValue(name, filepath.Join(tempDir, name, "run.sh")))
}
})
It("should handle backends with aliases", func() {
backendName := "backend1"
alias := "alias1"
backendPath := filepath.Join(tempDir, backendName)
// Create backend directory
err := os.MkdirAll(backendPath, 0750)
Expect(err).NotTo(HaveOccurred())
// Create alias file
err = os.WriteFile(filepath.Join(backendPath, "alias"), []byte(alias), 0644)
Expect(err).NotTo(HaveOccurred())
backends, err := ListSystemBackends(tempDir)
Expect(err).NotTo(HaveOccurred())
Expect(backends).To(HaveKeyWithValue(alias, filepath.Join(tempDir, backendName, "run.sh")))
})
It("should return error when base path doesn't exist", func() {
_, err := ListSystemBackends(filepath.Join(tempDir, "non-existent"))
Expect(err).To(HaveOccurred())
})
})
})

View file

@ -1,109 +1,109 @@
package gallery
import (
"errors"
"fmt"
"os"
"path/filepath"
"strings"
"dario.cat/mergo"
"github.com/mudler/LocalAI/core/config"
"github.com/mudler/LocalAI/pkg/downloader"
"github.com/mudler/LocalAI/pkg/utils"
"github.com/rs/zerolog/log"
"gopkg.in/yaml.v2"
)
// Installs a model from the gallery
func InstallModelFromGallery(galleries []config.Gallery, name string, basePath string, req GalleryModel, downloadStatus func(string, string, string, float64), enforceScan bool) error {
applyModel := func(model *GalleryModel) error {
name = strings.ReplaceAll(name, string(os.PathSeparator), "__")
var config Config
if len(model.URL) > 0 {
var err error
config, err = GetGalleryConfigFromURL(model.URL, basePath)
if err != nil {
return err
}
config.Description = model.Description
config.License = model.License
} else if len(model.ConfigFile) > 0 {
// TODO: is this worse than using the override method with a blank cfg yaml?
reYamlConfig, err := yaml.Marshal(model.ConfigFile)
if err != nil {
return err
}
config = Config{
ConfigFile: string(reYamlConfig),
Description: model.Description,
License: model.License,
URLs: model.URLs,
Name: model.Name,
Files: make([]File, 0), // Real values get added below, must be blank
// Prompt Template Skipped for now - I expect in this mode that they will be delivered as files.
}
} else {
return fmt.Errorf("invalid gallery model %+v", model)
}
installName := model.Name
if req.Name != "" {
installName = req.Name
}
// Copy the model configuration from the request schema
config.URLs = append(config.URLs, model.URLs...)
config.Icon = model.Icon
config.Files = append(config.Files, req.AdditionalFiles...)
config.Files = append(config.Files, model.AdditionalFiles...)
// TODO model.Overrides could be merged with user overrides (not defined yet)
if err := mergo.Merge(&model.Overrides, req.Overrides, mergo.WithOverride); err != nil {
return err
}
if err := InstallModel(basePath, installName, &config, model.Overrides, downloadStatus, enforceScan); err != nil {
return err
}
return nil
}
models, err := AvailableGalleryModels(galleries, basePath)
func GetGalleryConfigFromURL[T any](url string, basePath string) (T, error) {
var config T
uri := downloader.URI(url)
err := uri.DownloadWithCallback(basePath, func(url string, d []byte) error {
return yaml.Unmarshal(d, &config)
})
if err != nil {
return err
log.Error().Err(err).Str("url", url).Msg("failed to get gallery config for url")
return config, err
}
model := FindModel(models, name, basePath)
if model == nil {
return fmt.Errorf("no model found with name %q", name)
}
return applyModel(model)
return config, nil
}
func FindModel(models []*GalleryModel, name string, basePath string) *GalleryModel {
var model *GalleryModel
func ReadConfigFile[T any](filePath string) (*T, error) {
// Read the YAML file
yamlFile, err := os.ReadFile(filePath)
if err != nil {
return nil, fmt.Errorf("failed to read YAML file: %v", err)
}
// Unmarshal YAML data into a Config struct
var config T
err = yaml.Unmarshal(yamlFile, &config)
if err != nil {
return nil, fmt.Errorf("failed to unmarshal YAML: %v", err)
}
return &config, nil
}
type GalleryElement interface {
SetGallery(gallery config.Gallery)
SetInstalled(installed bool)
GetName() string
GetDescription() string
GetTags() []string
GetGallery() config.Gallery
}
type GalleryElements[T GalleryElement] []T
func (gm GalleryElements[T]) Search(term string) GalleryElements[T] {
var filteredModels GalleryElements[T]
for _, m := range gm {
if strings.Contains(m.GetName(), term) ||
strings.Contains(m.GetDescription(), term) ||
strings.Contains(m.GetGallery().Name, term) ||
strings.Contains(strings.Join(m.GetTags(), ","), term) {
filteredModels = append(filteredModels, m)
}
}
return filteredModels
}
func (gm GalleryElements[T]) FindByName(name string) T {
for _, m := range gm {
if strings.EqualFold(m.GetName(), name) {
return m
}
}
var zero T
return zero
}
func (gm GalleryElements[T]) Paginate(pageNum int, itemsNum int) GalleryElements[T] {
start := (pageNum - 1) * itemsNum
end := start + itemsNum
if start > len(gm) {
start = len(gm)
}
if end > len(gm) {
end = len(gm)
}
return gm[start:end]
}
func FindGalleryElement[T GalleryElement](models []T, name string, basePath string) T {
var model T
name = strings.ReplaceAll(name, string(os.PathSeparator), "__")
if !strings.Contains(name, "@") {
for _, m := range models {
if strings.EqualFold(m.Name, name) {
if strings.EqualFold(m.GetName(), name) {
model = m
break
}
}
if model == nil {
return nil
}
} else {
for _, m := range models {
if strings.EqualFold(name, fmt.Sprintf("%s@%s", m.Gallery.Name, m.Name)) {
if strings.EqualFold(name, fmt.Sprintf("%s@%s", m.GetGallery().Name, m.GetName())) {
model = m
break
}
@ -116,12 +116,28 @@ func FindModel(models []*GalleryModel, name string, basePath string) *GalleryMod
// List available models
// Models galleries are a list of yaml files that are hosted on a remote server (for example github).
// Each yaml file contains a list of models that can be downloaded and optionally overrides to define a new model setting.
func AvailableGalleryModels(galleries []config.Gallery, basePath string) (GalleryModels, error) {
func AvailableGalleryModels(galleries []config.Gallery, basePath string) (GalleryElements[*GalleryModel], error) {
var models []*GalleryModel
// Get models from galleries
for _, gallery := range galleries {
galleryModels, err := getGalleryModels(gallery, basePath)
galleryModels, err := getGalleryElements[*GalleryModel](gallery, basePath)
if err != nil {
return nil, err
}
models = append(models, galleryModels...)
}
return models, nil
}
// List available backends
func AvailableBackends(galleries []config.Gallery, basePath string) (GalleryElements[*GalleryBackend], error) {
var models []*GalleryBackend
// Get models from galleries
for _, gallery := range galleries {
galleryModels, err := getGalleryElements[*GalleryBackend](gallery, basePath)
if err != nil {
return nil, err
}
@ -146,8 +162,8 @@ func findGalleryURLFromReferenceURL(url string, basePath string) (string, error)
return refFile, err
}
func getGalleryModels(gallery config.Gallery, basePath string) ([]*GalleryModel, error) {
var models []*GalleryModel = []*GalleryModel{}
func getGalleryElements[T GalleryElement](gallery config.Gallery, basePath string) ([]T, error) {
var models []T = []T{}
if strings.HasSuffix(gallery.URL, ".ref") {
var err error
@ -170,97 +186,16 @@ func getGalleryModels(gallery config.Gallery, basePath string) ([]*GalleryModel,
// Add gallery to models
for _, model := range models {
model.Gallery = gallery
model.SetGallery(gallery)
// we check if the model was already installed by checking if the config file exists
// TODO: (what to do if the model doesn't install a config file?)
if _, err := os.Stat(filepath.Join(basePath, fmt.Sprintf("%s.yaml", model.Name))); err == nil {
model.Installed = true
// TODO: This is sub-optimal now that the gallery handles both backends and models - we need to abstract this away
if _, err := os.Stat(filepath.Join(basePath, fmt.Sprintf("%s.yaml", model.GetName()))); err == nil {
model.SetInstalled(true)
}
if _, err := os.Stat(filepath.Join(basePath, model.GetName())); err == nil {
model.SetInstalled(true)
}
}
return models, nil
}
func GetLocalModelConfiguration(basePath string, name string) (*Config, error) {
name = strings.ReplaceAll(name, string(os.PathSeparator), "__")
galleryFile := filepath.Join(basePath, galleryFileName(name))
return ReadConfigFile(galleryFile)
}
func DeleteModelFromSystem(basePath string, name string, additionalFiles []string) error {
// os.PathSeparator is not allowed in model names. Replace them with "__" to avoid conflicts with file paths.
name = strings.ReplaceAll(name, string(os.PathSeparator), "__")
configFile := filepath.Join(basePath, fmt.Sprintf("%s.yaml", name))
galleryFile := filepath.Join(basePath, galleryFileName(name))
for _, f := range []string{configFile, galleryFile} {
if err := utils.VerifyPath(f, basePath); err != nil {
return fmt.Errorf("failed to verify path %s: %w", f, err)
}
}
var err error
// Delete all the files associated to the model
// read the model config
galleryconfig, err := ReadConfigFile(galleryFile)
if err != nil {
log.Error().Err(err).Msgf("failed to read gallery file %s", configFile)
}
var filesToRemove []string
// Remove additional files
if galleryconfig != nil {
for _, f := range galleryconfig.Files {
fullPath := filepath.Join(basePath, f.Filename)
filesToRemove = append(filesToRemove, fullPath)
}
}
for _, f := range additionalFiles {
fullPath := filepath.Join(filepath.Join(basePath, f))
filesToRemove = append(filesToRemove, fullPath)
}
filesToRemove = append(filesToRemove, configFile)
filesToRemove = append(filesToRemove, galleryFile)
// skip duplicates
filesToRemove = utils.Unique(filesToRemove)
// Removing files
for _, f := range filesToRemove {
if e := os.Remove(f); e != nil {
err = errors.Join(err, fmt.Errorf("failed to remove file %s: %w", f, e))
}
}
return err
}
// This is ***NEVER*** going to be perfect or finished.
// This is a BEST EFFORT function to surface known-vulnerable models to users.
func SafetyScanGalleryModels(galleries []config.Gallery, basePath string) error {
galleryModels, err := AvailableGalleryModels(galleries, basePath)
if err != nil {
return err
}
for _, gM := range galleryModels {
if gM.Installed {
err = errors.Join(err, SafetyScanGalleryModel(gM))
}
}
return err
}
func SafetyScanGalleryModel(galleryModel *GalleryModel) error {
for _, file := range galleryModel.AdditionalFiles {
scanResults, err := downloader.HuggingFaceScan(downloader.URI(file.URI))
if err != nil && errors.Is(err, downloader.ErrUnsafeFilesFound) {
log.Error().Str("model", galleryModel.Name).Strs("clamAV", scanResults.ClamAVInfectedFiles).Strs("pickles", scanResults.DangerousPickles).Msg("Contains unsafe file(s)!")
return err
}
}
return nil
}

View file

@ -1,7 +1,6 @@
package gallery_test
import (
"os"
"testing"
. "github.com/onsi/ginkgo/v2"
@ -12,9 +11,3 @@ func TestGallery(t *testing.T) {
RegisterFailHandler(Fail)
RunSpecs(t, "Gallery test suite")
}
var _ = BeforeSuite(func() {
if os.Getenv("FIXTURES") == "" {
Fail("FIXTURES env var not set")
}
})

View file

@ -0,0 +1,19 @@
package gallery
import "github.com/mudler/LocalAI/core/config"
type Metadata struct {
URL string `json:"url,omitempty" yaml:"url,omitempty"`
Name string `json:"name,omitempty" yaml:"name,omitempty"`
Description string `json:"description,omitempty" yaml:"description,omitempty"`
License string `json:"license,omitempty" yaml:"license,omitempty"`
URLs []string `json:"urls,omitempty" yaml:"urls,omitempty"`
Icon string `json:"icon,omitempty" yaml:"icon,omitempty"`
Tags []string `json:"tags,omitempty" yaml:"tags,omitempty"`
// AdditionalFiles are used to add additional files to the model
AdditionalFiles []File `json:"files,omitempty" yaml:"files,omitempty"`
// Gallery is a reference to the gallery which contains the model
Gallery config.Gallery `json:"gallery,omitempty" yaml:"gallery,omitempty"`
// Installed is used to indicate if the model is installed or not
Installed bool `json:"installed,omitempty" yaml:"installed,omitempty"`
}

View file

@ -5,8 +5,10 @@ import (
"fmt"
"os"
"path/filepath"
"strings"
"dario.cat/mergo"
"github.com/mudler/LocalAI/core/config"
lconfig "github.com/mudler/LocalAI/core/config"
"github.com/mudler/LocalAI/pkg/downloader"
"github.com/mudler/LocalAI/pkg/utils"
@ -41,10 +43,10 @@ prompt_templates:
content: ""
*/
// Config is the model configuration which contains all the model details
// ModelConfig is the model configuration which contains all the model details
// This configuration is read from the gallery endpoint and is used to download and install the model
// It is the internal structure, separated from the request
type Config struct {
type ModelConfig struct {
Description string `yaml:"description"`
Icon string `yaml:"icon"`
License string `yaml:"license"`
@ -66,37 +68,78 @@ type PromptTemplate struct {
Content string `yaml:"content"`
}
func GetGalleryConfigFromURL(url string, basePath string) (Config, error) {
var config Config
uri := downloader.URI(url)
err := uri.DownloadWithCallback(basePath, func(url string, d []byte) error {
return yaml.Unmarshal(d, &config)
})
if err != nil {
log.Error().Err(err).Str("url", url).Msg("failed to get gallery config for url")
return config, err
// Installs a model from the gallery
func InstallModelFromGallery(galleries []config.Gallery, name string, basePath string, req GalleryModel, downloadStatus func(string, string, string, float64), enforceScan bool) error {
applyModel := func(model *GalleryModel) error {
name = strings.ReplaceAll(name, string(os.PathSeparator), "__")
var config ModelConfig
if len(model.URL) > 0 {
var err error
config, err = GetGalleryConfigFromURL[ModelConfig](model.URL, basePath)
if err != nil {
return err
}
config.Description = model.Description
config.License = model.License
} else if len(model.ConfigFile) > 0 {
// TODO: is this worse than using the override method with a blank cfg yaml?
reYamlConfig, err := yaml.Marshal(model.ConfigFile)
if err != nil {
return err
}
config = ModelConfig{
ConfigFile: string(reYamlConfig),
Description: model.Description,
License: model.License,
URLs: model.URLs,
Name: model.Name,
Files: make([]File, 0), // Real values get added below, must be blank
// Prompt Template Skipped for now - I expect in this mode that they will be delivered as files.
}
} else {
return fmt.Errorf("invalid gallery model %+v", model)
}
installName := model.Name
if req.Name != "" {
installName = req.Name
}
// Copy the model configuration from the request schema
config.URLs = append(config.URLs, model.URLs...)
config.Icon = model.Icon
config.Files = append(config.Files, req.AdditionalFiles...)
config.Files = append(config.Files, model.AdditionalFiles...)
// TODO model.Overrides could be merged with user overrides (not defined yet)
if err := mergo.Merge(&model.Overrides, req.Overrides, mergo.WithOverride); err != nil {
return err
}
if err := InstallModel(basePath, installName, &config, model.Overrides, downloadStatus, enforceScan); err != nil {
return err
}
return nil
}
return config, nil
models, err := AvailableGalleryModels(galleries, basePath)
if err != nil {
return err
}
model := FindGalleryElement(models, name, basePath)
if model == nil {
return fmt.Errorf("no model found with name %q", name)
}
return applyModel(model)
}
func ReadConfigFile(filePath string) (*Config, error) {
// Read the YAML file
yamlFile, err := os.ReadFile(filePath)
if err != nil {
return nil, fmt.Errorf("failed to read YAML file: %v", err)
}
// Unmarshal YAML data into a Config struct
var config Config
err = yaml.Unmarshal(yamlFile, &config)
if err != nil {
return nil, fmt.Errorf("failed to unmarshal YAML: %v", err)
}
return &config, nil
}
func InstallModel(basePath, nameOverride string, config *Config, configOverrides map[string]interface{}, downloadStatus func(string, string, string, float64), enforceScan bool) error {
func InstallModel(basePath, nameOverride string, config *ModelConfig, configOverrides map[string]interface{}, downloadStatus func(string, string, string, float64), enforceScan bool) error {
// Create base path if it doesn't exist
err := os.MkdirAll(basePath, 0750)
if err != nil {
@ -219,3 +262,88 @@ func InstallModel(basePath, nameOverride string, config *Config, configOverrides
func galleryFileName(name string) string {
return "._gallery_" + name + ".yaml"
}
func GetLocalModelConfiguration(basePath string, name string) (*ModelConfig, error) {
name = strings.ReplaceAll(name, string(os.PathSeparator), "__")
galleryFile := filepath.Join(basePath, galleryFileName(name))
return ReadConfigFile[ModelConfig](galleryFile)
}
func DeleteModelFromSystem(basePath string, name string, additionalFiles []string) error {
// os.PathSeparator is not allowed in model names. Replace them with "__" to avoid conflicts with file paths.
name = strings.ReplaceAll(name, string(os.PathSeparator), "__")
configFile := filepath.Join(basePath, fmt.Sprintf("%s.yaml", name))
galleryFile := filepath.Join(basePath, galleryFileName(name))
for _, f := range []string{configFile, galleryFile} {
if err := utils.VerifyPath(f, basePath); err != nil {
return fmt.Errorf("failed to verify path %s: %w", f, err)
}
}
var err error
// Delete all the files associated to the model
// read the model config
galleryconfig, err := ReadConfigFile[ModelConfig](galleryFile)
if err != nil {
log.Error().Err(err).Msgf("failed to read gallery file %s", configFile)
}
var filesToRemove []string
// Remove additional files
if galleryconfig != nil {
for _, f := range galleryconfig.Files {
fullPath := filepath.Join(basePath, f.Filename)
filesToRemove = append(filesToRemove, fullPath)
}
}
for _, f := range additionalFiles {
fullPath := filepath.Join(filepath.Join(basePath, f))
filesToRemove = append(filesToRemove, fullPath)
}
filesToRemove = append(filesToRemove, configFile)
filesToRemove = append(filesToRemove, galleryFile)
// skip duplicates
filesToRemove = utils.Unique(filesToRemove)
// Removing files
for _, f := range filesToRemove {
if e := os.Remove(f); e != nil {
err = errors.Join(err, fmt.Errorf("failed to remove file %s: %w", f, e))
}
}
return err
}
// This is ***NEVER*** going to be perfect or finished.
// This is a BEST EFFORT function to surface known-vulnerable models to users.
func SafetyScanGalleryModels(galleries []config.Gallery, basePath string) error {
galleryModels, err := AvailableGalleryModels(galleries, basePath)
if err != nil {
return err
}
for _, gM := range galleryModels {
if gM.Installed {
err = errors.Join(err, SafetyScanGalleryModel(gM))
}
}
return err
}
func SafetyScanGalleryModel(galleryModel *GalleryModel) error {
for _, file := range galleryModel.AdditionalFiles {
scanResults, err := downloader.HuggingFaceScan(downloader.URI(file.URI))
if err != nil && errors.Is(err, downloader.ErrUnsafeFilesFound) {
log.Error().Str("model", galleryModel.Name).Strs("clamAV", scanResults.ClamAVInfectedFiles).Strs("pickles", scanResults.DangerousPickles).Msg("Contains unsafe file(s)!")
return err
}
}
return nil
}

View file

@ -16,12 +16,18 @@ const bertEmbeddingsURL = `https://gist.githubusercontent.com/mudler/0a080b166b8
var _ = Describe("Model test", func() {
BeforeEach(func() {
if os.Getenv("FIXTURES") == "" {
Skip("FIXTURES env var not set, skipping model tests")
}
})
Context("Downloading", func() {
It("applies model correctly", func() {
tempdir, err := os.MkdirTemp("", "test")
Expect(err).ToNot(HaveOccurred())
defer os.RemoveAll(tempdir)
c, err := ReadConfigFile(filepath.Join(os.Getenv("FIXTURES"), "gallery_simple.yaml"))
c, err := ReadConfigFile[ModelConfig](filepath.Join(os.Getenv("FIXTURES"), "gallery_simple.yaml"))
Expect(err).ToNot(HaveOccurred())
err = InstallModel(tempdir, "", c, map[string]interface{}{}, func(string, string, string, float64) {}, true)
Expect(err).ToNot(HaveOccurred())
@ -107,7 +113,7 @@ var _ = Describe("Model test", func() {
tempdir, err := os.MkdirTemp("", "test")
Expect(err).ToNot(HaveOccurred())
defer os.RemoveAll(tempdir)
c, err := ReadConfigFile(filepath.Join(os.Getenv("FIXTURES"), "gallery_simple.yaml"))
c, err := ReadConfigFile[ModelConfig](filepath.Join(os.Getenv("FIXTURES"), "gallery_simple.yaml"))
Expect(err).ToNot(HaveOccurred())
err = InstallModel(tempdir, "foo", c, map[string]interface{}{}, func(string, string, string, float64) {}, true)
@ -123,7 +129,7 @@ var _ = Describe("Model test", func() {
tempdir, err := os.MkdirTemp("", "test")
Expect(err).ToNot(HaveOccurred())
defer os.RemoveAll(tempdir)
c, err := ReadConfigFile(filepath.Join(os.Getenv("FIXTURES"), "gallery_simple.yaml"))
c, err := ReadConfigFile[ModelConfig](filepath.Join(os.Getenv("FIXTURES"), "gallery_simple.yaml"))
Expect(err).ToNot(HaveOccurred())
err = InstallModel(tempdir, "foo", c, map[string]interface{}{"backend": "foo"}, func(string, string, string, float64) {}, true)
@ -149,7 +155,7 @@ var _ = Describe("Model test", func() {
tempdir, err := os.MkdirTemp("", "test")
Expect(err).ToNot(HaveOccurred())
defer os.RemoveAll(tempdir)
c, err := ReadConfigFile(filepath.Join(os.Getenv("FIXTURES"), "gallery_simple.yaml"))
c, err := ReadConfigFile[ModelConfig](filepath.Join(os.Getenv("FIXTURES"), "gallery_simple.yaml"))
Expect(err).ToNot(HaveOccurred())
err = InstallModel(tempdir, "../../../foo", c, map[string]interface{}{}, func(string, string, string, float64) {}, true)

View file

@ -0,0 +1,46 @@
package gallery
import (
"fmt"
"github.com/mudler/LocalAI/core/config"
)
// GalleryModel is the struct used to represent a model in the gallery returned by the endpoint.
// It is used to install the model by resolving the URL and downloading the files.
// The other fields are used to override the configuration of the model.
type GalleryModel struct {
Metadata `json:",inline" yaml:",inline"`
// config_file is read in the situation where URL is blank - and therefore this is a base config.
ConfigFile map[string]interface{} `json:"config_file,omitempty" yaml:"config_file,omitempty"`
// Overrides are used to override the configuration of the model located at URL
Overrides map[string]interface{} `json:"overrides,omitempty" yaml:"overrides,omitempty"`
}
func (m *GalleryModel) SetGallery(gallery config.Gallery) {
m.Gallery = gallery
}
func (m *GalleryModel) SetInstalled(installed bool) {
m.Installed = installed
}
func (m *GalleryModel) GetName() string {
return m.Name
}
func (m *GalleryModel) GetGallery() config.Gallery {
return m.Gallery
}
func (m GalleryModel) ID() string {
return fmt.Sprintf("%s@%s", m.Gallery.Name, m.Name)
}
func (m *GalleryModel) GetTags() []string {
return m.Tags
}
func (m *GalleryModel) GetDescription() string {
return m.Description
}

View file

@ -1,25 +0,0 @@
package gallery
import "github.com/mudler/LocalAI/core/config"
type GalleryOp struct {
Id string
GalleryModelName string
ConfigURL string
Delete bool
Req GalleryModel
Galleries []config.Gallery
}
type GalleryOpStatus struct {
Deletion bool `json:"deletion"` // Deletion is true if the operation is a deletion
FileName string `json:"file_name"`
Error error `json:"error"`
Processed bool `json:"processed"`
Message string `json:"message"`
Progress float64 `json:"progress"`
TotalFileSize string `json:"file_size"`
DownloadedFileSize string `json:"downloaded_size"`
GalleryModelName string `json:"gallery_model_name"`
}

View file

@ -1,76 +0,0 @@
package gallery
import (
"fmt"
"strings"
"github.com/mudler/LocalAI/core/config"
)
// GalleryModel is the struct used to represent a model in the gallery returned by the endpoint.
// It is used to install the model by resolving the URL and downloading the files.
// The other fields are used to override the configuration of the model.
type GalleryModel struct {
Metadata `json:",inline" yaml:",inline"`
// config_file is read in the situation where URL is blank - and therefore this is a base config.
ConfigFile map[string]interface{} `json:"config_file,omitempty" yaml:"config_file,omitempty"`
// Overrides are used to override the configuration of the model located at URL
Overrides map[string]interface{} `json:"overrides,omitempty" yaml:"overrides,omitempty"`
}
type Metadata struct {
URL string `json:"url,omitempty" yaml:"url,omitempty"`
Name string `json:"name,omitempty" yaml:"name,omitempty"`
Description string `json:"description,omitempty" yaml:"description,omitempty"`
License string `json:"license,omitempty" yaml:"license,omitempty"`
URLs []string `json:"urls,omitempty" yaml:"urls,omitempty"`
Icon string `json:"icon,omitempty" yaml:"icon,omitempty"`
Tags []string `json:"tags,omitempty" yaml:"tags,omitempty"`
// AdditionalFiles are used to add additional files to the model
AdditionalFiles []File `json:"files,omitempty" yaml:"files,omitempty"`
// Gallery is a reference to the gallery which contains the model
Gallery config.Gallery `json:"gallery,omitempty" yaml:"gallery,omitempty"`
// Installed is used to indicate if the model is installed or not
Installed bool `json:"installed,omitempty" yaml:"installed,omitempty"`
}
func (m GalleryModel) ID() string {
return fmt.Sprintf("%s@%s", m.Gallery.Name, m.Name)
}
type GalleryModels []*GalleryModel
func (gm GalleryModels) Search(term string) GalleryModels {
var filteredModels GalleryModels
for _, m := range gm {
if strings.Contains(m.Name, term) ||
strings.Contains(m.Description, term) ||
strings.Contains(m.Gallery.Name, term) ||
strings.Contains(strings.Join(m.Tags, ","), term) {
filteredModels = append(filteredModels, m)
}
}
return filteredModels
}
func (gm GalleryModels) FindByName(name string) *GalleryModel {
for _, m := range gm {
if strings.EqualFold(m.Name, name) {
return m
}
}
return nil
}
func (gm GalleryModels) Paginate(pageNum int, itemsNum int) GalleryModels {
start := (pageNum - 1) * itemsNum
end := start + itemsNum
if start > len(gm) {
start = len(gm)
}
if end > len(gm) {
end = len(gm)
}
return gm[start:end]
}

View file

@ -14,7 +14,7 @@ var _ = Describe("Gallery API tests", func() {
URL: "github:go-skynet/model-gallery/gpt4all-j.yaml@main",
},
}
e, err := GetGalleryConfigFromURL(req.URL, "")
e, err := GetGalleryConfigFromURL[ModelConfig](req.URL, "")
Expect(err).ToNot(HaveOccurred())
Expect(e.Name).To(Equal("gpt4all-j"))
})

View file

@ -204,7 +204,7 @@ func API(application *application.Application) (*fiber.App, error) {
utils.LoadConfig(application.ApplicationConfig().ConfigsDir, openai.AssistantsConfigFile, &openai.Assistants)
utils.LoadConfig(application.ApplicationConfig().ConfigsDir, openai.AssistantsFileConfigFile, &openai.AssistantFiles)
galleryService := services.NewGalleryService(application.ApplicationConfig())
galleryService := services.NewGalleryService(application.ApplicationConfig(), application.ModelLoader())
galleryService.Start(application.ApplicationConfig().Context, application.BackendLoader())
requestExtractor := middleware.NewRequestExtractor(application.BackendLoader(), application.ModelLoader(), application.ApplicationConfig())

View file

@ -485,29 +485,6 @@ var _ = Describe("API test", func() {
Expect(err).ToNot(HaveOccurred())
Expect(content["backend"]).To(Equal("llama"))
})
It("apply models from config", func() {
response := postModelApplyRequest("http://127.0.0.1:9090/models/apply", modelApplyRequest{
ConfigURL: "https://raw.githubusercontent.com/mudler/LocalAI/v2.25.0/embedded/models/hermes-2-pro-mistral.yaml",
})
Expect(response["uuid"]).ToNot(BeEmpty(), fmt.Sprint(response))
uuid := response["uuid"].(string)
Eventually(func() bool {
response := getModelStatus("http://127.0.0.1:9090/models/jobs/" + uuid)
return response["processed"].(bool)
}, "900s", "10s").Should(Equal(true))
Eventually(func() []string {
models, _ := client.ListModels(context.TODO())
modelList := []string{}
for _, m := range models.Models {
modelList = append(modelList, m.ID)
}
return modelList
}, "360s", "10s").Should(ContainElements("hermes-2-pro-mistral"))
})
It("apply models without overrides", func() {
response := postModelApplyRequest("http://127.0.0.1:9090/models/apply", modelApplyRequest{
URL: bertEmbeddingsURL,
@ -533,80 +510,6 @@ var _ = Describe("API test", func() {
Expect(content["usage"]).To(ContainSubstring("You can test this model with curl like this"))
})
It("runs openllama gguf(llama-cpp)", Label("llama-gguf"), func() {
if runtime.GOOS != "linux" {
Skip("test supported only on linux")
}
modelName := "hermes-2-pro-mistral"
response := postModelApplyRequest("http://127.0.0.1:9090/models/apply", modelApplyRequest{
ConfigURL: "https://raw.githubusercontent.com/mudler/LocalAI/v2.25.0/embedded/models/hermes-2-pro-mistral.yaml",
})
Expect(response["uuid"]).ToNot(BeEmpty(), fmt.Sprint(response))
uuid := response["uuid"].(string)
Eventually(func() bool {
response := getModelStatus("http://127.0.0.1:9090/models/jobs/" + uuid)
return response["processed"].(bool)
}, "900s", "10s").Should(Equal(true))
By("testing chat")
resp, err := client.CreateChatCompletion(context.TODO(), openai.ChatCompletionRequest{Model: modelName, Messages: []openai.ChatCompletionMessage{
{
Role: "user",
Content: "How much is 2+2?",
},
}})
Expect(err).ToNot(HaveOccurred())
Expect(len(resp.Choices)).To(Equal(1))
Expect(resp.Choices[0].Message.Content).To(Or(ContainSubstring("4"), ContainSubstring("four")))
By("testing functions")
resp2, err := client.CreateChatCompletion(
context.TODO(),
openai.ChatCompletionRequest{
Model: modelName,
Messages: []openai.ChatCompletionMessage{
{
Role: "user",
Content: "What is the weather like in San Francisco (celsius)?",
},
},
Functions: []openai.FunctionDefinition{
openai.FunctionDefinition{
Name: "get_current_weather",
Description: "Get the current weather",
Parameters: jsonschema.Definition{
Type: jsonschema.Object,
Properties: map[string]jsonschema.Definition{
"location": {
Type: jsonschema.String,
Description: "The city and state, e.g. San Francisco, CA",
},
"unit": {
Type: jsonschema.String,
Enum: []string{"celcius", "fahrenheit"},
},
},
Required: []string{"location"},
},
},
},
})
Expect(err).ToNot(HaveOccurred())
Expect(len(resp2.Choices)).To(Equal(1))
Expect(resp2.Choices[0].Message.FunctionCall).ToNot(BeNil())
Expect(resp2.Choices[0].Message.FunctionCall.Name).To(Equal("get_current_weather"), resp2.Choices[0].Message.FunctionCall.Name)
var res map[string]string
err = json.Unmarshal([]byte(resp2.Choices[0].Message.FunctionCall.Arguments), &res)
Expect(err).ToNot(HaveOccurred())
Expect(res["location"]).To(ContainSubstring("San Francisco"), fmt.Sprint(res))
Expect(res["unit"]).To(Equal("celcius"), fmt.Sprint(res))
Expect(string(resp2.Choices[0].FinishReason)).To(Equal("function_call"), fmt.Sprint(resp2.Choices[0].FinishReason))
})
})
})
@ -673,6 +576,82 @@ var _ = Describe("API test", func() {
_, err = os.ReadDir(tmpdir)
Expect(err).To(HaveOccurred())
})
It("runs gguf models (chat)", Label("llama-gguf"), func() {
if runtime.GOOS != "linux" {
Skip("test supported only on linux")
}
modelName := "qwen3-1.7b"
response := postModelApplyRequest("http://127.0.0.1:9090/models/apply", modelApplyRequest{
ID: "localai@" + modelName,
})
Expect(response["uuid"]).ToNot(BeEmpty(), fmt.Sprint(response))
uuid := response["uuid"].(string)
Eventually(func() bool {
response := getModelStatus("http://127.0.0.1:9090/models/jobs/" + uuid)
return response["processed"].(bool)
}, "900s", "10s").Should(Equal(true))
By("testing chat")
resp, err := client.CreateChatCompletion(context.TODO(), openai.ChatCompletionRequest{Model: modelName, Messages: []openai.ChatCompletionMessage{
{
Role: "user",
Content: "How much is 2+2?",
},
}})
Expect(err).ToNot(HaveOccurred())
Expect(len(resp.Choices)).To(Equal(1))
Expect(resp.Choices[0].Message.Content).To(Or(ContainSubstring("4"), ContainSubstring("four")))
By("testing functions")
resp2, err := client.CreateChatCompletion(
context.TODO(),
openai.ChatCompletionRequest{
Model: modelName,
Messages: []openai.ChatCompletionMessage{
{
Role: "user",
Content: "What is the weather like in San Francisco (celsius)?",
},
},
Functions: []openai.FunctionDefinition{
openai.FunctionDefinition{
Name: "get_current_weather",
Description: "Get the current weather",
Parameters: jsonschema.Definition{
Type: jsonschema.Object,
Properties: map[string]jsonschema.Definition{
"location": {
Type: jsonschema.String,
Description: "The city and state, e.g. San Francisco, CA",
},
"unit": {
Type: jsonschema.String,
Enum: []string{"celcius", "fahrenheit"},
},
},
Required: []string{"location"},
},
},
},
})
Expect(err).ToNot(HaveOccurred())
Expect(len(resp2.Choices)).To(Equal(1))
Expect(resp2.Choices[0].Message.FunctionCall).ToNot(BeNil())
Expect(resp2.Choices[0].Message.FunctionCall.Name).To(Equal("get_current_weather"), resp2.Choices[0].Message.FunctionCall.Name)
var res map[string]string
err = json.Unmarshal([]byte(resp2.Choices[0].Message.FunctionCall.Arguments), &res)
Expect(err).ToNot(HaveOccurred())
Expect(res["location"]).To(ContainSubstring("San Francisco"), fmt.Sprint(res))
Expect(res["unit"]).To(Equal("celcius"), fmt.Sprint(res))
Expect(string(resp2.Choices[0].FinishReason)).To(Equal("function_call"), fmt.Sprint(resp2.Choices[0].FinishReason))
})
It("installs and is capable to run tts", Label("tts"), func() {
if runtime.GOOS != "linux" {
Skip("test supported only on linux")

View file

@ -331,7 +331,7 @@ func modelActionItems(m *gallery.GalleryModel, processTracker ProcessTracker, ga
elem.If(
currentlyProcessing,
elem.Node( // If currently installing, show progress bar
elem.Raw(StartProgressBar(jobID, "0", progressMessage)),
elem.Raw(StartModelProgressBar(jobID, "0", progressMessage)),
), // Otherwise, show install button (if not installed) or display "Installed"
elem.If(m.Installed,
elem.Node(elem.Div(
@ -418,3 +418,335 @@ func ListModels(models []*gallery.GalleryModel, processTracker ProcessTracker, g
return wrapper.Render()
}
func ListBackends(backends []*gallery.GalleryBackend, processTracker ProcessTracker, galleryService *services.GalleryService) string {
backendsElements := []elem.Node{}
for _, b := range backends {
elems := []elem.Node{}
if b.Icon == "" {
b.Icon = noImage
}
divProperties := attrs.Props{
"class": "flex justify-center items-center",
}
elems = append(elems,
elem.Div(divProperties,
elem.A(attrs.Props{
"href": "#!",
},
elem.Img(attrs.Props{
"class": "rounded-t-lg max-h-48 max-w-96 object-cover mt-3",
"src": b.Icon,
"loading": "lazy",
}),
),
),
)
elems = append(elems,
backendDescription(b),
backendActionItems(b, processTracker, galleryService),
)
backendsElements = append(backendsElements,
elem.Div(
attrs.Props{
"class": "me-4 mb-2 block rounded-lg bg-white shadow-secondary-1 dark:bg-gray-800 dark:bg-surface-dark dark:text-white text-surface pb-2 bg-gray-800/90 border border-gray-700/50 rounded-xl overflow-hidden transition-all duration-300 hover:shadow-lg hover:shadow-blue-900/20 hover:-translate-y-1 hover:border-blue-700/50",
},
elem.Div(
attrs.Props{},
elems...,
),
),
backendModal(b),
)
}
wrapper := elem.Div(attrs.Props{
"class": "dark grid grid-cols-1 grid-rows-1 md:grid-cols-3 block rounded-lg shadow-secondary-1 dark:bg-surface-dark",
}, backendsElements...)
return wrapper.Render()
}
func backendDescription(b *gallery.GalleryBackend) elem.Node {
return elem.Div(
attrs.Props{
"class": "p-6 text-surface dark:text-white",
},
elem.H5(
attrs.Props{
"class": "mb-2 text-xl font-bold leading-tight",
},
elem.Text(bluemonday.StrictPolicy().Sanitize(b.Name)),
),
elem.Div(
attrs.Props{
"class": "mb-4 text-sm truncate text-base",
},
elem.Text(bluemonday.StrictPolicy().Sanitize(b.Description)),
),
)
}
func backendActionItems(b *gallery.GalleryBackend, processTracker ProcessTracker, galleryService *services.GalleryService) elem.Node {
galleryID := fmt.Sprintf("%s@%s", b.Gallery.Name, b.Name)
currentlyProcessing := processTracker.Exists(galleryID)
jobID := ""
isDeletionOp := false
if currentlyProcessing {
status := galleryService.GetStatus(galleryID)
if status != nil && status.Deletion {
isDeletionOp = true
}
jobID = processTracker.Get(galleryID)
}
nodes := []elem.Node{
cardSpan("Repository: "+b.Gallery.Name, "fa-brands fa-git-alt"),
}
if b.License != "" {
nodes = append(nodes,
cardSpan("License: "+b.License, "fas fa-book"),
)
}
progressMessage := "Installation"
if isDeletionOp {
progressMessage = "Deletion"
}
return elem.Div(
attrs.Props{
"class": "px-6 pt-4 pb-2",
},
elem.P(
attrs.Props{
"class": "mb-4 text-base",
},
nodes...,
),
elem.Div(
attrs.Props{
"id": "action-div-" + dropBadChars(galleryID),
"class": "flow-root",
},
backendInfoButton(b),
elem.Div(
attrs.Props{
"class": "float-right",
},
elem.If(
currentlyProcessing,
elem.Node(
elem.Raw(StartModelProgressBar(jobID, "0", progressMessage)),
),
elem.If(b.Installed,
elem.Node(elem.Div(
attrs.Props{},
backendReInstallButton(galleryID),
backendDeleteButton(galleryID),
)),
backendInstallButton(galleryID),
),
),
),
),
)
}
func backendModal(b *gallery.GalleryBackend) elem.Node {
urls := []elem.Node{}
for _, url := range b.URLs {
urls = append(urls,
elem.Li(attrs.Props{}, link(url, url)),
)
}
tagsNodes := []elem.Node{}
for _, tag := range b.Tags {
tagsNodes = append(tagsNodes,
searchableElement(tag, "fas fa-tag"),
)
}
modalID := fmt.Sprintf("modal-%s", dropBadChars(fmt.Sprintf("%s@%s", b.Gallery.Name, b.Name)))
return elem.Div(
attrs.Props{
"id": modalID,
"tabindex": "-1",
"aria-hidden": "true",
"class": "hidden fixed top-0 right-0 left-0 z-50 justify-center items-center w-full md:inset-0 h-full max-h-full bg-gray-900/50",
},
elem.Div(
attrs.Props{
"class": "relative p-4 w-full max-w-2xl h-[90vh] mx-auto mt-[5vh]",
},
elem.Div(
attrs.Props{
"class": "relative bg-white rounded-lg shadow dark:bg-gray-700 h-full flex flex-col",
},
elem.Div(
attrs.Props{
"class": "flex items-center justify-between p-4 md:p-5 border-b rounded-t dark:border-gray-600",
},
elem.H3(
attrs.Props{
"class": "text-xl font-semibold text-gray-900 dark:text-white",
},
elem.Text(bluemonday.StrictPolicy().Sanitize(b.Name)),
),
elem.Button(
attrs.Props{
"class": "text-gray-400 bg-transparent hover:bg-gray-200 hover:text-gray-900 rounded-lg text-sm w-8 h-8 ms-auto inline-flex justify-center items-center dark:hover:bg-gray-600 dark:hover:text-white",
"data-modal-hide": modalID,
},
elem.Raw(
`<svg class="w-3 h-3" aria-hidden="true" xmlns="http://www.w3.org/2000/svg" fill="none" viewBox="0 0 14 14">
<path stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="m1 1 6 6m0 0 6 6M7 7l6-6M7 7l-6 6"/>
</svg>`,
),
elem.Span(
attrs.Props{
"class": "sr-only",
},
elem.Text("Close modal"),
),
),
),
elem.Div(
attrs.Props{
"class": "p-4 md:p-5 space-y-4 overflow-y-auto flex-grow",
},
elem.Div(
attrs.Props{
"class": "flex justify-center items-center",
},
elem.Img(attrs.Props{
"src": b.Icon,
"class": "rounded-t-lg max-h-48 max-w-96 object-cover mt-3",
"loading": "lazy",
}),
),
elem.P(
attrs.Props{
"class": "text-base leading-relaxed text-gray-500 dark:text-gray-400",
},
elem.Text(bluemonday.StrictPolicy().Sanitize(b.Description)),
),
elem.Div(
attrs.Props{
"class": "flex flex-wrap gap-2",
},
tagsNodes...,
),
elem.Div(
attrs.Props{
"class": "text-base leading-relaxed text-gray-500 dark:text-gray-400",
},
elem.Ul(attrs.Props{}, urls...),
),
),
elem.Div(
attrs.Props{
"class": "flex items-center p-4 md:p-5 border-t border-gray-200 rounded-b dark:border-gray-600",
},
elem.Button(
attrs.Props{
"data-modal-hide": modalID,
"type": "button",
"class": "text-white bg-blue-700 hover:bg-blue-800 focus:ring-4 focus:outline-none focus:ring-blue-300 font-medium rounded-lg text-sm px-5 py-2.5 text-center dark:bg-blue-600 dark:hover:bg-blue-700 dark:focus:ring-blue-800",
},
elem.Text("Close"),
),
),
),
),
)
}
func backendInfoButton(b *gallery.GalleryBackend) elem.Node {
modalID := fmt.Sprintf("modal-%s", dropBadChars(fmt.Sprintf("%s@%s", b.Gallery.Name, b.Name)))
return elem.Button(
attrs.Props{
"data-twe-ripple-init": "",
"data-twe-ripple-color": "light",
"class": "inline-flex items-center rounded-lg bg-gray-700 hover:bg-gray-600 px-4 py-2 text-sm font-medium text-white transition duration-300 ease-in-out",
"data-modal-target": modalID,
"data-modal-toggle": modalID,
},
elem.P(
attrs.Props{
"class": "flex items-center",
},
elem.I(
attrs.Props{
"class": "fas fa-info-circle pr-2",
},
),
elem.Text("Info"),
),
)
}
func backendInstallButton(galleryID string) elem.Node {
return elem.Button(
attrs.Props{
"data-twe-ripple-init": "",
"data-twe-ripple-color": "light",
"class": "float-right inline-flex items-center rounded-lg bg-blue-600 hover:bg-blue-700 px-4 py-2 text-sm font-medium text-white transition duration-300 ease-in-out shadow hover:shadow-lg",
"hx-swap": "outerHTML",
"hx-post": "browse/install/backend/" + galleryID,
},
elem.I(
attrs.Props{
"class": "fa-solid fa-download pr-2",
},
),
elem.Text("Install"),
)
}
func backendReInstallButton(galleryID string) elem.Node {
return elem.Button(
attrs.Props{
"data-twe-ripple-init": "",
"data-twe-ripple-color": "light",
"class": "float-right inline-block rounded bg-primary ml-2 px-6 pb-2.5 mb-3 pt-2.5 text-xs font-medium uppercase leading-normal text-white shadow-primary-3 transition duration-150 ease-in-out hover:bg-primary-accent-300 hover:shadow-primary-2 focus:bg-primary-accent-300 focus:shadow-primary-2 focus:outline-none focus:ring-0 active:bg-primary-600 active:shadow-primary-2 dark:shadow-black/30 dark:hover:shadow-dark-strong dark:focus:shadow-dark-strong dark:active:shadow-dark-strong",
"hx-target": "#action-div-" + dropBadChars(galleryID),
"hx-swap": "outerHTML",
"hx-post": "browse/install/backend/" + galleryID,
},
elem.I(
attrs.Props{
"class": "fa-solid fa-arrow-rotate-right pr-2",
},
),
elem.Text("Reinstall"),
)
}
func backendDeleteButton(galleryID string) elem.Node {
return elem.Button(
attrs.Props{
"data-twe-ripple-init": "",
"data-twe-ripple-color": "light",
"hx-confirm": "Are you sure you wish to delete the backend?",
"class": "float-right inline-block rounded bg-red-800 px-6 pb-2.5 mb-3 pt-2.5 text-xs font-medium uppercase leading-normal text-white shadow-primary-3 transition duration-150 ease-in-out hover:bg-red-accent-300 hover:shadow-red-2 focus:bg-red-accent-300 focus:shadow-primary-2 focus:outline-none focus:ring-0 active:bg-red-600 active:shadow-primary-2 dark:shadow-black/30 dark:hover:shadow-dark-strong dark:focus:shadow-dark-strong dark:active:shadow-dark-strong",
"hx-target": "#action-div-" + dropBadChars(galleryID),
"hx-swap": "outerHTML",
"hx-post": "browse/delete/backend/" + galleryID,
},
elem.I(
attrs.Props{
"class": "fa-solid fa-cancel pr-2",
},
),
elem.Text("Delete"),
)
}

View file

@ -6,7 +6,7 @@ import (
"github.com/microcosm-cc/bluemonday"
)
func DoneProgress(galleryID, text string, showDelete bool) string {
func DoneModelProgress(galleryID, text string, showDelete bool) string {
return elem.Div(
attrs.Props{
"id": "action-div-" + dropBadChars(galleryID),
@ -24,6 +24,24 @@ func DoneProgress(galleryID, text string, showDelete bool) string {
).Render()
}
func DoneBackendProgress(galleryID, text string, showDelete bool) string {
return elem.Div(
attrs.Props{
"id": "action-div-" + dropBadChars(galleryID),
},
elem.H3(
attrs.Props{
"role": "status",
"id": "pblabel",
"tabindex": "-1",
"autofocus": "",
},
elem.Text(bluemonday.StrictPolicy().Sanitize(text)),
),
elem.If(showDelete, backendDeleteButton(galleryID), reInstallButton(galleryID)),
).Render()
}
func ErrorProgress(err, galleryName string) string {
return elem.Div(
attrs.Props{},
@ -57,14 +75,22 @@ func ProgressBar(progress string) string {
).Render()
}
func StartProgressBar(uid, progress, text string) string {
func StartModelProgressBar(uid, progress, text string) string {
return progressBar(uid, "browse/job/", progress, text)
}
func StartBackendProgressBar(uid, progress, text string) string {
return progressBar(uid, "browse/backend/job/", progress, text)
}
func progressBar(uid, url, progress, text string) string {
if progress == "" {
progress = "0"
}
return elem.Div(
attrs.Props{
"hx-trigger": "done",
"hx-get": "browse/job/" + uid,
"hx-get": url + uid,
"hx-swap": "outerHTML",
"hx-target": "this",
},
@ -77,7 +103,7 @@ func StartProgressBar(uid, progress, text string) string {
},
elem.Text(bluemonday.StrictPolicy().Sanitize(text)), //Perhaps overly defensive
elem.Div(attrs.Props{
"hx-get": "browse/job/progress/" + uid,
"hx-get": url + "progress/" + uid,
"hx-trigger": "every 600ms",
"hx-target": "this",
"hx-swap": "innerHTML",

View file

@ -0,0 +1,152 @@
package localai
import (
"encoding/json"
"fmt"
"github.com/gofiber/fiber/v2"
"github.com/google/uuid"
"github.com/mudler/LocalAI/core/config"
"github.com/mudler/LocalAI/core/gallery"
"github.com/mudler/LocalAI/core/http/utils"
"github.com/mudler/LocalAI/core/schema"
"github.com/mudler/LocalAI/core/services"
"github.com/rs/zerolog/log"
)
type BackendEndpointService struct {
galleries []config.Gallery
backendPath string
backendApplier *services.GalleryService
}
type GalleryBackend struct {
ID string `json:"id"`
}
func CreateBackendEndpointService(galleries []config.Gallery, backendPath string, backendApplier *services.GalleryService) BackendEndpointService {
return BackendEndpointService{
galleries: galleries,
backendPath: backendPath,
backendApplier: backendApplier,
}
}
// GetOpStatusEndpoint returns the job status
// @Summary Returns the job status
// @Success 200 {object} services.BackendOpStatus "Response"
// @Router /backends/jobs/{uuid} [get]
func (mgs *BackendEndpointService) GetOpStatusEndpoint() func(c *fiber.Ctx) error {
return func(c *fiber.Ctx) error {
status := mgs.backendApplier.GetStatus(c.Params("uuid"))
if status == nil {
return fmt.Errorf("could not find any status for ID")
}
return c.JSON(status)
}
}
// GetAllStatusEndpoint returns all the jobs status progress
// @Summary Returns all the jobs status progress
// @Success 200 {object} map[string]services.BackendOpStatus "Response"
// @Router /backends/jobs [get]
func (mgs *BackendEndpointService) GetAllStatusEndpoint() func(c *fiber.Ctx) error {
return func(c *fiber.Ctx) error {
return c.JSON(mgs.backendApplier.GetAllStatus())
}
}
// ApplyBackendEndpoint installs a new backend to a LocalAI instance
// @Summary Install backends to LocalAI.
// @Param request body BackendModel true "query params"
// @Success 200 {object} schema.BackendResponse "Response"
// @Router /backends/apply [post]
func (mgs *BackendEndpointService) ApplyBackendEndpoint() func(c *fiber.Ctx) error {
return func(c *fiber.Ctx) error {
input := new(GalleryBackend)
// Get input data from the request body
if err := c.BodyParser(input); err != nil {
return err
}
uuid, err := uuid.NewUUID()
if err != nil {
return err
}
mgs.backendApplier.BackendGalleryChannel <- services.GalleryOp[gallery.GalleryBackend]{
ID: uuid.String(),
GalleryElementName: input.ID,
Galleries: mgs.galleries,
}
return c.JSON(schema.BackendResponse{ID: uuid.String(), StatusURL: fmt.Sprintf("%sbackends/jobs/%s", utils.BaseURL(c), uuid.String())})
}
}
// DeleteBackendEndpoint lets delete backends from a LocalAI instance
// @Summary delete backends from LocalAI.
// @Param name path string true "Backend name"
// @Success 200 {object} schema.BackendResponse "Response"
// @Router /backends/delete/{name} [post]
func (mgs *BackendEndpointService) DeleteBackendEndpoint() func(c *fiber.Ctx) error {
return func(c *fiber.Ctx) error {
backendName := c.Params("name")
mgs.backendApplier.BackendGalleryChannel <- services.GalleryOp[gallery.GalleryBackend]{
Delete: true,
GalleryElementName: backendName,
Galleries: mgs.galleries,
}
uuid, err := uuid.NewUUID()
if err != nil {
return err
}
return c.JSON(schema.BackendResponse{ID: uuid.String(), StatusURL: fmt.Sprintf("%sbackends/jobs/%s", utils.BaseURL(c), uuid.String())})
}
}
// ListBackendsEndpoint list the available backends configured in LocalAI
// @Summary List all Backends
// @Success 200 {object} []gallery.GalleryBackend "Response"
// @Router /backends [get]
func (mgs *BackendEndpointService) ListBackendsEndpoint() func(c *fiber.Ctx) error {
return func(c *fiber.Ctx) error {
backends, err := gallery.ListSystemBackends(mgs.backendPath)
if err != nil {
return err
}
return c.JSON(backends)
}
}
// ListModelGalleriesEndpoint list the available galleries configured in LocalAI
// @Summary List all Galleries
// @Success 200 {object} []config.Gallery "Response"
// @Router /backends/galleries [get]
// NOTE: This is different (and much simpler!) than above! This JUST lists the model galleries that have been loaded, not their contents!
func (mgs *BackendEndpointService) ListBackendGalleriesEndpoint() func(c *fiber.Ctx) error {
return func(c *fiber.Ctx) error {
log.Debug().Msgf("Listing backend galleries %+v", mgs.galleries)
dat, err := json.Marshal(mgs.galleries)
if err != nil {
return err
}
return c.Send(dat)
}
}
// ListAvailableBackendsEndpoint list the available backends in the galleries configured in LocalAI
// @Summary List all available Backends
// @Success 200 {object} []gallery.GalleryBackend "Response"
// @Router /backends/available [get]
func (mgs *BackendEndpointService) ListAvailableBackendsEndpoint() func(c *fiber.Ctx) error {
return func(c *fiber.Ctx) error {
backends, err := gallery.AvailableBackends(mgs.galleries, mgs.backendPath)
if err != nil {
return err
}
return c.JSON(backends)
}
}

View file

@ -3,7 +3,6 @@ package localai
import (
"encoding/json"
"fmt"
"slices"
"github.com/gofiber/fiber/v2"
"github.com/google/uuid"
@ -22,8 +21,7 @@ type ModelGalleryEndpointService struct {
}
type GalleryModel struct {
ID string `json:"id"`
ConfigURL string `json:"config_url"`
ID string `json:"id"`
gallery.GalleryModel
}
@ -37,7 +35,7 @@ func CreateModelGalleryEndpointService(galleries []config.Gallery, modelPath str
// GetOpStatusEndpoint returns the job status
// @Summary Returns the job status
// @Success 200 {object} gallery.GalleryOpStatus "Response"
// @Success 200 {object} services.GalleryOpStatus "Response"
// @Router /models/jobs/{uuid} [get]
func (mgs *ModelGalleryEndpointService) GetOpStatusEndpoint() func(c *fiber.Ctx) error {
return func(c *fiber.Ctx) error {
@ -51,7 +49,7 @@ func (mgs *ModelGalleryEndpointService) GetOpStatusEndpoint() func(c *fiber.Ctx)
// GetAllStatusEndpoint returns all the jobs status progress
// @Summary Returns all the jobs status progress
// @Success 200 {object} map[string]gallery.GalleryOpStatus "Response"
// @Success 200 {object} map[string]services.GalleryOpStatus "Response"
// @Router /models/jobs [get]
func (mgs *ModelGalleryEndpointService) GetAllStatusEndpoint() func(c *fiber.Ctx) error {
return func(c *fiber.Ctx) error {
@ -76,12 +74,11 @@ func (mgs *ModelGalleryEndpointService) ApplyModelGalleryEndpoint() func(c *fibe
if err != nil {
return err
}
mgs.galleryApplier.C <- gallery.GalleryOp{
Req: input.GalleryModel,
Id: uuid.String(),
GalleryModelName: input.ID,
Galleries: mgs.galleries,
ConfigURL: input.ConfigURL,
mgs.galleryApplier.ModelGalleryChannel <- services.GalleryOp[gallery.GalleryModel]{
Req: input.GalleryModel,
ID: uuid.String(),
GalleryElementName: input.ID,
Galleries: mgs.galleries,
}
return c.JSON(schema.GalleryResponse{ID: uuid.String(), StatusURL: fmt.Sprintf("%smodels/jobs/%s", utils.BaseURL(c), uuid.String())})
@ -97,9 +94,9 @@ func (mgs *ModelGalleryEndpointService) DeleteModelGalleryEndpoint() func(c *fib
return func(c *fiber.Ctx) error {
modelName := c.Params("name")
mgs.galleryApplier.C <- gallery.GalleryOp{
Delete: true,
GalleryModelName: modelName,
mgs.galleryApplier.ModelGalleryChannel <- services.GalleryOp[gallery.GalleryModel]{
Delete: true,
GalleryElementName: modelName,
}
uuid, err := uuid.NewUUID()
@ -157,58 +154,3 @@ func (mgs *ModelGalleryEndpointService) ListModelGalleriesEndpoint() func(c *fib
return c.Send(dat)
}
}
// AddModelGalleryEndpoint adds a gallery in LocalAI
// @Summary Adds a gallery in LocalAI
// @Param request body config.Gallery true "Gallery details"
// @Success 200 {object} []config.Gallery "Response"
// @Router /models/galleries [post]
func (mgs *ModelGalleryEndpointService) AddModelGalleryEndpoint() func(c *fiber.Ctx) error {
return func(c *fiber.Ctx) error {
input := new(config.Gallery)
// Get input data from the request body
if err := c.BodyParser(input); err != nil {
return err
}
if slices.ContainsFunc(mgs.galleries, func(gallery config.Gallery) bool {
return gallery.Name == input.Name
}) {
return fmt.Errorf("%s already exists", input.Name)
}
dat, err := json.Marshal(mgs.galleries)
if err != nil {
return err
}
log.Debug().Msgf("Adding %+v to gallery list", *input)
mgs.galleries = append(mgs.galleries, *input)
return c.Send(dat)
}
}
// RemoveModelGalleryEndpoint remove a gallery in LocalAI
// @Summary removes a gallery from LocalAI
// @Param request body config.Gallery true "Gallery details"
// @Success 200 {object} []config.Gallery "Response"
// @Router /models/galleries [delete]
func (mgs *ModelGalleryEndpointService) RemoveModelGalleryEndpoint() func(c *fiber.Ctx) error {
return func(c *fiber.Ctx) error {
input := new(config.Gallery)
// Get input data from the request body
if err := c.BodyParser(input); err != nil {
return err
}
if !slices.ContainsFunc(mgs.galleries, func(gallery config.Gallery) bool {
return gallery.Name == input.Name
}) {
return fmt.Errorf("%s is not currently registered", input.Name)
}
mgs.galleries = slices.DeleteFunc(mgs.galleries, func(gallery config.Gallery) bool {
return gallery.Name == input.Name
})
dat, err := json.Marshal(mgs.galleries)
if err != nil {
return err
}
return c.Send(dat)
}
}

View file

@ -21,6 +21,9 @@ func SystemInformations(ml *model.ModelLoader, appConfig *config.ApplicationConf
for b := range appConfig.ExternalGRPCBackends {
availableBackends = append(availableBackends, b)
}
for b := range ml.GetAllExternalBackends(nil) {
availableBackends = append(availableBackends, b)
}
sysmodels := []schema.SysInfoModel{}
for _, m := range loadedModels {

View file

@ -12,10 +12,10 @@ import (
)
func WelcomeEndpoint(appConfig *config.ApplicationConfig,
cl *config.BackendConfigLoader, ml *model.ModelLoader, modelStatus func() (map[string]string, map[string]string)) func(*fiber.Ctx) error {
cl *config.BackendConfigLoader, ml *model.ModelLoader, opcache *services.OpCache) func(*fiber.Ctx) error {
return func(c *fiber.Ctx) error {
backendConfigs := cl.GetAllBackendConfigs()
galleryConfigs := map[string]*gallery.Config{}
galleryConfigs := map[string]*gallery.ModelConfig{}
for _, m := range backendConfigs {
cfg, err := gallery.GetLocalModelConfiguration(ml.ModelPath, m.Name)
@ -28,7 +28,7 @@ func WelcomeEndpoint(appConfig *config.ApplicationConfig,
modelsWithoutConfig, _ := services.ListModels(cl, ml, config.NoFilterFn, services.LOOSE_ONLY)
// Get model statuses to display in the UI the operation in progress
processingModels, taskTypes := modelStatus()
processingModels, taskTypes := opcache.GetStatus()
summary := fiber.Map{
"Title": "LocalAI API - " + internal.PrintableVersion(),

View file

@ -30,10 +30,16 @@ func RegisterLocalAIRoutes(router *fiber.App,
router.Get("/models/available", modelGalleryEndpointService.ListModelFromGalleryEndpoint())
router.Get("/models/galleries", modelGalleryEndpointService.ListModelGalleriesEndpoint())
router.Post("/models/galleries", modelGalleryEndpointService.AddModelGalleryEndpoint())
router.Delete("/models/galleries", modelGalleryEndpointService.RemoveModelGalleryEndpoint())
router.Get("/models/jobs/:uuid", modelGalleryEndpointService.GetOpStatusEndpoint())
router.Get("/models/jobs", modelGalleryEndpointService.GetAllStatusEndpoint())
backendGalleryEndpointService := localai.CreateBackendEndpointService(appConfig.BackendGalleries, appConfig.BackendsPath, galleryService)
router.Post("/backends/apply", backendGalleryEndpointService.ApplyBackendEndpoint())
router.Post("/backends/delete/:name", backendGalleryEndpointService.DeleteBackendEndpoint())
router.Get("/backends", backendGalleryEndpointService.ListBackendsEndpoint())
router.Get("/backends/available", backendGalleryEndpointService.ListAvailableBackendsEndpoint())
router.Get("/backends/galleries", backendGalleryEndpointService.ListBackendGalleriesEndpoint())
router.Get("/backends/jobs/:uuid", backendGalleryEndpointService.GetOpStatusEndpoint())
}
router.Post("/tts",

View file

@ -1,13 +1,6 @@
package routes
import (
"fmt"
"html/template"
"math"
"sort"
"strconv"
"strings"
"github.com/mudler/LocalAI/core/config"
"github.com/mudler/LocalAI/core/gallery"
"github.com/mudler/LocalAI/core/http/elements"
@ -17,78 +10,20 @@ import (
"github.com/mudler/LocalAI/core/services"
"github.com/mudler/LocalAI/internal"
"github.com/mudler/LocalAI/pkg/model"
"github.com/mudler/LocalAI/pkg/xsync"
"github.com/gofiber/fiber/v2"
"github.com/google/uuid"
"github.com/microcosm-cc/bluemonday"
"github.com/rs/zerolog/log"
)
type modelOpCache struct {
status *xsync.SyncedMap[string, string]
}
func NewModelOpCache() *modelOpCache {
return &modelOpCache{
status: xsync.NewSyncedMap[string, string](),
}
}
func (m *modelOpCache) Set(key string, value string) {
m.status.Set(key, value)
}
func (m *modelOpCache) Get(key string) string {
return m.status.Get(key)
}
func (m *modelOpCache) DeleteUUID(uuid string) {
for _, k := range m.status.Keys() {
if m.status.Get(k) == uuid {
m.status.Delete(k)
}
}
}
func (m *modelOpCache) Map() map[string]string {
return m.status.Map()
}
func (m *modelOpCache) Exists(key string) bool {
return m.status.Exists(key)
}
func RegisterUIRoutes(app *fiber.App,
cl *config.BackendConfigLoader,
ml *model.ModelLoader,
appConfig *config.ApplicationConfig,
galleryService *services.GalleryService) {
// keeps the state of models that are being installed from the UI
var processingModels = NewModelOpCache()
// keeps the state of ops that are started from the UI
var processingOps = services.NewOpCache(galleryService)
// modelStatus returns the current status of the models being processed (installation or deletion)
// it is called asynchronously from the UI
modelStatus := func() (map[string]string, map[string]string) {
processingModelsData := processingModels.Map()
taskTypes := map[string]string{}
for k, v := range processingModelsData {
status := galleryService.GetStatus(v)
taskTypes[k] = "Installation"
if status != nil && status.Deletion {
taskTypes[k] = "Deletion"
} else if status == nil {
taskTypes[k] = "Waiting"
}
}
return processingModelsData, taskTypes
}
app.Get("/", localai.WelcomeEndpoint(appConfig, cl, ml, modelStatus))
app.Get("/", localai.WelcomeEndpoint(appConfig, cl, ml, processingOps))
if p2p.IsP2PEnabled() {
app.Get("/p2p", func(c *fiber.Ctx) error {
@ -124,262 +59,8 @@ func RegisterUIRoutes(app *fiber.App,
}
if !appConfig.DisableGalleryEndpoint {
// Show the Models page (all models)
app.Get("/browse", func(c *fiber.Ctx) error {
term := c.Query("term")
page := c.Query("page")
items := c.Query("items")
models, err := gallery.AvailableGalleryModels(appConfig.Galleries, appConfig.ModelPath)
if err != nil {
log.Error().Err(err).Msg("could not list models from galleries")
return c.Status(fiber.StatusInternalServerError).Render("views/error", fiber.Map{
"Title": "LocalAI - Models",
"BaseURL": utils.BaseURL(c),
"Version": internal.PrintableVersion(),
"ErrorCode": "500",
"ErrorMessage": err.Error(),
})
}
// Get all available tags
allTags := map[string]struct{}{}
tags := []string{}
for _, m := range models {
for _, t := range m.Tags {
allTags[t] = struct{}{}
}
}
for t := range allTags {
tags = append(tags, t)
}
sort.Strings(tags)
if term != "" {
models = gallery.GalleryModels(models).Search(term)
}
// Get model statuses
processingModelsData, taskTypes := modelStatus()
summary := fiber.Map{
"Title": "LocalAI - Models",
"BaseURL": utils.BaseURL(c),
"Version": internal.PrintableVersion(),
"Models": template.HTML(elements.ListModels(models, processingModels, galleryService)),
"Repositories": appConfig.Galleries,
"AllTags": tags,
"ProcessingModels": processingModelsData,
"AvailableModels": len(models),
"IsP2PEnabled": p2p.IsP2PEnabled(),
"TaskTypes": taskTypes,
// "ApplicationConfig": appConfig,
}
if page == "" {
page = "1"
}
if page != "" {
// return a subset of the models
pageNum, err := strconv.Atoi(page)
if err != nil {
return c.Status(fiber.StatusBadRequest).SendString("Invalid page number")
}
if pageNum == 0 {
return c.Render("views/models", summary)
}
itemsNum, err := strconv.Atoi(items)
if err != nil {
itemsNum = 21
}
totalPages := int(math.Ceil(float64(len(models)) / float64(itemsNum)))
models = models.Paginate(pageNum, itemsNum)
prevPage := pageNum - 1
nextPage := pageNum + 1
if prevPage < 1 {
prevPage = 1
}
if nextPage > totalPages {
nextPage = totalPages
}
if prevPage != pageNum {
summary["PrevPage"] = prevPage
}
summary["NextPage"] = nextPage
summary["TotalPages"] = totalPages
summary["CurrentPage"] = pageNum
summary["Models"] = template.HTML(elements.ListModels(models, processingModels, galleryService))
}
// Render index
return c.Render("views/models", summary)
})
// Show the models, filtered from the user input
// https://htmx.org/examples/active-search/
app.Post("/browse/search/models", func(c *fiber.Ctx) error {
page := c.Query("page")
items := c.Query("items")
form := struct {
Search string `form:"search"`
}{}
if err := c.BodyParser(&form); err != nil {
return c.Status(fiber.StatusBadRequest).SendString(bluemonday.StrictPolicy().Sanitize(err.Error()))
}
models, _ := gallery.AvailableGalleryModels(appConfig.Galleries, appConfig.ModelPath)
if page != "" {
// return a subset of the models
pageNum, err := strconv.Atoi(page)
if err != nil {
return c.Status(fiber.StatusBadRequest).SendString("Invalid page number")
}
itemsNum, err := strconv.Atoi(items)
if err != nil {
itemsNum = 21
}
models = models.Paginate(pageNum, itemsNum)
}
if form.Search != "" {
models = models.Search(form.Search)
}
return c.SendString(elements.ListModels(models, processingModels, galleryService))
})
/*
Install routes
*/
// This route is used when the "Install" button is pressed, we submit here a new job to the gallery service
// https://htmx.org/examples/progress-bar/
app.Post("/browse/install/model/:id", func(c *fiber.Ctx) error {
galleryID := strings.Clone(c.Params("id")) // note: strings.Clone is required for multiple requests!
log.Debug().Msgf("UI job submitted to install : %+v\n", galleryID)
id, err := uuid.NewUUID()
if err != nil {
return err
}
uid := id.String()
processingModels.Set(galleryID, uid)
op := gallery.GalleryOp{
Id: uid,
GalleryModelName: galleryID,
Galleries: appConfig.Galleries,
}
go func() {
galleryService.C <- op
}()
return c.SendString(elements.StartProgressBar(uid, "0", "Installation"))
})
// This route is used when the "Install" button is pressed, we submit here a new job to the gallery service
// https://htmx.org/examples/progress-bar/
app.Post("/browse/delete/model/:id", func(c *fiber.Ctx) error {
galleryID := strings.Clone(c.Params("id")) // note: strings.Clone is required for multiple requests!
log.Debug().Msgf("UI job submitted to delete : %+v\n", galleryID)
var galleryName = galleryID
if strings.Contains(galleryID, "@") {
// if the galleryID contains a @ it means that it's a model from a gallery
// but we want to delete it from the local models which does not need
// a repository ID
galleryName = strings.Split(galleryID, "@")[1]
}
id, err := uuid.NewUUID()
if err != nil {
return err
}
uid := id.String()
// Track the deletion job by galleryID and galleryName
// The GalleryID contains information about the repository,
// while the GalleryName is ONLY the name of the model
processingModels.Set(galleryName, uid)
processingModels.Set(galleryID, uid)
op := gallery.GalleryOp{
Id: uid,
Delete: true,
GalleryModelName: galleryName,
}
go func() {
galleryService.C <- op
cl.RemoveBackendConfig(galleryName)
}()
return c.SendString(elements.StartProgressBar(uid, "0", "Deletion"))
})
// Display the job current progress status
// If the job is done, we trigger the /browse/job/:uid route
// https://htmx.org/examples/progress-bar/
app.Get("/browse/job/progress/:uid", func(c *fiber.Ctx) error {
jobUID := strings.Clone(c.Params("uid")) // note: strings.Clone is required for multiple requests!
status := galleryService.GetStatus(jobUID)
if status == nil {
//fmt.Errorf("could not find any status for ID")
return c.SendString(elements.ProgressBar("0"))
}
if status.Progress == 100 {
c.Set("HX-Trigger", "done") // this triggers /browse/job/:uid (which is when the job is done)
return c.SendString(elements.ProgressBar("100"))
}
if status.Error != nil {
// TODO: instead of deleting the job, we should keep it in the cache and make it dismissable by the user
processingModels.DeleteUUID(jobUID)
return c.SendString(elements.ErrorProgress(status.Error.Error(), status.GalleryModelName))
}
return c.SendString(elements.ProgressBar(fmt.Sprint(status.Progress)))
})
// this route is hit when the job is done, and we display the
// final state (for now just displays "Installation completed")
app.Get("/browse/job/:uid", func(c *fiber.Ctx) error {
jobUID := strings.Clone(c.Params("uid")) // note: strings.Clone is required for multiple requests!
status := galleryService.GetStatus(jobUID)
galleryID := ""
processingModels.DeleteUUID(jobUID)
if galleryID == "" {
log.Debug().Msgf("no processing model found for job : %+v\n", jobUID)
}
log.Debug().Msgf("JOB finished : %+v\n", status)
showDelete := true
displayText := "Installation completed"
if status.Deletion {
showDelete = false
displayText = "Deletion completed"
}
return c.SendString(elements.DoneProgress(galleryID, displayText, showDelete))
})
registerGalleryRoutes(app, cl, appConfig, galleryService, processingOps)
registerBackendGalleryRoutes(app, appConfig, galleryService, processingOps)
}
app.Get("/talk/", func(c *fiber.Ctx) error {
@ -412,7 +93,7 @@ func RegisterUIRoutes(app *fiber.App,
return c.Redirect(utils.BaseURL(c))
}
modelThatCanBeUsed := ""
galleryConfigs := map[string]*gallery.Config{}
galleryConfigs := map[string]*gallery.ModelConfig{}
for _, m := range backendConfigs {
cfg, err := gallery.GetLocalModelConfiguration(ml.ModelPath, m.Name)
@ -452,7 +133,7 @@ func RegisterUIRoutes(app *fiber.App,
backendConfigs := cl.GetAllBackendConfigs()
modelsWithoutConfig, _ := services.ListModels(cl, ml, config.NoFilterFn, services.LOOSE_ONLY)
galleryConfigs := map[string]*gallery.Config{}
galleryConfigs := map[string]*gallery.ModelConfig{}
for _, m := range backendConfigs {
cfg, err := gallery.GetLocalModelConfiguration(ml.ModelPath, m.Name)

View file

@ -0,0 +1,258 @@
package routes
import (
"fmt"
"html/template"
"math"
"sort"
"strconv"
"strings"
"github.com/gofiber/fiber/v2"
"github.com/google/uuid"
"github.com/microcosm-cc/bluemonday"
"github.com/mudler/LocalAI/core/config"
"github.com/mudler/LocalAI/core/gallery"
"github.com/mudler/LocalAI/core/http/elements"
"github.com/mudler/LocalAI/core/http/utils"
"github.com/mudler/LocalAI/core/services"
"github.com/mudler/LocalAI/internal"
"github.com/rs/zerolog/log"
)
func registerBackendGalleryRoutes(app *fiber.App, appConfig *config.ApplicationConfig, galleryService *services.GalleryService, opcache *services.OpCache) {
// Show the Backends page (all backends)
app.Get("/browse/backends", func(c *fiber.Ctx) error {
term := c.Query("term")
page := c.Query("page")
items := c.Query("items")
backends, err := gallery.AvailableBackends(appConfig.BackendGalleries, appConfig.BackendsPath)
if err != nil {
log.Error().Err(err).Msg("could not list backends from galleries")
return c.Status(fiber.StatusInternalServerError).Render("views/error", fiber.Map{
"Title": "LocalAI - Backends",
"BaseURL": utils.BaseURL(c),
"Version": internal.PrintableVersion(),
"ErrorCode": "500",
"ErrorMessage": err.Error(),
})
}
// Get all available tags
allTags := map[string]struct{}{}
tags := []string{}
for _, b := range backends {
for _, t := range b.Tags {
allTags[t] = struct{}{}
}
}
for t := range allTags {
tags = append(tags, t)
}
sort.Strings(tags)
if term != "" {
backends = gallery.GalleryElements[*gallery.GalleryBackend](backends).Search(term)
}
// Get backend statuses
processingBackendsData, taskTypes := opcache.GetStatus()
summary := fiber.Map{
"Title": "LocalAI - Backends",
"BaseURL": utils.BaseURL(c),
"Version": internal.PrintableVersion(),
"Backends": template.HTML(elements.ListBackends(backends, opcache, galleryService)),
"Repositories": appConfig.BackendGalleries,
"AllTags": tags,
"ProcessingBackends": processingBackendsData,
"AvailableBackends": len(backends),
"TaskTypes": taskTypes,
}
if page == "" {
page = "1"
}
if page != "" {
// return a subset of the backends
pageNum, err := strconv.Atoi(page)
if err != nil {
return c.Status(fiber.StatusBadRequest).SendString("Invalid page number")
}
if pageNum == 0 {
return c.Render("views/backends", summary)
}
itemsNum, err := strconv.Atoi(items)
if err != nil {
itemsNum = 21
}
totalPages := int(math.Ceil(float64(len(backends)) / float64(itemsNum)))
backends = backends.Paginate(pageNum, itemsNum)
prevPage := pageNum - 1
nextPage := pageNum + 1
if prevPage < 1 {
prevPage = 1
}
if nextPage > totalPages {
nextPage = totalPages
}
if prevPage != pageNum {
summary["PrevPage"] = prevPage
}
summary["NextPage"] = nextPage
summary["TotalPages"] = totalPages
summary["CurrentPage"] = pageNum
summary["Backends"] = template.HTML(elements.ListBackends(backends, opcache, galleryService))
}
// Render index
return c.Render("views/backends", summary)
})
// Show the backends, filtered from the user input
app.Post("/browse/search/backends", func(c *fiber.Ctx) error {
page := c.Query("page")
items := c.Query("items")
form := struct {
Search string `form:"search"`
}{}
if err := c.BodyParser(&form); err != nil {
return c.Status(fiber.StatusBadRequest).SendString(bluemonday.StrictPolicy().Sanitize(err.Error()))
}
backends, _ := gallery.AvailableBackends(appConfig.BackendGalleries, appConfig.BackendsPath)
if page != "" {
// return a subset of the backends
pageNum, err := strconv.Atoi(page)
if err != nil {
return c.Status(fiber.StatusBadRequest).SendString("Invalid page number")
}
itemsNum, err := strconv.Atoi(items)
if err != nil {
itemsNum = 21
}
backends = backends.Paginate(pageNum, itemsNum)
}
if form.Search != "" {
backends = backends.Search(form.Search)
}
return c.SendString(elements.ListBackends(backends, opcache, galleryService))
})
// Install backend route
app.Post("/browse/install/backend/:id", func(c *fiber.Ctx) error {
backendID := strings.Clone(c.Params("id")) // note: strings.Clone is required for multiple requests!
log.Debug().Msgf("UI job submitted to install backend: %+v\n", backendID)
id, err := uuid.NewUUID()
if err != nil {
return err
}
uid := id.String()
opcache.Set(backendID, uid)
op := services.GalleryOp[gallery.GalleryBackend]{
ID: uid,
GalleryElementName: backendID,
Galleries: appConfig.BackendGalleries,
}
go func() {
galleryService.BackendGalleryChannel <- op
}()
return c.SendString(elements.StartBackendProgressBar(uid, "0", "Backend Installation"))
})
// Delete backend route
app.Post("/browse/delete/backend/:id", func(c *fiber.Ctx) error {
backendID := strings.Clone(c.Params("id")) // note: strings.Clone is required for multiple requests!
log.Debug().Msgf("UI job submitted to delete backend: %+v\n", backendID)
var backendName = backendID
if strings.Contains(backendID, "@") {
// TODO: this is ugly workaround - we should handle this consistently across the codebase
backendName = strings.Split(backendID, "@")[1]
}
id, err := uuid.NewUUID()
if err != nil {
return err
}
uid := id.String()
opcache.Set(backendName, uid)
opcache.Set(backendID, uid)
op := services.GalleryOp[gallery.GalleryBackend]{
ID: uid,
Delete: true,
GalleryElementName: backendName,
Galleries: appConfig.BackendGalleries,
}
go func() {
galleryService.BackendGalleryChannel <- op
}()
return c.SendString(elements.StartBackendProgressBar(uid, "0", "Backend Deletion"))
})
// Display the job current progress status
app.Get("/browse/backend/job/progress/:uid", func(c *fiber.Ctx) error {
jobUID := strings.Clone(c.Params("uid")) // note: strings.Clone is required for multiple requests!
status := galleryService.GetStatus(jobUID)
if status == nil {
return c.SendString(elements.ProgressBar("0"))
}
if status.Progress == 100 {
c.Set("HX-Trigger", "done") // this triggers /browse/backend/job/:uid
return c.SendString(elements.ProgressBar("100"))
}
if status.Error != nil {
opcache.DeleteUUID(jobUID)
return c.SendString(elements.ErrorProgress(status.Error.Error(), status.GalleryElementName))
}
return c.SendString(elements.ProgressBar(fmt.Sprint(status.Progress)))
})
// Job completion route
app.Get("/browse/backend/job/:uid", func(c *fiber.Ctx) error {
jobUID := strings.Clone(c.Params("uid")) // note: strings.Clone is required for multiple requests!
status := galleryService.GetStatus(jobUID)
backendID := status.GalleryElementName
opcache.DeleteUUID(jobUID)
if backendID == "" {
log.Debug().Msgf("no processing backend found for job: %+v\n", jobUID)
}
log.Debug().Msgf("JOB finished: %+v\n", status)
showDelete := true
displayText := "Backend Installation completed"
if status.Deletion {
showDelete = false
displayText = "Backend Deletion completed"
}
return c.SendString(elements.DoneBackendProgress(backendID, displayText, showDelete))
})
}

View file

@ -0,0 +1,282 @@
package routes
import (
"fmt"
"html/template"
"math"
"sort"
"strconv"
"strings"
"github.com/gofiber/fiber/v2"
"github.com/google/uuid"
"github.com/microcosm-cc/bluemonday"
"github.com/mudler/LocalAI/core/config"
"github.com/mudler/LocalAI/core/gallery"
"github.com/mudler/LocalAI/core/http/elements"
"github.com/mudler/LocalAI/core/http/utils"
"github.com/mudler/LocalAI/core/p2p"
"github.com/mudler/LocalAI/core/services"
"github.com/mudler/LocalAI/internal"
"github.com/rs/zerolog/log"
)
func registerGalleryRoutes(app *fiber.App, cl *config.BackendConfigLoader, appConfig *config.ApplicationConfig, galleryService *services.GalleryService, opcache *services.OpCache) {
// Show the Models page (all models)
app.Get("/browse", func(c *fiber.Ctx) error {
term := c.Query("term")
page := c.Query("page")
items := c.Query("items")
models, err := gallery.AvailableGalleryModels(appConfig.Galleries, appConfig.ModelPath)
if err != nil {
log.Error().Err(err).Msg("could not list models from galleries")
return c.Status(fiber.StatusInternalServerError).Render("views/error", fiber.Map{
"Title": "LocalAI - Models",
"BaseURL": utils.BaseURL(c),
"Version": internal.PrintableVersion(),
"ErrorCode": "500",
"ErrorMessage": err.Error(),
})
}
// Get all available tags
allTags := map[string]struct{}{}
tags := []string{}
for _, m := range models {
for _, t := range m.Tags {
allTags[t] = struct{}{}
}
}
for t := range allTags {
tags = append(tags, t)
}
sort.Strings(tags)
if term != "" {
models = gallery.GalleryElements[*gallery.GalleryModel](models).Search(term)
}
// Get model statuses
processingModelsData, taskTypes := opcache.GetStatus()
summary := fiber.Map{
"Title": "LocalAI - Models",
"BaseURL": utils.BaseURL(c),
"Version": internal.PrintableVersion(),
"Models": template.HTML(elements.ListModels(models, opcache, galleryService)),
"Repositories": appConfig.Galleries,
"AllTags": tags,
"ProcessingModels": processingModelsData,
"AvailableModels": len(models),
"IsP2PEnabled": p2p.IsP2PEnabled(),
"TaskTypes": taskTypes,
// "ApplicationConfig": appConfig,
}
if page == "" {
page = "1"
}
if page != "" {
// return a subset of the models
pageNum, err := strconv.Atoi(page)
if err != nil {
return c.Status(fiber.StatusBadRequest).SendString("Invalid page number")
}
if pageNum == 0 {
return c.Render("views/models", summary)
}
itemsNum, err := strconv.Atoi(items)
if err != nil {
itemsNum = 21
}
totalPages := int(math.Ceil(float64(len(models)) / float64(itemsNum)))
models = models.Paginate(pageNum, itemsNum)
prevPage := pageNum - 1
nextPage := pageNum + 1
if prevPage < 1 {
prevPage = 1
}
if nextPage > totalPages {
nextPage = totalPages
}
if prevPage != pageNum {
summary["PrevPage"] = prevPage
}
summary["NextPage"] = nextPage
summary["TotalPages"] = totalPages
summary["CurrentPage"] = pageNum
summary["Models"] = template.HTML(elements.ListModels(models, opcache, galleryService))
}
// Render index
return c.Render("views/models", summary)
})
// Show the models, filtered from the user input
// https://htmx.org/examples/active-search/
app.Post("/browse/search/models", func(c *fiber.Ctx) error {
page := c.Query("page")
items := c.Query("items")
form := struct {
Search string `form:"search"`
}{}
if err := c.BodyParser(&form); err != nil {
return c.Status(fiber.StatusBadRequest).SendString(bluemonday.StrictPolicy().Sanitize(err.Error()))
}
models, _ := gallery.AvailableGalleryModels(appConfig.Galleries, appConfig.ModelPath)
if page != "" {
// return a subset of the models
pageNum, err := strconv.Atoi(page)
if err != nil {
return c.Status(fiber.StatusBadRequest).SendString("Invalid page number")
}
itemsNum, err := strconv.Atoi(items)
if err != nil {
itemsNum = 21
}
models = models.Paginate(pageNum, itemsNum)
}
if form.Search != "" {
models = models.Search(form.Search)
}
return c.SendString(elements.ListModels(models, opcache, galleryService))
})
/*
Install routes
*/
// This route is used when the "Install" button is pressed, we submit here a new job to the gallery service
// https://htmx.org/examples/progress-bar/
app.Post("/browse/install/model/:id", func(c *fiber.Ctx) error {
galleryID := strings.Clone(c.Params("id")) // note: strings.Clone is required for multiple requests!
log.Debug().Msgf("UI job submitted to install : %+v\n", galleryID)
id, err := uuid.NewUUID()
if err != nil {
return err
}
uid := id.String()
opcache.Set(galleryID, uid)
op := services.GalleryOp[gallery.GalleryModel]{
ID: uid,
GalleryElementName: galleryID,
Galleries: appConfig.Galleries,
}
go func() {
galleryService.ModelGalleryChannel <- op
}()
return c.SendString(elements.StartModelProgressBar(uid, "0", "Installation"))
})
// This route is used when the "Install" button is pressed, we submit here a new job to the gallery service
// https://htmx.org/examples/progress-bar/
app.Post("/browse/delete/model/:id", func(c *fiber.Ctx) error {
galleryID := strings.Clone(c.Params("id")) // note: strings.Clone is required for multiple requests!
log.Debug().Msgf("UI job submitted to delete : %+v\n", galleryID)
var galleryName = galleryID
if strings.Contains(galleryID, "@") {
// if the galleryID contains a @ it means that it's a model from a gallery
// but we want to delete it from the local models which does not need
// a repository ID
galleryName = strings.Split(galleryID, "@")[1]
}
id, err := uuid.NewUUID()
if err != nil {
return err
}
uid := id.String()
// Track the deletion job by galleryID and galleryName
// The GalleryID contains information about the repository,
// while the GalleryName is ONLY the name of the model
opcache.Set(galleryName, uid)
opcache.Set(galleryID, uid)
op := services.GalleryOp[gallery.GalleryModel]{
ID: uid,
Delete: true,
GalleryElementName: galleryName,
Galleries: appConfig.Galleries,
}
go func() {
galleryService.ModelGalleryChannel <- op
cl.RemoveBackendConfig(galleryName)
}()
return c.SendString(elements.StartModelProgressBar(uid, "0", "Deletion"))
})
// Display the job current progress status
// If the job is done, we trigger the /browse/job/:uid route
// https://htmx.org/examples/progress-bar/
app.Get("/browse/job/progress/:uid", func(c *fiber.Ctx) error {
jobUID := strings.Clone(c.Params("uid")) // note: strings.Clone is required for multiple requests!
status := galleryService.GetStatus(jobUID)
if status == nil {
//fmt.Errorf("could not find any status for ID")
return c.SendString(elements.ProgressBar("0"))
}
if status.Progress == 100 {
c.Set("HX-Trigger", "done") // this triggers /browse/job/:uid (which is when the job is done)
return c.SendString(elements.ProgressBar("100"))
}
if status.Error != nil {
// TODO: instead of deleting the job, we should keep it in the cache and make it dismissable by the user
opcache.DeleteUUID(jobUID)
return c.SendString(elements.ErrorProgress(status.Error.Error(), status.GalleryElementName))
}
return c.SendString(elements.ProgressBar(fmt.Sprint(status.Progress)))
})
// this route is hit when the job is done, and we display the
// final state (for now just displays "Installation completed")
app.Get("/browse/job/:uid", func(c *fiber.Ctx) error {
jobUID := strings.Clone(c.Params("uid")) // note: strings.Clone is required for multiple requests!
status := galleryService.GetStatus(jobUID)
galleryID := status.GalleryElementName
opcache.DeleteUUID(jobUID)
if galleryID == "" {
log.Debug().Msgf("no processing model found for job : %+v\n", jobUID)
}
log.Debug().Msgf("JOB finished : %+v\n", status)
showDelete := true
displayText := "Installation completed"
if status.Deletion {
showDelete = false
displayText = "Deletion completed"
}
return c.SendString(elements.DoneModelProgress(galleryID, displayText, showDelete))
})
}

View file

@ -0,0 +1,148 @@
<!DOCTYPE html>
<html lang="en">
{{template "views/partials/head" .}}
<body class="bg-gradient-to-br from-gray-900 to-gray-950 text-gray-200">
<div class="flex flex-col min-h-screen">
{{template "views/partials/navbar" .}}
{{ $numBackendsPerPage := 21 }}
<div class="container mx-auto px-4 py-8 flex-grow">
<!-- Hero Header -->
<div class="bg-gradient-to-r from-indigo-900/30 to-purple-900/30 rounded-2xl shadow-xl p-6 mb-8">
<div class="max-w-4xl mx-auto text-center">
<h1 class="text-3xl md:text-4xl font-bold text-white mb-3">
<span class="bg-clip-text text-transparent bg-gradient-to-r from-indigo-400 to-purple-400">
Backend Management
</span>
</h1>
<p class="text-lg text-gray-300 mb-2">
<span class="font-semibold text-indigo-300">{{.AvailableBackends}}</span> backends available
<a href="https://localai.io/backends/" target="_blank" class="ml-2 text-blue-400 hover:text-blue-300 transition">
<i class="fas fa-circle-info"></i>
</a>
</p>
</div>
</div>
{{template "views/partials/inprogress" .}}
<!-- Search and Filter Section -->
<div class="bg-gray-800/70 rounded-xl p-6 mb-8 shadow-lg border border-gray-700/50">
<!-- Search Input -->
<div class="relative mb-6">
<div class="absolute inset-y-0 start-0 flex items-center ps-3 pointer-events-none">
<i class="fas fa-search text-gray-400"></i>
</div>
<input class="form-control block w-full pl-10 px-4 py-3 text-base font-normal text-gray-300 bg-gray-900/80 bg-clip-padding border border-gray-700/70 rounded-lg transition ease-in-out focus:text-gray-200 focus:bg-gray-900 focus:border-blue-500 focus:ring-1 focus:ring-blue-500/50 focus:outline-none"
type="search"
name="search"
placeholder="Search backends by name, description or type..."
hx-post="browse/search/backends"
hx-trigger="input changed delay:500ms, search"
hx-target="#search-results"
oninput="hidePagination()"
onchange="hidePagination()"
onsearch="hidePagination()"
hx-indicator=".htmx-indicator">
<span class="htmx-indicator absolute right-3 top-3">
<svg class="animate-spin h-5 w-5 text-blue-500" xmlns="http://www.w3.org/2000/svg" fill="none" viewBox="0 0 24 24">
<circle class="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" stroke-width="4"></circle>
<path class="opacity-75" fill="currentColor" d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4zm2 5.291A7.962 7.962 0 014 12H0c0 3.042 1.135 5.824 3 7.938l3-2.647z"></path>
</svg>
</span>
</div>
<!-- Filter by Type -->
<div class="mb-4">
<h3 class="text-gray-200 font-medium mb-3">Filter by type:</h3>
<div class="flex flex-wrap gap-2">
<button hx-post="browse/search/backends"
class="inline-flex items-center rounded-full px-4 py-2 text-sm font-medium bg-indigo-900/60 text-indigo-200 border border-indigo-700/50 hover:bg-indigo-800 transition duration-200 ease-in-out"
hx-target="#search-results"
hx-vals='{"search": "llm"}'
onclick="hidePagination()"
hx-indicator=".htmx-indicator">
<i class="fas fa-brain mr-2"></i>LLM
</button>
<button hx-post="browse/search/backends"
class="inline-flex items-center rounded-full px-4 py-2 text-sm font-medium bg-purple-900/60 text-purple-200 border border-purple-700/50 hover:bg-purple-800 transition duration-200 ease-in-out"
hx-target="#search-results"
hx-vals='{"search": "diffusion"}'
onclick="hidePagination()"
hx-indicator=".htmx-indicator">
<i class="fas fa-image mr-2"></i>Diffusion
</button>
<button hx-post="browse/search/backends"
class="inline-flex items-center rounded-full px-4 py-2 text-sm font-medium bg-blue-900/60 text-blue-200 border border-blue-700/50 hover:bg-blue-800 transition duration-200 ease-in-out"
hx-target="#search-results"
hx-vals='{"search": "tts"}'
onclick="hidePagination()"
hx-indicator=".htmx-indicator">
<i class="fas fa-microphone mr-2"></i>TTS
</button>
<button hx-post="browse/search/backends"
class="inline-flex items-center rounded-full px-4 py-2 text-sm font-medium bg-green-900/60 text-green-200 border border-green-700/50 hover:bg-green-800 transition duration-200 ease-in-out"
hx-target="#search-results"
hx-vals='{"search": "whisper"}'
onclick="hidePagination()"
hx-indicator=".htmx-indicator">
<i class="fas fa-headphones mr-2"></i>Whisper
</button>
</div>
</div>
</div>
<!-- Results Section -->
<div id="search-results" class="transition-all duration-300">
{{.Backends}}
</div>
<!-- Pagination -->
{{ if gt .AvailableBackends $numBackendsPerPage }}
<div id="paginate" class="flex justify-center mt-8">
<div class="flex items-center gap-4">
{{ if .PrevPage }}
<button onclick="window.location.href='browse/backends?page={{.PrevPage}}'"
class="flex items-center justify-center h-10 w-10 bg-gray-800/80 text-gray-300 hover:bg-indigo-900/70 hover:text-white rounded-lg shadow transition duration-300 ease-in-out">
<i class="fas fa-chevron-left"></i>
</button>
{{ end }}
<div class="text-gray-400 text-sm">
Page <span class="text-white font-medium">{{.CurrentPage}}</span> of <span class="text-white font-medium">{{.TotalPages}}</span>
</div>
{{ if .NextPage }}
<button onclick="window.location.href='browse/backends?page={{.NextPage}}'"
class="flex items-center justify-center h-10 w-10 bg-gray-800/80 text-gray-300 hover:bg-indigo-900/70 hover:text-white rounded-lg shadow transition duration-300 ease-in-out">
<i class="fas fa-chevron-right"></i>
</button>
{{ end }}
</div>
</div>
{{ end }}
</div>
{{template "views/partials/footer" .}}
</div>
<script>
function hidePagination() {
const paginateDiv = document.getElementById('paginate');
if (paginateDiv) {
paginateDiv.style.display = 'none';
}
}
// Listen for the htmx:afterSwap event to handle cases when the search results are updated
document.body.addEventListener('htmx:afterSwap', function(event) {
if (event.detail.target.id === 'search-results') {
hidePagination();
}
});
</script>
</body>
</html>

View file

@ -25,6 +25,9 @@
<a href="browse/" class="text-gray-300 hover:text-white px-3 py-2 rounded-lg transition duration-300 ease-in-out hover:bg-blue-900/30 flex items-center">
<i class="fas fa-brain text-blue-400 mr-2"></i>Models
</a>
<a href="browse/backends" class="text-gray-300 hover:text-white px-3 py-2 rounded-lg transition duration-300 ease-in-out hover:bg-blue-900/30 flex items-center">
<i class="fas fa-server text-blue-400 mr-2"></i>Backends
</a>
<a href="chat/" class="text-gray-300 hover:text-white px-3 py-2 rounded-lg transition duration-300 ease-in-out hover:bg-blue-900/30 flex items-center">
<i class="fa-solid fa-comments text-blue-400 mr-2"></i>Chat
</a>
@ -57,6 +60,9 @@
<a href="browse/" class="block text-gray-300 hover:text-white hover:bg-blue-900/30 px-3 py-2 rounded-lg transition duration-300 ease-in-out flex items-center">
<i class="fas fa-brain text-blue-400 mr-3 w-5 text-center"></i>Models
</a>
<a href="browse/backends" class="block text-gray-300 hover:text-white hover:bg-blue-900/30 px-3 py-2 rounded-lg transition duration-300 ease-in-out flex items-center">
<i class="fas fa-server text-blue-400 mr-3 w-5 text-center"></i>Backends
</a>
<a href="chat/" class="block text-gray-300 hover:text-white hover:bg-blue-900/30 px-3 py-2 rounded-lg transition duration-300 ease-in-out flex items-center">
<i class="fa-solid fa-comments text-blue-400 mr-3 w-5 text-center"></i>Chat
</a>

7
core/schema/backend.go Normal file
View file

@ -0,0 +1,7 @@
package schema
// BackendResponse represents the response for backend operations
type BackendResponse struct {
ID string `json:"id"`
StatusURL string `json:"status_url"`
}

44
core/services/backends.go Normal file
View file

@ -0,0 +1,44 @@
package services
import (
"github.com/mudler/LocalAI/core/gallery"
"github.com/mudler/LocalAI/pkg/utils"
"github.com/rs/zerolog/log"
)
func (g *GalleryService) backendHandler(op *GalleryOp[gallery.GalleryBackend]) error {
utils.ResetDownloadTimers()
g.UpdateStatus(op.ID, &GalleryOpStatus{Message: "processing", Progress: 0})
// displayDownload displays the download progress
progressCallback := func(fileName string, current string, total string, percentage float64) {
g.UpdateStatus(op.ID, &GalleryOpStatus{Message: "processing", FileName: fileName, Progress: percentage, TotalFileSize: total, DownloadedFileSize: current})
utils.DisplayDownloadFunction(fileName, current, total, percentage)
}
var err error
if op.Delete {
err = gallery.DeleteBackendFromSystem(g.appConfig.BackendsPath, op.GalleryElementName)
g.modelLoader.DeleteExternalBackend(op.GalleryElementName)
} else {
log.Warn().Msgf("installing backend %s", op.GalleryElementName)
err = gallery.InstallBackendFromGallery(g.appConfig.BackendGalleries, op.GalleryElementName, g.appConfig.BackendsPath, progressCallback)
if err == nil {
err = gallery.RegisterBackends(g.appConfig.BackendsPath, g.modelLoader)
}
}
if err != nil {
log.Error().Err(err).Msgf("error installing backend %s", op.GalleryElementName)
return err
}
g.UpdateStatus(op.ID,
&GalleryOpStatus{
Deletion: op.Delete,
Processed: true,
GalleryElementName: op.GalleryElementName,
Message: "completed",
Progress: 100})
return nil
}

View file

@ -2,60 +2,48 @@ package services
import (
"context"
"encoding/json"
"fmt"
"os"
"path/filepath"
"sync"
"github.com/mudler/LocalAI/core/config"
"github.com/mudler/LocalAI/core/gallery"
"github.com/mudler/LocalAI/pkg/startup"
"github.com/mudler/LocalAI/pkg/utils"
"gopkg.in/yaml.v2"
"github.com/mudler/LocalAI/pkg/model"
)
type GalleryService struct {
appConfig *config.ApplicationConfig
sync.Mutex
C chan gallery.GalleryOp
statuses map[string]*gallery.GalleryOpStatus
ModelGalleryChannel chan GalleryOp[gallery.GalleryModel]
BackendGalleryChannel chan GalleryOp[gallery.GalleryBackend]
modelLoader *model.ModelLoader
statuses map[string]*GalleryOpStatus
}
func NewGalleryService(appConfig *config.ApplicationConfig) *GalleryService {
func NewGalleryService(appConfig *config.ApplicationConfig, ml *model.ModelLoader) *GalleryService {
return &GalleryService{
appConfig: appConfig,
C: make(chan gallery.GalleryOp),
statuses: make(map[string]*gallery.GalleryOpStatus),
appConfig: appConfig,
ModelGalleryChannel: make(chan GalleryOp[gallery.GalleryModel]),
BackendGalleryChannel: make(chan GalleryOp[gallery.GalleryBackend]),
modelLoader: ml,
statuses: make(map[string]*GalleryOpStatus),
}
}
func prepareModel(modelPath string, req gallery.GalleryModel, downloadStatus func(string, string, string, float64), enforceScan bool) error {
config, err := gallery.GetGalleryConfigFromURL(req.URL, modelPath)
if err != nil {
return err
}
config.Files = append(config.Files, req.AdditionalFiles...)
return gallery.InstallModel(modelPath, req.Name, &config, req.Overrides, downloadStatus, enforceScan)
}
func (g *GalleryService) UpdateStatus(s string, op *gallery.GalleryOpStatus) {
func (g *GalleryService) UpdateStatus(s string, op *GalleryOpStatus) {
g.Lock()
defer g.Unlock()
g.statuses[s] = op
}
func (g *GalleryService) GetStatus(s string) *gallery.GalleryOpStatus {
func (g *GalleryService) GetStatus(s string) *GalleryOpStatus {
g.Lock()
defer g.Unlock()
return g.statuses[s]
}
func (g *GalleryService) GetAllStatus() map[string]*gallery.GalleryOpStatus {
func (g *GalleryService) GetAllStatus() map[string]*GalleryOpStatus {
g.Lock()
defer g.Unlock()
@ -63,153 +51,35 @@ func (g *GalleryService) GetAllStatus() map[string]*gallery.GalleryOpStatus {
}
func (g *GalleryService) Start(c context.Context, cl *config.BackendConfigLoader) {
// updates the status with an error
var updateError func(id string, e error)
if !g.appConfig.OpaqueErrors {
updateError = func(id string, e error) {
g.UpdateStatus(id, &GalleryOpStatus{Error: e, Processed: true, Message: "error: " + e.Error()})
}
} else {
updateError = func(id string, _ error) {
g.UpdateStatus(id, &GalleryOpStatus{Error: fmt.Errorf("an error occurred"), Processed: true})
}
}
go func() {
for {
select {
case <-c.Done():
return
case op := <-g.C:
utils.ResetDownloadTimers()
g.UpdateStatus(op.Id, &gallery.GalleryOpStatus{Message: "processing", Progress: 0})
// updates the status with an error
var updateError func(e error)
if !g.appConfig.OpaqueErrors {
updateError = func(e error) {
g.UpdateStatus(op.Id, &gallery.GalleryOpStatus{Error: e, Processed: true, Message: "error: " + e.Error()})
}
} else {
updateError = func(_ error) {
g.UpdateStatus(op.Id, &gallery.GalleryOpStatus{Error: fmt.Errorf("an error occurred"), Processed: true})
}
}
// displayDownload displays the download progress
progressCallback := func(fileName string, current string, total string, percentage float64) {
g.UpdateStatus(op.Id, &gallery.GalleryOpStatus{Message: "processing", FileName: fileName, Progress: percentage, TotalFileSize: total, DownloadedFileSize: current})
utils.DisplayDownloadFunction(fileName, current, total, percentage)
}
var err error
// delete a model
if op.Delete {
modelConfig := &config.BackendConfig{}
// Galleryname is the name of the model in this case
dat, err := os.ReadFile(filepath.Join(g.appConfig.ModelPath, op.GalleryModelName+".yaml"))
if err != nil {
updateError(err)
continue
}
err = yaml.Unmarshal(dat, modelConfig)
if err != nil {
updateError(err)
continue
}
files := []string{}
// Remove the model from the config
if modelConfig.Model != "" {
files = append(files, modelConfig.ModelFileName())
}
if modelConfig.MMProj != "" {
files = append(files, modelConfig.MMProjFileName())
}
err = gallery.DeleteModelFromSystem(g.appConfig.ModelPath, op.GalleryModelName, files)
if err != nil {
updateError(err)
continue
}
} else {
// if the request contains a gallery name, we apply the gallery from the gallery list
if op.GalleryModelName != "" {
err = gallery.InstallModelFromGallery(op.Galleries, op.GalleryModelName, g.appConfig.ModelPath, op.Req, progressCallback, g.appConfig.EnforcePredownloadScans)
} else if op.ConfigURL != "" {
err = startup.InstallModels(op.Galleries, g.appConfig.ModelPath, g.appConfig.EnforcePredownloadScans, progressCallback, op.ConfigURL)
if err != nil {
updateError(err)
continue
}
err = cl.Preload(g.appConfig.ModelPath)
} else {
err = prepareModel(g.appConfig.ModelPath, op.Req, progressCallback, g.appConfig.EnforcePredownloadScans)
}
}
case op := <-g.BackendGalleryChannel:
err := g.backendHandler(&op)
if err != nil {
updateError(err)
continue
updateError(op.ID, err)
}
// Reload models
err = cl.LoadBackendConfigsFromPath(g.appConfig.ModelPath)
case op := <-g.ModelGalleryChannel:
err := g.modelHandler(&op, cl)
if err != nil {
updateError(err)
continue
updateError(op.ID, err)
}
err = cl.Preload(g.appConfig.ModelPath)
if err != nil {
updateError(err)
continue
}
g.UpdateStatus(op.Id,
&gallery.GalleryOpStatus{
Deletion: op.Delete,
Processed: true,
GalleryModelName: op.GalleryModelName,
Message: "completed",
Progress: 100})
}
}
}()
}
type galleryModel struct {
gallery.GalleryModel `yaml:",inline"` // https://github.com/go-yaml/yaml/issues/63
ID string `json:"id"`
}
func processRequests(modelPath string, enforceScan bool, galleries []config.Gallery, requests []galleryModel) error {
var err error
for _, r := range requests {
utils.ResetDownloadTimers()
if r.ID == "" {
err = prepareModel(modelPath, r.GalleryModel, utils.DisplayDownloadFunction, enforceScan)
} else {
err = gallery.InstallModelFromGallery(
galleries, r.ID, modelPath, r.GalleryModel, utils.DisplayDownloadFunction, enforceScan)
}
}
return err
}
func ApplyGalleryFromFile(modelPath, s string, enforceScan bool, galleries []config.Gallery) error {
dat, err := os.ReadFile(s)
if err != nil {
return err
}
var requests []galleryModel
if err := yaml.Unmarshal(dat, &requests); err != nil {
return err
}
return processRequests(modelPath, enforceScan, galleries, requests)
}
func ApplyGalleryFromString(modelPath, s string, enforceScan bool, galleries []config.Gallery) error {
var requests []galleryModel
err := json.Unmarshal([]byte(s), &requests)
if err != nil {
return err
}
return processRequests(modelPath, enforceScan, galleries, requests)
}

153
core/services/models.go Normal file
View file

@ -0,0 +1,153 @@
package services
import (
"encoding/json"
"os"
"path/filepath"
"github.com/mudler/LocalAI/core/config"
"github.com/mudler/LocalAI/core/gallery"
"github.com/mudler/LocalAI/pkg/utils"
"gopkg.in/yaml.v2"
)
func (g *GalleryService) modelHandler(op *GalleryOp[gallery.GalleryModel], cl *config.BackendConfigLoader) error {
utils.ResetDownloadTimers()
g.UpdateStatus(op.ID, &GalleryOpStatus{Message: "processing", Progress: 0})
// displayDownload displays the download progress
progressCallback := func(fileName string, current string, total string, percentage float64) {
g.UpdateStatus(op.ID, &GalleryOpStatus{Message: "processing", FileName: fileName, Progress: percentage, TotalFileSize: total, DownloadedFileSize: current})
utils.DisplayDownloadFunction(fileName, current, total, percentage)
}
err := processModelOperation(op, g.appConfig.ModelPath, g.appConfig.EnforcePredownloadScans, progressCallback)
if err != nil {
return err
}
// Reload models
err = cl.LoadBackendConfigsFromPath(g.appConfig.ModelPath)
if err != nil {
return err
}
err = cl.Preload(g.appConfig.ModelPath)
if err != nil {
return err
}
g.UpdateStatus(op.ID,
&GalleryOpStatus{
Deletion: op.Delete,
Processed: true,
GalleryElementName: op.GalleryElementName,
Message: "completed",
Progress: 100})
return nil
}
func prepareModel(modelPath string, req gallery.GalleryModel, downloadStatus func(string, string, string, float64), enforceScan bool) error {
config, err := gallery.GetGalleryConfigFromURL[gallery.ModelConfig](req.URL, modelPath)
if err != nil {
return err
}
config.Files = append(config.Files, req.AdditionalFiles...)
return gallery.InstallModel(modelPath, req.Name, &config, req.Overrides, downloadStatus, enforceScan)
}
type galleryModel struct {
gallery.GalleryModel `yaml:",inline"` // https://github.com/go-yaml/yaml/issues/63
ID string `json:"id"`
}
func processRequests(modelPath string, enforceScan bool, galleries []config.Gallery, requests []galleryModel) error {
var err error
for _, r := range requests {
utils.ResetDownloadTimers()
if r.ID == "" {
err = prepareModel(modelPath, r.GalleryModel, utils.DisplayDownloadFunction, enforceScan)
} else {
err = gallery.InstallModelFromGallery(
galleries, r.ID, modelPath, r.GalleryModel, utils.DisplayDownloadFunction, enforceScan)
}
}
return err
}
func ApplyGalleryFromFile(modelPath, s string, enforceScan bool, galleries []config.Gallery) error {
dat, err := os.ReadFile(s)
if err != nil {
return err
}
var requests []galleryModel
if err := yaml.Unmarshal(dat, &requests); err != nil {
return err
}
return processRequests(modelPath, enforceScan, galleries, requests)
}
func ApplyGalleryFromString(modelPath, s string, enforceScan bool, galleries []config.Gallery) error {
var requests []galleryModel
err := json.Unmarshal([]byte(s), &requests)
if err != nil {
return err
}
return processRequests(modelPath, enforceScan, galleries, requests)
}
// processModelOperation handles the installation or deletion of a model
func processModelOperation(
op *GalleryOp[gallery.GalleryModel],
modelPath string,
enforcePredownloadScans bool,
progressCallback func(string, string, string, float64),
) error {
// delete a model
if op.Delete {
modelConfig := &config.BackendConfig{}
// Galleryname is the name of the model in this case
dat, err := os.ReadFile(filepath.Join(modelPath, op.GalleryElementName+".yaml"))
if err != nil {
return err
}
err = yaml.Unmarshal(dat, modelConfig)
if err != nil {
return err
}
files := []string{}
// Remove the model from the config
if modelConfig.Model != "" {
files = append(files, modelConfig.ModelFileName())
}
if modelConfig.MMProj != "" {
files = append(files, modelConfig.MMProjFileName())
}
return gallery.DeleteModelFromSystem(modelPath, op.GalleryElementName, files)
}
// if the request contains a gallery name, we apply the gallery from the gallery list
if op.GalleryElementName != "" {
return gallery.InstallModelFromGallery(op.Galleries, op.GalleryElementName, modelPath, op.Req, progressCallback, enforcePredownloadScans)
// } else if op.ConfigURL != "" {
// err := startup.InstallModels(op.Galleries, modelPath, enforcePredownloadScans, progressCallback, op.ConfigURL)
// if err != nil {
// return err
// }
// return cl.Preload(modelPath)
} else {
return prepareModel(modelPath, op.Req, progressCallback, enforcePredownloadScans)
}
}

View file

@ -0,0 +1,81 @@
package services
import (
"github.com/mudler/LocalAI/core/config"
"github.com/mudler/LocalAI/pkg/xsync"
)
type GalleryOp[T any] struct {
ID string
GalleryElementName string
Delete bool
Req T
Galleries []config.Gallery
}
type GalleryOpStatus struct {
Deletion bool `json:"deletion"` // Deletion is true if the operation is a deletion
FileName string `json:"file_name"`
Error error `json:"error"`
Processed bool `json:"processed"`
Message string `json:"message"`
Progress float64 `json:"progress"`
TotalFileSize string `json:"file_size"`
DownloadedFileSize string `json:"downloaded_size"`
GalleryElementName string `json:"gallery_element_name"`
}
type OpCache struct {
status *xsync.SyncedMap[string, string]
galleryService *GalleryService
}
func NewOpCache(galleryService *GalleryService) *OpCache {
return &OpCache{
status: xsync.NewSyncedMap[string, string](),
galleryService: galleryService,
}
}
func (m *OpCache) Set(key string, value string) {
m.status.Set(key, value)
}
func (m *OpCache) Get(key string) string {
return m.status.Get(key)
}
func (m *OpCache) DeleteUUID(uuid string) {
for _, k := range m.status.Keys() {
if m.status.Get(k) == uuid {
m.status.Delete(k)
}
}
}
func (m *OpCache) Map() map[string]string {
return m.status.Map()
}
func (m *OpCache) Exists(key string) bool {
return m.status.Exists(key)
}
func (m *OpCache) GetStatus() (map[string]string, map[string]string) {
processingModelsData := m.Map()
taskTypes := map[string]string{}
for k, v := range processingModelsData {
status := m.galleryService.GetStatus(v)
taskTypes[k] = "Installation"
if status != nil && status.Deletion {
taskTypes[k] = "Deletion"
} else if status == nil {
taskTypes[k] = "Waiting"
}
}
return processingModelsData, taskTypes
}

118
docs/content/backends.md Normal file
View file

@ -0,0 +1,118 @@
---
title: "Backends"
description: "Learn how to use, manage, and develop backends in LocalAI"
weight: 4
---
# Backends
LocalAI supports a variety of backends that can be used to run different types of AI models. There are core Backends which are included, and there are containerized applications that provide the runtime environment for specific model types, such as LLMs, diffusion models, or text-to-speech models.
## Managing Backends in the UI
The LocalAI web interface provides an intuitive way to manage your backends:
1. Navigate to the "Backends" section in the navigation menu
2. Browse available backends from configured galleries
3. Use the search bar to find specific backends by name, description, or type
4. Filter backends by type using the quick filter buttons (LLM, Diffusion, TTS, Whisper)
5. Install or delete backends with a single click
6. Monitor installation progress in real-time
Each backend card displays:
- Backend name and description
- Type of models it supports
- Installation status
- Action buttons (Install/Delete)
- Additional information via the info button
## Backend Galleries
Backend galleries are repositories that contain backend definitions. They work similarly to model galleries but are specifically for backends.
### Adding a Backend Gallery
You can add backend galleries by specifying the **Environment Variable** `LOCALAI_BACKEND_GALLERIES`:
```bash
export LOCALAI_BACKEND_GALLERIES='[{"name":"my-gallery","url":"https://raw.githubusercontent.com/username/repo/main/backends"}]'
```
The URL needs to point to a valid yaml file, for example:
```yaml
- name: "test-backend"
uri: "quay.io/image/tests:localai-backend-test"
alias: "foo-backend"
```
Where URI is the path to an OCI container image.
### Backend Gallery Structure
A backend gallery is a collection of YAML files, each defining a backend. Here's an example structure:
```yaml
# backends/llm-backend.yaml
name: "llm-backend"
description: "A backend for running LLM models"
uri: "quay.io/username/llm-backend:latest"
alias: "llm"
tags:
- "llm"
- "text-generation"
```
## Pre-installing Backends
You can pre-install backends when starting LocalAI using the `LOCALAI_EXTERNAL_BACKENDS` environment variable:
```bash
export LOCALAI_EXTERNAL_BACKENDS="llm-backend,diffusion-backend"
local-ai run
```
## Creating a Backend
To create a new backend, you need to:
1. Create a container image that implements the LocalAI backend interface
2. Define a backend YAML file
3. Publish your backend to a container registry
### Backend Container Requirements
Your backend container should:
1. Implement the LocalAI backend interface (gRPC or HTTP)
2. Handle model loading and inference
3. Support the required model types
4. Include necessary dependencies
5. Have a top level `run.sh` file that will be used to run the backend
6. Pushed to a registry so can be used in a gallery
### Publishing Your Backend
1. Build your container image:
```bash
docker build -t quay.io/username/my-backend:latest .
```
2. Push to a container registry:
```bash
docker push quay.io/username/my-backend:latest
```
3. Add your backend to a gallery:
- Create a YAML entry in your gallery repository
- Include the backend definition
- Make the gallery accessible via HTTP/HTTPS
## Backend Types
LocalAI supports various types of backends:
- **LLM Backends**: For running language models
- **Diffusion Backends**: For image generation
- **TTS Backends**: For text-to-speech conversion
- **Whisper Backends**: For speech-to-text conversion

View file

@ -169,11 +169,6 @@ curl http://localhost:8080/tts -H "Content-Type: application/json" -d '{
}' | aplay
```
### Parler-tts
`parler-tts`. It is possible to install and configure the model directly from the gallery. https://github.com/huggingface/parler-tts
## Using config files
You can also use a `config-file` to specify TTS models and their parameters.

Some files were not shown because too many files have changed in this diff Show more