feat: cuda transformers (#1401)

* Use cuda in transformers if available tensorflow probably needs a different check. Signed-off-by: Erich Schubert <kno10@users.noreply.github.com> * feat: expose CUDA at top level Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * tests: add to tests and create workflow for py extra backends * doc: update note on how to use core images --------- Signed-off-by: Erich Schubert <kno10@users.noreply.github.com> Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Erich Schubert <kno10@users.noreply.github.com>
2025-05-28 06:25:00 +00:00 · 2023-12-08 15:45:04 +01:00 · 2023-12-08 15:45:04 +01:00 · 887b3dff04
commit 887b3dff04
parent 3822bd2369
9 changed files with 163 additions and 11 deletions
--- a/docs/content/advanced/_index.en.md
+++ b/docs/content/advanced/_index.en.md
@ -207,6 +207,9 @@ lora_adapter: "/path/to/lora/adapter"
 lora_base: "/path/to/lora/base"
 # Disable mulmatq (CUDA)
 no_mulmatq: true
+
+# Diffusers/transformers
+cuda: true
 ```

 ### Prompt templates 
@ -363,4 +366,32 @@ You can control the backends that are built by setting the `GRPC_BACKENDS` envir
 make GRPC_BACKENDS=backend-assets/grpc/llama-cpp build
 ```

-By default, all the backends are built.
+By default, all the backends are built.
+
+### Extra backends
+
+LocalAI can be extended with extra backends. The backends are implemented as `gRPC` services and can be written in any language. The container images that are built and published on [quay.io](https://quay.io/repository/go-skynet/local-ai?tab=tags) contain a set of images split in core and extra. By default Images bring all the dependencies and backends supported by LocalAI (we call those `extra` images). The `-core` images instead bring only the strictly necessary dependencies to run LocalAI without only a core set of backends.
+
+If you wish to build a custom container image with extra backends, you can use the core images and build only the backends you are interested into. For instance, to use the diffusers backend:
+
+```Dockerfile
+FROM quay.io/go-skynet/local-ai:master-ffmpeg-core
+
+RUN PATH=$PATH:/opt/conda/bin make -C backend/python/diffusers
+```
+
+Remember also to set the `EXTERNAL_GRPC_BACKENDS` environment variable (or `--external-grpc-backends` as CLI flag) to point to the backends you are using (`EXTERNAL_GRPC_BACKENDS="backend_name:/path/to/backend"`), for example with diffusers:
+
+```Dockerfile
+FROM quay.io/go-skynet/local-ai:master-ffmpeg-core
+
+RUN PATH=$PATH:/opt/conda/bin make -C backend/python/diffusers
+
+ENV EXTERNAL_GRPC_BACKENDS="diffusers:/build/backend/python/diffusers/run.sh"
+```
+
+{{% notice note %}}
+
+You can specify remote external backends or path to local files. The syntax is `backend-name:/path/to/backend` or `backend-name:host:port`.
+
+{{% /notice %}}
--- a/docs/content/getting_started/_index.en.md
+++ b/docs/content/getting_started/_index.en.md
@ -178,6 +178,7 @@ You can control LocalAI with command line arguments, to specify a binding addres
 | --watchdog-busy-timeout value | $WATCHDOG_BUSY_TIMEOUT | 5m | Watchdog timeout. This will restart the backend if it crashes.  |
 | --watchdog-idle-timeout value | $WATCHDOG_IDLE_TIMEOUT | 15m | Watchdog idle timeout. This will restart the backend if it crashes. |
 | --preload-backend-only | $PRELOAD_BACKEND_ONLY | false | If set, the api is NOT launched, and only the preloaded models / backends are started. This is intended for multi-node setups. |
+| --external-grpc-backends | EXTERNAL_GRPC_BACKENDS | none | Comma separated list of external gRPC backends to use. Format: `name:host:port` or `name:/path/to/file` |

 ### Container images