mirror of
https://github.com/mudler/LocalAI.git
synced 2025-05-28 06:25:00 +00:00
feat: cuda transformers (#1401)
* Use cuda in transformers if available tensorflow probably needs a different check. Signed-off-by: Erich Schubert <kno10@users.noreply.github.com> * feat: expose CUDA at top level Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * tests: add to tests and create workflow for py extra backends * doc: update note on how to use core images --------- Signed-off-by: Erich Schubert <kno10@users.noreply.github.com> Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Erich Schubert <kno10@users.noreply.github.com>
This commit is contained in:
parent
3822bd2369
commit
887b3dff04
9 changed files with 163 additions and 11 deletions
|
@ -207,6 +207,9 @@ lora_adapter: "/path/to/lora/adapter"
|
|||
lora_base: "/path/to/lora/base"
|
||||
# Disable mulmatq (CUDA)
|
||||
no_mulmatq: true
|
||||
|
||||
# Diffusers/transformers
|
||||
cuda: true
|
||||
```
|
||||
|
||||
### Prompt templates
|
||||
|
@ -363,4 +366,32 @@ You can control the backends that are built by setting the `GRPC_BACKENDS` envir
|
|||
make GRPC_BACKENDS=backend-assets/grpc/llama-cpp build
|
||||
```
|
||||
|
||||
By default, all the backends are built.
|
||||
By default, all the backends are built.
|
||||
|
||||
### Extra backends
|
||||
|
||||
LocalAI can be extended with extra backends. The backends are implemented as `gRPC` services and can be written in any language. The container images that are built and published on [quay.io](https://quay.io/repository/go-skynet/local-ai?tab=tags) contain a set of images split in core and extra. By default Images bring all the dependencies and backends supported by LocalAI (we call those `extra` images). The `-core` images instead bring only the strictly necessary dependencies to run LocalAI without only a core set of backends.
|
||||
|
||||
If you wish to build a custom container image with extra backends, you can use the core images and build only the backends you are interested into. For instance, to use the diffusers backend:
|
||||
|
||||
```Dockerfile
|
||||
FROM quay.io/go-skynet/local-ai:master-ffmpeg-core
|
||||
|
||||
RUN PATH=$PATH:/opt/conda/bin make -C backend/python/diffusers
|
||||
```
|
||||
|
||||
Remember also to set the `EXTERNAL_GRPC_BACKENDS` environment variable (or `--external-grpc-backends` as CLI flag) to point to the backends you are using (`EXTERNAL_GRPC_BACKENDS="backend_name:/path/to/backend"`), for example with diffusers:
|
||||
|
||||
```Dockerfile
|
||||
FROM quay.io/go-skynet/local-ai:master-ffmpeg-core
|
||||
|
||||
RUN PATH=$PATH:/opt/conda/bin make -C backend/python/diffusers
|
||||
|
||||
ENV EXTERNAL_GRPC_BACKENDS="diffusers:/build/backend/python/diffusers/run.sh"
|
||||
```
|
||||
|
||||
{{% notice note %}}
|
||||
|
||||
You can specify remote external backends or path to local files. The syntax is `backend-name:/path/to/backend` or `backend-name:host:port`.
|
||||
|
||||
{{% /notice %}}
|
||||
|
|
|
@ -178,6 +178,7 @@ You can control LocalAI with command line arguments, to specify a binding addres
|
|||
| --watchdog-busy-timeout value | $WATCHDOG_BUSY_TIMEOUT | 5m | Watchdog timeout. This will restart the backend if it crashes. |
|
||||
| --watchdog-idle-timeout value | $WATCHDOG_IDLE_TIMEOUT | 15m | Watchdog idle timeout. This will restart the backend if it crashes. |
|
||||
| --preload-backend-only | $PRELOAD_BACKEND_ONLY | false | If set, the api is NOT launched, and only the preloaded models / backends are started. This is intended for multi-node setups. |
|
||||
| --external-grpc-backends | EXTERNAL_GRPC_BACKENDS | none | Comma separated list of external gRPC backends to use. Format: `name:host:port` or `name:/path/to/file` |
|
||||
|
||||
### Container images
|
||||
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue