feat(llama.cpp): support embeddings endpoints (#2871)

* feat(llama.cpp): add embeddings

Also enable embeddings by default for llama.cpp models

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(Makefile): prepare llama.cpp sources only once

Otherwise we keep cloning llama.cpp for each of the variants

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* do not set embeddings to false

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* docs: add embeddings to the YAML config reference

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
This commit is contained in:
Ettore Di Giacinto 2024-07-15 22:54:16 +02:00 committed by GitHub
parent 6564e7ea01
commit 35561edb6e
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
5 changed files with 44 additions and 12 deletions

View file

@ -112,6 +112,8 @@ name: "" # Model name, used to identify the model in API calls.
# Precision settings for the model, reducing precision can enhance performance on some hardware.
f16: null # Whether to use 16-bit floating-point precision.
embeddings: true # Enable embeddings for the model.
# Concurrency settings for the application.
threads: null # Number of threads to use for processing.