feat(llama.cpp): support embeddings endpoints (#2871)

* feat(llama.cpp): add embeddings Also enable embeddings by default for llama.cpp models Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(Makefile): prepare llama.cpp sources only once Otherwise we keep cloning llama.cpp for each of the variants Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * do not set embeddings to false Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * docs: add embeddings to the YAML config reference Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-05-20 10:35:01 +00:00 · 2024-07-15 22:54:16 +02:00 · 2024-07-15 22:54:16 +02:00 · 35561edb6e
commit 35561edb6e
parent 6564e7ea01
5 changed files with 44 additions and 12 deletions
--- a/docs/content/docs/advanced/advanced-usage.md
+++ b/docs/content/docs/advanced/advanced-usage.md
@ -112,6 +112,8 @@ name: "" # Model name, used to identify the model in API calls.
 # Precision settings for the model, reducing precision can enhance performance on some hardware.
 f16: null # Whether to use 16-bit floating-point precision.

+embeddings: true # Enable embeddings for the model.
+
 # Concurrency settings for the application.
 threads: null # Number of threads to use for processing.