🔥 add LaVA support and GPT vision API, Multiple requests for llama.cpp, return JSON types (#1254)

* wip * wip * Make it functional Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * wip * Small fixups * do not inject space on role encoding, encode img at beginning of messages Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Add examples/config defaults * Add include dir of current source dir * cleanup * fixes Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fixups * Revert "fixups" This reverts commit f1a4731cca. * fixes Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-05-30 23:44:59 +00:00 · 2023-11-11 13:14:59 +01:00 · 2023-11-11 13:14:59 +01:00 · 0eae727366
commit 0eae727366
parent 3b4c5d54d8
28 changed files with 26908 additions and 864 deletions
--- a/.env
+++ b/.env
@ -66,4 +66,7 @@ MODELS_PATH=/models
 ### Python backends GRPC max workers
 ### Default number of workers for GRPC Python backends.
 ### This actually controls wether a backend can process multiple requests or not.
-# PYTHON_GRPC_MAX_WORKERS=1
+# PYTHON_GRPC_MAX_WORKERS=1
+
+### Define the number of parallel LLAMA.cpp workers (Defaults to 1)
+# LLAMACPP_PARALLEL=1