feat: allow to run parallel requests (#1290)

* feat: allow to run parallel requests

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixup

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
This commit is contained in:
Ettore Di Giacinto 2023-11-16 08:20:05 +01:00 committed by GitHub
parent 66a558ff41
commit fdd95d1d86
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
9 changed files with 91 additions and 44 deletions

5
.env
View file

@ -69,4 +69,7 @@ MODELS_PATH=/models
# PYTHON_GRPC_MAX_WORKERS=1
### Define the number of parallel LLAMA.cpp workers (Defaults to 1)
# LLAMACPP_PARALLEL=1
# LLAMACPP_PARALLEL=1
### Enable to run parallel requests
# PARALLEL_REQUESTS=true