chore(model-gallery): add more quants for popular models (#3365)

* models(gallery): add higher quants for some llama and hermes

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* models(gallery): vllm: specify a reasonable max_tokens

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
This commit is contained in:
Ettore Di Giacinto 2024-08-24 00:29:24 +02:00 committed by GitHub
parent ac5f6f210b
commit 84d6e5a987
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
3 changed files with 53 additions and 0 deletions

View file

@ -2,6 +2,9 @@
name: "vllm"
config_file: |
context_size: 8192
parameters:
max_tokens: 8192
backend: vllm
function:
disable_no_action: true