mirror of
https://github.com/mudler/LocalAI.git
synced 2025-05-20 18:45:00 +00:00
chore(model gallery): add gemma-3-12b-it-qat (#5117)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
This commit is contained in:
parent
6af3f46bc3
commit
128612a6fc
1 changed files with 18 additions and 0 deletions
|
@ -78,6 +78,24 @@
|
||||||
- filename: gemma-3-1b-it-Q4_K_M.gguf
|
- filename: gemma-3-1b-it-Q4_K_M.gguf
|
||||||
sha256: 8ccc5cd1f1b3602548715ae25a66ed73fd5dc68a210412eea643eb20eb75a135
|
sha256: 8ccc5cd1f1b3602548715ae25a66ed73fd5dc68a210412eea643eb20eb75a135
|
||||||
uri: huggingface://ggml-org/gemma-3-1b-it-GGUF/gemma-3-1b-it-Q4_K_M.gguf
|
uri: huggingface://ggml-org/gemma-3-1b-it-GGUF/gemma-3-1b-it-Q4_K_M.gguf
|
||||||
|
- !!merge <<: *gemma3
|
||||||
|
name: "gemma-3-12b-it-qat"
|
||||||
|
urls:
|
||||||
|
- https://huggingface.co/google/gemma-3-12b-it
|
||||||
|
- https://huggingface.co/vinimuchulski/gemma-3-12b-it-qat-q4_0-gguf
|
||||||
|
description: |
|
||||||
|
This model corresponds to the 12B instruction-tuned version of the Gemma 3 model in GGUF format using Quantization Aware Training (QAT). The GGUF corresponds to Q4_0 quantization.
|
||||||
|
|
||||||
|
Thanks to QAT, the model is able to preserve similar quality as bfloat16 while significantly reducing the memory requirements to load the model.
|
||||||
|
|
||||||
|
You can find the half-precision version here.
|
||||||
|
overrides:
|
||||||
|
parameters:
|
||||||
|
model: gemma-3-12b-it-q4_0.gguf
|
||||||
|
files:
|
||||||
|
- filename: gemma-3-12b-it-q4_0.gguf
|
||||||
|
sha256: 6f1bb5f455414f7b46482bda51cbfdbf19786e21a5498c4403fdfc03d09b045c
|
||||||
|
uri: huggingface://vinimuchulski/gemma-3-12b-it-qat-q4_0-gguf/gemma-3-12b-it-q4_0.gguf
|
||||||
- !!merge <<: *gemma3
|
- !!merge <<: *gemma3
|
||||||
name: "qgallouedec_gemma-3-27b-it-codeforces-sft"
|
name: "qgallouedec_gemma-3-27b-it-codeforces-sft"
|
||||||
urls:
|
urls:
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue