docs: Initial import from localai-website (#1312)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-05-30 23:44:59 +00:00 · 2023-11-22 18:13:50 +01:00 · 2023-11-22 18:13:50 +01:00 · c5c77d2b0d
commit c5c77d2b0d
parent 763f94ca80
66 changed files with 6111 additions and 0 deletions
--- a/docs/content/model-compatibility/autogptq.md
+++ b/docs/content/model-compatibility/autogptq.md
@ -0,0 +1,38 @@
+
+++
+disableToc = false
+title = "🦙 AutoGPTQ"
+weight = 3
+++
+
+[AutoGPTQ](https://github.com/PanQiWei/AutoGPTQ) is an easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
+
+## Prerequisites
+
+This is an extra backend - in the container images is already available and there is nothing to do for the setup.
+
+If you are building LocalAI locally, you need to install [AutoGPTQ manually](https://github.com/PanQiWei/AutoGPTQ#quick-installation).
+
+
+## Model setup
+
+The models are automatically downloaded from `huggingface` if not present the first time. It is possible to define models via `YAML` config file, or just by querying the endpoint with the `huggingface` repository model name. For example, create a `YAML` config file in `models/`:
+
+```
+name: orca
+backend: autogptq
+model_base_name: "orca_mini_v2_13b-GPTQ-4bit-128g.no-act.order"
+parameters:
+  model: "TheBloke/orca_mini_v2_13b-GPTQ"
+# ...
+```
+
+Test with:
+
+```bash
+curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{                                                                                                         
+   "model": "orca",
+   "messages": [{"role": "user", "content": "How are you?"}],
+   "temperature": 0.1
+ }'
+```