mirror of
https://github.com/mudler/LocalAI.git
synced 2025-05-25 13:04:59 +00:00
How To Updates / Model Used Switched / Removed "docker-compose" (RIP) (#1417)
* Update _index.md Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> * Update easy-model.md Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> * Update easy-setup-docker-cpu.md Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> * Update easy-setup-docker-gpu.md Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> * Update _index.en.md Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> * Update easy-setup-docker-cpu.md Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> * Update easy-setup-docker-gpu.md Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> * Update easy-setup-docker-cpu.md Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> * Update easy-setup-docker-cpu.md Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> * Update easy-setup-docker-gpu.md Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> * Update easy-model.md Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> * Update easy-setup-docker-cpu.md Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> * Update easy-setup-docker-gpu.md Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> * Update easy-setup-docker-cpu.md Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> * Update _index.en.md Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> * Update easy-setup-docker-gpu.md Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> --------- Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com>
This commit is contained in:
parent
4a965e1b0e
commit
9222bec8b1
5 changed files with 75 additions and 64 deletions
|
@ -5,43 +5,52 @@ title = "Easy Model Setup"
|
|||
weight = 2
|
||||
+++
|
||||
|
||||
Lets Learn how to setup a model, for this ``How To`` we are going to use the ``Luna-Ai`` model (Yes I know haha - ``Luna Midori`` making a how to using the ``luna-ai-llama2`` model - lol)
|
||||
Lets learn how to setup a model, for this ``How To`` we are going to use the ``Dolphin 2.2.1 Mistral 7B`` model.
|
||||
|
||||
To download the model to your models folder, run this command in a commandline of your picking.
|
||||
```bash
|
||||
curl --location 'http://localhost:8080/models/apply' \
|
||||
--header 'Content-Type: application/json' \
|
||||
--data-raw '{
|
||||
"id": "TheBloke/Luna-AI-Llama2-Uncensored-GGUF/luna-ai-llama2-uncensored.Q4_K_M.gguf"
|
||||
"id": "TheBloke/dolphin-2.2.1-mistral-7B-GGUF/dolphin-2.2.1-mistral-7b.Q4_0.gguf"
|
||||
}'
|
||||
```
|
||||
|
||||
Each model needs at least ``4`` files, with out these files, the model will run raw, what that means is you can not change settings of the model.
|
||||
Each model needs at least ``5`` files, with out these files, the model will run raw, what that means is you can not change settings of the model.
|
||||
```
|
||||
File 1 - The model's GGUF file
|
||||
File 2 - The model's .yaml file
|
||||
File 3 - The Chat API .tmpl file
|
||||
File 4 - The Completion API .tmpl file
|
||||
File 4 - The Chat API helper .tmpl file
|
||||
File 5 - The Completion API .tmpl file
|
||||
```
|
||||
So lets fix that! We are using ``lunademo`` name for this ``How To`` but you can name the files what ever you want! Lets make blank files to start with
|
||||
|
||||
```bash
|
||||
touch lunademo-chat.tmpl
|
||||
touch lunademo-chat-block.tmpl
|
||||
touch lunademo-completion.tmpl
|
||||
touch lunademo.yaml
|
||||
```
|
||||
Now lets edit the `"lunademo-chat.tmpl"`, Looking at the huggingface repo, this model uses the ``ASSISTANT:`` tag for when the AI replys, so lets make sure to add that to this file. Do not add the user as we will be doing that in our yaml file!
|
||||
Now lets edit the `"lunademo-chat.tmpl"`, This is the template that model "Chat" trained models use, but changed for LocalAI
|
||||
|
||||
```txt
|
||||
<|im_start|>{{if eq .RoleName "assistant"}}assistant{{else if eq .RoleName "system"}}system{{else if eq .RoleName "user"}}user{{end}}
|
||||
{{if .Content}}{{.Content}}{{end}}
|
||||
<|im_end|>
|
||||
```
|
||||
|
||||
For the `"lunademo-chat-block.tmpl"`, Looking at the huggingface repo, this model uses the ``<|im_start|>assistant`` tag for when the AI replys, so lets make sure to add that to this file. Do not add the user as we will be doing that in our yaml file!
|
||||
|
||||
```txt
|
||||
{{.Input}}
|
||||
|
||||
ASSISTANT:
|
||||
<|im_start|>assistant
|
||||
```
|
||||
|
||||
Now in the `"lunademo-completion.tmpl"` file lets add this.
|
||||
Now in the `"lunademo-completion.tmpl"` file lets add this. (This is a hold over from OpenAI V0)
|
||||
|
||||
```txt
|
||||
Complete the following sentence: {{.Input}}
|
||||
{{.Input}}
|
||||
```
|
||||
|
||||
|
||||
|
@ -58,25 +67,18 @@ What this does is tell ``LocalAI`` how to load the model. Then we are going to *
|
|||
```yaml
|
||||
name: lunademo
|
||||
parameters:
|
||||
model: luna-ai-llama2-uncensored.Q4_K_M.gguf
|
||||
model: dolphin-2.2.1-mistral-7b.Q4_0.gguf
|
||||
```
|
||||
|
||||
Now that we have the model set up, there a few things we should add to the yaml file to make it run better, for this model it uses the following roles.
|
||||
```yaml
|
||||
roles:
|
||||
assistant: 'ASSISTANT:'
|
||||
system: 'SYSTEM:'
|
||||
user: 'USER:'
|
||||
```
|
||||
|
||||
What that did is made sure that ``LocalAI`` added the test to the users in the request, so if a message is from ``system`` it shows up in the template as ``SYSTEM:``, speaking of template files, lets add those to our models yaml file now.
|
||||
Now that LocalAI knows what file to load with our request, lets add the template files to our models yaml file now.
|
||||
```yaml
|
||||
template:
|
||||
chat: lunademo-chat
|
||||
chat: lunademo-chat-block
|
||||
chat_message: lunademo-chat
|
||||
completion: lunademo-completion
|
||||
```
|
||||
|
||||
If you are running on ``GPU`` or want to tune the model, you can add settings like
|
||||
If you are running on ``GPU`` or want to tune the model, you can add settings like (higher the GPU Layers the more GPU used)
|
||||
```yaml
|
||||
f16: true
|
||||
gpu_layers: 4
|
||||
|
@ -85,8 +87,7 @@ gpu_layers: 4
|
|||
To fully tune the model to your like. But be warned, you **must** restart ``LocalAI`` after changing a yaml file
|
||||
|
||||
```bash
|
||||
docker-compose restart ##windows
|
||||
docker compose restart ##linux / mac
|
||||
docker compose restart
|
||||
```
|
||||
|
||||
If you want to check your models yaml, here is a full copy!
|
||||
|
@ -96,19 +97,18 @@ context_size: 2000
|
|||
##Put settings right here for tunning!! Before name but after Backend!
|
||||
name: lunademo
|
||||
parameters:
|
||||
model: luna-ai-llama2-uncensored.Q4_K_M.gguf
|
||||
roles:
|
||||
assistant: 'ASSISTANT:'
|
||||
system: 'SYSTEM:'
|
||||
user: 'USER:'
|
||||
model: dolphin-2.2.1-mistral-7b.Q4_0.gguf
|
||||
template:
|
||||
chat: lunademo-chat
|
||||
chat: lunademo-chat-block
|
||||
chat_message: lunademo-chat
|
||||
completion: lunademo-completion
|
||||
```
|
||||
|
||||
Now that we got that setup, lets test it out but sending a [request]({{%relref "easy-request" %}}) to Localai!
|
||||
|
||||
## Adv Stuff
|
||||
## ----- Adv Stuff -----
|
||||
|
||||
**(Please do not run these steps if you have already done the setup)**
|
||||
Alright now that we have learned how to set up our own models, here is how to use the gallery to do alot of this for us. This command will download and set up (mostly, we will **always** need to edit our yaml file to fit our computer / hardware)
|
||||
```bash
|
||||
curl http://localhost:8080/models/apply -H "Content-Type: application/json" -d '{
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue