docs: Initial import from localai-website (#1312)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
This commit is contained in:
Ettore Di Giacinto 2023-11-22 18:13:50 +01:00 committed by GitHub
parent 763f94ca80
commit c5c77d2b0d
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
66 changed files with 6111 additions and 0 deletions

View file

@ -0,0 +1,23 @@
+++
disableToc = false
title = "How-tos"
weight = 9
+++
## How-tos
This section includes LocalAI end-to-end examples, tutorial and how-tos curated by the community and maintained by [lunamidori5](https://github.com/lunamidori5).
- [Setup LocalAI with Docker on CPU]({{%relref "howtos/easy-setup-docker-cpu" %}})
- [Setup LocalAI with Docker With CUDA]({{%relref "howtos/easy-setup-docker-gpu" %}})
- [Seting up a Model]({{%relref "howtos/easy-model-import-downloaded" %}})
- [Making requests via Autogen]({{%relref "howtos/easy-request-autogen" %}})
- [Making requests via OpenAi API V0]({{%relref "howtos/easy-request-openai-v0" %}})
- [Making requests via OpenAi API V1]({{%relref "howtos/easy-request-openai-v1" %}})
- [Making requests via Curl]({{%relref "howtos/easy-request-curl" %}})
## Programs and Demos
This section includes other programs and how to setup, install, and use of LocalAI.
- [Python LocalAI Demo]({{%relref "howtos/easy-setup-full" %}}) - [lunamidori5](https://github.com/lunamidori5)
- [Autogen]({{%relref "howtos/autogen-setup" %}}) - [lunamidori5](https://github.com/lunamidori5)

View file

@ -0,0 +1,91 @@
+++
disableToc = false
title = "Easy Demo - AutoGen"
weight = 2
+++
This is just a short demo of setting up ``LocalAI`` with Autogen, this is based on you already having a model setup.
```python
import os
import openai
import autogen
openai.api_key = "sx-xxx"
OPENAI_API_KEY = "sx-xxx"
os.environ['OPENAI_API_KEY'] = OPENAI_API_KEY
config_list_json = [
{
"model": "gpt-3.5-turbo",
"api_base": "http://[YOURLOCALAIIPHERE]:8080/v1",
"api_type": "open_ai",
"api_key": "NULL",
}
]
print("models to use: ", [config_list_json[i]["model"] for i in range(len(config_list_json))])
llm_config = {"config_list": config_list_json, "seed": 42}
user_proxy = autogen.UserProxyAgent(
name="Admin",
system_message="A human admin. Interact with the planner to discuss the plan. Plan execution needs to be approved by this admin.",
code_execution_config={
"work_dir": "coding",
"last_n_messages": 8,
"use_docker": "python:3",
},
human_input_mode="ALWAYS",
is_termination_msg=lambda x: x.get("content", "").rstrip().endswith("TERMINATE"),
)
engineer = autogen.AssistantAgent(
name="Coder",
llm_config=llm_config,
)
scientist = autogen.AssistantAgent(
name="Scientist",
llm_config=llm_config,
system_message="""Scientist. You follow an approved plan. You are able to categorize papers after seeing their abstracts printed. You don't write code."""
)
planner = autogen.AssistantAgent(
name="Planner",
system_message='''Planner. Suggest a plan. Revise the plan based on feedback from admin and critic, until admin approval.
The plan may involve an engineer who can write code and a scientist who doesn't write code.
Explain the plan first. Be clear which step is performed by an engineer, and which step is performed by a scientist.
''',
llm_config=llm_config,
)
executor = autogen.UserProxyAgent(
name="Executor",
system_message="Executor. Execute the code written by the engineer and report the result.",
human_input_mode="NEVER",
code_execution_config={
"work_dir": "coding",
"last_n_messages": 8,
"use_docker": "python:3",
}
)
critic = autogen.AssistantAgent(
name="Critic",
system_message="Critic. Double check plan, claims, code from other agents and provide feedback. Check whether the plan includes adding verifiable info such as source URL.",
llm_config=llm_config,
)
groupchat = autogen.GroupChat(agents=[user_proxy, engineer, scientist, planner, executor, critic], messages=[], max_round=999)
manager = autogen.GroupChatManager(groupchat=groupchat, llm_config=llm_config)
#autogen.ChatCompletion.start_logging()
#text_input = input("Please enter request: ")
text_input = ("Change this to a task you would like the group chat to do or comment this out and uncomment the other line!")
#Uncomment one of these two chats based on what you would like to do
#user_proxy.initiate_chat(engineer, message=str(text_input))
#For a one on one chat use this one ^
#user_proxy.initiate_chat(manager, message=str(text_input))
#To setup a group chat use this one ^
```

View file

@ -0,0 +1,135 @@
+++
disableToc = false
title = "Easy Model Setup"
weight = 2
+++
Lets Learn how to setup a model, for this ``How To`` we are going to use the ``Luna-Ai`` model (Yes I know haha - ``Luna Midori`` making a how to using the ``luna-ai-llama2`` model - lol)
To download the model to your models folder, run this command in a commandline of your picking.
```bash
curl --location 'http://localhost:8080/models/apply' \
--header 'Content-Type: application/json' \
--data-raw '{
"id": "TheBloke/Luna-AI-Llama2-Uncensored-GGUF/luna-ai-llama2-uncensored.Q4_K_M.gguf"
}'
```
Each model needs at least ``4`` files, with out these files, the model will run raw, what that means is you can not change settings of the model.
```
File 1 - The model's GGUF file
File 2 - The model's .yaml file
File 3 - The Chat API .tmpl file
File 4 - The Completion API .tmpl file
```
So lets fix that! We are using ``lunademo`` name for this ``How To`` but you can name the files what ever you want! Lets make blank files to start with
```bash
touch lunademo-chat.tmpl
touch lunademo-completion.tmpl
touch lunademo.yaml
```
Now lets edit the `"lunademo-chat.tmpl"`, Looking at the huggingface repo, this model uses the ``ASSISTANT:`` tag for when the AI replys, so lets make sure to add that to this file. Do not add the user as we will be doing that in our yaml file!
```txt
{{.Input}}
ASSISTANT:
```
Now in the `"lunademo-completion.tmpl"` file lets add this.
```txt
Complete the following sentence: {{.Input}}
```
For the `"lunademo.yaml"` file. Lets set it up for your computer or hardware. (If you want to see advanced yaml configs - [Link](https://localai.io/advanced/))
We are going to 1st setup the backend and context size.
```yaml
backend: llama
context_size: 2000
```
What this does is tell ``LocalAI`` how to load the model. Then we are going to **add** our settings in after that. Lets add the models name and the models settings. The models ``name:`` is what you will put into your request when sending a ``OpenAI`` request to ``LocalAI``
```yaml
name: lunademo
parameters:
model: luna-ai-llama2-uncensored.Q4_K_M.gguf
temperature: 0.2
top_k: 40
top_p: 0.65
```
Now that we have the model set up, there a few things we should add to the yaml file to make it run better, for this model it uses the following roles.
```yaml
roles:
assistant: 'ASSISTANT:'
system: 'SYSTEM:'
user: 'USER:'
```
What that did is made sure that ``LocalAI`` added the test to the users in the request, so if a message is from ``system`` it shows up in the template as ``SYSTEM:``, speaking of template files, lets add those to our models yaml file now.
```yaml
template:
chat: lunademo-chat
completion: lunademo-completion
```
If you are running on ``GPU`` or want to tune the model, you can add settings like
```yaml
f16: true
gpu_layers: 4
```
To fully tune the model to your like. But be warned, you **must** restart ``LocalAI`` after changing a yaml file
```bash
docker-compose restart ##windows
docker compose restart ##linux / mac
```
If you want to check your models yaml, here is a full copy!
```yaml
backend: llama
context_size: 2000
##Put settings right here for tunning!! Before name but after Backend!
name: lunademo
parameters:
model: luna-ai-llama2-uncensored.Q4_K_M.gguf
temperature: 0.2
top_k: 40
top_p: 0.65
roles:
assistant: 'ASSISTANT:'
system: 'SYSTEM:'
user: 'USER:'
template:
chat: lunademo-chat
completion: lunademo-completion
```
Now that we got that setup, lets test it out but sending a request by using [Curl]({{%relref "easy-request-curl" %}}) Or use the [OpenAI Python API]({{%relref "easy-request-openai-v1" %}})!
## Adv Stuff
Alright now that we have learned how to set up our own models, here is how to use the gallery to do alot of this for us. This command will download and set up (mostly, we will **always** need to edit our yaml file to fit our computer / hardware)
```bash
curl http://localhost:8080/models/apply -H "Content-Type: application/json" -d '{
"id": "model-gallery@lunademo"
}'
```
This will setup the model, models yaml, and both template files (you will see it only did one, as completions is out of date and not supported by ``OpenAI`` if you need one, just follow the steps from before to make one.
If you would like to download a raw model using the gallery api, you can run this command. You will need to set up the 3 files needed to run the model tho!
```bash
curl --location 'http://localhost:8080/models/apply' \
--header 'Content-Type: application/json' \
--data-raw '{
"id": "NAME_OFF_HUGGINGFACE/REPO_NAME/MODENAME.gguf",
"name": "REQUSTNAME"
}'
```

View file

@ -0,0 +1 @@

View file

@ -0,0 +1,35 @@
+++
disableToc = false
title = "Easy Request - Curl"
weight = 2
+++
Now we can make a curl request!
Curl Chat API -
```bash
curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
"model": "lunademo",
"messages": [{"role": "user", "content": "How are you?"}],
"temperature": 0.9
}'
```
Curl Completion API -
```bash
curl --request POST \
--url http://localhost:8080/v1/completions \
--header 'Content-Type: application/json' \
--data '{
"model": "lunademo",
"prompt": "function downloadFile(string url, string outputPath) {",
"max_tokens": 256,
"temperature": 0.5
}'
```
See [OpenAI API](https://platform.openai.com/docs/api-reference) for more info!
Have fun using LocalAI!

View file

@ -0,0 +1,50 @@
+++
disableToc = false
title = "Easy Request - Openai V0"
weight = 2
+++
This is for Python, ``OpenAI``=``0.28.1``, if you are on ``OpenAI``=>``V1`` please use this [How to]({{%relref "howtos/easy-request-openai-v1" %}})
OpenAI Chat API Python -
```python
import os
import openai
openai.api_base = "http://localhost:8080/v1"
openai.api_key = "sx-xxx"
OPENAI_API_KEY = "sx-xxx"
os.environ['OPENAI_API_KEY'] = OPENAI_API_KEY
completion = openai.ChatCompletion.create(
model="lunademo",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "How are you?"}
]
)
print(completion.choices[0].message.content)
```
OpenAI Completion API Python -
```python
import os
import openai
openai.api_base = "http://localhost:8080/v1"
openai.api_key = "sx-xxx"
OPENAI_API_KEY = "sx-xxx"
os.environ['OPENAI_API_KEY'] = OPENAI_API_KEY
completion = openai.Completion.create(
model="lunademo",
prompt="function downloadFile(string url, string outputPath) ",
max_tokens=256,
temperature=0.5)
print(completion.choices[0].text)
```
See [OpenAI API](https://platform.openai.com/docs/api-reference) for more info!
Have fun using LocalAI!

View file

@ -0,0 +1,28 @@
+++
disableToc = false
title = "Easy Request - Openai V1"
weight = 2
+++
This is for Python, ``OpenAI``=>``V1``, if you are on ``OpenAI``<``V1`` please use this [How to]({{%relref "howtos/easy-request-openai-v0" %}})
OpenAI Chat API Python -
```python
from openai import OpenAI
client = OpenAI(base_url="http://localhost:8080/v1", api_key="sk-xxx")
messages = [
{"role": "system", "content": "You are LocalAI, a helpful, but really confused ai, you will only reply with confused emotes"},
{"role": "user", "content": "Hello How are you today LocalAI"}
]
completion = client.chat.completions.create(
model="lunademo",
messages=messages,
)
print(completion.choices[0].message)
```
See [OpenAI API](https://platform.openai.com/docs/api-reference) for more info!
Have fun using LocalAI!

View file

@ -0,0 +1,131 @@
+++
disableToc = false
title = "Easy Setup - CPU Docker"
weight = 2
+++
{{% notice Note %}}
- You will need about 10gb of RAM Free
- You will need about 15gb of space free on C drive for ``Docker-compose``
{{% /notice %}}
We are going to run `LocalAI` with `docker-compose` for this set up.
Lets clone `LocalAI` with git.
```bash
git clone https://github.com/go-skynet/LocalAI
```
Then we will cd into the ``LocalAI`` folder.
```bash
cd LocalAI
```
At this point we want to set up our `.env` file, here is a copy for you to use if you wish, please make sure to set it to the same as in the `docker-compose` file for later.
```bash
## Set number of threads.
## Note: prefer the number of physical cores. Overbooking the CPU degrades performance notably.
THREADS=2
## Specify a different bind address (defaults to ":8080")
# ADDRESS=127.0.0.1:8080
## Define galleries.
## models will to install will be visible in `/models/available`
GALLERIES=[{"name":"model-gallery", "url":"github:go-skynet/model-gallery/index.yaml"}, {"url": "github:go-skynet/model-gallery/huggingface.yaml","name":"huggingface"}]
## Default path for models
MODELS_PATH=/models
## Enable debug mode
# DEBUG=true
## Disables COMPEL (Lets Stable Diffuser work, uncomment if you plan on using it)
# COMPEL=0
## Enable/Disable single backend (useful if only one GPU is available)
# SINGLE_ACTIVE_BACKEND=true
## Specify a build type. Available: cublas, openblas, clblas.
BUILD_TYPE=cublas
## Uncomment and set to true to enable rebuilding from source
# REBUILD=true
## Enable go tags, available: stablediffusion, tts
## stablediffusion: image generation with stablediffusion
## tts: enables text-to-speech with go-piper
## (requires REBUILD=true)
#
#GO_TAGS=tts
## Path where to store generated images
# IMAGE_PATH=/tmp
## Specify a default upload limit in MB (whisper)
# UPLOAD_LIMIT
# HUGGINGFACEHUB_API_TOKEN=Token here
```
Now that we have the `.env` set lets set up our `docker-compose` file.
It will use a container from [quay.io](https://quay.io/repository/go-skynet/local-ai?tab=tags).
Also note this `docker-compose` file is for `CPU` only.
```docker
version: '3.6'
services:
api:
image: quay.io/go-skynet/local-ai:v1.40.0
tty: true # enable colorized logs
restart: always # should this be on-failure ?
ports:
- 8080:8080
env_file:
- .env
volumes:
- ./models:/models
- ./images/:/tmp/generated/images/
command: ["/usr/bin/local-ai" ]
```
Make sure to save that in the root of the `LocalAI` folder. Then lets spin up the Docker run this in a `CMD` or `BASH`
```bash
docker-compose up -d --pull always
```
Now we are going to let that set up, once it is done, lets check to make sure our huggingface / localai galleries are working (wait until you see this screen to do this)
You should see:
```
┌───────────────────────────────────────────────────┐
│ Fiber v2.42.0 │
│ http://127.0.0.1:8080 │
│ (bound on host 0.0.0.0 and port 8080) │
│ │
│ Handlers ............. 1 Processes ........... 1 │
│ Prefork ....... Disabled PID ................. 1 │
└───────────────────────────────────────────────────┘
```
```bash
curl http://localhost:8080/models/available
```
Output will look like this:
![](https://cdn.discordapp.com/attachments/1116933141895053322/1134037542845566976/image.png)
Now that we got that setup, lets go setup a [model]({{%relref "easy-model-import-downloaded" %}})

View file

@ -0,0 +1,146 @@
+++
disableToc = false
title = "Easy Setup - GPU Docker"
weight = 2
+++
{{% notice Note %}}
- You will need about 10gb of RAM Free
- You will need about 15gb of space free on C drive for ``Docker-compose``
{{% /notice %}}
We are going to run `LocalAI` with `docker-compose` for this set up.
Lets clone `LocalAI` with git.
```bash
git clone https://github.com/go-skynet/LocalAI
```
Then we will cd into the `LocalAI` folder.
```bash
cd LocalAI
```
At this point we want to set up our `.env` file, here is a copy for you to use if you wish, please make sure to set it to the same as in the `docker-compose` file for later.
```bash
## Set number of threads.
## Note: prefer the number of physical cores. Overbooking the CPU degrades performance notably.
THREADS=2
## Specify a different bind address (defaults to ":8080")
# ADDRESS=127.0.0.1:8080
## Define galleries.
## models will to install will be visible in `/models/available`
GALLERIES=[{"name":"model-gallery", "url":"github:go-skynet/model-gallery/index.yaml"}, {"url": "github:go-skynet/model-gallery/huggingface.yaml","name":"huggingface"}]
## Default path for models
MODELS_PATH=/models
## Enable debug mode
# DEBUG=true
## Disables COMPEL (Lets Stable Diffuser work, uncomment if you plan on using it)
# COMPEL=0
## Enable/Disable single backend (useful if only one GPU is available)
# SINGLE_ACTIVE_BACKEND=true
## Specify a build type. Available: cublas, openblas, clblas.
BUILD_TYPE=cublas
## Uncomment and set to true to enable rebuilding from source
# REBUILD=true
## Enable go tags, available: stablediffusion, tts
## stablediffusion: image generation with stablediffusion
## tts: enables text-to-speech with go-piper
## (requires REBUILD=true)
#
#GO_TAGS=tts
## Path where to store generated images
# IMAGE_PATH=/tmp
## Specify a default upload limit in MB (whisper)
# UPLOAD_LIMIT
# HUGGINGFACEHUB_API_TOKEN=Token here
```
Now that we have the `.env` set lets set up our `docker-compose` file.
It will use a container from [quay.io](https://quay.io/repository/go-skynet/local-ai?tab=tags).
Also note this `docker-compose` file is for `CUDA` only.
Please change the image to what you need.
```
Cuda 11 - v1.40.0-cublas-cuda11
Cuda 12 - v1.40.0-cublas-cuda12
Cuda 11 with TTS - v1.40.0-cublas-cuda11-ffmpeg
Cuda 12 with TTS - v1.40.0-cublas-cuda12-ffmpeg
```
```docker
version: '3.6'
services:
api:
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
image: quay.io/go-skynet/local-ai:[CHANGEMETOIMAGENEEDED]
tty: true # enable colorized logs
restart: always # should this be on-failure ?
ports:
- 8080:8080
env_file:
- .env
volumes:
- ./models:/models
- ./images/:/tmp/generated/images/
command: ["/usr/bin/local-ai" ]
```
Make sure to save that in the root of the `LocalAI` folder. Then lets spin up the Docker run this in a `CMD` or `BASH`
```bash
docker-compose up -d --pull always
```
Now we are going to let that set up, once it is done, lets check to make sure our huggingface / localai galleries are working (wait until you see this screen to do this)
You should see:
```
┌───────────────────────────────────────────────────┐
│ Fiber v2.42.0 │
│ http://127.0.0.1:8080 │
│ (bound on host 0.0.0.0 and port 8080) │
│ │
│ Handlers ............. 1 Processes ........... 1 │
│ Prefork ....... Disabled PID ................. 1 │
└───────────────────────────────────────────────────┘
```
```bash
curl http://localhost:8080/models/available
```
Output will look like this:
![](https://cdn.discordapp.com/attachments/1116933141895053322/1134037542845566976/image.png)
Now that we got that setup, lets go setup a [model]({{%relref "easy-model-import-downloaded" %}})

View file

@ -0,0 +1,37 @@
+++
disableToc = false
title = "Easy Setup - Embeddings"
weight = 2
+++
To install an embedding model, run the following command
```bash
curl http://localhost:8080/models/apply -H "Content-Type: application/json" -d '{
"id": "model-gallery@bert-embeddings"
}'
```
Now we need to make a ``bert.yaml`` in the models folder
```yaml
backend: bert-embeddings
embeddings: true
name: text-embedding-ada-002
parameters:
model: bert
```
**Restart LocalAI after you change a yaml file**
When you would like to request the model from CLI you can do
```bash
curl http://localhost:8080/v1/embeddings \
-H "Content-Type: application/json" \
-d '{
"input": "The food was delicious and the waiter...",
"model": "text-embedding-ada-002"
}'
```
See [OpenAI Embedding](https://platform.openai.com/docs/api-reference/embeddings/object) for more info!

View file

@ -0,0 +1,61 @@
+++
disableToc = false
title = "Easy Demo - Full Chat Python AI"
weight = 2
+++
{{% notice Note %}}
- You will need about 10gb of RAM Free
- You will need about 15gb of space free on C drive for ``Docker-compose``
{{% /notice %}}
This is for `Linux`, `Mac OS`, or `Windows` Hosts. - [Docker Desktop](https://docs.docker.com/engine/install/), [Python 3.11](https://www.python.org/downloads/release/python-3110/), [Git](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git)
Linux Hosts:
There is a Full_Auto installer compatible with some types of Linux distributions, feel free to use them, but note that they may not fully work. If you need to install something, please use the links at the top.
```bash
git clone https://github.com/lunamidori5/localai-lunademo.git
cd localai-lunademo
#Pick your type of linux for the Full Autos, if you already have python, docker, and docker-compose installed skip this chmod. But make sure you chmod the setup_linux file.
chmod +x Full_Auto_setup_Debian.sh or chmod +x Full_Auto_setup_Ubutnu.sh
chmod +x Setup_Linux.sh
#Make sure to install cuda to your host OS and to Docker if you plan on using GPU
./(the setupfile you wish to run)
```
Windows Hosts:
```batch
REM Make sure you have git, docker-desktop, and python 3.11 installed
git clone https://github.com/lunamidori5/localai-lunademo.git
cd localai-lunademo
call Setup.bat
```
MacOS Hosts:
- I need some help working on a MacOS Setup file, if you are willing to help out, please contact Luna Midori on [discord](https://discord.com/channels/1096914990004457512/1099364883755171890/1147591145057157200) or put in a PR on [Luna Midori's github](https://github.com/lunamidori5/localai-lunademo).
Video How Tos
- Ubuntu - ``COMING SOON``
- Debian - ``COMING SOON``
- Windows - ``COMING SOON``
- MacOS - ``PLANED - NEED HELP``
Enjoy localai! (If you need help contact Luna Midori on [Discord](https://discord.com/channels/1096914990004457512/1099364883755171890/1147591145057157200))
{{% notice Issues %}}
- Trying to run ``Setup.bat`` or ``Setup_Linux.sh`` from `Git Bash` on Windows is not working. (Somewhat fixed)
- Running over `SSH` or other remote command line based apps may bug out, load slowly, or crash.
{{% /notice %}}

View file

@ -0,0 +1,46 @@
+++
disableToc = false
title = "Easy Setup - Stable Diffusion"
weight = 2
+++
To set up a Stable Diffusion model is super easy.
In your models folder make a file called ``stablediffusion.yaml``, then edit that file with the following. (You can change ``Linaqruf/animagine-xl`` with what ever ``sd-lx`` model you would like.
```yaml
name: animagine-xl
parameters:
model: Linaqruf/animagine-xl
backend: diffusers
# Force CPU usage - set to true for GPU
f16: false
diffusers:
pipeline_type: StableDiffusionXLPipeline
cuda: false # Enable for GPU usage (CUDA)
scheduler_type: dpm_2_a
```
If you are using docker, you will need to run in the localai folder with the ``docker-compose.yaml`` file in it
```bash
docker-compose down #windows
docker compose down #linux/mac
```
Then in your ``.env`` file uncomment this line.
```yaml
COMPEL=0
```
After that we can reinstall the LocalAI docker VM by running in the localai folder with the ``docker-compose.yaml`` file in it
```bash
docker-compose up #windows
docker compose up #linux/mac
```
Then to download and setup the model, Just send in a normal ``OpenAI`` request! LocalAI will do the rest!
```bash
curl http://localhost:8080/v1/images/generations -H "Content-Type: application/json" -d '{
"prompt": "Two Boxes, 1blue, 1red",
"size": "256x256"
}'
```