mirror of
https://github.com/mudler/LocalAI.git
synced 2025-05-24 04:25:00 +00:00
docs: Initial import from localai-website (#1312)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
This commit is contained in:
parent
763f94ca80
commit
c5c77d2b0d
66 changed files with 6111 additions and 0 deletions
23
docs/content/howtos/_index.md
Normal file
23
docs/content/howtos/_index.md
Normal file
|
@ -0,0 +1,23 @@
|
|||
+++
|
||||
disableToc = false
|
||||
title = "How-tos"
|
||||
weight = 9
|
||||
+++
|
||||
|
||||
## How-tos
|
||||
|
||||
This section includes LocalAI end-to-end examples, tutorial and how-tos curated by the community and maintained by [lunamidori5](https://github.com/lunamidori5).
|
||||
|
||||
- [Setup LocalAI with Docker on CPU]({{%relref "howtos/easy-setup-docker-cpu" %}})
|
||||
- [Setup LocalAI with Docker With CUDA]({{%relref "howtos/easy-setup-docker-gpu" %}})
|
||||
- [Seting up a Model]({{%relref "howtos/easy-model-import-downloaded" %}})
|
||||
- [Making requests via Autogen]({{%relref "howtos/easy-request-autogen" %}})
|
||||
- [Making requests via OpenAi API V0]({{%relref "howtos/easy-request-openai-v0" %}})
|
||||
- [Making requests via OpenAi API V1]({{%relref "howtos/easy-request-openai-v1" %}})
|
||||
- [Making requests via Curl]({{%relref "howtos/easy-request-curl" %}})
|
||||
|
||||
## Programs and Demos
|
||||
|
||||
This section includes other programs and how to setup, install, and use of LocalAI.
|
||||
- [Python LocalAI Demo]({{%relref "howtos/easy-setup-full" %}}) - [lunamidori5](https://github.com/lunamidori5)
|
||||
- [Autogen]({{%relref "howtos/autogen-setup" %}}) - [lunamidori5](https://github.com/lunamidori5)
|
91
docs/content/howtos/autogen-setup.md
Normal file
91
docs/content/howtos/autogen-setup.md
Normal file
|
@ -0,0 +1,91 @@
|
|||
|
||||
+++
|
||||
disableToc = false
|
||||
title = "Easy Demo - AutoGen"
|
||||
weight = 2
|
||||
+++
|
||||
|
||||
This is just a short demo of setting up ``LocalAI`` with Autogen, this is based on you already having a model setup.
|
||||
|
||||
```python
|
||||
import os
|
||||
import openai
|
||||
import autogen
|
||||
|
||||
openai.api_key = "sx-xxx"
|
||||
OPENAI_API_KEY = "sx-xxx"
|
||||
os.environ['OPENAI_API_KEY'] = OPENAI_API_KEY
|
||||
|
||||
config_list_json = [
|
||||
{
|
||||
"model": "gpt-3.5-turbo",
|
||||
"api_base": "http://[YOURLOCALAIIPHERE]:8080/v1",
|
||||
"api_type": "open_ai",
|
||||
"api_key": "NULL",
|
||||
}
|
||||
]
|
||||
|
||||
print("models to use: ", [config_list_json[i]["model"] for i in range(len(config_list_json))])
|
||||
|
||||
llm_config = {"config_list": config_list_json, "seed": 42}
|
||||
user_proxy = autogen.UserProxyAgent(
|
||||
name="Admin",
|
||||
system_message="A human admin. Interact with the planner to discuss the plan. Plan execution needs to be approved by this admin.",
|
||||
code_execution_config={
|
||||
"work_dir": "coding",
|
||||
"last_n_messages": 8,
|
||||
"use_docker": "python:3",
|
||||
},
|
||||
human_input_mode="ALWAYS",
|
||||
is_termination_msg=lambda x: x.get("content", "").rstrip().endswith("TERMINATE"),
|
||||
)
|
||||
engineer = autogen.AssistantAgent(
|
||||
name="Coder",
|
||||
llm_config=llm_config,
|
||||
)
|
||||
scientist = autogen.AssistantAgent(
|
||||
name="Scientist",
|
||||
llm_config=llm_config,
|
||||
system_message="""Scientist. You follow an approved plan. You are able to categorize papers after seeing their abstracts printed. You don't write code."""
|
||||
)
|
||||
planner = autogen.AssistantAgent(
|
||||
name="Planner",
|
||||
system_message='''Planner. Suggest a plan. Revise the plan based on feedback from admin and critic, until admin approval.
|
||||
The plan may involve an engineer who can write code and a scientist who doesn't write code.
|
||||
Explain the plan first. Be clear which step is performed by an engineer, and which step is performed by a scientist.
|
||||
''',
|
||||
llm_config=llm_config,
|
||||
)
|
||||
executor = autogen.UserProxyAgent(
|
||||
name="Executor",
|
||||
system_message="Executor. Execute the code written by the engineer and report the result.",
|
||||
human_input_mode="NEVER",
|
||||
code_execution_config={
|
||||
"work_dir": "coding",
|
||||
"last_n_messages": 8,
|
||||
"use_docker": "python:3",
|
||||
}
|
||||
)
|
||||
critic = autogen.AssistantAgent(
|
||||
name="Critic",
|
||||
system_message="Critic. Double check plan, claims, code from other agents and provide feedback. Check whether the plan includes adding verifiable info such as source URL.",
|
||||
llm_config=llm_config,
|
||||
)
|
||||
groupchat = autogen.GroupChat(agents=[user_proxy, engineer, scientist, planner, executor, critic], messages=[], max_round=999)
|
||||
manager = autogen.GroupChatManager(groupchat=groupchat, llm_config=llm_config)
|
||||
|
||||
|
||||
#autogen.ChatCompletion.start_logging()
|
||||
|
||||
#text_input = input("Please enter request: ")
|
||||
text_input = ("Change this to a task you would like the group chat to do or comment this out and uncomment the other line!")
|
||||
|
||||
#Uncomment one of these two chats based on what you would like to do
|
||||
|
||||
#user_proxy.initiate_chat(engineer, message=str(text_input))
|
||||
#For a one on one chat use this one ^
|
||||
|
||||
#user_proxy.initiate_chat(manager, message=str(text_input))
|
||||
#To setup a group chat use this one ^
|
||||
```
|
||||
|
135
docs/content/howtos/easy-model-import-downloaded.md
Normal file
135
docs/content/howtos/easy-model-import-downloaded.md
Normal file
|
@ -0,0 +1,135 @@
|
|||
|
||||
+++
|
||||
disableToc = false
|
||||
title = "Easy Model Setup"
|
||||
weight = 2
|
||||
+++
|
||||
|
||||
Lets Learn how to setup a model, for this ``How To`` we are going to use the ``Luna-Ai`` model (Yes I know haha - ``Luna Midori`` making a how to using the ``luna-ai-llama2`` model - lol)
|
||||
|
||||
To download the model to your models folder, run this command in a commandline of your picking.
|
||||
```bash
|
||||
curl --location 'http://localhost:8080/models/apply' \
|
||||
--header 'Content-Type: application/json' \
|
||||
--data-raw '{
|
||||
"id": "TheBloke/Luna-AI-Llama2-Uncensored-GGUF/luna-ai-llama2-uncensored.Q4_K_M.gguf"
|
||||
}'
|
||||
```
|
||||
|
||||
Each model needs at least ``4`` files, with out these files, the model will run raw, what that means is you can not change settings of the model.
|
||||
```
|
||||
File 1 - The model's GGUF file
|
||||
File 2 - The model's .yaml file
|
||||
File 3 - The Chat API .tmpl file
|
||||
File 4 - The Completion API .tmpl file
|
||||
```
|
||||
So lets fix that! We are using ``lunademo`` name for this ``How To`` but you can name the files what ever you want! Lets make blank files to start with
|
||||
|
||||
```bash
|
||||
touch lunademo-chat.tmpl
|
||||
touch lunademo-completion.tmpl
|
||||
touch lunademo.yaml
|
||||
```
|
||||
Now lets edit the `"lunademo-chat.tmpl"`, Looking at the huggingface repo, this model uses the ``ASSISTANT:`` tag for when the AI replys, so lets make sure to add that to this file. Do not add the user as we will be doing that in our yaml file!
|
||||
|
||||
```txt
|
||||
{{.Input}}
|
||||
|
||||
ASSISTANT:
|
||||
```
|
||||
|
||||
Now in the `"lunademo-completion.tmpl"` file lets add this.
|
||||
|
||||
```txt
|
||||
Complete the following sentence: {{.Input}}
|
||||
```
|
||||
|
||||
|
||||
For the `"lunademo.yaml"` file. Lets set it up for your computer or hardware. (If you want to see advanced yaml configs - [Link](https://localai.io/advanced/))
|
||||
|
||||
We are going to 1st setup the backend and context size.
|
||||
|
||||
```yaml
|
||||
backend: llama
|
||||
context_size: 2000
|
||||
```
|
||||
|
||||
What this does is tell ``LocalAI`` how to load the model. Then we are going to **add** our settings in after that. Lets add the models name and the models settings. The models ``name:`` is what you will put into your request when sending a ``OpenAI`` request to ``LocalAI``
|
||||
```yaml
|
||||
name: lunademo
|
||||
parameters:
|
||||
model: luna-ai-llama2-uncensored.Q4_K_M.gguf
|
||||
temperature: 0.2
|
||||
top_k: 40
|
||||
top_p: 0.65
|
||||
```
|
||||
|
||||
Now that we have the model set up, there a few things we should add to the yaml file to make it run better, for this model it uses the following roles.
|
||||
```yaml
|
||||
roles:
|
||||
assistant: 'ASSISTANT:'
|
||||
system: 'SYSTEM:'
|
||||
user: 'USER:'
|
||||
```
|
||||
|
||||
What that did is made sure that ``LocalAI`` added the test to the users in the request, so if a message is from ``system`` it shows up in the template as ``SYSTEM:``, speaking of template files, lets add those to our models yaml file now.
|
||||
```yaml
|
||||
template:
|
||||
chat: lunademo-chat
|
||||
completion: lunademo-completion
|
||||
```
|
||||
|
||||
If you are running on ``GPU`` or want to tune the model, you can add settings like
|
||||
```yaml
|
||||
f16: true
|
||||
gpu_layers: 4
|
||||
```
|
||||
|
||||
To fully tune the model to your like. But be warned, you **must** restart ``LocalAI`` after changing a yaml file
|
||||
|
||||
```bash
|
||||
docker-compose restart ##windows
|
||||
docker compose restart ##linux / mac
|
||||
```
|
||||
|
||||
If you want to check your models yaml, here is a full copy!
|
||||
```yaml
|
||||
backend: llama
|
||||
context_size: 2000
|
||||
##Put settings right here for tunning!! Before name but after Backend!
|
||||
name: lunademo
|
||||
parameters:
|
||||
model: luna-ai-llama2-uncensored.Q4_K_M.gguf
|
||||
temperature: 0.2
|
||||
top_k: 40
|
||||
top_p: 0.65
|
||||
roles:
|
||||
assistant: 'ASSISTANT:'
|
||||
system: 'SYSTEM:'
|
||||
user: 'USER:'
|
||||
template:
|
||||
chat: lunademo-chat
|
||||
completion: lunademo-completion
|
||||
```
|
||||
|
||||
Now that we got that setup, lets test it out but sending a request by using [Curl]({{%relref "easy-request-curl" %}}) Or use the [OpenAI Python API]({{%relref "easy-request-openai-v1" %}})!
|
||||
|
||||
## Adv Stuff
|
||||
Alright now that we have learned how to set up our own models, here is how to use the gallery to do alot of this for us. This command will download and set up (mostly, we will **always** need to edit our yaml file to fit our computer / hardware)
|
||||
```bash
|
||||
curl http://localhost:8080/models/apply -H "Content-Type: application/json" -d '{
|
||||
"id": "model-gallery@lunademo"
|
||||
}'
|
||||
```
|
||||
|
||||
This will setup the model, models yaml, and both template files (you will see it only did one, as completions is out of date and not supported by ``OpenAI`` if you need one, just follow the steps from before to make one.
|
||||
If you would like to download a raw model using the gallery api, you can run this command. You will need to set up the 3 files needed to run the model tho!
|
||||
```bash
|
||||
curl --location 'http://localhost:8080/models/apply' \
|
||||
--header 'Content-Type: application/json' \
|
||||
--data-raw '{
|
||||
"id": "NAME_OFF_HUGGINGFACE/REPO_NAME/MODENAME.gguf",
|
||||
"name": "REQUSTNAME"
|
||||
}'
|
||||
```
|
||||
|
1
docs/content/howtos/easy-request-autogen.md
Normal file
1
docs/content/howtos/easy-request-autogen.md
Normal file
|
@ -0,0 +1 @@
|
|||
|
35
docs/content/howtos/easy-request-curl.md
Normal file
35
docs/content/howtos/easy-request-curl.md
Normal file
|
@ -0,0 +1,35 @@
|
|||
|
||||
+++
|
||||
disableToc = false
|
||||
title = "Easy Request - Curl"
|
||||
weight = 2
|
||||
+++
|
||||
|
||||
Now we can make a curl request!
|
||||
|
||||
Curl Chat API -
|
||||
|
||||
```bash
|
||||
curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
|
||||
"model": "lunademo",
|
||||
"messages": [{"role": "user", "content": "How are you?"}],
|
||||
"temperature": 0.9
|
||||
}'
|
||||
```
|
||||
|
||||
Curl Completion API -
|
||||
|
||||
```bash
|
||||
curl --request POST \
|
||||
--url http://localhost:8080/v1/completions \
|
||||
--header 'Content-Type: application/json' \
|
||||
--data '{
|
||||
"model": "lunademo",
|
||||
"prompt": "function downloadFile(string url, string outputPath) {",
|
||||
"max_tokens": 256,
|
||||
"temperature": 0.5
|
||||
}'
|
||||
```
|
||||
|
||||
See [OpenAI API](https://platform.openai.com/docs/api-reference) for more info!
|
||||
Have fun using LocalAI!
|
50
docs/content/howtos/easy-request-openai-v0.md
Normal file
50
docs/content/howtos/easy-request-openai-v0.md
Normal file
|
@ -0,0 +1,50 @@
|
|||
|
||||
+++
|
||||
disableToc = false
|
||||
title = "Easy Request - Openai V0"
|
||||
weight = 2
|
||||
+++
|
||||
|
||||
This is for Python, ``OpenAI``=``0.28.1``, if you are on ``OpenAI``=>``V1`` please use this [How to]({{%relref "howtos/easy-request-openai-v1" %}})
|
||||
|
||||
OpenAI Chat API Python -
|
||||
|
||||
```python
|
||||
import os
|
||||
import openai
|
||||
openai.api_base = "http://localhost:8080/v1"
|
||||
openai.api_key = "sx-xxx"
|
||||
OPENAI_API_KEY = "sx-xxx"
|
||||
os.environ['OPENAI_API_KEY'] = OPENAI_API_KEY
|
||||
|
||||
completion = openai.ChatCompletion.create(
|
||||
model="lunademo",
|
||||
messages=[
|
||||
{"role": "system", "content": "You are a helpful assistant."},
|
||||
{"role": "user", "content": "How are you?"}
|
||||
]
|
||||
)
|
||||
|
||||
print(completion.choices[0].message.content)
|
||||
```
|
||||
|
||||
OpenAI Completion API Python -
|
||||
|
||||
```python
|
||||
import os
|
||||
import openai
|
||||
openai.api_base = "http://localhost:8080/v1"
|
||||
openai.api_key = "sx-xxx"
|
||||
OPENAI_API_KEY = "sx-xxx"
|
||||
os.environ['OPENAI_API_KEY'] = OPENAI_API_KEY
|
||||
|
||||
completion = openai.Completion.create(
|
||||
model="lunademo",
|
||||
prompt="function downloadFile(string url, string outputPath) ",
|
||||
max_tokens=256,
|
||||
temperature=0.5)
|
||||
|
||||
print(completion.choices[0].text)
|
||||
```
|
||||
See [OpenAI API](https://platform.openai.com/docs/api-reference) for more info!
|
||||
Have fun using LocalAI!
|
28
docs/content/howtos/easy-request-openai-v1.md
Normal file
28
docs/content/howtos/easy-request-openai-v1.md
Normal file
|
@ -0,0 +1,28 @@
|
|||
|
||||
+++
|
||||
disableToc = false
|
||||
title = "Easy Request - Openai V1"
|
||||
weight = 2
|
||||
+++
|
||||
|
||||
This is for Python, ``OpenAI``=>``V1``, if you are on ``OpenAI``<``V1`` please use this [How to]({{%relref "howtos/easy-request-openai-v0" %}})
|
||||
|
||||
OpenAI Chat API Python -
|
||||
```python
|
||||
from openai import OpenAI
|
||||
|
||||
client = OpenAI(base_url="http://localhost:8080/v1", api_key="sk-xxx")
|
||||
|
||||
messages = [
|
||||
{"role": "system", "content": "You are LocalAI, a helpful, but really confused ai, you will only reply with confused emotes"},
|
||||
{"role": "user", "content": "Hello How are you today LocalAI"}
|
||||
]
|
||||
completion = client.chat.completions.create(
|
||||
model="lunademo",
|
||||
messages=messages,
|
||||
)
|
||||
|
||||
print(completion.choices[0].message)
|
||||
```
|
||||
See [OpenAI API](https://platform.openai.com/docs/api-reference) for more info!
|
||||
Have fun using LocalAI!
|
131
docs/content/howtos/easy-setup-docker-cpu.md
Normal file
131
docs/content/howtos/easy-setup-docker-cpu.md
Normal file
|
@ -0,0 +1,131 @@
|
|||
|
||||
+++
|
||||
disableToc = false
|
||||
title = "Easy Setup - CPU Docker"
|
||||
weight = 2
|
||||
+++
|
||||
|
||||
{{% notice Note %}}
|
||||
- You will need about 10gb of RAM Free
|
||||
- You will need about 15gb of space free on C drive for ``Docker-compose``
|
||||
{{% /notice %}}
|
||||
|
||||
We are going to run `LocalAI` with `docker-compose` for this set up.
|
||||
|
||||
|
||||
Lets clone `LocalAI` with git.
|
||||
|
||||
```bash
|
||||
git clone https://github.com/go-skynet/LocalAI
|
||||
```
|
||||
|
||||
|
||||
Then we will cd into the ``LocalAI`` folder.
|
||||
|
||||
```bash
|
||||
cd LocalAI
|
||||
```
|
||||
|
||||
|
||||
At this point we want to set up our `.env` file, here is a copy for you to use if you wish, please make sure to set it to the same as in the `docker-compose` file for later.
|
||||
|
||||
```bash
|
||||
## Set number of threads.
|
||||
## Note: prefer the number of physical cores. Overbooking the CPU degrades performance notably.
|
||||
THREADS=2
|
||||
|
||||
## Specify a different bind address (defaults to ":8080")
|
||||
# ADDRESS=127.0.0.1:8080
|
||||
|
||||
## Define galleries.
|
||||
## models will to install will be visible in `/models/available`
|
||||
GALLERIES=[{"name":"model-gallery", "url":"github:go-skynet/model-gallery/index.yaml"}, {"url": "github:go-skynet/model-gallery/huggingface.yaml","name":"huggingface"}]
|
||||
|
||||
## Default path for models
|
||||
MODELS_PATH=/models
|
||||
|
||||
## Enable debug mode
|
||||
# DEBUG=true
|
||||
|
||||
## Disables COMPEL (Lets Stable Diffuser work, uncomment if you plan on using it)
|
||||
# COMPEL=0
|
||||
|
||||
## Enable/Disable single backend (useful if only one GPU is available)
|
||||
# SINGLE_ACTIVE_BACKEND=true
|
||||
|
||||
## Specify a build type. Available: cublas, openblas, clblas.
|
||||
BUILD_TYPE=cublas
|
||||
|
||||
## Uncomment and set to true to enable rebuilding from source
|
||||
# REBUILD=true
|
||||
|
||||
## Enable go tags, available: stablediffusion, tts
|
||||
## stablediffusion: image generation with stablediffusion
|
||||
## tts: enables text-to-speech with go-piper
|
||||
## (requires REBUILD=true)
|
||||
#
|
||||
#GO_TAGS=tts
|
||||
|
||||
## Path where to store generated images
|
||||
# IMAGE_PATH=/tmp
|
||||
|
||||
## Specify a default upload limit in MB (whisper)
|
||||
# UPLOAD_LIMIT
|
||||
|
||||
# HUGGINGFACEHUB_API_TOKEN=Token here
|
||||
```
|
||||
|
||||
|
||||
Now that we have the `.env` set lets set up our `docker-compose` file.
|
||||
It will use a container from [quay.io](https://quay.io/repository/go-skynet/local-ai?tab=tags).
|
||||
Also note this `docker-compose` file is for `CPU` only.
|
||||
|
||||
```docker
|
||||
version: '3.6'
|
||||
|
||||
services:
|
||||
api:
|
||||
image: quay.io/go-skynet/local-ai:v1.40.0
|
||||
tty: true # enable colorized logs
|
||||
restart: always # should this be on-failure ?
|
||||
ports:
|
||||
- 8080:8080
|
||||
env_file:
|
||||
- .env
|
||||
volumes:
|
||||
- ./models:/models
|
||||
- ./images/:/tmp/generated/images/
|
||||
command: ["/usr/bin/local-ai" ]
|
||||
```
|
||||
|
||||
|
||||
Make sure to save that in the root of the `LocalAI` folder. Then lets spin up the Docker run this in a `CMD` or `BASH`
|
||||
|
||||
```bash
|
||||
docker-compose up -d --pull always
|
||||
```
|
||||
|
||||
|
||||
Now we are going to let that set up, once it is done, lets check to make sure our huggingface / localai galleries are working (wait until you see this screen to do this)
|
||||
|
||||
You should see:
|
||||
```
|
||||
┌───────────────────────────────────────────────────┐
|
||||
│ Fiber v2.42.0 │
|
||||
│ http://127.0.0.1:8080 │
|
||||
│ (bound on host 0.0.0.0 and port 8080) │
|
||||
│ │
|
||||
│ Handlers ............. 1 Processes ........... 1 │
|
||||
│ Prefork ....... Disabled PID ................. 1 │
|
||||
└───────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
```bash
|
||||
curl http://localhost:8080/models/available
|
||||
```
|
||||
|
||||
Output will look like this:
|
||||
|
||||

|
||||
|
||||
Now that we got that setup, lets go setup a [model]({{%relref "easy-model-import-downloaded" %}})
|
146
docs/content/howtos/easy-setup-docker-gpu.md
Normal file
146
docs/content/howtos/easy-setup-docker-gpu.md
Normal file
|
@ -0,0 +1,146 @@
|
|||
|
||||
+++
|
||||
disableToc = false
|
||||
title = "Easy Setup - GPU Docker"
|
||||
weight = 2
|
||||
+++
|
||||
|
||||
{{% notice Note %}}
|
||||
- You will need about 10gb of RAM Free
|
||||
- You will need about 15gb of space free on C drive for ``Docker-compose``
|
||||
{{% /notice %}}
|
||||
|
||||
We are going to run `LocalAI` with `docker-compose` for this set up.
|
||||
|
||||
|
||||
Lets clone `LocalAI` with git.
|
||||
|
||||
```bash
|
||||
git clone https://github.com/go-skynet/LocalAI
|
||||
```
|
||||
|
||||
|
||||
Then we will cd into the `LocalAI` folder.
|
||||
|
||||
```bash
|
||||
cd LocalAI
|
||||
```
|
||||
|
||||
|
||||
At this point we want to set up our `.env` file, here is a copy for you to use if you wish, please make sure to set it to the same as in the `docker-compose` file for later.
|
||||
|
||||
```bash
|
||||
## Set number of threads.
|
||||
## Note: prefer the number of physical cores. Overbooking the CPU degrades performance notably.
|
||||
THREADS=2
|
||||
|
||||
## Specify a different bind address (defaults to ":8080")
|
||||
# ADDRESS=127.0.0.1:8080
|
||||
|
||||
## Define galleries.
|
||||
## models will to install will be visible in `/models/available`
|
||||
GALLERIES=[{"name":"model-gallery", "url":"github:go-skynet/model-gallery/index.yaml"}, {"url": "github:go-skynet/model-gallery/huggingface.yaml","name":"huggingface"}]
|
||||
|
||||
## Default path for models
|
||||
MODELS_PATH=/models
|
||||
|
||||
## Enable debug mode
|
||||
# DEBUG=true
|
||||
|
||||
## Disables COMPEL (Lets Stable Diffuser work, uncomment if you plan on using it)
|
||||
# COMPEL=0
|
||||
|
||||
## Enable/Disable single backend (useful if only one GPU is available)
|
||||
# SINGLE_ACTIVE_BACKEND=true
|
||||
|
||||
## Specify a build type. Available: cublas, openblas, clblas.
|
||||
BUILD_TYPE=cublas
|
||||
|
||||
## Uncomment and set to true to enable rebuilding from source
|
||||
# REBUILD=true
|
||||
|
||||
## Enable go tags, available: stablediffusion, tts
|
||||
## stablediffusion: image generation with stablediffusion
|
||||
## tts: enables text-to-speech with go-piper
|
||||
## (requires REBUILD=true)
|
||||
#
|
||||
#GO_TAGS=tts
|
||||
|
||||
## Path where to store generated images
|
||||
# IMAGE_PATH=/tmp
|
||||
|
||||
## Specify a default upload limit in MB (whisper)
|
||||
# UPLOAD_LIMIT
|
||||
|
||||
# HUGGINGFACEHUB_API_TOKEN=Token here
|
||||
```
|
||||
|
||||
|
||||
Now that we have the `.env` set lets set up our `docker-compose` file.
|
||||
It will use a container from [quay.io](https://quay.io/repository/go-skynet/local-ai?tab=tags).
|
||||
Also note this `docker-compose` file is for `CUDA` only.
|
||||
|
||||
Please change the image to what you need.
|
||||
```
|
||||
Cuda 11 - v1.40.0-cublas-cuda11
|
||||
Cuda 12 - v1.40.0-cublas-cuda12
|
||||
Cuda 11 with TTS - v1.40.0-cublas-cuda11-ffmpeg
|
||||
Cuda 12 with TTS - v1.40.0-cublas-cuda12-ffmpeg
|
||||
```
|
||||
|
||||
```docker
|
||||
version: '3.6'
|
||||
|
||||
services:
|
||||
api:
|
||||
deploy:
|
||||
resources:
|
||||
reservations:
|
||||
devices:
|
||||
- driver: nvidia
|
||||
count: 1
|
||||
capabilities: [gpu]
|
||||
image: quay.io/go-skynet/local-ai:[CHANGEMETOIMAGENEEDED]
|
||||
tty: true # enable colorized logs
|
||||
restart: always # should this be on-failure ?
|
||||
ports:
|
||||
- 8080:8080
|
||||
env_file:
|
||||
- .env
|
||||
volumes:
|
||||
- ./models:/models
|
||||
- ./images/:/tmp/generated/images/
|
||||
command: ["/usr/bin/local-ai" ]
|
||||
```
|
||||
|
||||
|
||||
Make sure to save that in the root of the `LocalAI` folder. Then lets spin up the Docker run this in a `CMD` or `BASH`
|
||||
|
||||
```bash
|
||||
docker-compose up -d --pull always
|
||||
```
|
||||
|
||||
|
||||
Now we are going to let that set up, once it is done, lets check to make sure our huggingface / localai galleries are working (wait until you see this screen to do this)
|
||||
|
||||
You should see:
|
||||
```
|
||||
┌───────────────────────────────────────────────────┐
|
||||
│ Fiber v2.42.0 │
|
||||
│ http://127.0.0.1:8080 │
|
||||
│ (bound on host 0.0.0.0 and port 8080) │
|
||||
│ │
|
||||
│ Handlers ............. 1 Processes ........... 1 │
|
||||
│ Prefork ....... Disabled PID ................. 1 │
|
||||
└───────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
```bash
|
||||
curl http://localhost:8080/models/available
|
||||
```
|
||||
|
||||
Output will look like this:
|
||||
|
||||

|
||||
|
||||
Now that we got that setup, lets go setup a [model]({{%relref "easy-model-import-downloaded" %}})
|
37
docs/content/howtos/easy-setup-embeddings.md
Normal file
37
docs/content/howtos/easy-setup-embeddings.md
Normal file
|
@ -0,0 +1,37 @@
|
|||
+++
|
||||
disableToc = false
|
||||
title = "Easy Setup - Embeddings"
|
||||
weight = 2
|
||||
+++
|
||||
|
||||
To install an embedding model, run the following command
|
||||
|
||||
```bash
|
||||
curl http://localhost:8080/models/apply -H "Content-Type: application/json" -d '{
|
||||
"id": "model-gallery@bert-embeddings"
|
||||
}'
|
||||
```
|
||||
|
||||
Now we need to make a ``bert.yaml`` in the models folder
|
||||
```yaml
|
||||
backend: bert-embeddings
|
||||
embeddings: true
|
||||
name: text-embedding-ada-002
|
||||
parameters:
|
||||
model: bert
|
||||
```
|
||||
|
||||
**Restart LocalAI after you change a yaml file**
|
||||
|
||||
When you would like to request the model from CLI you can do
|
||||
|
||||
```bash
|
||||
curl http://localhost:8080/v1/embeddings \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"input": "The food was delicious and the waiter...",
|
||||
"model": "text-embedding-ada-002"
|
||||
}'
|
||||
```
|
||||
|
||||
See [OpenAI Embedding](https://platform.openai.com/docs/api-reference/embeddings/object) for more info!
|
61
docs/content/howtos/easy-setup-full.md
Normal file
61
docs/content/howtos/easy-setup-full.md
Normal file
|
@ -0,0 +1,61 @@
|
|||
+++
|
||||
disableToc = false
|
||||
title = "Easy Demo - Full Chat Python AI"
|
||||
weight = 2
|
||||
+++
|
||||
|
||||
{{% notice Note %}}
|
||||
- You will need about 10gb of RAM Free
|
||||
- You will need about 15gb of space free on C drive for ``Docker-compose``
|
||||
{{% /notice %}}
|
||||
|
||||
This is for `Linux`, `Mac OS`, or `Windows` Hosts. - [Docker Desktop](https://docs.docker.com/engine/install/), [Python 3.11](https://www.python.org/downloads/release/python-3110/), [Git](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git)
|
||||
|
||||
Linux Hosts:
|
||||
|
||||
There is a Full_Auto installer compatible with some types of Linux distributions, feel free to use them, but note that they may not fully work. If you need to install something, please use the links at the top.
|
||||
|
||||
```bash
|
||||
git clone https://github.com/lunamidori5/localai-lunademo.git
|
||||
|
||||
cd localai-lunademo
|
||||
|
||||
#Pick your type of linux for the Full Autos, if you already have python, docker, and docker-compose installed skip this chmod. But make sure you chmod the setup_linux file.
|
||||
|
||||
chmod +x Full_Auto_setup_Debian.sh or chmod +x Full_Auto_setup_Ubutnu.sh
|
||||
|
||||
chmod +x Setup_Linux.sh
|
||||
|
||||
#Make sure to install cuda to your host OS and to Docker if you plan on using GPU
|
||||
|
||||
./(the setupfile you wish to run)
|
||||
```
|
||||
|
||||
Windows Hosts:
|
||||
|
||||
```batch
|
||||
REM Make sure you have git, docker-desktop, and python 3.11 installed
|
||||
|
||||
git clone https://github.com/lunamidori5/localai-lunademo.git
|
||||
|
||||
cd localai-lunademo
|
||||
|
||||
call Setup.bat
|
||||
```
|
||||
|
||||
MacOS Hosts:
|
||||
- I need some help working on a MacOS Setup file, if you are willing to help out, please contact Luna Midori on [discord](https://discord.com/channels/1096914990004457512/1099364883755171890/1147591145057157200) or put in a PR on [Luna Midori's github](https://github.com/lunamidori5/localai-lunademo).
|
||||
|
||||
Video How Tos
|
||||
|
||||
- Ubuntu - ``COMING SOON``
|
||||
- Debian - ``COMING SOON``
|
||||
- Windows - ``COMING SOON``
|
||||
- MacOS - ``PLANED - NEED HELP``
|
||||
|
||||
Enjoy localai! (If you need help contact Luna Midori on [Discord](https://discord.com/channels/1096914990004457512/1099364883755171890/1147591145057157200))
|
||||
|
||||
{{% notice Issues %}}
|
||||
- Trying to run ``Setup.bat`` or ``Setup_Linux.sh`` from `Git Bash` on Windows is not working. (Somewhat fixed)
|
||||
- Running over `SSH` or other remote command line based apps may bug out, load slowly, or crash.
|
||||
{{% /notice %}}
|
46
docs/content/howtos/easy-setup-sd.md
Normal file
46
docs/content/howtos/easy-setup-sd.md
Normal file
|
@ -0,0 +1,46 @@
|
|||
+++
|
||||
disableToc = false
|
||||
title = "Easy Setup - Stable Diffusion"
|
||||
weight = 2
|
||||
+++
|
||||
|
||||
To set up a Stable Diffusion model is super easy.
|
||||
In your models folder make a file called ``stablediffusion.yaml``, then edit that file with the following. (You can change ``Linaqruf/animagine-xl`` with what ever ``sd-lx`` model you would like.
|
||||
```yaml
|
||||
name: animagine-xl
|
||||
parameters:
|
||||
model: Linaqruf/animagine-xl
|
||||
backend: diffusers
|
||||
|
||||
# Force CPU usage - set to true for GPU
|
||||
f16: false
|
||||
diffusers:
|
||||
pipeline_type: StableDiffusionXLPipeline
|
||||
cuda: false # Enable for GPU usage (CUDA)
|
||||
scheduler_type: dpm_2_a
|
||||
```
|
||||
|
||||
If you are using docker, you will need to run in the localai folder with the ``docker-compose.yaml`` file in it
|
||||
```bash
|
||||
docker-compose down #windows
|
||||
docker compose down #linux/mac
|
||||
```
|
||||
|
||||
Then in your ``.env`` file uncomment this line.
|
||||
```yaml
|
||||
COMPEL=0
|
||||
```
|
||||
|
||||
After that we can reinstall the LocalAI docker VM by running in the localai folder with the ``docker-compose.yaml`` file in it
|
||||
```bash
|
||||
docker-compose up #windows
|
||||
docker compose up #linux/mac
|
||||
```
|
||||
|
||||
Then to download and setup the model, Just send in a normal ``OpenAI`` request! LocalAI will do the rest!
|
||||
```bash
|
||||
curl http://localhost:8080/v1/images/generations -H "Content-Type: application/json" -d '{
|
||||
"prompt": "Two Boxes, 1blue, 1red",
|
||||
"size": "256x256"
|
||||
}'
|
||||
```
|
Loading…
Add table
Add a link
Reference in a new issue