From bd277162c73c6c0e0eba2011586bd5f32c1be65b Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Fri, 19 Jul 2024 23:56:58 +0200 Subject: [PATCH 0001/1851] docs: :arrow_up: update docs version mudler/LocalAI (#2926) :arrow_up: Update docs version mudler/LocalAI Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- docs/data/version.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/data/version.json b/docs/data/version.json index 30b4b614..f54a5e67 100644 --- a/docs/data/version.json +++ b/docs/data/version.json @@ -1,3 +1,3 @@ { - "version": "v2.18.1" + "version": "v2.19.0" } From e75f73bf736d90a519ceea9804b8bbe96b93ec7f Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Sat, 20 Jul 2024 00:10:26 +0200 Subject: [PATCH 0002/1851] chore: :arrow_up: Update ggerganov/llama.cpp (#2927) :arrow_up: Update ggerganov/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index df13cbfb..0f5ecd00 100644 --- a/Makefile +++ b/Makefile @@ -8,7 +8,7 @@ DETECT_LIBS?=true # llama.cpp versions GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be -CPPLLAMA_VERSION?=705b7ecf60e667ced57c15d67aa86865e3cc7aa7 +CPPLLAMA_VERSION?=87e397d00bdcedd5cbf6dfda06a7b0f302462728 # gpt4all version GPT4ALL_REPO?=https://github.com/nomic-ai/gpt4all From f9f83791d1997cf0f1f88e7bdbad27190df9a5f5 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Sat, 20 Jul 2024 09:15:48 +0200 Subject: [PATCH 0003/1851] ci(release): run also on tags Signed-off-by: Ettore Di Giacinto --- .github/workflows/release.yaml | 2 ++ 1 file changed, 2 insertions(+) diff --git a/.github/workflows/release.yaml b/.github/workflows/release.yaml index 92e07326..b2c6c069 100644 --- a/.github/workflows/release.yaml +++ b/.github/workflows/release.yaml @@ -4,6 +4,8 @@ on: push: branches: - master + tags: + - 'v*' pull_request: env: From 87bd831aba259df70091fe93cafebb830db8ef75 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Sat, 20 Jul 2024 10:43:18 +0200 Subject: [PATCH 0004/1851] docs: add federation (#2929) Signed-off-by: Ettore Di Giacinto --- .../docs/features/distributed_inferencing.md | 91 ++++++++++++------- 1 file changed, 60 insertions(+), 31 deletions(-) diff --git a/docs/content/docs/features/distributed_inferencing.md b/docs/content/docs/features/distributed_inferencing.md index abe34373..b7ce41a9 100644 --- a/docs/content/docs/features/distributed_inferencing.md +++ b/docs/content/docs/features/distributed_inferencing.md @@ -5,17 +5,65 @@ weight = 15 url = "/features/distribute/" +++ + +This functionality enables LocalAI to distribute inference requests across multiple worker nodes, improving efficiency and performance. Nodes are automatically discovered and connect via p2p by using a shared token which makes sure the communication is secure and private between the nodes of the network. + +LocalAI supports two modes of distributed inferencing via p2p: + +- **Federated Mode**: Requests are shared between the cluster and routed to a single worker node in the network based on the load balancer's decision. +- **Worker Mode**: Requests are processed by all the workers which contributes to the final inference result (by sharing the model weights). + +## Usage + +Starting LocalAI with `--p2p` generates a shared token for connecting multiple instances: and that's all you need to create AI clusters, eliminating the need for intricate network setups. + +Simply navigate to the "Swarm" section in the WebUI and follow the on-screen instructions. + +For fully shared instances, initiate LocalAI with --p2p --federated and adhere to the Swarm section's guidance. This feature, while still experimental, offers a tech preview quality experience. + +### Federated mode + +Federated mode allows to launch multiple LocalAI instances and connect them together in a federated network. This mode is useful when you want to distribute the load of the inference across multiple nodes, but you want to have a single point of entry for the API. In the Swarm section of the WebUI, you can see the instructions to connect multiple instances together. + +![346663124-1d2324fd-8b55-4fa2-9856-721a467969c2](https://github.com/user-attachments/assets/19ebd44a-20ff-412c-b92f-cfb8efbe4b21) + +To start a LocalAI server in federated mode, run: + +```bash +local-ai run --p2p --federated +``` + +This will generate a token that you can use to connect other LocalAI instances to the network or others can use to join the network. If you already have a token, you can specify it using the `TOKEN` environment variable. + +To start a load balanced server that routes the requests to the network, run with the `TOKEN`: + +```bash +local-ai federated +``` + +To see all the available options, run `local-ai federated --help`. + +The instructions are displayed in the "Swarm" section of the WebUI, guiding you through the process of connecting multiple instances. + +### Workers mode + {{% alert note %}} This feature is available exclusively with llama-cpp compatible models. This feature was introduced in [LocalAI pull request #2324](https://github.com/mudler/LocalAI/pull/2324) and is based on the upstream work in [llama.cpp pull request #6829](https://github.com/ggerganov/llama.cpp/pull/6829). {{% /alert %}} -This functionality enables LocalAI to distribute inference requests across multiple worker nodes, improving efficiency and performance. +To connect multiple workers to a single LocalAI instance, start first a server in p2p mode: -## Usage +```bash +local-ai run --p2p +``` -### Starting Workers +And navigate the WebUI to the "Swarm" section to see the instructions to connect multiple workers to the network. + +![346663124-1d2324fd-8b55-4fa2-9856-721a467969c2](https://github.com/user-attachments/assets/b8cadddf-a467-49cf-a1ed-8850de95366d) + +### Without P2P To start workers for distributing the computational load, run: @@ -23,48 +71,27 @@ To start workers for distributing the computational load, run: local-ai worker llama-cpp-rpc ``` -Alternatively, you can build the RPC server following the llama.cpp [README](https://github.com/ggerganov/llama.cpp/blob/master/examples/rpc/README.md), which is compatible with LocalAI. - -### Starting LocalAI - -To start the LocalAI server, which handles API requests, specify the worker addresses using the `LLAMACPP_GRPC_SERVERS` environment variable: +And you can specify the address of the workers when starting LocalAI with the `LLAMACPP_GRPC_SERVERS` environment variable: ```bash LLAMACPP_GRPC_SERVERS="address1:port,address2:port" local-ai run ``` - The workload on the LocalAI server will then be distributed across the specified nodes. -## Peer-to-Peer Networking +Alternatively, you can build the RPC workers/server following the llama.cpp [README](https://github.com/ggerganov/llama.cpp/blob/master/examples/rpc/README.md), which is compatible with LocalAI. -![output](https://github.com/mudler/LocalAI/assets/2420543/8ca277cf-c208-4562-8929-808b2324b584) +## Manual example (worker) -Workers can also connect to each other in a peer-to-peer network, distributing the workload in a decentralized manner. - -A shared token between the server and the workers is required for communication within the peer-to-peer network. This feature supports both local network (using mDNS discovery) and DHT for communication across different networks. - -The token is automatically generated when starting the server with the `--p2p` flag. Workers can be started with the token using `local-ai worker p2p-llama-cpp-rpc` and specifying the token via the environment variable `TOKEN` or with the `--token` argument. - -A network is established between the server and workers using DHT and mDNS discovery protocols. The llama.cpp RPC server is automatically started and exposed to the peer-to-peer network, allowing the API server to connect. - -When the HTTP server starts, it discovers workers in the network and creates port forwards to the local service. Llama.cpp is configured to use these services. For more details on the implementation, refer to [LocalAI pull request #2343](https://github.com/mudler/LocalAI/pull/2343). - -### Usage +Use the WebUI to guide you in the process of starting new workers. This example shows the manual steps to highlight the process. 1. Start the server with `--p2p`: ```bash ./local-ai run --p2p -# 1:02AM INF loading environment variables from file envFile=.env -# 1:02AM INF Setting logging to info -# 1:02AM INF P2P mode enabled -# 1:02AM INF No token provided, generating one -# 1:02AM INF Generated Token: -# XXXXXXXXXXX -# 1:02AM INF Press a button to proceed +# Get the token in the Swarm section of the WebUI ``` -Copy the displayed token and press Enter. +Copy the token from the WebUI or via API call (e.g., `curl http://localhost:8000/p2p/token`) and save it for later use. To reuse the same token later, restart the server with `--p2ptoken` or `P2P_TOKEN`. @@ -93,12 +120,14 @@ The server logs should indicate that new workers are being discovered. 3. Start inference as usual on the server initiated in step 1. +![output](https://github.com/mudler/LocalAI/assets/2420543/8ca277cf-c208-4562-8929-808b2324b584) + ## Notes - If running in p2p mode with container images, make sure you start the container with `--net host` or `network_mode: host` in the docker-compose file. - Only a single model is supported currently. - Ensure the server detects new workers before starting inference. Currently, additional workers cannot be added once inference has begun. - +- For more details on the implementation, refer to [LocalAI pull request #2343](https://github.com/mudler/LocalAI/pull/2343) ## Environment Variables From 0ee1f8c1cffc4e0abc8b5125e4683ada273dc871 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Sat, 20 Jul 2024 10:43:34 +0200 Subject: [PATCH 0005/1851] ci(Makefile): enable p2p on cross-arm64 builds (#2928) Signed-off-by: Ettore Di Giacinto --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 0f5ecd00..a8b7a832 100644 --- a/Makefile +++ b/Makefile @@ -421,7 +421,7 @@ else endif dist-cross-linux-arm64: - CMAKE_ARGS="$(CMAKE_ARGS) -DGGML_NATIVE=off" GRPC_BACKENDS="backend-assets/grpc/llama-cpp-fallback backend-assets/grpc/llama-cpp-grpc backend-assets/util/llama-cpp-rpc-server" \ + CMAKE_ARGS="$(CMAKE_ARGS) -DGGML_NATIVE=off" GRPC_BACKENDS="backend-assets/grpc/llama-cpp-fallback backend-assets/grpc/llama-cpp-grpc backend-assets/util/llama-cpp-rpc-server" GO_TAGS="p2p" \ STATIC=true $(MAKE) build mkdir -p release # if BUILD_ID is empty, then we don't append it to the binary name From 46b86f7e6eb96e1146a52928b4fc538523e8ebc8 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Sat, 20 Jul 2024 16:03:44 +0200 Subject: [PATCH 0006/1851] models(gallery): add tulu 8b and 70b (#2931) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index c130c570..aef6c239 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -3146,6 +3146,36 @@ - filename: L3-8B-Celeste-v1-Q4_K_M.gguf sha256: ed5277719965fb6bbcce7d16742e3bac4a8d5b8f52133261a3402a480cd65317 uri: huggingface://bartowski/L3-8B-Celeste-v1-GGUF/L3-8B-Celeste-v1-Q4_K_M.gguf +- !!merge <<: *llama3 + name: "llama-3-tulu-2-8b-i1" + icon: https://huggingface.co/datasets/allenai/blog-images/resolve/main/tulu-v2/Tulu%20V2%20banner.png + urls: + - https://huggingface.co/allenai/llama-3-tulu-2-8b + - https://huggingface.co/mradermacher/llama-3-tulu-2-8b-i1-GGUF + description: | + Tulu is a series of language models that are trained to act as helpful assistants. Llama 3 Tulu V2 8B is a fine-tuned version of Llama 3 that was trained on a mix of publicly available, synthetic and human datasets. + overrides: + parameters: + model: llama-3-tulu-2-8b.i1-Q4_K_M.gguf + files: + - filename: llama-3-tulu-2-8b.i1-Q4_K_M.gguf + sha256: f859c22bfa64f461e9ffd973dc7ad6a78bb98b1dda6f49abfa416a4022b7e333 + uri: huggingface://mradermacher/llama-3-tulu-2-8b-i1-GGUF/llama-3-tulu-2-8b.i1-Q4_K_M.gguf +- !!merge <<: *llama3 + name: "llama-3-tulu-2-dpo-70b-i1" + icon: https://huggingface.co/datasets/allenai/blog-images/resolve/main/tulu-v2/Tulu%20V2%20banner.png + urls: + - https://huggingface.co/allenai/llama-3-tulu-2-dpo-70b + - https://huggingface.co/mradermacher/llama-3-tulu-2-dpo-70b-i1-GGUF + description: | + Tulu is a series of language models that are trained to act as helpful assistants. Llama 3 Tulu V2 8B is a fine-tuned version of Llama 3 that was trained on a mix of publicly available, synthetic and human datasets. + overrides: + parameters: + model: llama-3-tulu-2-dpo-70b.i1-Q4_K_M.gguf + files: + - filename: llama-3-tulu-2-dpo-70b.i1-Q4_K_M.gguf + sha256: fc309bbdf1e2bdced954c4c8dc1f9a885c547017ee5e750bfde645af89e3d3a5 + uri: huggingface://mradermacher/llama-3-tulu-2-dpo-70b-i1-GGUF/llama-3-tulu-2-dpo-70b.i1-Q4_K_M.gguf - &command-R ### START Command-r url: "github:mudler/LocalAI/gallery/command-r.yaml@master" From 450dbed820e364f87eede055e898613e14172a1f Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Sat, 20 Jul 2024 16:16:29 +0200 Subject: [PATCH 0007/1851] models(gallery): add suzume-orpo (#2932) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index aef6c239..63664070 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -3176,6 +3176,28 @@ - filename: llama-3-tulu-2-dpo-70b.i1-Q4_K_M.gguf sha256: fc309bbdf1e2bdced954c4c8dc1f9a885c547017ee5e750bfde645af89e3d3a5 uri: huggingface://mradermacher/llama-3-tulu-2-dpo-70b-i1-GGUF/llama-3-tulu-2-dpo-70b.i1-Q4_K_M.gguf +- !!merge <<: *llama3 + license: cc-by-nc-4.0 + name: "suzume-llama-3-8b-multilingual-orpo-borda-top25" + icon: https://cdn-uploads.huggingface.co/production/uploads/64b63f8ad57e02621dc93c8b/kWQSu02YfgYdUQqv4s5lq.png + urls: + - https://huggingface.co/lightblue/suzume-llama-3-8B-multilingual-orpo-borda-top25 + - https://huggingface.co/RichardErkhov/lightblue_-_suzume-llama-3-8B-multilingual-orpo-borda-top25-gguf + description: | + This is Suzume ORPO, an ORPO trained fine-tune of the lightblue/suzume-llama-3-8B-multilingual model using our lightblue/mitsu dataset. + + We have trained several versions of this model using ORPO and so recommend that you use the best performing model from our tests, lightblue/suzume-llama-3-8B-multilingual-orpo-borda-half. + + Note that this model has a non-commerical license as we used the Command R and Command R+ models to generate our training data for this model (lightblue/mitsu). + + We are currently working on a developing a commerically usable model, so stay tuned for that! + overrides: + parameters: + model: suzume-llama-3-8B-multilingual-orpo-borda-top25.Q4_K_M.gguf + files: + - filename: suzume-llama-3-8B-multilingual-orpo-borda-top25.Q4_K_M.gguf + sha256: ef75a02c5f38e14a8873c7989188dac6974851b4654279fe1921d2c8018cc388 + uri: huggingface://RichardErkhov/lightblue_-_suzume-llama-3-8B-multilingual-orpo-borda-top25-gguf/suzume-llama-3-8B-multilingual-orpo-borda-top25.Q4_K_M.gguf - &command-R ### START Command-r url: "github:mudler/LocalAI/gallery/command-r.yaml@master" From f505d7ab3f4dabf927413d42691adb37bd46f131 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Sat, 20 Jul 2024 16:17:34 +0200 Subject: [PATCH 0008/1851] models(gallery): add archangel_sft_pythia2-8b (#2933) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 27 +++++++++++++++++++++++++++ gallery/tuluv2.yaml | 43 +++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 70 insertions(+) create mode 100644 gallery/tuluv2.yaml diff --git a/gallery/index.yaml b/gallery/index.yaml index 63664070..2ef3d46b 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -24,6 +24,33 @@ - filename: DeepSeek-Coder-V2-Lite-Instruct-Q4_K_M.gguf sha256: 50ec78036433265965ed1afd0667c00c71c12aa70bcf383be462cb8e159db6c0 uri: huggingface://LoneStriker/DeepSeek-Coder-V2-Lite-Instruct-GGUF/DeepSeek-Coder-V2-Lite-Instruct-Q4_K_M.gguf +- name: "archangel_sft_pythia2-8b" + url: "github:mudler/LocalAI/gallery/tuluv2.yaml@master" + icon: https://gist.github.com/assets/29318529/fe2d8391-dbd1-4b7e-9dc4-7cb97e55bc06 + license: apache-2.0 + urls: + - https://huggingface.co/ContextualAI/archangel_sft_pythia2-8b + - https://huggingface.co/RichardErkhov/ContextualAI_-_archangel_sft_pythia2-8b-gguf + - https://github.com/ContextualAI/HALOs + description: | + datasets: + - stanfordnlp/SHP + - Anthropic/hh-rlhf + - OpenAssistant/oasst1 + + This repo contains the model checkpoints for: + - model family pythia2-8b + - optimized with the loss SFT + - aligned using the SHP, Anthropic HH and Open Assistant datasets. + + Please refer to our [code repository](https://github.com/ContextualAI/HALOs) or [blog](https://contextual.ai/better-cheaper-faster-llm-alignment-with-kto/) which contains intructions for training your own HALOs and links to our model cards. + overrides: + parameters: + model: archangel_sft_pythia2-8b.Q4_K_M.gguf + files: + - filename: archangel_sft_pythia2-8b.Q4_K_M.gguf + sha256: a47782c55ef2b39b19644213720a599d9849511a73c9ebb0c1de749383c0a0f8 + uri: huggingface://RichardErkhov/ContextualAI_-_archangel_sft_pythia2-8b-gguf/archangel_sft_pythia2-8b.Q4_K_M.gguf - &qwen2 ## Start QWEN2 url: "github:mudler/LocalAI/gallery/chatml.yaml@master" diff --git a/gallery/tuluv2.yaml b/gallery/tuluv2.yaml new file mode 100644 index 00000000..ca2785a2 --- /dev/null +++ b/gallery/tuluv2.yaml @@ -0,0 +1,43 @@ +--- +name: "tuluv2" + +config_file: | + mmap: true + template: + chat_message: | + <|{{ .RoleName }}|> + {{ if .FunctionCall -}} + Function call: + {{ else if eq .RoleName "tool" -}} + Function response: + {{ end -}} + {{ if .Content -}} + {{.Content }} + {{ end -}} + {{ if .FunctionCall -}} + {{toJson .FunctionCall}} + {{ end -}} + function: | + <|{{ .RoleName }}|> + {{ if .FunctionCall -}} + Function call: + {{ else if eq .RoleName "tool" -}} + Function response: + {{ end -}} + {{ if .Content -}} + {{.Content }} + {{ end -}} + {{ if .FunctionCall -}} + {{toJson .FunctionCall}} + {{ end -}} + chat: | + {{.Input -}} + <|assistant|> + completion: | + {{.Input}} + context_size: 4096 + f16: true + stopwords: + - '<|im_end|>' + - '' + - '<|endoftext|>' From 8667a67695eed2625e361ea3d34b220e8568f783 Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Sat, 20 Jul 2024 23:33:54 +0200 Subject: [PATCH 0009/1851] docs: :arrow_up: update docs version mudler/LocalAI (#2935) :arrow_up: Update docs version mudler/LocalAI Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- docs/data/version.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/data/version.json b/docs/data/version.json index f54a5e67..fff9fa0c 100644 --- a/docs/data/version.json +++ b/docs/data/version.json @@ -1,3 +1,3 @@ { - "version": "v2.19.0" + "version": "v2.19.1" } From 86509e6002948c20ca987bc5dbcacc2bef1e65a4 Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Sat, 20 Jul 2024 23:35:21 +0200 Subject: [PATCH 0010/1851] chore: :arrow_up: Update ggerganov/llama.cpp (#2936) :arrow_up: Update ggerganov/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index a8b7a832..906e8ca5 100644 --- a/Makefile +++ b/Makefile @@ -8,7 +8,7 @@ DETECT_LIBS?=true # llama.cpp versions GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be -CPPLLAMA_VERSION?=87e397d00bdcedd5cbf6dfda06a7b0f302462728 +CPPLLAMA_VERSION?=07283b1a90e1320aae4762c7e03c879043910252 # gpt4all version GPT4ALL_REPO?=https://github.com/nomic-ai/gpt4all From ef5e8326c8c4820fb89bb960cc45377c415dff92 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Sun, 21 Jul 2024 10:31:44 +0200 Subject: [PATCH 0011/1851] models(gallery): add celestev1.2 (#2937) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index 2ef3d46b..46f9f7c7 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -3173,6 +3173,22 @@ - filename: L3-8B-Celeste-v1-Q4_K_M.gguf sha256: ed5277719965fb6bbcce7d16742e3bac4a8d5b8f52133261a3402a480cd65317 uri: huggingface://bartowski/L3-8B-Celeste-v1-GGUF/L3-8B-Celeste-v1-Q4_K_M.gguf +- !!merge <<: *llama3 + name: "l3-8b-celeste-v1.2" + icon: https://cdn-uploads.huggingface.co/production/uploads/630cf5d14ca0a22768bbe10c/Zv__LDTO-nHvpuxPcCgUU.webp + urls: + - https://huggingface.co/mudler/L3-8B-Celeste-V1.2-Q4_K_M-GGUF + description: | + Trained on LLaMA 3 8B Instruct at 8K context using Reddit Writing Prompts, Opus 15K Instruct an c2 logs cleaned. + + This is a roleplay model any instruction following capabilities outside roleplay contexts are coincidental. + overrides: + parameters: + model: l3-8b-celeste-v1.2-q4_k_m.gguf + files: + - filename: l3-8b-celeste-v1.2-q4_k_m.gguf + sha256: 7752204c0e9f627ff5726eb69bb6114974cafbc934a993ad019abfba62002783 + uri: huggingface://mudler/L3-8B-Celeste-V1.2-Q4_K_M-GGUF/l3-8b-celeste-v1.2-q4_k_m.gguf - !!merge <<: *llama3 name: "llama-3-tulu-2-8b-i1" icon: https://huggingface.co/datasets/allenai/blog-images/resolve/main/tulu-v2/Tulu%20V2%20banner.png From 77ad49333a2ad8a2d7bdf1ad25aba3de93eee720 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Sun, 21 Jul 2024 21:45:04 +0200 Subject: [PATCH 0012/1851] models(gallery): add calme-2.3-phi3-4b (#2939) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index 46f9f7c7..dc2e5007 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -3500,7 +3500,23 @@ - filename: phillama-3.8b-v0.1.Q4_K_M.gguf sha256: da537d352b7aae54bbad0d2cff3e3a1b0e1dc1e1d25bec3aae1d05cf4faee7a2 uri: huggingface://RichardErkhov/raincandy-u_-_phillama-3.8b-v0.1-gguf/phillama-3.8b-v0.1.Q4_K_M.gguf +- !!merge <<: *llama3 + name: "calme-2.3-phi3-4b" + icon: https://huggingface.co/MaziyarPanahi/calme-2.1-phi3-4b/resolve/main/phi-3-instruct.webp + urls: + - https://huggingface.co/MaziyarPanahi/calme-2.3-phi3-4b + - https://huggingface.co/MaziyarPanahi/calme-2.3-phi3-4b-GGUF + description: | + MaziyarPanahi/calme-2.1-phi3-4b + This model is a fine-tune (DPO) of microsoft/Phi-3-mini-4k-instruct model. + overrides: + parameters: + model: Phi-3-mini-4k-instruct-v0.3.Q4_K_M.gguf + files: + - filename: Phi-3-mini-4k-instruct-v0.3.Q4_K_M.gguf + sha256: 3a23e1052369c080afb925882bd814cbea5ec859894655a7434c3d49e43a6127 + uri: huggingface://MaziyarPanahi/calme-2.3-phi3-4b-GGUF/Phi-3-mini-4k-instruct-v0.3.Q4_K_M.gguf - &hermes-2-pro-mistral ### START Hermes url: "github:mudler/LocalAI/gallery/hermes-2-pro-mistral.yaml@master" From 3f7eddb039226c29a3398394c50f35f5c1d8105e Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Sun, 21 Jul 2024 21:51:52 +0200 Subject: [PATCH 0013/1851] models(gallery): add calme-2.8-qwen2-7b (#2940) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index dc2e5007..3fd8def2 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -247,6 +247,21 @@ - filename: Qwen2-Wukong-7B-Q4_K_M.gguf sha256: 6b8ca6649c33fc84d4892ebcff1214f0b34697aced784f0d6d32e284a15943ad uri: huggingface://bartowski/Qwen2-Wukong-7B-GGUF/Qwen2-Wukong-7B-Q4_K_M.gguf +- !!merge <<: *qwen2 + name: "calme-2.8-qwen2-7b" + icon: https://huggingface.co/MaziyarPanahi/calme-2.8-qwen2-7b/resolve/main/qwen2-fine-tunes-maziyar-panahi.webp + urls: + - https://huggingface.co/MaziyarPanahi/calme-2.8-qwen2-7b + - https://huggingface.co/MaziyarPanahi/calme-2.8-qwen2-7b-GGUF + description: | + This is a fine-tuned version of the Qwen/Qwen2-7B model. It aims to improve the base model across all benchmarks. + overrides: + parameters: + model: Qwen2-7B-Instruct-v0.8.Q4_K_M.gguf + files: + - filename: Qwen2-7B-Instruct-v0.8.Q4_K_M.gguf + sha256: 8c1b3efe9fa6ae1b37942ef26473cb4e0aed0f8038b60d4b61e5bffb61e49b7e + uri: huggingface://MaziyarPanahi/calme-2.8-qwen2-7b-GGUF/Qwen2-7B-Instruct-v0.8.Q4_K_M.gguf - &mistral03 ## START Mistral url: "github:mudler/LocalAI/gallery/mistral-0.3.yaml@master" From 9c0c11e8a05717c685381650e3b640341faf4683 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Sun, 21 Jul 2024 21:57:30 +0200 Subject: [PATCH 0014/1851] models(gallery): add StellarDong-72b (#2941) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index 3fd8def2..63e3b49f 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -262,6 +262,21 @@ - filename: Qwen2-7B-Instruct-v0.8.Q4_K_M.gguf sha256: 8c1b3efe9fa6ae1b37942ef26473cb4e0aed0f8038b60d4b61e5bffb61e49b7e uri: huggingface://MaziyarPanahi/calme-2.8-qwen2-7b-GGUF/Qwen2-7B-Instruct-v0.8.Q4_K_M.gguf +- !!merge <<: *qwen2 + name: "stellardong-72b-i1" + icon: https://huggingface.co/smelborp/StellarDong-72b/resolve/main/stellardong.png + urls: + - https://huggingface.co/smelborp/StellarDong-72b + - https://huggingface.co/mradermacher/StellarDong-72b-i1-GGUF + description: | + Magnum + Nova = you won't believe how stellar this dong is!! + overrides: + parameters: + model: StellarDong-72b.i1-Q4_K_M.gguf + files: + - filename: StellarDong-72b.i1-Q4_K_M.gguf + sha256: 4c5012f0a034f40a044904891343ade2594f29c28a8a9d8052916de4dc5a61df + uri: huggingface://mradermacher/StellarDong-72b-i1-GGUF/StellarDong-72b.i1-Q4_K_M.gguf - &mistral03 ## START Mistral url: "github:mudler/LocalAI/gallery/mistral-0.3.yaml@master" From 19282af0596c0f95ac028ff552647f2c9fa07b32 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Sun, 21 Jul 2024 22:01:15 +0200 Subject: [PATCH 0015/1851] models(gallery): add calme-2.4-llama3-70b (#2942) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index 63e3b49f..31af59a3 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -3271,6 +3271,21 @@ - filename: suzume-llama-3-8B-multilingual-orpo-borda-top25.Q4_K_M.gguf sha256: ef75a02c5f38e14a8873c7989188dac6974851b4654279fe1921d2c8018cc388 uri: huggingface://RichardErkhov/lightblue_-_suzume-llama-3-8B-multilingual-orpo-borda-top25-gguf/suzume-llama-3-8B-multilingual-orpo-borda-top25.Q4_K_M.gguf +- !!merge <<: *llama3 + name: "calme-2.4-llama3-70b" + icon: https://huggingface.co/MaziyarPanahi/calme-2.4-llama3-70b/resolve/main/llama-3-merges.webp + urls: + - https://huggingface.co/MaziyarPanahi/calme-2.4-llama3-70b + - https://huggingface.co/mradermacher/calme-2.4-llama3-70b-GGUF + description: | + This model is a fine-tune (DPO) of meta-llama/Meta-Llama-3-70B-Instruct model. + overrides: + parameters: + model: calme-2.4-llama3-70b.Q4_K_M.gguf + files: + - filename: calme-2.4-llama3-70b.Q4_K_M.gguf + sha256: 0b44ac8a88395dfc60f1b9d3cfffc0ffef74ec0a302e610ef91fc787187568f2 + uri: huggingface://mradermacher/calme-2.4-llama3-70b-GGUF/calme-2.4-llama3-70b.Q4_K_M.gguf - &command-R ### START Command-r url: "github:mudler/LocalAI/gallery/command-r.yaml@master" From bcd9e153ba1e7efae8f9ec8ab3778310ec1a1818 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Mon, 22 Jul 2024 15:39:57 +0200 Subject: [PATCH 0016/1851] ci(Makefile): reduce binary size by compressing (#2947) Makefile: try to reduce binary size Signed-off-by: Ettore Di Giacinto --- .github/workflows/release.yaml | 6 ++--- .github/workflows/test.yml | 2 +- Dockerfile | 2 +- Makefile | 47 +++++++++++++++++++++++++++++++++- 4 files changed, 51 insertions(+), 6 deletions(-) diff --git a/.github/workflows/release.yaml b/.github/workflows/release.yaml index b2c6c069..faed2b81 100644 --- a/.github/workflows/release.yaml +++ b/.github/workflows/release.yaml @@ -35,7 +35,7 @@ jobs: - name: Dependencies run: | sudo apt-get update - sudo apt-get install build-essential ffmpeg protobuf-compiler ccache gawk + sudo apt-get install build-essential ffmpeg protobuf-compiler ccache upx-ucl gawk sudo apt-get install -qy binutils-aarch64-linux-gnu gcc-aarch64-linux-gnu g++-aarch64-linux-gnu libgmock-dev - name: Install CUDA Dependencies run: | @@ -151,7 +151,7 @@ jobs: - name: Dependencies run: | sudo apt-get update - sudo apt-get install -y wget curl build-essential ffmpeg protobuf-compiler ccache gawk cmake libgmock-dev + sudo apt-get install -y wget curl build-essential ffmpeg protobuf-compiler ccache upx-ucl gawk cmake libgmock-dev - name: Intel Dependencies run: | wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB | gpg --dearmor | sudo tee /usr/share/keyrings/oneapi-archive-keyring.gpg > /dev/null @@ -252,7 +252,7 @@ jobs: - name: Dependencies run: | sudo apt-get update - sudo apt-get install -y --no-install-recommends libopencv-dev protobuf-compiler ccache + sudo apt-get install -y --no-install-recommends libopencv-dev protobuf-compiler ccache upx-ucl go install google.golang.org/grpc/cmd/protoc-gen-go-grpc@1958fcbe2ca8bd93af633f11e97d44e567e945af go install google.golang.org/protobuf/cmd/protoc-gen-go@v1.34.2 - name: Build stablediffusion diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml index 084d016d..e6efe77f 100644 --- a/.github/workflows/test.yml +++ b/.github/workflows/test.yml @@ -70,7 +70,7 @@ jobs: - name: Dependencies run: | sudo apt-get update - sudo apt-get install build-essential curl ffmpeg + sudo apt-get install build-essential ccache upx-ucl curl ffmpeg sudo apt-get install -y libgmock-dev curl https://repo.anaconda.com/pkgs/misc/gpgkeys/anaconda.asc | gpg --dearmor > conda.gpg && \ sudo install -o root -g root -m 644 conda.gpg /usr/share/keyrings/conda-archive-keyring.gpg && \ diff --git a/Dockerfile b/Dockerfile index 78ed4cd3..fcad8343 100644 --- a/Dockerfile +++ b/Dockerfile @@ -24,7 +24,7 @@ RUN apt-get update && \ cmake \ curl \ git \ - unzip && \ + unzip upx-ucl && \ apt-get clean && \ rm -rf /var/lib/apt/lists/* diff --git a/Makefile b/Makefile index 906e8ca5..882b6fe6 100644 --- a/Makefile +++ b/Makefile @@ -58,7 +58,7 @@ RANDOM := $(shell bash -c 'echo $$RANDOM') VERSION?=$(shell git describe --always --tags || echo "dev" ) # go tool nm ./local-ai | grep Commit -LD_FLAGS?= +LD_FLAGS?=-s -w override LD_FLAGS += -X "github.com/mudler/LocalAI/internal.Version=$(VERSION)" override LD_FLAGS += -X "github.com/mudler/LocalAI/internal.Commit=$(shell git rev-parse HEAD)" @@ -72,6 +72,14 @@ WHITE := $(shell tput -Txterm setaf 7) CYAN := $(shell tput -Txterm setaf 6) RESET := $(shell tput -Txterm sgr0) +UPX?= +# check if upx exists +ifeq (, $(shell which upx)) + UPX= +else + UPX=$(shell which upx) +endif + # Default Docker bridge IP E2E_BRIDGE_IP?=172.17.0.1 @@ -377,6 +385,7 @@ build: prepare backend-assets grpcs ## Build the project $(info ${GREEN}I BUILD_TYPE: ${YELLOW}$(BUILD_TYPE)${RESET}) $(info ${GREEN}I GO_TAGS: ${YELLOW}$(GO_TAGS)${RESET}) $(info ${GREEN}I LD_FLAGS: ${YELLOW}$(LD_FLAGS)${RESET}) + $(info ${GREEN}I UPX: ${YELLOW}$(UPX)${RESET}) ifneq ($(BACKEND_LIBS),) $(MAKE) backend-assets/lib cp -f $(BACKEND_LIBS) backend-assets/lib/ @@ -733,13 +742,22 @@ backend-assets/grpc: protogen-go replace backend-assets/grpc/bert-embeddings: sources/go-bert.cpp sources/go-bert.cpp/libgobert.a backend-assets/grpc CGO_LDFLAGS="$(CGO_LDFLAGS)" C_INCLUDE_PATH=$(CURDIR)/sources/go-bert.cpp LIBRARY_PATH=$(CURDIR)/sources/go-bert.cpp \ $(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/bert-embeddings ./backend/go/llm/bert/ +ifneq ($(UPX),) + $(UPX) backend-assets/grpc/bert-embeddings +endif backend-assets/grpc/gpt4all: sources/gpt4all sources/gpt4all/gpt4all-bindings/golang/libgpt4all.a backend-assets/gpt4all backend-assets/grpc CGO_LDFLAGS="$(CGO_LDFLAGS)" C_INCLUDE_PATH=$(CURDIR)/sources/gpt4all/gpt4all-bindings/golang/ LIBRARY_PATH=$(CURDIR)/sources/gpt4all/gpt4all-bindings/golang/ \ $(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/gpt4all ./backend/go/llm/gpt4all/ +ifneq ($(UPX),) + $(UPX) backend-assets/grpc/gpt4all +endif backend-assets/grpc/huggingface: backend-assets/grpc $(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/huggingface ./backend/go/llm/langchain/ +ifneq ($(UPX),) + $(UPX) backend-assets/grpc/huggingface +endif backend/cpp/llama/llama.cpp: LLAMA_VERSION=$(CPPLLAMA_VERSION) $(MAKE) -C backend/cpp/llama llama.cpp @@ -765,6 +783,9 @@ else echo "BUILD_GRPC_FOR_BACKEND_LLAMA is not defined." LLAMA_VERSION=$(CPPLLAMA_VERSION) $(MAKE) -C backend/cpp/${VARIANT} grpc-server endif +ifneq ($(UPX),) + $(UPX) backend/cpp/${VARIANT}/grpc-server +endif # This target is for manually building a variant with-auto detected flags backend-assets/grpc/llama-cpp: backend-assets/grpc backend/cpp/llama/llama.cpp @@ -837,33 +858,57 @@ backend-assets/grpc/llama-cpp-grpc: backend-assets/grpc backend/cpp/llama/llama. backend-assets/util/llama-cpp-rpc-server: backend-assets/grpc/llama-cpp-grpc mkdir -p backend-assets/util/ cp -rf backend/cpp/llama-grpc/llama.cpp/build/bin/rpc-server backend-assets/util/llama-cpp-rpc-server +ifneq ($(UPX),) + $(UPX) backend-assets/util/llama-cpp-rpc-server +endif backend-assets/grpc/llama-ggml: sources/go-llama.cpp sources/go-llama.cpp/libbinding.a backend-assets/grpc CGO_LDFLAGS="$(CGO_LDFLAGS)" C_INCLUDE_PATH=$(CURDIR)/sources/go-llama.cpp LIBRARY_PATH=$(CURDIR)/sources/go-llama.cpp \ $(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/llama-ggml ./backend/go/llm/llama-ggml/ +ifneq ($(UPX),) + $(UPX) backend-assets/grpc/llama-ggml +endif backend-assets/grpc/piper: sources/go-piper sources/go-piper/libpiper_binding.a backend-assets/grpc backend-assets/espeak-ng-data CGO_CXXFLAGS="$(PIPER_CGO_CXXFLAGS)" CGO_LDFLAGS="$(PIPER_CGO_LDFLAGS)" LIBRARY_PATH=$(CURDIR)/sources/go-piper \ $(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/piper ./backend/go/tts/ +ifneq ($(UPX),) + $(UPX) backend-assets/grpc/piper +endif backend-assets/grpc/rwkv: sources/go-rwkv.cpp sources/go-rwkv.cpp/librwkv.a backend-assets/grpc CGO_LDFLAGS="$(CGO_LDFLAGS)" C_INCLUDE_PATH=$(CURDIR)/sources/go-rwkv.cpp LIBRARY_PATH=$(CURDIR)/sources/go-rwkv.cpp \ $(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/rwkv ./backend/go/llm/rwkv +ifneq ($(UPX),) + $(UPX) backend-assets/grpc/rwkv +endif backend-assets/grpc/stablediffusion: sources/go-stable-diffusion sources/go-stable-diffusion/libstablediffusion.a backend-assets/grpc CGO_LDFLAGS="$(CGO_LDFLAGS)" CPATH="$(CPATH):$(CURDIR)/sources/go-stable-diffusion/:/usr/include/opencv4" LIBRARY_PATH=$(CURDIR)/sources/go-stable-diffusion/ \ $(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/stablediffusion ./backend/go/image/stablediffusion +ifneq ($(UPX),) + $(UPX) backend-assets/grpc/stablediffusion +endif backend-assets/grpc/tinydream: sources/go-tiny-dream sources/go-tiny-dream/libtinydream.a backend-assets/grpc CGO_LDFLAGS="$(CGO_LDFLAGS)" LIBRARY_PATH=$(CURDIR)/go-tiny-dream \ $(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/tinydream ./backend/go/image/tinydream +ifneq ($(UPX),) + $(UPX) backend-assets/grpc/tinydream +endif backend-assets/grpc/whisper: sources/whisper.cpp sources/whisper.cpp/libwhisper.a backend-assets/grpc CGO_LDFLAGS="$(CGO_LDFLAGS) $(CGO_LDFLAGS_WHISPER)" C_INCLUDE_PATH="$(CURDIR)/sources/whisper.cpp/include:$(CURDIR)/sources/whisper.cpp/ggml/include" LIBRARY_PATH=$(CURDIR)/sources/whisper.cpp \ $(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/whisper ./backend/go/transcribe/ +ifneq ($(UPX),) + $(UPX) backend-assets/grpc/whisper +endif backend-assets/grpc/local-store: backend-assets/grpc $(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/local-store ./backend/go/stores/ +ifneq ($(UPX),) + $(UPX) backend-assets/grpc/local-store +endif grpcs: prepare $(GRPC_BACKENDS) From 7d61de63ae1fca11d020359657938fc69af64560 Mon Sep 17 00:00:00 2001 From: fakezeta Date: Mon, 22 Jul 2024 15:40:34 +0200 Subject: [PATCH 0017/1851] fix: pin setuptools 69.5.1 (#2949) pin setuptools 69.5.1 --- backend/python/sentencetransformers/requirements-intel.txt | 2 +- backend/python/transformers-musicgen/requirements-intel.txt | 2 +- backend/python/transformers/requirements-intel.txt | 1 - backend/python/transformers/requirements.txt | 2 +- 4 files changed, 3 insertions(+), 4 deletions(-) diff --git a/backend/python/sentencetransformers/requirements-intel.txt b/backend/python/sentencetransformers/requirements-intel.txt index 635b4c31..95d4848c 100644 --- a/backend/python/sentencetransformers/requirements-intel.txt +++ b/backend/python/sentencetransformers/requirements-intel.txt @@ -2,4 +2,4 @@ intel-extension-for-pytorch torch optimum[openvino] -setuptools==70.3.0 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file +setuptools==69.5.1 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file diff --git a/backend/python/transformers-musicgen/requirements-intel.txt b/backend/python/transformers-musicgen/requirements-intel.txt index 635b4c31..95d4848c 100644 --- a/backend/python/transformers-musicgen/requirements-intel.txt +++ b/backend/python/transformers-musicgen/requirements-intel.txt @@ -2,4 +2,4 @@ intel-extension-for-pytorch torch optimum[openvino] -setuptools==70.3.0 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file +setuptools==69.5.1 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file diff --git a/backend/python/transformers/requirements-intel.txt b/backend/python/transformers/requirements-intel.txt index 635b4c31..8fc18a0e 100644 --- a/backend/python/transformers/requirements-intel.txt +++ b/backend/python/transformers/requirements-intel.txt @@ -2,4 +2,3 @@ intel-extension-for-pytorch torch optimum[openvino] -setuptools==70.3.0 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file diff --git a/backend/python/transformers/requirements.txt b/backend/python/transformers/requirements.txt index 76066f50..40e87073 100644 --- a/backend/python/transformers/requirements.txt +++ b/backend/python/transformers/requirements.txt @@ -6,4 +6,4 @@ torch certifi intel-extension-for-transformers bitsandbytes -setuptools==70.3.0 # https://github.com/mudler/LocalAI/issues/2406 +setuptools==69.5.1 # https://github.com/mudler/LocalAI/issues/2406 From 153e97715543188212a366eeccecf112f5115e8c Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Mon, 22 Jul 2024 17:35:10 +0200 Subject: [PATCH 0018/1851] Update distributed_inferencing.md Signed-off-by: Ettore Di Giacinto --- docs/content/docs/features/distributed_inferencing.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/content/docs/features/distributed_inferencing.md b/docs/content/docs/features/distributed_inferencing.md index b7ce41a9..1ab3fa55 100644 --- a/docs/content/docs/features/distributed_inferencing.md +++ b/docs/content/docs/features/distributed_inferencing.md @@ -11,7 +11,7 @@ This functionality enables LocalAI to distribute inference requests across multi LocalAI supports two modes of distributed inferencing via p2p: - **Federated Mode**: Requests are shared between the cluster and routed to a single worker node in the network based on the load balancer's decision. -- **Worker Mode**: Requests are processed by all the workers which contributes to the final inference result (by sharing the model weights). +- **Worker Mode** (aka "model sharding" or "splitting weights"): Requests are processed by all the workers which contributes to the final inference result (by sharing the model weights). ## Usage From 3dc601c4704154450e84f7cb31bf896ebf0f29d7 Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Mon, 22 Jul 2024 18:04:41 +0200 Subject: [PATCH 0019/1851] chore: :arrow_up: Update ggerganov/llama.cpp (#2943) :arrow_up: Update ggerganov/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> Co-authored-by: Ettore Di Giacinto --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 882b6fe6..b7df2486 100644 --- a/Makefile +++ b/Makefile @@ -8,7 +8,7 @@ DETECT_LIBS?=true # llama.cpp versions GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be -CPPLLAMA_VERSION?=07283b1a90e1320aae4762c7e03c879043910252 +CPPLLAMA_VERSION?=45f2c19cc57286eead7b232ce8028273a817aa4d # gpt4all version GPT4ALL_REPO?=https://github.com/nomic-ai/gpt4all From a6b92af875987e7e62cc0c530abefed89a015c88 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon, 22 Jul 2024 21:34:12 +0000 Subject: [PATCH 0020/1851] chore(deps): Bump grpcio from 1.64.1 to 1.65.1 in /backend/python/openvoice (#2956) chore(deps): Bump grpcio in /backend/python/openvoice Bumps [grpcio](https://github.com/grpc/grpc) from 1.64.1 to 1.65.1. - [Release notes](https://github.com/grpc/grpc/releases) - [Changelog](https://github.com/grpc/grpc/blob/master/doc/grpc_release_schedule.md) - [Commits](https://github.com/grpc/grpc/compare/v1.64.1...v1.65.1) --- updated-dependencies: - dependency-name: grpcio dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/openvoice/requirements-intel.txt | 2 +- backend/python/openvoice/requirements.txt | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/backend/python/openvoice/requirements-intel.txt b/backend/python/openvoice/requirements-intel.txt index b0551187..bad088a9 100644 --- a/backend/python/openvoice/requirements-intel.txt +++ b/backend/python/openvoice/requirements-intel.txt @@ -2,7 +2,7 @@ intel-extension-for-pytorch torch optimum[openvino] -grpcio==1.64.1 +grpcio==1.65.1 protobuf librosa==0.9.1 faster-whisper==1.0.3 diff --git a/backend/python/openvoice/requirements.txt b/backend/python/openvoice/requirements.txt index 07ba879a..86d16ec2 100644 --- a/backend/python/openvoice/requirements.txt +++ b/backend/python/openvoice/requirements.txt @@ -1,4 +1,4 @@ -grpcio==1.65.0 +grpcio==1.65.1 protobuf librosa faster-whisper From 1a75546b272cb1e3deff1315b734ae39afb2bd71 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon, 22 Jul 2024 21:41:06 +0000 Subject: [PATCH 0021/1851] chore(deps): Bump grpcio from 1.65.0 to 1.65.1 in /backend/python/sentencetransformers (#2955) chore(deps): Bump grpcio in /backend/python/sentencetransformers Bumps [grpcio](https://github.com/grpc/grpc) from 1.65.0 to 1.65.1. - [Release notes](https://github.com/grpc/grpc/releases) - [Changelog](https://github.com/grpc/grpc/blob/master/doc/grpc_release_schedule.md) - [Commits](https://github.com/grpc/grpc/compare/v1.65.0...v1.65.1) --- updated-dependencies: - dependency-name: grpcio dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/sentencetransformers/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/sentencetransformers/requirements.txt b/backend/python/sentencetransformers/requirements.txt index ac21d449..4ef4a28b 100644 --- a/backend/python/sentencetransformers/requirements.txt +++ b/backend/python/sentencetransformers/requirements.txt @@ -1,6 +1,6 @@ accelerate sentence-transformers==3.0.1 transformers -grpcio==1.65.0 +grpcio==1.65.1 protobuf certifi \ No newline at end of file From f4ed47bf956bdf39e39d71e6a25d144fb4da0cdd Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon, 22 Jul 2024 21:47:54 +0000 Subject: [PATCH 0022/1851] chore(deps): Bump grpcio from 1.65.0 to 1.65.1 in /backend/python/bark (#2951) Bumps [grpcio](https://github.com/grpc/grpc) from 1.65.0 to 1.65.1. - [Release notes](https://github.com/grpc/grpc/releases) - [Changelog](https://github.com/grpc/grpc/blob/master/doc/grpc_release_schedule.md) - [Commits](https://github.com/grpc/grpc/compare/v1.65.0...v1.65.1) --- updated-dependencies: - dependency-name: grpcio dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/bark/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/bark/requirements.txt b/backend/python/bark/requirements.txt index 215b3d35..d3f9f52b 100644 --- a/backend/python/bark/requirements.txt +++ b/backend/python/bark/requirements.txt @@ -1,6 +1,6 @@ accelerate bark==0.1.5 -grpcio==1.65.0 +grpcio==1.65.1 protobuf certifi transformers \ No newline at end of file From 29669791615befa7e41c6862c6fe25b273ef729b Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon, 22 Jul 2024 22:26:35 +0000 Subject: [PATCH 0023/1851] chore(deps): Bump docs/themes/hugo-theme-relearn from `1b2e139` to `7aec99b` (#2952) chore(deps): Bump docs/themes/hugo-theme-relearn Bumps [docs/themes/hugo-theme-relearn](https://github.com/McShelby/hugo-theme-relearn) from `1b2e139` to `7aec99b`. - [Release notes](https://github.com/McShelby/hugo-theme-relearn/releases) - [Commits](https://github.com/McShelby/hugo-theme-relearn/compare/1b2e139512106f8074ac7d4a884135d159720cc4...7aec99b38dc2668c6139bf71855535ace41c123c) --- updated-dependencies: - dependency-name: docs/themes/hugo-theme-relearn dependency-type: direct:production ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- docs/themes/hugo-theme-relearn | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/themes/hugo-theme-relearn b/docs/themes/hugo-theme-relearn index 1b2e1395..7aec99b3 160000 --- a/docs/themes/hugo-theme-relearn +++ b/docs/themes/hugo-theme-relearn @@ -1 +1 @@ -Subproject commit 1b2e139512106f8074ac7d4a884135d159720cc4 +Subproject commit 7aec99b38dc2668c6139bf71855535ace41c123c From d3166e8571c576be78139b8dfffb77e0de8da2fc Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon, 22 Jul 2024 22:49:29 +0000 Subject: [PATCH 0024/1851] chore(deps): Bump langchain from 0.2.8 to 0.2.10 in /examples/langchain/langchainpy-localai-example (#2959) chore(deps): Bump langchain Bumps [langchain](https://github.com/langchain-ai/langchain) from 0.2.8 to 0.2.10. - [Release notes](https://github.com/langchain-ai/langchain/releases) - [Commits](https://github.com/langchain-ai/langchain/compare/langchain==0.2.8...langchain==0.2.10) --- updated-dependencies: - dependency-name: langchain dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain/langchainpy-localai-example/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain/langchainpy-localai-example/requirements.txt b/examples/langchain/langchainpy-localai-example/requirements.txt index 01a75d46..a0578a09 100644 --- a/examples/langchain/langchainpy-localai-example/requirements.txt +++ b/examples/langchain/langchainpy-localai-example/requirements.txt @@ -10,7 +10,7 @@ debugpy==1.8.2 frozenlist==1.4.1 greenlet==3.0.3 idna==3.7 -langchain==0.2.8 +langchain==0.2.10 langchain-community==0.2.7 marshmallow==3.21.3 marshmallow-enum==1.5.1 From 8ec7a0a407d240bce303dd2d837602c4a61dd4af Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon, 22 Jul 2024 22:49:39 +0000 Subject: [PATCH 0025/1851] chore(deps): Bump numpy from 1.26.4 to 2.0.1 in /examples/langchain/langchainpy-localai-example (#2958) chore(deps): Bump numpy Bumps [numpy](https://github.com/numpy/numpy) from 1.26.4 to 2.0.1. - [Release notes](https://github.com/numpy/numpy/releases) - [Changelog](https://github.com/numpy/numpy/blob/main/doc/RELEASE_WALKTHROUGH.rst) - [Commits](https://github.com/numpy/numpy/compare/v1.26.4...v2.0.1) --- updated-dependencies: - dependency-name: numpy dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain/langchainpy-localai-example/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain/langchainpy-localai-example/requirements.txt b/examples/langchain/langchainpy-localai-example/requirements.txt index a0578a09..25d74716 100644 --- a/examples/langchain/langchainpy-localai-example/requirements.txt +++ b/examples/langchain/langchainpy-localai-example/requirements.txt @@ -17,7 +17,7 @@ marshmallow-enum==1.5.1 multidict==6.0.5 mypy-extensions==1.0.0 numexpr==2.10.1 -numpy==1.26.4 +numpy==2.0.1 openai==1.35.13 openapi-schema-pydantic==1.2.4 packaging>=23.2 From 9fc09b32cfec5ccd693ca9ac8592c45b305bbaec Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon, 22 Jul 2024 23:50:41 +0000 Subject: [PATCH 0026/1851] chore(deps): Bump sqlalchemy from 2.0.30 to 2.0.31 in /examples/langchain/langchainpy-localai-example (#2957) chore(deps): Bump sqlalchemy Bumps [sqlalchemy](https://github.com/sqlalchemy/sqlalchemy) from 2.0.30 to 2.0.31. - [Release notes](https://github.com/sqlalchemy/sqlalchemy/releases) - [Changelog](https://github.com/sqlalchemy/sqlalchemy/blob/main/CHANGES.rst) - [Commits](https://github.com/sqlalchemy/sqlalchemy/commits) --- updated-dependencies: - dependency-name: sqlalchemy dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain/langchainpy-localai-example/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain/langchainpy-localai-example/requirements.txt b/examples/langchain/langchainpy-localai-example/requirements.txt index 25d74716..522dbe14 100644 --- a/examples/langchain/langchainpy-localai-example/requirements.txt +++ b/examples/langchain/langchainpy-localai-example/requirements.txt @@ -24,7 +24,7 @@ packaging>=23.2 pydantic==2.8.2 PyYAML==6.0.1 requests==2.32.3 -SQLAlchemy==2.0.30 +SQLAlchemy==2.0.31 tenacity==8.5.0 tqdm==4.66.4 typing-inspect==0.9.0 From a1bc2e977109379b2e89ab1c1c2d6f9ea646eb01 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 23 Jul 2024 00:08:22 +0000 Subject: [PATCH 0027/1851] chore(deps): Bump grpcio from 1.65.0 to 1.65.1 in /backend/python/vllm (#2964) Bumps [grpcio](https://github.com/grpc/grpc) from 1.65.0 to 1.65.1. - [Release notes](https://github.com/grpc/grpc/releases) - [Changelog](https://github.com/grpc/grpc/blob/master/doc/grpc_release_schedule.md) - [Commits](https://github.com/grpc/grpc/compare/v1.65.0...v1.65.1) --- updated-dependencies: - dependency-name: grpcio dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/vllm/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/vllm/requirements.txt b/backend/python/vllm/requirements.txt index 986a4d55..7c612a2f 100644 --- a/backend/python/vllm/requirements.txt +++ b/backend/python/vllm/requirements.txt @@ -1,6 +1,6 @@ accelerate vllm -grpcio==1.65.0 +grpcio==1.65.1 protobuf certifi transformers From 824cc816ea9e990100fcb533733758184bab9ebe Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 23 Jul 2024 00:58:30 +0000 Subject: [PATCH 0028/1851] chore(deps): Bump llama-index from 0.10.55 to 0.10.56 in /examples/chainlit (#2966) chore(deps): Bump llama-index in /examples/chainlit Bumps [llama-index](https://github.com/run-llama/llama_index) from 0.10.55 to 0.10.56. - [Release notes](https://github.com/run-llama/llama_index/releases) - [Changelog](https://github.com/run-llama/llama_index/blob/main/CHANGELOG.md) - [Commits](https://github.com/run-llama/llama_index/compare/v0.10.55...v0.10.56) --- updated-dependencies: - dependency-name: llama-index dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/chainlit/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/chainlit/requirements.txt b/examples/chainlit/requirements.txt index 116b7b61..cac24528 100644 --- a/examples/chainlit/requirements.txt +++ b/examples/chainlit/requirements.txt @@ -1,4 +1,4 @@ -llama_index==0.10.55 +llama_index==0.10.56 requests==2.32.3 weaviate_client==4.6.5 transformers From b555b64616367db6e288706f06541dc88ddc8cfa Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 23 Jul 2024 01:07:42 +0000 Subject: [PATCH 0029/1851] chore(deps): Bump grpcio from 1.65.0 to 1.65.1 in /backend/python/common/template (#2963) chore(deps): Bump grpcio in /backend/python/common/template Bumps [grpcio](https://github.com/grpc/grpc) from 1.65.0 to 1.65.1. - [Release notes](https://github.com/grpc/grpc/releases) - [Changelog](https://github.com/grpc/grpc/blob/master/doc/grpc_release_schedule.md) - [Commits](https://github.com/grpc/grpc/compare/v1.65.0...v1.65.1) --- updated-dependencies: - dependency-name: grpcio dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/common/template/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/common/template/requirements.txt b/backend/python/common/template/requirements.txt index c762c4d6..8d1e3151 100644 --- a/backend/python/common/template/requirements.txt +++ b/backend/python/common/template/requirements.txt @@ -1,2 +1,2 @@ -grpcio==1.65.0 +grpcio==1.65.1 protobuf \ No newline at end of file From ede352256be74b8f63e71c07c542d3ce3900b5bc Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 23 Jul 2024 01:17:19 +0000 Subject: [PATCH 0030/1851] chore(deps): Bump weaviate-client from 4.6.5 to 4.6.7 in /examples/chainlit (#2965) chore(deps): Bump weaviate-client in /examples/chainlit Bumps [weaviate-client](https://github.com/weaviate/weaviate-python-client) from 4.6.5 to 4.6.7. - [Release notes](https://github.com/weaviate/weaviate-python-client/releases) - [Changelog](https://github.com/weaviate/weaviate-python-client/blob/main/docs/changelog.rst) - [Commits](https://github.com/weaviate/weaviate-python-client/compare/v4.6.5...v4.6.7) --- updated-dependencies: - dependency-name: weaviate-client dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/chainlit/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/chainlit/requirements.txt b/examples/chainlit/requirements.txt index cac24528..13415f11 100644 --- a/examples/chainlit/requirements.txt +++ b/examples/chainlit/requirements.txt @@ -1,6 +1,6 @@ llama_index==0.10.56 requests==2.32.3 -weaviate_client==4.6.5 +weaviate_client==4.6.7 transformers torch chainlit From 99324eeef0cb9c29cb68958b9ff599a4e1e768ba Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 23 Jul 2024 02:39:44 +0000 Subject: [PATCH 0031/1851] chore(deps): Bump grpcio from 1.65.0 to 1.65.1 in /backend/python/transformers (#2970) chore(deps): Bump grpcio in /backend/python/transformers Bumps [grpcio](https://github.com/grpc/grpc) from 1.65.0 to 1.65.1. - [Release notes](https://github.com/grpc/grpc/releases) - [Changelog](https://github.com/grpc/grpc/blob/master/doc/grpc_release_schedule.md) - [Commits](https://github.com/grpc/grpc/compare/v1.65.0...v1.65.1) --- updated-dependencies: - dependency-name: grpcio dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/transformers/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/transformers/requirements.txt b/backend/python/transformers/requirements.txt index 40e87073..55925b32 100644 --- a/backend/python/transformers/requirements.txt +++ b/backend/python/transformers/requirements.txt @@ -1,6 +1,6 @@ accelerate transformers -grpcio==1.65.0 +grpcio==1.65.1 protobuf torch certifi From 8385eb2a596e6a41dd84f582c17f25bbae55d2c4 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 23 Jul 2024 03:42:48 +0000 Subject: [PATCH 0032/1851] chore(deps): Bump openai from 1.35.13 to 1.37.0 in /examples/functions (#2973) Bumps [openai](https://github.com/openai/openai-python) from 1.35.13 to 1.37.0. - [Release notes](https://github.com/openai/openai-python/releases) - [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md) - [Commits](https://github.com/openai/openai-python/compare/v1.35.13...v1.37.0) --- updated-dependencies: - dependency-name: openai dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/functions/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/functions/requirements.txt b/examples/functions/requirements.txt index 481af898..d5e8f2c5 100644 --- a/examples/functions/requirements.txt +++ b/examples/functions/requirements.txt @@ -1,2 +1,2 @@ langchain==0.2.8 -openai==1.35.13 +openai==1.37.0 From 2f9f04b26097106eed81913485ab550948dfdd77 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 23 Jul 2024 03:47:26 +0000 Subject: [PATCH 0033/1851] chore(deps): Bump grpcio from 1.65.0 to 1.65.1 in /backend/python/diffusers (#2969) chore(deps): Bump grpcio in /backend/python/diffusers Bumps [grpcio](https://github.com/grpc/grpc) from 1.65.0 to 1.65.1. - [Release notes](https://github.com/grpc/grpc/releases) - [Changelog](https://github.com/grpc/grpc/blob/master/doc/grpc_release_schedule.md) - [Commits](https://github.com/grpc/grpc/compare/v1.65.0...v1.65.1) --- updated-dependencies: - dependency-name: grpcio dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/diffusers/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/diffusers/requirements.txt b/backend/python/diffusers/requirements.txt index c607187e..6f04d677 100644 --- a/backend/python/diffusers/requirements.txt +++ b/backend/python/diffusers/requirements.txt @@ -3,7 +3,7 @@ accelerate compel peft diffusers -grpcio==1.65.0 +grpcio==1.65.1 opencv-python pillow protobuf From 7ab3217df0d11d422138a9290b34407e43bcfae5 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 23 Jul 2024 04:03:28 +0000 Subject: [PATCH 0034/1851] chore(deps): Bump grpcio from 1.65.0 to 1.65.1 in /backend/python/exllama2 (#2971) chore(deps): Bump grpcio in /backend/python/exllama2 Bumps [grpcio](https://github.com/grpc/grpc) from 1.65.0 to 1.65.1. - [Release notes](https://github.com/grpc/grpc/releases) - [Changelog](https://github.com/grpc/grpc/blob/master/doc/grpc_release_schedule.md) - [Commits](https://github.com/grpc/grpc/compare/v1.65.0...v1.65.1) --- updated-dependencies: - dependency-name: grpcio dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/exllama2/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/exllama2/requirements.txt b/backend/python/exllama2/requirements.txt index 62c7117a..6aae273c 100644 --- a/backend/python/exllama2/requirements.txt +++ b/backend/python/exllama2/requirements.txt @@ -1,5 +1,5 @@ accelerate -grpcio==1.65.0 +grpcio==1.65.1 protobuf certifi torch From fb574434a4e0b62f6c03b20d6226dc362f9a3bdf Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 23 Jul 2024 04:40:27 +0000 Subject: [PATCH 0035/1851] chore(deps): Bump grpcio from 1.65.0 to 1.65.1 in /backend/python/rerankers (#2974) chore(deps): Bump grpcio in /backend/python/rerankers Bumps [grpcio](https://github.com/grpc/grpc) from 1.65.0 to 1.65.1. - [Release notes](https://github.com/grpc/grpc/releases) - [Changelog](https://github.com/grpc/grpc/blob/master/doc/grpc_release_schedule.md) - [Commits](https://github.com/grpc/grpc/compare/v1.65.0...v1.65.1) --- updated-dependencies: - dependency-name: grpcio dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/rerankers/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/rerankers/requirements.txt b/backend/python/rerankers/requirements.txt index 1b437654..8b2ad4d0 100644 --- a/backend/python/rerankers/requirements.txt +++ b/backend/python/rerankers/requirements.txt @@ -1,6 +1,6 @@ accelerate rerankers[transformers] -grpcio==1.65.0 +grpcio==1.65.1 protobuf certifi transformers \ No newline at end of file From 385d8dc29b69269064dcca7cbc60b39496e6671c Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 23 Jul 2024 06:15:50 +0000 Subject: [PATCH 0036/1851] chore(deps): Bump grpcio from 1.65.0 to 1.65.1 in /backend/python/coqui (#2980) Bumps [grpcio](https://github.com/grpc/grpc) from 1.65.0 to 1.65.1. - [Release notes](https://github.com/grpc/grpc/releases) - [Changelog](https://github.com/grpc/grpc/blob/master/doc/grpc_release_schedule.md) - [Commits](https://github.com/grpc/grpc/compare/v1.65.0...v1.65.1) --- updated-dependencies: - dependency-name: grpcio dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/coqui/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/coqui/requirements.txt b/backend/python/coqui/requirements.txt index d7dd07e4..e1cddaa3 100644 --- a/backend/python/coqui/requirements.txt +++ b/backend/python/coqui/requirements.txt @@ -1,6 +1,6 @@ accelerate TTS==0.22.0 -grpcio==1.65.0 +grpcio==1.65.1 protobuf certifi transformers \ No newline at end of file From bbb1dc2ae085e41a46ab60e2b3c8af9eeeeb749b Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 23 Jul 2024 06:33:45 +0000 Subject: [PATCH 0037/1851] chore(deps): Bump grpcio from 1.65.0 to 1.65.1 in /backend/python/parler-tts (#2982) chore(deps): Bump grpcio in /backend/python/parler-tts Bumps [grpcio](https://github.com/grpc/grpc) from 1.65.0 to 1.65.1. - [Release notes](https://github.com/grpc/grpc/releases) - [Changelog](https://github.com/grpc/grpc/blob/master/doc/grpc_release_schedule.md) - [Commits](https://github.com/grpc/grpc/compare/v1.65.0...v1.65.1) --- updated-dependencies: - dependency-name: grpcio dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/parler-tts/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/parler-tts/requirements.txt b/backend/python/parler-tts/requirements.txt index c3706051..147cad9a 100644 --- a/backend/python/parler-tts/requirements.txt +++ b/backend/python/parler-tts/requirements.txt @@ -1,5 +1,5 @@ accelerate -grpcio==1.65.0 +grpcio==1.65.1 protobuf torch git+https://github.com/huggingface/parler-tts.git@10016fb0300c0dc31a0fb70e26f3affee7b62f16 From 6ec593c23776a6dc645027a99051859f15359920 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 23 Jul 2024 06:50:45 +0000 Subject: [PATCH 0038/1851] chore(deps): Bump grpcio from 1.65.0 to 1.65.1 in /backend/python/vall-e-x (#2981) chore(deps): Bump grpcio in /backend/python/vall-e-x Bumps [grpcio](https://github.com/grpc/grpc) from 1.65.0 to 1.65.1. - [Release notes](https://github.com/grpc/grpc/releases) - [Changelog](https://github.com/grpc/grpc/blob/master/doc/grpc_release_schedule.md) - [Commits](https://github.com/grpc/grpc/compare/v1.65.0...v1.65.1) --- updated-dependencies: - dependency-name: grpcio dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/vall-e-x/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/vall-e-x/requirements.txt b/backend/python/vall-e-x/requirements.txt index ac891fe7..d1d0583e 100644 --- a/backend/python/vall-e-x/requirements.txt +++ b/backend/python/vall-e-x/requirements.txt @@ -1,4 +1,4 @@ accelerate -grpcio==1.65.0 +grpcio==1.65.1 protobuf certifi \ No newline at end of file From 36789e9ead9ed4e3caf10f93b13d52aa0b9f35f9 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 23 Jul 2024 07:34:26 +0000 Subject: [PATCH 0039/1851] chore(deps): Bump grpcio from 1.65.0 to 1.65.1 in /backend/python/transformers-musicgen (#2990) chore(deps): Bump grpcio in /backend/python/transformers-musicgen Bumps [grpcio](https://github.com/grpc/grpc) from 1.65.0 to 1.65.1. - [Release notes](https://github.com/grpc/grpc/releases) - [Changelog](https://github.com/grpc/grpc/blob/master/doc/grpc_release_schedule.md) - [Commits](https://github.com/grpc/grpc/compare/v1.65.0...v1.65.1) --- updated-dependencies: - dependency-name: grpcio dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/transformers-musicgen/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/transformers-musicgen/requirements.txt b/backend/python/transformers-musicgen/requirements.txt index 8a969c34..8ffa3c31 100644 --- a/backend/python/transformers-musicgen/requirements.txt +++ b/backend/python/transformers-musicgen/requirements.txt @@ -1,6 +1,6 @@ accelerate transformers -grpcio==1.65.0 +grpcio==1.65.1 protobuf torch scipy==1.14.0 From 9c331239d9a9b2abc447d1f504a9e4c8f14656c3 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 23 Jul 2024 08:16:38 +0000 Subject: [PATCH 0040/1851] chore(deps): Bump grpcio from 1.65.0 to 1.65.1 in /backend/python/autogptq (#2984) chore(deps): Bump grpcio in /backend/python/autogptq Bumps [grpcio](https://github.com/grpc/grpc) from 1.65.0 to 1.65.1. - [Release notes](https://github.com/grpc/grpc/releases) - [Changelog](https://github.com/grpc/grpc/blob/master/doc/grpc_release_schedule.md) - [Commits](https://github.com/grpc/grpc/compare/v1.65.0...v1.65.1) --- updated-dependencies: - dependency-name: grpcio dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/autogptq/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/autogptq/requirements.txt b/backend/python/autogptq/requirements.txt index e416adb2..7a1bf85f 100644 --- a/backend/python/autogptq/requirements.txt +++ b/backend/python/autogptq/requirements.txt @@ -1,6 +1,6 @@ accelerate auto-gptq==0.7.1 -grpcio==1.65.0 +grpcio==1.65.1 protobuf torch certifi From 5e5037f10d87510e76cff558d1048cfae01a4828 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Tue, 23 Jul 2024 10:42:51 +0200 Subject: [PATCH 0041/1851] feat(p2p): warn the user to start with --p2p (#2993) Signed-off-by: Ettore Di Giacinto --- core/http/views/p2p.html | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/core/http/views/p2p.html b/core/http/views/p2p.html index 0396924e..a8c51310 100644 --- a/core/http/views/p2p.html +++ b/core/http/views/p2p.html @@ -16,7 +16,16 @@
LocalAI uses P2P technologies to enable distribution of work between peers. It is possible to share an instance with Federation and/or split the weights of a model across peers (only available with llama.cpp models). You can now share computational resources between your devices or your friends!
- + + {{ if and .IsP2PEnabled (eq .P2PToken "") }} +
+

Warning: P2P mode is disabled or no token was specified

+

You have to enable P2P mode by starting LocalAI with --p2p. Please restart the server with --p2p to generate a new token automatically that can be used to automatically discover other nodes. If you already have a token specify it with export TOKEN=".." + Check out the documentation for more information. +

+
+ {{ else }} +
@@ -128,7 +137,8 @@
- + + {{ end }} From e3cd11cc0a1d84066d774f094192dc65932723c0 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 23 Jul 2024 09:28:33 +0000 Subject: [PATCH 0042/1851] chore(deps): Bump llama-index from 0.10.55 to 0.10.56 in /examples/langchain-chroma (#2986) chore(deps): Bump llama-index in /examples/langchain-chroma Bumps [llama-index](https://github.com/run-llama/llama_index) from 0.10.55 to 0.10.56. - [Release notes](https://github.com/run-llama/llama_index/releases) - [Changelog](https://github.com/run-llama/llama_index/blob/main/CHANGELOG.md) - [Commits](https://github.com/run-llama/llama_index/compare/v0.10.55...v0.10.56) --- updated-dependencies: - dependency-name: llama-index dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain-chroma/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain-chroma/requirements.txt b/examples/langchain-chroma/requirements.txt index 0e6d8c4d..7a316c24 100644 --- a/examples/langchain-chroma/requirements.txt +++ b/examples/langchain-chroma/requirements.txt @@ -1,4 +1,4 @@ langchain==0.2.8 openai==1.35.13 chromadb==0.5.4 -llama-index==0.10.55 \ No newline at end of file +llama-index==0.10.56 \ No newline at end of file From 39de3cf21dad01110c677e88559a2f6ab1990f3c Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 23 Jul 2024 10:15:55 +0000 Subject: [PATCH 0043/1851] chore(deps): Bump grpcio from 1.65.0 to 1.65.1 in /backend/python/mamba (#2989) Bumps [grpcio](https://github.com/grpc/grpc) from 1.65.0 to 1.65.1. - [Release notes](https://github.com/grpc/grpc/releases) - [Changelog](https://github.com/grpc/grpc/blob/master/doc/grpc_release_schedule.md) - [Commits](https://github.com/grpc/grpc/compare/v1.65.0...v1.65.1) --- updated-dependencies: - dependency-name: grpcio dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/mamba/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/mamba/requirements.txt b/backend/python/mamba/requirements.txt index e431ddfe..2aac2cda 100644 --- a/backend/python/mamba/requirements.txt +++ b/backend/python/mamba/requirements.txt @@ -1,6 +1,6 @@ causal-conv1d==1.4.0 mamba-ssm==2.2.2 -grpcio==1.65.0 +grpcio==1.65.1 protobuf certifi transformers \ No newline at end of file From b53947a5bb42deb24d0805f4d677484ecabb78cd Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Tue, 23 Jul 2024 12:33:42 +0200 Subject: [PATCH 0044/1851] chore: :arrow_up: Update ggerganov/llama.cpp (#2992) :arrow_up: Update ggerganov/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index b7df2486..634d78a2 100644 --- a/Makefile +++ b/Makefile @@ -8,7 +8,7 @@ DETECT_LIBS?=true # llama.cpp versions GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be -CPPLLAMA_VERSION?=45f2c19cc57286eead7b232ce8028273a817aa4d +CPPLLAMA_VERSION?=081fe431aa8fb6307145c4feb3eed4f48cab19f8 # gpt4all version GPT4ALL_REPO?=https://github.com/nomic-ai/gpt4all From 703cd08f01ad105ea1677956fd0c3b24690271ec Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 23 Jul 2024 11:00:46 +0000 Subject: [PATCH 0045/1851] chore(deps): Bump langchain-community from 0.2.7 to 0.2.9 in /examples/langchain/langchainpy-localai-example (#2960) chore(deps): Bump langchain-community Bumps [langchain-community](https://github.com/langchain-ai/langchain) from 0.2.7 to 0.2.9. - [Release notes](https://github.com/langchain-ai/langchain/releases) - [Commits](https://github.com/langchain-ai/langchain/compare/langchain-community==0.2.7...langchain-community==0.2.9) --- updated-dependencies: - dependency-name: langchain-community dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain/langchainpy-localai-example/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain/langchainpy-localai-example/requirements.txt b/examples/langchain/langchainpy-localai-example/requirements.txt index 522dbe14..6420d50e 100644 --- a/examples/langchain/langchainpy-localai-example/requirements.txt +++ b/examples/langchain/langchainpy-localai-example/requirements.txt @@ -11,7 +11,7 @@ frozenlist==1.4.1 greenlet==3.0.3 idna==3.7 langchain==0.2.10 -langchain-community==0.2.7 +langchain-community==0.2.9 marshmallow==3.21.3 marshmallow-enum==1.5.1 multidict==6.0.5 From 0314b37cd83b40184739ddd15af5e575c1e4045d Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 23 Jul 2024 11:01:00 +0000 Subject: [PATCH 0046/1851] chore(deps): Bump openai from 1.35.13 to 1.37.0 in /examples/langchain/langchainpy-localai-example (#2961) chore(deps): Bump openai Bumps [openai](https://github.com/openai/openai-python) from 1.35.13 to 1.37.0. - [Release notes](https://github.com/openai/openai-python/releases) - [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md) - [Commits](https://github.com/openai/openai-python/compare/v1.35.13...v1.37.0) --- updated-dependencies: - dependency-name: openai dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain/langchainpy-localai-example/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain/langchainpy-localai-example/requirements.txt b/examples/langchain/langchainpy-localai-example/requirements.txt index 6420d50e..0e03d543 100644 --- a/examples/langchain/langchainpy-localai-example/requirements.txt +++ b/examples/langchain/langchainpy-localai-example/requirements.txt @@ -18,7 +18,7 @@ multidict==6.0.5 mypy-extensions==1.0.0 numexpr==2.10.1 numpy==2.0.1 -openai==1.35.13 +openai==1.37.0 openapi-schema-pydantic==1.2.4 packaging>=23.2 pydantic==2.8.2 From ead69a116ae27688846111819cd4324d06f149b6 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 23 Jul 2024 11:51:05 +0000 Subject: [PATCH 0047/1851] chore(deps): Bump langchain from 0.2.8 to 0.2.10 in /examples/functions (#2975) Bumps [langchain](https://github.com/langchain-ai/langchain) from 0.2.8 to 0.2.10. - [Release notes](https://github.com/langchain-ai/langchain/releases) - [Commits](https://github.com/langchain-ai/langchain/compare/langchain==0.2.8...langchain==0.2.10) --- updated-dependencies: - dependency-name: langchain dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/functions/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/functions/requirements.txt b/examples/functions/requirements.txt index d5e8f2c5..f8afacdc 100644 --- a/examples/functions/requirements.txt +++ b/examples/functions/requirements.txt @@ -1,2 +1,2 @@ -langchain==0.2.8 +langchain==0.2.10 openai==1.37.0 From c7f0743f4815b4168ca096afb0408407391d3cf5 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 23 Jul 2024 12:26:46 +0000 Subject: [PATCH 0048/1851] chore(deps): Bump openai from 1.35.13 to 1.37.0 in /examples/langchain-chroma (#2988) chore(deps): Bump openai in /examples/langchain-chroma Bumps [openai](https://github.com/openai/openai-python) from 1.35.13 to 1.37.0. - [Release notes](https://github.com/openai/openai-python/releases) - [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md) - [Commits](https://github.com/openai/openai-python/compare/v1.35.13...v1.37.0) --- updated-dependencies: - dependency-name: openai dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain-chroma/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain-chroma/requirements.txt b/examples/langchain-chroma/requirements.txt index 7a316c24..17ed9c9a 100644 --- a/examples/langchain-chroma/requirements.txt +++ b/examples/langchain-chroma/requirements.txt @@ -1,4 +1,4 @@ langchain==0.2.8 -openai==1.35.13 +openai==1.37.0 chromadb==0.5.4 llama-index==0.10.56 \ No newline at end of file From 1c96e0b79ec25421845c917aa0de9be2603f983d Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 23 Jul 2024 14:34:07 +0000 Subject: [PATCH 0049/1851] chore(deps): Bump langchain from 0.2.8 to 0.2.10 in /examples/langchain-chroma (#2987) chore(deps): Bump langchain in /examples/langchain-chroma Bumps [langchain](https://github.com/langchain-ai/langchain) from 0.2.8 to 0.2.10. - [Release notes](https://github.com/langchain-ai/langchain/releases) - [Commits](https://github.com/langchain-ai/langchain/compare/langchain==0.2.8...langchain==0.2.10) --- updated-dependencies: - dependency-name: langchain dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain-chroma/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain-chroma/requirements.txt b/examples/langchain-chroma/requirements.txt index 17ed9c9a..89ca2db7 100644 --- a/examples/langchain-chroma/requirements.txt +++ b/examples/langchain-chroma/requirements.txt @@ -1,4 +1,4 @@ -langchain==0.2.8 +langchain==0.2.10 openai==1.37.0 chromadb==0.5.4 llama-index==0.10.56 \ No newline at end of file From a9757fb0571668560cef55892d8661f35c961ebc Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Tue, 23 Jul 2024 23:35:31 +0200 Subject: [PATCH 0050/1851] fix(cuda): downgrade to 12.0 to increase compatibility range (#2994) * fix(cuda): downgrade to 12.0 to increase compatibility range Signed-off-by: Ettore Di Giacinto * improve messaging Signed-off-by: Ettore Di Giacinto --------- Signed-off-by: Ettore Di Giacinto --- .github/workflows/image-pr.yml | 4 ++-- .github/workflows/image.yml | 8 ++++---- .github/workflows/release.yaml | 1 - Dockerfile | 2 +- Makefile | 2 +- pkg/model/initializers.go | 6 +++--- 6 files changed, 11 insertions(+), 12 deletions(-) diff --git a/.github/workflows/image-pr.yml b/.github/workflows/image-pr.yml index 290f8793..8ebaa1b2 100644 --- a/.github/workflows/image-pr.yml +++ b/.github/workflows/image-pr.yml @@ -47,7 +47,7 @@ jobs: # makeflags: "--jobs=3 --output-sync=target" - build-type: 'cublas' cuda-major-version: "12" - cuda-minor-version: "4" + cuda-minor-version: "0" platforms: 'linux/amd64' tag-latest: 'false' tag-suffix: '-cublas-cuda12-ffmpeg' @@ -120,7 +120,7 @@ jobs: # makeflags: "--jobs=3 --output-sync=target" # - build-type: 'cublas' # cuda-major-version: "12" - # cuda-minor-version: "4" + # cuda-minor-version: "0" # platforms: 'linux/amd64' # tag-latest: 'false' # tag-suffix: '-cublas-cuda12-ffmpeg-core' diff --git a/.github/workflows/image.yml b/.github/workflows/image.yml index 73899e15..395d7761 100644 --- a/.github/workflows/image.yml +++ b/.github/workflows/image.yml @@ -75,7 +75,7 @@ jobs: makeflags: "--jobs=3 --output-sync=target" - build-type: 'cublas' cuda-major-version: "12" - cuda-minor-version: "4" + cuda-minor-version: "0" platforms: 'linux/amd64' tag-latest: 'false' tag-suffix: '-cublas-cuda12' @@ -100,7 +100,7 @@ jobs: makeflags: "--jobs=3 --output-sync=target" - build-type: 'cublas' cuda-major-version: "12" - cuda-minor-version: "4" + cuda-minor-version: "0" platforms: 'linux/amd64' tag-latest: 'auto' tag-suffix: '-cublas-cuda12-ffmpeg' @@ -285,7 +285,7 @@ jobs: makeflags: "--jobs=4 --output-sync=target" - build-type: 'cublas' cuda-major-version: "12" - cuda-minor-version: "4" + cuda-minor-version: "0" platforms: 'linux/amd64' tag-latest: 'false' tag-suffix: '-cublas-cuda12-core' @@ -307,7 +307,7 @@ jobs: makeflags: "--jobs=4 --output-sync=target" - build-type: 'cublas' cuda-major-version: "12" - cuda-minor-version: "4" + cuda-minor-version: "0" platforms: 'linux/amd64' tag-latest: 'false' tag-suffix: '-cublas-cuda12-ffmpeg-core' diff --git a/.github/workflows/release.yaml b/.github/workflows/release.yaml index faed2b81..5c883db4 100644 --- a/.github/workflows/release.yaml +++ b/.github/workflows/release.yaml @@ -31,7 +31,6 @@ jobs: with: go-version: '1.21.x' cache: false - - name: Dependencies run: | sudo apt-get update diff --git a/Dockerfile b/Dockerfile index fcad8343..a0feadd9 100644 --- a/Dockerfile +++ b/Dockerfile @@ -99,7 +99,7 @@ FROM requirements-${IMAGE_TYPE} AS requirements-drivers ARG BUILD_TYPE ARG CUDA_MAJOR_VERSION=12 -ARG CUDA_MINOR_VERSION=4 +ARG CUDA_MINOR_VERSION=0 ENV BUILD_TYPE=${BUILD_TYPE} diff --git a/Makefile b/Makefile index 634d78a2..297938ae 100644 --- a/Makefile +++ b/Makefile @@ -480,7 +480,7 @@ prepare-e2e: mkdir -p $(TEST_DIR) cp -rfv $(abspath ./tests/e2e-fixtures)/gpu.yaml $(TEST_DIR)/gpu.yaml test -e $(TEST_DIR)/ggllm-test-model.bin || wget -q https://huggingface.co/TheBloke/CodeLlama-7B-Instruct-GGUF/resolve/main/codellama-7b-instruct.Q2_K.gguf -O $(TEST_DIR)/ggllm-test-model.bin - docker build --build-arg GRPC_BACKENDS="$(GRPC_BACKENDS)" --build-arg IMAGE_TYPE=core --build-arg BUILD_TYPE=$(BUILD_TYPE) --build-arg CUDA_MAJOR_VERSION=12 --build-arg CUDA_MINOR_VERSION=4 --build-arg FFMPEG=true -t localai-tests . + docker build --build-arg GRPC_BACKENDS="$(GRPC_BACKENDS)" --build-arg IMAGE_TYPE=core --build-arg BUILD_TYPE=$(BUILD_TYPE) --build-arg CUDA_MAJOR_VERSION=12 --build-arg CUDA_MINOR_VERSION=0 --build-arg FFMPEG=true -t localai-tests . run-e2e-image: ls -liah $(abspath ./tests/e2e-fixtures) diff --git a/pkg/model/initializers.go b/pkg/model/initializers.go index 901b4d99..88a08f28 100644 --- a/pkg/model/initializers.go +++ b/pkg/model/initializers.go @@ -212,7 +212,7 @@ func selectGRPCProcess(backend, assetDir string, f16 bool) string { grpcProcess = p foundCUDA = true } else { - log.Info().Msgf("GPU device found but no CUDA backend present") + log.Debug().Msgf("Nvidia GPU device found, no embedded CUDA variant found. You can ignore this message if you are using container with CUDA support") } } if strings.Contains(gpu.String(), "amd") { @@ -222,7 +222,7 @@ func selectGRPCProcess(backend, assetDir string, f16 bool) string { grpcProcess = p foundAMDGPU = true } else { - log.Info().Msgf("GPU device found but no HIPBLAS backend present") + log.Debug().Msgf("AMD GPU device found, no embedded HIPBLAS variant found. You can ignore this message if you are using container with HIPBLAS support") } } if strings.Contains(gpu.String(), "intel") { @@ -236,7 +236,7 @@ func selectGRPCProcess(backend, assetDir string, f16 bool) string { grpcProcess = p foundIntelGPU = true } else { - log.Info().Msgf("GPU device found but no Intel backend present") + log.Debug().Msgf("Intel GPU device found, no embedded SYCL variant found. You can ignore this message if you are using container with SYCL support") } } } From 89484efaed97ee64ae88d33051cacd3bbd2b8ae9 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Wed, 24 Jul 2024 12:27:49 +0200 Subject: [PATCH 0051/1851] docs: update distributed_inferencing.md Signed-off-by: Ettore Di Giacinto --- .../docs/features/distributed_inferencing.md | 23 ++++++++++++++----- 1 file changed, 17 insertions(+), 6 deletions(-) diff --git a/docs/content/docs/features/distributed_inferencing.md b/docs/content/docs/features/distributed_inferencing.md index 1ab3fa55..2de7ae3c 100644 --- a/docs/content/docs/features/distributed_inferencing.md +++ b/docs/content/docs/features/distributed_inferencing.md @@ -122,12 +122,6 @@ The server logs should indicate that new workers are being discovered. ![output](https://github.com/mudler/LocalAI/assets/2420543/8ca277cf-c208-4562-8929-808b2324b584) -## Notes - -- If running in p2p mode with container images, make sure you start the container with `--net host` or `network_mode: host` in the docker-compose file. -- Only a single model is supported currently. -- Ensure the server detects new workers before starting inference. Currently, additional workers cannot be added once inference has begun. -- For more details on the implementation, refer to [LocalAI pull request #2343](https://github.com/mudler/LocalAI/pull/2343) ## Environment Variables @@ -138,3 +132,20 @@ There are options that can be tweaked or parameters that can be set using enviro | **LOCALAI_P2P_DISABLE_DHT** | Set to "true" to disable DHT and enable p2p layer to be local only (mDNS) | | **LOCALAI_P2P_DISABLE_LIMITS** | Set to "true" to disable connection limits and resources management | | **LOCALAI_P2P_TOKEN** | Set the token for the p2p network | + +## Architecture + +LocalAI uses https://github.com/libp2p/go-libp2p under the hood, the same project powering IPFS. Differently from other frameworks, LocalAI uses peer2peer without a single master server, but rather it uses sub/gossip and ledger functionalities to achieve consensus across different peers. + +[EdgeVPN](https://github.com/mudler/edgevpn) is used as a library to establish the network and expose the ledger functionality under a shared token to ease out automatic discovery and have separated, private peer2peer networks. + +The weights are split proportional to the memory when running into worker mode, when in federation mode each request is split to every node which have to load the model fully. + +## Notes + +- If running in p2p mode with container images, make sure you start the container with `--net host` or `network_mode: host` in the docker-compose file. +- Only a single model is supported currently. +- Ensure the server detects new workers before starting inference. Currently, additional workers cannot be added once inference has begun. +- For more details on the implementation, refer to [LocalAI pull request #2343](https://github.com/mudler/LocalAI/pull/2343) + + From bd900945f7fec40ab3398c6a34693ca271eb556f Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Wed, 24 Jul 2024 12:35:52 +0200 Subject: [PATCH 0052/1851] fix(llama.cpp): do not set anymore lora_base (#2999) Signed-off-by: Ettore Di Giacinto --- backend/cpp/llama/grpc-server.cpp | 1 - 1 file changed, 1 deletion(-) diff --git a/backend/cpp/llama/grpc-server.cpp b/backend/cpp/llama/grpc-server.cpp index 1cff6b8a..cb5c85f1 100644 --- a/backend/cpp/llama/grpc-server.cpp +++ b/backend/cpp/llama/grpc-server.cpp @@ -2259,7 +2259,6 @@ static void params_parse(const backend::ModelOptions* request, // get the directory of modelfile std::string model_dir = params.model.substr(0, params.model.find_last_of("/\\")); params.lora_adapter.push_back(std::make_tuple(model_dir + "/"+request->loraadapter(), scale_factor)); - params.lora_base = model_dir + "/"+request->lorabase(); } params.use_mlock = request->mlock(); params.use_mmap = request->mmap(); From 9fee46207ac4dd73354a58f47a58cb4e691f4773 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Wed, 24 Jul 2024 12:48:14 +0200 Subject: [PATCH 0053/1851] models(gallery): add llama3.1 70b and 8b (#3000) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 40 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 40 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index 31af59a3..cc654885 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -1,4 +1,44 @@ --- +## LLama3.1 +- &llama31 + url: "github:mudler/LocalAI/gallery/llama3-instruct.yaml@master" + icon: https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/aJJxKus1wP5N-euvHEUq7.png + name: "llama3-8b-instruct" + license: llama3.1 + description: | + The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models in 8B, 70B and 405B sizes (text in/text out). The Llama 3.1 instruction tuned text only models (8B, 70B, 405B) are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks. + + Model developer: Meta + + Model Architecture: Llama 3.1 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. + urls: + - https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct + - https://huggingface.co/MaziyarPanahi/Meta-Llama-3.1-8B-Instruct-GGUF + tags: + - llm + - gguf + - gpu + - cpu + - llama3.1 + overrides: + parameters: + model: Meta-Llama-3.1-8B-Instruct.Q4_K_M.gguf + files: + - filename: Meta-Llama-3.1-8B-Instruct.Q4_K_M.gguf + sha256: c2f17f44af962660d1ad4cb1af91a731f219f3b326c2b14441f9df1f347f2815 + uri: huggingface://MaziyarPanahi/Meta-Llama-3.1-8B-Instruct-GGUF/Meta-Llama-3.1-8B-Instruct.Q4_K_M.gguf +- !!merge <<: *llama31 + name: "meta-llama-3.1-70b-instruct" + urls: + - https://huggingface.co/meta-llama/Meta-Llama-3.1-70B-Instruct + - https://huggingface.co/MaziyarPanahi/Meta-Llama-3.1-70B-Instruct-GGUF + overrides: + parameters: + model: Meta-Llama-3.1-70B-Instruct.Q4_K_M.gguf + files: + - filename: Meta-Llama-3.1-70B-Instruct.Q4_K_M.gguf + sha256: 3f16ab17da4521fe3ed7c5d7beed960d3fe7b5b64421ee9650aa53d6b649ccab + uri: huggingface://MaziyarPanahi/Meta-Llama-3.1-70B-Instruct-GGUF/Meta-Llama-3.1-70B-Instruct.Q4_K_M.gguf ## Deepseek - &deepseek url: "github:mudler/LocalAI/gallery/deepseek.yaml@master" From 0802895cd20cf0ae482c66ef2d83a2e5244f5e27 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Wed, 24 Jul 2024 14:32:54 +0200 Subject: [PATCH 0054/1851] Update index.yaml Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gallery/index.yaml b/gallery/index.yaml index cc654885..fa61393c 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -3,7 +3,7 @@ - &llama31 url: "github:mudler/LocalAI/gallery/llama3-instruct.yaml@master" icon: https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/aJJxKus1wP5N-euvHEUq7.png - name: "llama3-8b-instruct" + name: "meta-llama-3.1-8b-instruct" license: llama3.1 description: | The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models in 8B, 70B and 405B sizes (text in/text out). The Llama 3.1 instruction tuned text only models (8B, 70B, 405B) are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks. From 80ae919dbe1c8e8022a9daecbdabf00482cbdd38 Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Wed, 24 Jul 2024 15:37:08 +0200 Subject: [PATCH 0055/1851] chore: :arrow_up: Update ggerganov/llama.cpp (#2995) :arrow_up: Update ggerganov/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 297938ae..55ef43c6 100644 --- a/Makefile +++ b/Makefile @@ -8,7 +8,7 @@ DETECT_LIBS?=true # llama.cpp versions GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be -CPPLLAMA_VERSION?=081fe431aa8fb6307145c4feb3eed4f48cab19f8 +CPPLLAMA_VERSION?=79167d9e49aef9caa98e13ee7ca067ec9f88b4b5 # gpt4all version GPT4ALL_REPO?=https://github.com/nomic-ai/gpt4all From 4a69ef305245d5e5172de247c34e2a39b73c06f5 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Wed, 24 Jul 2024 23:40:08 +0200 Subject: [PATCH 0056/1851] models(gallery): add llama3.1-claude (#3005) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index fa61393c..870242f0 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -39,6 +39,20 @@ - filename: Meta-Llama-3.1-70B-Instruct.Q4_K_M.gguf sha256: 3f16ab17da4521fe3ed7c5d7beed960d3fe7b5b64421ee9650aa53d6b649ccab uri: huggingface://MaziyarPanahi/Meta-Llama-3.1-70B-Instruct-GGUF/Meta-Llama-3.1-70B-Instruct.Q4_K_M.gguf +- !!merge <<: *llama31 + name: "meta-llama-3.1-8b-claude-imat" + urls: + - https://huggingface.co/Undi95/Meta-Llama-3.1-8B-Claude + - https://huggingface.co/InferenceIllusionist/Meta-Llama-3.1-8B-Claude-iMat-GGUF + description: | + Meta-Llama-3.1-8B-Claude-iMat-GGUF: Quantized from Meta-Llama-3.1-8B-Claude fp16. Weighted quantizations were creating using fp16 GGUF and groups_merged.txt in 88 chunks and n_ctx=512. Static fp16 will also be included in repo. For a brief rundown of iMatrix quant performance, please see this PR. All quants are verified working prior to uploading to repo for your safety and convenience. + overrides: + parameters: + model: Meta-Llama-3.1-8B-Claude-iMat-Q4_K_M.gguf + files: + - filename: Meta-Llama-3.1-8B-Claude-iMat-Q4_K_M.gguf + sha256: 8de80021b9438f0925a41ae73f77cb73fcfa30090e03a0919ce23d2b9818e9c7 + uri: huggingface://InferenceIllusionist/Meta-Llama-3.1-8B-Claude-iMat-GGUF/Meta-Llama-3.1-8B-Claude-iMat-Q4_K_M.gguf ## Deepseek - &deepseek url: "github:mudler/LocalAI/gallery/deepseek.yaml@master" From 9031d2b9eb549502139c2b73d5d5b0f77f703cff Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Thu, 25 Jul 2024 00:32:10 +0200 Subject: [PATCH 0057/1851] docs: :arrow_up: update docs version mudler/LocalAI (#3002) :arrow_up: Update docs version mudler/LocalAI Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- docs/data/version.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/data/version.json b/docs/data/version.json index fff9fa0c..efda370f 100644 --- a/docs/data/version.json +++ b/docs/data/version.json @@ -1,3 +1,3 @@ { - "version": "v2.19.1" + "version": "v2.19.2" } From 717cc6fe1a5a27b0335305c78cea5109bcf158da Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Thu, 25 Jul 2024 00:47:38 +0200 Subject: [PATCH 0058/1851] chore: :arrow_up: Update ggerganov/llama.cpp (#3003) :arrow_up: Update ggerganov/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 55ef43c6..f1862aef 100644 --- a/Makefile +++ b/Makefile @@ -8,7 +8,7 @@ DETECT_LIBS?=true # llama.cpp versions GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be -CPPLLAMA_VERSION?=79167d9e49aef9caa98e13ee7ca067ec9f88b4b5 +CPPLLAMA_VERSION?=68504f0970db5a3602d176953690f503059906b1 # gpt4all version GPT4ALL_REPO?=https://github.com/nomic-ai/gpt4all From 5eda7f578d232c0a8151e18e679cfb64c249c2de Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Thu, 25 Jul 2024 08:41:00 +0200 Subject: [PATCH 0059/1851] refactor: break down json grammar parser in different files (#3004) * refactor: break down json grammar parser in different files Signed-off-by: Ettore Di Giacinto * fix: patch to `refactor_grammars` - propagate errors (#3006) propagate errors around Signed-off-by: Dave Lee --------- Signed-off-by: Ettore Di Giacinto Signed-off-by: Dave Lee Co-authored-by: Dave --- core/http/endpoints/openai/chat.go | 10 +- pkg/functions/bnf_rules.go | 47 ++++++ pkg/functions/function_structure.go | 25 +++ pkg/functions/functions.go | 17 ++ pkg/functions/functions_suite_test.go | 14 +- pkg/functions/grammar_json_schema.go | 179 +++++++--------------- pkg/functions/grammar_json_schema_test.go | 51 +++--- pkg/functions/json_mode.go | 28 ++++ 8 files changed, 218 insertions(+), 153 deletions(-) create mode 100644 pkg/functions/bnf_rules.go create mode 100644 pkg/functions/function_structure.go create mode 100644 pkg/functions/json_mode.go diff --git a/core/http/endpoints/openai/chat.go b/core/http/endpoints/openai/chat.go index f63a9913..c7afb7bf 100644 --- a/core/http/endpoints/openai/chat.go +++ b/core/http/endpoints/openai/chat.go @@ -226,9 +226,15 @@ func ChatEndpoint(cl *config.BackendConfigLoader, ml *model.ModelLoader, startup // Update input grammar jsStruct := funcs.ToJSONStructure(config.FunctionsConfig.FunctionNameKey, config.FunctionsConfig.FunctionNameKey) - config.Grammar = jsStruct.Grammar(config.FunctionsConfig.GrammarConfig.Options()...) + g, err := jsStruct.Grammar(config.FunctionsConfig.GrammarConfig.Options()...) + if err == nil { + config.Grammar = g + } case input.JSONFunctionGrammarObject != nil: - config.Grammar = input.JSONFunctionGrammarObject.Grammar(config.FunctionsConfig.GrammarConfig.Options()...) + g, err := input.JSONFunctionGrammarObject.Grammar(config.FunctionsConfig.GrammarConfig.Options()...) + if err == nil { + config.Grammar = g + } default: // Force picking one of the functions by the request if config.FunctionToCall() != "" { diff --git a/pkg/functions/bnf_rules.go b/pkg/functions/bnf_rules.go new file mode 100644 index 00000000..13aa3654 --- /dev/null +++ b/pkg/functions/bnf_rules.go @@ -0,0 +1,47 @@ +package functions + +import "regexp" + +var ( + PRIMITIVE_RULES = map[string]string{ + "boolean": `("true" | "false") space`, + "number": `("-"? ([0-9] | [1-9] [0-9]*)) ("." [0-9]+)? ([eE] [-+]? [0-9]+)? space`, + "integer": `("-"? ([0-9] | [1-9] [0-9]*)) space`, + "string": `"\"" ( + [^"\\] | + "\\" (["\\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F]) + )* "\"" space`, + // TODO: we shouldn't forbid \" and \\ or all unicode and have this branch here, + // however, if we don't have it, the grammar will be ambiguous and + // empirically results are way worse. + "freestring": `( + [^\x00] | + "\\" (["\\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F]) + )* space`, + "null": `"null" space`, + } + + INVALID_RULE_CHARS_RE = regexp.MustCompile(`[^a-zA-Z0-9-]+`) + GRAMMAR_LITERAL_ESCAPE_RE = regexp.MustCompile(`[\r\n"]`) + GRAMMAR_LITERAL_ESCAPES = map[string]string{ + "\r": `\r`, + "\n": `\n`, + `"`: `\"`, + } +) + +const ( + SPACE_RULE = `" "?` + + arrayNewLines = `arr ::= + "[\n" ( + realvalue + (",\n" realvalue)* + )? "]"` + + array = `arr ::= + "[" ( + realvalue + ("," realvalue)* + )? "]"` +) diff --git a/pkg/functions/function_structure.go b/pkg/functions/function_structure.go new file mode 100644 index 00000000..62cc68fa --- /dev/null +++ b/pkg/functions/function_structure.go @@ -0,0 +1,25 @@ +package functions + +import "encoding/json" + +type Item struct { + Type string `json:"type"` + Properties map[string]interface{} `json:"properties"` +} + +type JSONFunctionStructure struct { + OneOf []Item `json:"oneOf,omitempty"` + AnyOf []Item `json:"anyOf,omitempty"` + Defs map[string]interface{} `json:"$defs,omitempty"` +} + +func (j JSONFunctionStructure) Grammar(options ...func(*GrammarOption)) (string, error) { + grammarOpts := &GrammarOption{} + grammarOpts.Apply(options...) + + dat, err := json.Marshal(j) + if err != nil { + return "", err + } + return NewJSONSchemaConverter(grammarOpts.PropOrder).GrammarFromBytes(dat, options...) +} diff --git a/pkg/functions/functions.go b/pkg/functions/functions.go index 49e9fc93..2690b8ec 100644 --- a/pkg/functions/functions.go +++ b/pkg/functions/functions.go @@ -18,6 +18,15 @@ type Function struct { } type Functions []Function +type FunctionName struct { + Const string `json:"const"` +} + +type Argument struct { + Type string `json:"type"` + Properties map[string]interface{} `json:"properties"` +} + type Tool struct { Type string `json:"type"` Function Function `json:"function,omitempty"` @@ -86,3 +95,11 @@ func (f Functions) Select(name string) Functions { return funcs } + +func jsonString(v interface{}) (string, error) { + b, err := json.Marshal(v) + if err != nil { + return "", err + } + return string(b), nil +} diff --git a/pkg/functions/functions_suite_test.go b/pkg/functions/functions_suite_test.go index 8964b1c8..59a90ab0 100644 --- a/pkg/functions/functions_suite_test.go +++ b/pkg/functions/functions_suite_test.go @@ -1,8 +1,10 @@ -package functions +package functions_test import ( "testing" + . "github.com/mudler/LocalAI/pkg/functions" + . "github.com/onsi/ginkgo/v2" . "github.com/onsi/gomega" ) @@ -11,3 +13,13 @@ func TestGrammar(t *testing.T) { RegisterFailHandler(Fail) RunSpecs(t, "Grammar test suite") } + +func createFunction(field1 string, field2 string, name string, properties map[string]interface{}) map[string]interface{} { + property := map[string]interface{}{} + property[field1] = FunctionName{Const: name} + property[field2] = Argument{ + Type: "object", + Properties: properties, + } + return property +} diff --git a/pkg/functions/grammar_json_schema.go b/pkg/functions/grammar_json_schema.go index 7356d01d..5ffc0ba5 100644 --- a/pkg/functions/grammar_json_schema.go +++ b/pkg/functions/grammar_json_schema.go @@ -5,70 +5,12 @@ package functions import ( "encoding/json" "fmt" - "regexp" "sort" "strings" "github.com/mudler/LocalAI/pkg/utils" ) -const ( - JSONBNF = `root ::= object -value ::= object | array | string | number | ("true" | "false" | "null") ws - -object ::= - "{" ws ( - string ":" ws value - ("," ws string ":" ws value)* - )? "}" ws - -array ::= - "[" ws ( - value - ("," ws value)* - )? "]" ws - -string ::= - "\"" ( - [^"\\] | - "\\" (["\\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F]) # escapes - )* "\"" ws - -number ::= ("-"? ([0-9] | [1-9] [0-9]*)) ("." [0-9]+)? ([eE] [-+]? [0-9]+)? ws - -ws ::= ([ \t\n] ws)?` -) - -var ( - SPACE_RULE = `" "?` - - PRIMITIVE_RULES = map[string]string{ - "boolean": `("true" | "false") space`, - "number": `("-"? ([0-9] | [1-9] [0-9]*)) ("." [0-9]+)? ([eE] [-+]? [0-9]+)? space`, - "integer": `("-"? ([0-9] | [1-9] [0-9]*)) space`, - "string": `"\"" ( - [^"\\] | - "\\" (["\\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F]) - )* "\"" space`, - // TODO: we shouldn't forbid \" and \\ or all unicode and have this branch here, - // however, if we don't have it, the grammar will be ambiguous and - // empirically results are way worse. - "freestring": `( - [^\x00] | - "\\" (["\\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F]) - )* space`, - "null": `"null" space`, - } - - INVALID_RULE_CHARS_RE = regexp.MustCompile(`[^a-zA-Z0-9-]+`) - GRAMMAR_LITERAL_ESCAPE_RE = regexp.MustCompile(`[\r\n"]`) - GRAMMAR_LITERAL_ESCAPES = map[string]string{ - "\r": `\r`, - "\n": `\n`, - `"`: `\"`, - } -) - type JSONSchemaConverter struct { propOrder map[string]int rules map[string]string @@ -90,11 +32,15 @@ func NewJSONSchemaConverter(propOrder string) *JSONSchemaConverter { } } -func (sc *JSONSchemaConverter) formatLiteral(literal interface{}) string { - escaped := GRAMMAR_LITERAL_ESCAPE_RE.ReplaceAllStringFunc(jsonString(literal), func(match string) string { +func (sc *JSONSchemaConverter) formatLiteral(literal interface{}) (string, error) { + jLiteral, err := jsonString(literal) + if err != nil { + return "", err + } + escaped := GRAMMAR_LITERAL_ESCAPE_RE.ReplaceAllStringFunc(jLiteral, func(match string) string { return GRAMMAR_LITERAL_ESCAPES[match] }) - return fmt.Sprintf(`"%s"`, escaped) + return fmt.Sprintf(`"%s"`, escaped), nil } func (sc *JSONSchemaConverter) addRule(name, rule string) string { @@ -114,18 +60,6 @@ func (sc *JSONSchemaConverter) addRule(name, rule string) string { return key } -const arrayNewLines = `arr ::= - "[\n" ( - realvalue - (",\n" realvalue)* - )? "]"` - -const array = `arr ::= - "[" ( - realvalue - ("," realvalue)* - )? "]"` - func (sc *JSONSchemaConverter) finalizeGrammar(options ...func(*GrammarOption)) string { grammarOpts := &GrammarOption{} @@ -210,7 +144,7 @@ func (sc *JSONSchemaConverter) finalizeGrammar(options ...func(*GrammarOption)) return strings.Join(lines, "\n") } -func (sc *JSONSchemaConverter) visit(schema map[string]interface{}, name string, rootSchema map[string]interface{}) string { +func (sc *JSONSchemaConverter) visit(schema map[string]interface{}, name string, rootSchema map[string]interface{}) (string, error) { st, existType := schema["type"] var schemaType string if existType { @@ -229,31 +163,44 @@ func (sc *JSONSchemaConverter) visit(schema map[string]interface{}, name string, if oneOfExists { for i, altSchema := range oneOfSchemas { - alternative := sc.visit(altSchema.(map[string]interface{}), fmt.Sprintf("%s-%d", ruleName, i), rootSchema) + alternative, err := sc.visit(altSchema.(map[string]interface{}), fmt.Sprintf("%s-%d", ruleName, i), rootSchema) + if err != nil { + return "", err + } alternatives = append(alternatives, alternative) } } else if anyOfExists { for i, altSchema := range anyOfSchemas { - alternative := sc.visit(altSchema.(map[string]interface{}), fmt.Sprintf("%s-%d", ruleName, i), rootSchema) + alternative, err := sc.visit(altSchema.(map[string]interface{}), fmt.Sprintf("%s-%d", ruleName, i), rootSchema) + if err != nil { + return "", err + } alternatives = append(alternatives, alternative) } } rule := strings.Join(alternatives, " | ") - return sc.addRule(ruleName, rule) + return sc.addRule(ruleName, rule), nil } else if ref, exists := schema["$ref"].(string); exists { referencedSchema := sc.resolveReference(ref, rootSchema) return sc.visit(referencedSchema, name, rootSchema) } else if constVal, exists := schema["const"]; exists { - return sc.addRule(ruleName, sc.formatLiteral(constVal)) + literal, err := sc.formatLiteral((constVal)) + if err != nil { + return "", err + } + return sc.addRule(ruleName, literal), nil } else if enumVals, exists := schema["enum"].([]interface{}); exists { var enumRules []string for _, enumVal := range enumVals { - enumRule := sc.formatLiteral(enumVal) + enumRule, err := sc.formatLiteral(enumVal) + if err != nil { + return "", err + } enumRules = append(enumRules, enumRule) } rule := strings.Join(enumRules, " | ") - return sc.addRule(ruleName, rule) + return sc.addRule(ruleName, rule), nil } else if properties, exists := schema["properties"].(map[string]interface{}); schemaType == "object" && exists { propOrder := sc.propOrder var propPairs []struct { @@ -283,21 +230,30 @@ func (sc *JSONSchemaConverter) visit(schema map[string]interface{}, name string, for i, propPair := range propPairs { propName := propPair.propName propSchema := propPair.propSchema - propRuleName := sc.visit(propSchema, fmt.Sprintf("%s-%s", ruleName, propName), rootSchema) - + propRuleName, err := sc.visit(propSchema, fmt.Sprintf("%s-%s", ruleName, propName), rootSchema) + if err != nil { + return "", err + } + lPropName, err := sc.formatLiteral(propName) + if err != nil { + return "", err + } if i > 0 { rule.WriteString(` "," space`) } - rule.WriteString(fmt.Sprintf(` %s space ":" space %s`, sc.formatLiteral(propName), propRuleName)) + rule.WriteString(fmt.Sprintf(` %s space ":" space %s`, lPropName, propRuleName)) } rule.WriteString(` "}" space`) - return sc.addRule(ruleName, rule.String()) + return sc.addRule(ruleName, rule.String()), nil } else if items, exists := schema["items"].(map[string]interface{}); schemaType == "array" && exists { - itemRuleName := sc.visit(items, fmt.Sprintf("%s-item", ruleName), rootSchema) + itemRuleName, err := sc.visit(items, fmt.Sprintf("%s-item", ruleName), rootSchema) + if err != nil { + return "", err + } rule := fmt.Sprintf(`"[" space (%s ("," space %s)*)? "]" space`, itemRuleName, itemRuleName) - return sc.addRule(ruleName, rule) + return sc.addRule(ruleName, rule), nil } else { primitiveRule, exists := PRIMITIVE_RULES[schemaType] if !exists { @@ -306,7 +262,7 @@ func (sc *JSONSchemaConverter) visit(schema map[string]interface{}, name string, if ruleName == "root" { schemaType = "root" } - return sc.addRule(schemaType, primitiveRule) + return sc.addRule(schemaType, primitiveRule), nil } } func (sc *JSONSchemaConverter) resolveReference(ref string, rootSchema map[string]interface{}) map[string]interface{} { @@ -332,47 +288,20 @@ func (sc *JSONSchemaConverter) resolveReference(ref string, rootSchema map[strin return def } -func (sc *JSONSchemaConverter) Grammar(schema map[string]interface{}, options ...func(*GrammarOption)) string { +func (sc *JSONSchemaConverter) Grammar(schema map[string]interface{}, options ...func(*GrammarOption)) (string, error) { sc.addRule("freestring", PRIMITIVE_RULES["freestring"]) - sc.visit(schema, "", schema) - return sc.finalizeGrammar(options...) + _, err := sc.visit(schema, "", schema) + if err != nil { + return "", err + } + return sc.finalizeGrammar(options...), nil } -func (sc *JSONSchemaConverter) GrammarFromBytes(b []byte, options ...func(*GrammarOption)) string { +func (sc *JSONSchemaConverter) GrammarFromBytes(b []byte, options ...func(*GrammarOption)) (string, error) { var schema map[string]interface{} - _ = json.Unmarshal(b, &schema) + err := json.Unmarshal(b, &schema) + if err != nil { + return "", err + } return sc.Grammar(schema, options...) } - -func jsonString(v interface{}) string { - b, _ := json.Marshal(v) - return string(b) -} - -type FunctionName struct { - Const string `json:"const"` -} - -type Argument struct { - Type string `json:"type"` - Properties map[string]interface{} `json:"properties"` -} - -type Item struct { - Type string `json:"type"` - Properties map[string]interface{} `json:"properties"` -} - -type JSONFunctionStructure struct { - OneOf []Item `json:"oneOf,omitempty"` - AnyOf []Item `json:"anyOf,omitempty"` - Defs map[string]interface{} `json:"$defs,omitempty"` -} - -func (j JSONFunctionStructure) Grammar(options ...func(*GrammarOption)) string { - grammarOpts := &GrammarOption{} - grammarOpts.Apply(options...) - - dat, _ := json.Marshal(j) - return NewJSONSchemaConverter(grammarOpts.PropOrder).GrammarFromBytes(dat, options...) -} diff --git a/pkg/functions/grammar_json_schema_test.go b/pkg/functions/grammar_json_schema_test.go index bf52bd8d..56c5fe1e 100644 --- a/pkg/functions/grammar_json_schema_test.go +++ b/pkg/functions/grammar_json_schema_test.go @@ -3,22 +3,11 @@ package functions_test import ( "strings" - "github.com/mudler/LocalAI/pkg/functions" . "github.com/mudler/LocalAI/pkg/functions" . "github.com/onsi/ginkgo/v2" . "github.com/onsi/gomega" ) -func createFunction(field1 string, field2 string, name string, properties map[string]interface{}) map[string]interface{} { - property := map[string]interface{}{} - property[field1] = FunctionName{Const: name} - property[field2] = Argument{ - Type: "object", - Properties: properties, - } - return property -} - var testFunctions = []Item{ { Type: "object", @@ -245,7 +234,8 @@ root-1-name ::= "\"search\""` var _ = Describe("JSON schema grammar tests", func() { Context("JSON", func() { It("generates a valid grammar from JSON schema", func() { - grammar := NewJSONSchemaConverter("").GrammarFromBytes([]byte(testInput1)) + grammar, err := NewJSONSchemaConverter("").GrammarFromBytes([]byte(testInput1)) + Expect(err).To(BeNil()) results := strings.Split(inputResult1, "\n") for _, r := range results { if r != "" { @@ -255,7 +245,8 @@ var _ = Describe("JSON schema grammar tests", func() { Expect(len(results)).To(Equal(len(strings.Split(grammar, "\n")))) }) It("generates a valid grammar from JSON schema", func() { - grammar := NewJSONSchemaConverter("").GrammarFromBytes([]byte(testInput2)) + grammar, err := NewJSONSchemaConverter("").GrammarFromBytes([]byte(testInput2)) + Expect(err).To(BeNil()) results := strings.Split(inputResult3, "\n") for _, r := range results { if r != "" { @@ -269,7 +260,8 @@ var _ = Describe("JSON schema grammar tests", func() { structuredGrammar := JSONFunctionStructure{ OneOf: testFunctions} - grammar := structuredGrammar.Grammar() + grammar, err := structuredGrammar.Grammar() + Expect(err).To(BeNil()) results := strings.Split(inputResult1, "\n") for _, r := range results { if r != "" { @@ -283,7 +275,8 @@ var _ = Describe("JSON schema grammar tests", func() { structuredGrammar := JSONFunctionStructure{ OneOf: testFunctions} - grammar := structuredGrammar.Grammar(functions.EnableMaybeArray) + grammar, err := structuredGrammar.Grammar(EnableMaybeArray) + Expect(err).To(BeNil()) results := strings.Split( strings.Join([]string{ inputResult2, @@ -301,7 +294,8 @@ var _ = Describe("JSON schema grammar tests", func() { structuredGrammar := JSONFunctionStructure{ OneOf: testFunctionsName} - grammar := structuredGrammar.Grammar(functions.EnableMaybeArray) + grammar, err := structuredGrammar.Grammar(EnableMaybeArray) + Expect(err).To(BeNil()) results := strings.Split( strings.Join([]string{ inputResult4, @@ -319,10 +313,11 @@ var _ = Describe("JSON schema grammar tests", func() { structuredGrammar := JSONFunctionStructure{ OneOf: testFunctionsName} - grammar := structuredGrammar.Grammar( - functions.SetPrefix("suffix"), - functions.EnableMaybeArray, + grammar, err := structuredGrammar.Grammar( + SetPrefix("suffix"), + EnableMaybeArray, ) + Expect(err).To(BeNil()) results := strings.Split( strings.Join([]string{ rootResult(`"suffix" arr | realvalue`), @@ -339,7 +334,8 @@ var _ = Describe("JSON schema grammar tests", func() { structuredGrammar := JSONFunctionStructure{ OneOf: testFunctionsName} - grammar := structuredGrammar.Grammar(functions.SetPrefix("suffix")) + grammar, err := structuredGrammar.Grammar(SetPrefix("suffix")) + Expect(err).To(BeNil()) results := strings.Split( strings.Join([]string{ rootResult(`"suffix" realvalue`), @@ -356,7 +352,8 @@ var _ = Describe("JSON schema grammar tests", func() { structuredGrammar := JSONFunctionStructure{ OneOf: testFunctionsName} - grammar := structuredGrammar.Grammar(functions.SetPrefix("suffix"), functions.EnableMaybeString) + grammar, err := structuredGrammar.Grammar(SetPrefix("suffix"), EnableMaybeString) + Expect(err).To(BeNil()) results := strings.Split( strings.Join([]string{ rootResult(`( "suffix" realvalue | mixedstring )`), @@ -373,7 +370,8 @@ var _ = Describe("JSON schema grammar tests", func() { structuredGrammar := JSONFunctionStructure{ OneOf: testFunctionsName} - grammar := structuredGrammar.Grammar(functions.SetPrefix("suffix"), functions.EnableMaybeString, functions.EnableMaybeArray) + grammar, err := structuredGrammar.Grammar(SetPrefix("suffix"), EnableMaybeString, EnableMaybeArray) + Expect(err).To(BeNil()) results := strings.Split( strings.Join([]string{ rootResult(`( "suffix" (arr | realvalue) | mixedstring )`), @@ -392,7 +390,8 @@ var _ = Describe("JSON schema grammar tests", func() { structuredGrammar := JSONFunctionStructure{ OneOf: testFunctionsName} - grammar := structuredGrammar.Grammar(functions.EnableMaybeString, functions.EnableMaybeArray) + grammar, err := structuredGrammar.Grammar(EnableMaybeString, EnableMaybeArray) + Expect(err).To(BeNil()) results := strings.Split( strings.Join([]string{ rootResult(`mixedstring | arr | realvalue`), @@ -410,7 +409,8 @@ var _ = Describe("JSON schema grammar tests", func() { structuredGrammar := JSONFunctionStructure{ OneOf: testFunctionsName} - grammar := structuredGrammar.Grammar(functions.EnableMaybeString, functions.EnableMaybeArray, functions.NoMixedFreeString) + grammar, err := structuredGrammar.Grammar(EnableMaybeString, EnableMaybeArray, NoMixedFreeString) + Expect(err).To(BeNil()) results := strings.Split( strings.Join([]string{ rootResult(`freestring | arr | realvalue`), @@ -432,7 +432,8 @@ var _ = Describe("JSON schema grammar tests", func() { realvalue ("," realvalue)* )? "]"` - grammar := structuredGrammar.Grammar(functions.EnableMaybeString, functions.EnableMaybeArray, functions.DisableParallelNewLines) + grammar, err := structuredGrammar.Grammar(EnableMaybeString, EnableMaybeArray, DisableParallelNewLines) + Expect(err).To(BeNil()) results := strings.Split(content, "\n") for _, r := range results { if r != "" { diff --git a/pkg/functions/json_mode.go b/pkg/functions/json_mode.go new file mode 100644 index 00000000..46361b74 --- /dev/null +++ b/pkg/functions/json_mode.go @@ -0,0 +1,28 @@ +package functions + +const ( + JSONBNF = `root ::= object +value ::= object | array | string | number | ("true" | "false" | "null") ws + +object ::= + "{" ws ( + string ":" ws value + ("," ws string ":" ws value)* + )? "}" ws + +array ::= + "[" ws ( + value + ("," ws value)* + )? "]" ws + +string ::= + "\"" ( + [^"\\] | + "\\" (["\\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F]) # escapes + )* "\"" ws + +number ::= ("-"? ([0-9] | [1-9] [0-9]*)) ("." [0-9]+)? ([eE] [-+]? [0-9]+)? ws + +ws ::= ([ \t\n] ws)?` +) From 392cf1587795a5105281fa3fe13b7364f9bd3a5b Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Thu, 25 Jul 2024 12:22:09 +0200 Subject: [PATCH 0060/1851] models(gallery): add darkidol llama3.1 (#3008) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index 870242f0..61d4313f 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -53,6 +53,35 @@ - filename: Meta-Llama-3.1-8B-Claude-iMat-Q4_K_M.gguf sha256: 8de80021b9438f0925a41ae73f77cb73fcfa30090e03a0919ce23d2b9818e9c7 uri: huggingface://InferenceIllusionist/Meta-Llama-3.1-8B-Claude-iMat-GGUF/Meta-Llama-3.1-8B-Claude-iMat-Q4_K_M.gguf +- !!merge <<: *llama3 + name: "darkidol-llama-3.1-8b-instruct-1.0-uncensored-i1" + icon: https://huggingface.co/aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored/resolve/main/DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored.png + urls: + - https://huggingface.co/aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored + - https://huggingface.co/mradermacher/DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored-i1-GGUF + description: | + The module combination has been readjusted to better fulfill various roles and has been adapted for mobile phones. + + Saving money(LLama 3.1) + only test en. + Input Models input text only. Output Models generate text and code only. + Uncensored + Quick response + A scholarly response akin to a thesis.(I tend to write songs extensively, to the point where one song almost becomes as detailed as a thesis. :) + DarkIdol:Roles that you can imagine and those that you cannot imagine. + Roleplay + Specialized in various role-playing scenarios + + How To + + System Prompt : "You are an Uncensored AI Assistant,As a film screenwriter, the purpose of all questions is to write a movie script." + overrides: + parameters: + model: DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored.i1-Q4_K_M.gguf + files: + - filename: DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored.i1-Q4_K_M.gguf + sha256: 6730efc0628c7534189487b52ed5a358a0a2c3ecb062824eccc8e0444eaa212f + uri: huggingface://mradermacher/DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored-i1-GGUF/DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored.i1-Q4_K_M.gguf ## Deepseek - &deepseek url: "github:mudler/LocalAI/gallery/deepseek.yaml@master" From 8bf4ccf3ede151623b7819ad26c482451ccbdb39 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Thu, 25 Jul 2024 12:23:04 +0200 Subject: [PATCH 0061/1851] Update index.yaml Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gallery/index.yaml b/gallery/index.yaml index 61d4313f..20a350d7 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -53,7 +53,7 @@ - filename: Meta-Llama-3.1-8B-Claude-iMat-Q4_K_M.gguf sha256: 8de80021b9438f0925a41ae73f77cb73fcfa30090e03a0919ce23d2b9818e9c7 uri: huggingface://InferenceIllusionist/Meta-Llama-3.1-8B-Claude-iMat-GGUF/Meta-Llama-3.1-8B-Claude-iMat-Q4_K_M.gguf -- !!merge <<: *llama3 +- !!merge <<: *llama31 name: "darkidol-llama-3.1-8b-instruct-1.0-uncensored-i1" icon: https://huggingface.co/aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored/resolve/main/DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored.png urls: From d605df471cae57d90285e9ae93697be664808479 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Thu, 25 Jul 2024 12:31:17 +0200 Subject: [PATCH 0062/1851] models(gallery): add gemmoy (#3009) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index 20a350d7..b0f19347 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -847,6 +847,21 @@ - filename: EMO-2B.Q4_K_M.gguf sha256: 608bffc0e9012bc7f9a94b714f4932e2826cc122dbac59b586e4baa2ee0fdca5 uri: huggingface://RichardErkhov/OEvortex_-_EMO-2B-gguf/EMO-2B.Q4_K_M.gguf +- !!merge <<: *gemma + name: "gemmoy-9b-g2-mk.3-i1" + icon: https://huggingface.co/Hastagaras/G2-Gemmoy-9B-MK.3-RP/resolve/main/gemmoy.jpg + urls: + - https://huggingface.co/Hastagaras/Gemmoy-9B-G2-MK.3 + - https://huggingface.co/mradermacher/Gemmoy-9B-G2-MK.3-i1-GGUF + description: | + The Gemmoy-9B-G2-MK.3 model is a large language model trained on a variety of datasets, including grimulkan/LimaRP-augmented, LDJnr/Capybara, TheSkullery/C2logs_Filtered_Sharegpt_Merged, abacusai/SystemChat-1.1, and Hastagaras/FTTS-Stories-Sharegpt. + overrides: + parameters: + model: Gemmoy-9B-G2-MK.3.i1-Q4_K_M.gguf + files: + - filename: Gemmoy-9B-G2-MK.3.i1-Q4_K_M.gguf + sha256: 0d1004a246fbda7f1408a6841129b73c4100e697bd0a6806fc698eabbb0802a1 + uri: huggingface://mradermacher/Gemmoy-9B-G2-MK.3-i1-GGUF/Gemmoy-9B-G2-MK.3.i1-Q4_K_M.gguf - &llama3 url: "github:mudler/LocalAI/gallery/llama3-instruct.yaml@master" icon: https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/aJJxKus1wP5N-euvHEUq7.png From 3379c3d98c405f303c0ac013a61eb99e05c41c74 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Thu, 25 Jul 2024 19:37:15 +0200 Subject: [PATCH 0063/1851] models(gallery): add stheno Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index b0f19347..c8b361c3 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -1070,6 +1070,36 @@ - filename: llama-3-stheno-mahou-8b-q4_k_m.gguf sha256: a485cd74ef4ff3671c67ed8e10ea5379a1f24082ac688bd303fd28dfc9808c11 uri: huggingface://mudler/llama-3-Stheno-Mahou-8B-Q4_K_M-GGUF/llama-3-stheno-mahou-8b-q4_k_m.gguf +- !!merge <<: *llama3 + name: "l3-8b-stheno-horny-v3.3-32k-q5_k_m" + urls: + - https://huggingface.co/nothingiisreal/L3-8B-Stheno-Horny-v3.3-32K + - https://huggingface.co/Kurgan1138/L3-8B-Stheno-Horny-v3.3-32K-Q5_K_M-GGUF + description: | + This was an experiment to see if aligning other models via LORA is possible. Yes it is. We aligned it to be always horny. + + We took V3.3 Stheno weights from here + + And applied our lora at Alpha = 768 + + Thank you to Sao10K for the amazing model. + + This is not legal advice. I don't put any extra licensing on my own lora. + + LLaMA 3 license may conflict with Creative Commons Attribution Non Commercial 4.0. + + LLaMA 3 license can be found here + + If you want to host a model using our lora, you have our permission, but you might consider getting Sao's permission if you want to host their model. + + Again, not legal advice. + overrides: + parameters: + model: l3-8b-stheno-horny-v3.3-32k-q5_k_m.gguf + files: + - filename: l3-8b-stheno-horny-v3.3-32k-q5_k_m.gguf + sha256: 8d934f80ca6dbaa4852846108da92446a26715fbd5f6fc3859568850edf05262 + uri: huggingface://Kurgan1138/L3-8B-Stheno-Horny-v3.3-32K-Q5_K_M-GGUF/l3-8b-stheno-horny-v3.3-32k-q5_k_m.gguf - !!merge <<: *llama3 name: "llama-3-8b-openhermes-dpo" urls: From 43f49533e829e0d16a8bbb56eccad28616a4705f Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Thu, 25 Jul 2024 19:37:35 +0200 Subject: [PATCH 0064/1851] chore: add function calling template for llama 3.1 models (#3010) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 2 +- gallery/llama3.1-instruct.yaml | 62 ++++++++++++++++++++++++++++++++++ 2 files changed, 63 insertions(+), 1 deletion(-) create mode 100644 gallery/llama3.1-instruct.yaml diff --git a/gallery/index.yaml b/gallery/index.yaml index c8b361c3..3648c2d8 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -1,7 +1,7 @@ --- ## LLama3.1 - &llama31 - url: "github:mudler/LocalAI/gallery/llama3-instruct.yaml@master" + url: "github:mudler/LocalAI/gallery/llama3.1-instruct.yaml@master" icon: https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/aJJxKus1wP5N-euvHEUq7.png name: "meta-llama-3.1-8b-instruct" license: llama3.1 diff --git a/gallery/llama3.1-instruct.yaml b/gallery/llama3.1-instruct.yaml new file mode 100644 index 00000000..66c9ce97 --- /dev/null +++ b/gallery/llama3.1-instruct.yaml @@ -0,0 +1,62 @@ +--- +name: "llama3-instruct" + +config_file: | + mmap: true + function: + disable_no_action: true + grammar: + disable: true + response_regex: + - \w+)>(?P.*) + template: + chat_message: | + <|start_header_id|>{{if eq .RoleName "assistant"}}assistant{{else if eq .RoleName "system"}}system{{else if eq .RoleName "tool"}}tool{{else if eq .RoleName "user"}}user{{end}}<|end_header_id|> + + {{ if .FunctionCall -}} + Function call: + {{ else if eq .RoleName "tool" -}} + Function response: + {{ end -}} + {{ if .Content -}} + {{.Content -}} + {{ else if .FunctionCall -}} + {{ toJson .FunctionCall -}} + {{ end -}} + <|eot_id|> + function: | + <|start_header_id|>system<|end_header_id|> + + You have access to the following functions: + + {{range .Functions}} + Use the function '{{.Name}}' to '{{.Description}}' + {{toJson .Parameters}} + {{end}} + + Think very carefully before calling functions. + If a you choose to call a function ONLY reply in the following format with no prefix or suffix: + + {{`{{"example_name": "example_value"}}`}} + + Reminder: + - If looking for real time information use relevant functions before falling back to searching on internet + - Function calls MUST follow the specified format, start with + - Required parameters MUST be specified + - Only call one function at a time + - Put the entire function call reply on one line + <|eot_id|> + {{.Input }} + <|start_header_id|>assistant<|end_header_id|> + chat: | + <|begin_of_text|>{{.Input }} + <|start_header_id|>assistant<|end_header_id|> + completion: | + {{.Input}} + context_size: 8192 + f16: true + stopwords: + - <|im_end|> + - + - "<|eot_id|>" + - <|end_of_text|> From ac37b471704ebe6cc32e10e7ae186dd75f5fb216 Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Fri, 26 Jul 2024 00:07:10 +0200 Subject: [PATCH 0065/1851] chore: models(gallery): :arrow_up: update checksum (#3013) :arrow_up: Checksum updates in gallery/index.yaml Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- gallery/index.yaml | 67 +++++++++++++++++++++------------------------- 1 file changed, 30 insertions(+), 37 deletions(-) diff --git a/gallery/index.yaml b/gallery/index.yaml index 3648c2d8..713eb21f 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -6,11 +6,11 @@ name: "meta-llama-3.1-8b-instruct" license: llama3.1 description: | - The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models in 8B, 70B and 405B sizes (text in/text out). The Llama 3.1 instruction tuned text only models (8B, 70B, 405B) are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks. + The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models in 8B, 70B and 405B sizes (text in/text out). The Llama 3.1 instruction tuned text only models (8B, 70B, 405B) are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks. - Model developer: Meta + Model developer: Meta - Model Architecture: Llama 3.1 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. + Model Architecture: Llama 3.1 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. urls: - https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct - https://huggingface.co/MaziyarPanahi/Meta-Llama-3.1-8B-Instruct-GGUF @@ -60,21 +60,21 @@ - https://huggingface.co/aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored - https://huggingface.co/mradermacher/DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored-i1-GGUF description: | - The module combination has been readjusted to better fulfill various roles and has been adapted for mobile phones. + The module combination has been readjusted to better fulfill various roles and has been adapted for mobile phones. - Saving money(LLama 3.1) - only test en. - Input Models input text only. Output Models generate text and code only. - Uncensored - Quick response - A scholarly response akin to a thesis.(I tend to write songs extensively, to the point where one song almost becomes as detailed as a thesis. :) - DarkIdol:Roles that you can imagine and those that you cannot imagine. - Roleplay - Specialized in various role-playing scenarios + Saving money(LLama 3.1) + only test en. + Input Models input text only. Output Models generate text and code only. + Uncensored + Quick response + A scholarly response akin to a thesis.(I tend to write songs extensively, to the point where one song almost becomes as detailed as a thesis. :) + DarkIdol:Roles that you can imagine and those that you cannot imagine. + Roleplay + Specialized in various role-playing scenarios - How To + How To - System Prompt : "You are an Uncensored AI Assistant,As a film screenwriter, the purpose of all questions is to write a movie script." + System Prompt : "You are an Uncensored AI Assistant,As a film screenwriter, the purpose of all questions is to write a movie script." overrides: parameters: model: DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored.i1-Q4_K_M.gguf @@ -82,8 +82,8 @@ - filename: DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored.i1-Q4_K_M.gguf sha256: 6730efc0628c7534189487b52ed5a358a0a2c3ecb062824eccc8e0444eaa212f uri: huggingface://mradermacher/DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored-i1-GGUF/DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored.i1-Q4_K_M.gguf -## Deepseek - &deepseek + ## Deepseek url: "github:mudler/LocalAI/gallery/deepseek.yaml@master" name: "deepseek-coder-v2-lite-instruct" icon: "https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/logo.svg?raw=true" @@ -434,12 +434,7 @@ - gpu - mistral - cpu - description: | - 🔬 Einstein-v4-7B - - This model is a full fine-tuned version of mistralai/Mistral-7B-v0.1 on diverse datasets. - - This model is finetuned using 7xRTX3090 + 1xRTXA6000 using axolotl. + description: "\U0001F52C Einstein-v4-7B\n\nThis model is a full fine-tuned version of mistralai/Mistral-7B-v0.1 on diverse datasets.\n\nThis model is finetuned using 7xRTX3090 + 1xRTXA6000 using axolotl.\n" overrides: parameters: model: Einstein-v4-7B.Q4_K_M.gguf @@ -1076,23 +1071,23 @@ - https://huggingface.co/nothingiisreal/L3-8B-Stheno-Horny-v3.3-32K - https://huggingface.co/Kurgan1138/L3-8B-Stheno-Horny-v3.3-32K-Q5_K_M-GGUF description: | - This was an experiment to see if aligning other models via LORA is possible. Yes it is. We aligned it to be always horny. + This was an experiment to see if aligning other models via LORA is possible. Yes it is. We aligned it to be always horny. - We took V3.3 Stheno weights from here + We took V3.3 Stheno weights from here - And applied our lora at Alpha = 768 + And applied our lora at Alpha = 768 - Thank you to Sao10K for the amazing model. + Thank you to Sao10K for the amazing model. - This is not legal advice. I don't put any extra licensing on my own lora. + This is not legal advice. I don't put any extra licensing on my own lora. - LLaMA 3 license may conflict with Creative Commons Attribution Non Commercial 4.0. + LLaMA 3 license may conflict with Creative Commons Attribution Non Commercial 4.0. - LLaMA 3 license can be found here + LLaMA 3 license can be found here - If you want to host a model using our lora, you have our permission, but you might consider getting Sao's permission if you want to host their model. + If you want to host a model using our lora, you have our permission, but you might consider getting Sao's permission if you want to host their model. - Again, not legal advice. + Again, not legal advice. overrides: parameters: model: l3-8b-stheno-horny-v3.3-32k-q5_k_m.gguf @@ -3151,7 +3146,6 @@ - filename: ArliAI-Llama-3-8B-Dolfin-v0.5.Q4_K_M.gguf sha256: 71fef02915c606b438ccff2cae6b7760bbb54a558d5f2d39c2421d97b6682fea uri: huggingface://QuantFactory/ArliAI-Llama-3-8B-Dolfin-v0.5-GGUF/ArliAI-Llama-3-8B-Dolfin-v0.5.Q4_K_M.gguf - - !!merge <<: *llama3 name: "llama-3-ezo-8b-common-it" icon: https://huggingface.co/HODACHI/Llama-3-EZO-8b-Common-it @@ -3159,11 +3153,11 @@ - https://huggingface.co/HODACHI/Llama-3-EZO-8b-Common-it - https://huggingface.co/MCZK/Llama-3-EZO-8b-Common-it-GGUF description: | - Based on meta-llama/Meta-Llama-3-8B-Instruct, it has been enhanced for Japanese usage through additional pre-training and instruction tuning. (Built with Meta Llama3) + Based on meta-llama/Meta-Llama-3-8B-Instruct, it has been enhanced for Japanese usage through additional pre-training and instruction tuning. (Built with Meta Llama3) - This model is based on Llama-3-8B-Instruct and is subject to the Llama-3 Terms of Use. For detailed information, please refer to the official Llama-3 license page. + This model is based on Llama-3-8B-Instruct and is subject to the Llama-3 Terms of Use. For detailed information, please refer to the official Llama-3 license page. - このモデルはLlama-3-8B-Instructをベースにしており、Llama-3の利用規約に従います。詳細については、Llama-3の公式ライセンスページをご参照ください。 + このモデルはLlama-3-8B-Instructをベースにしており、Llama-3の利用規約に従います。詳細については、Llama-3の公式ライセンスページをご参照ください。 overrides: parameters: model: Llama-3-EZO-8b-Common-it.Q4_K_M.iMatrix.gguf @@ -3292,7 +3286,6 @@ - filename: L3-15B-MythicalMaid-t0.0001.Q4_K_M.gguf sha256: ecbd57783006f1a027f8a7f5a5d551dc8b3568912825f566d79fd34a804e8970 uri: huggingface://mradermacher/L3-15B-MythicalMaid-t0.0001-GGUF/L3-15B-MythicalMaid-t0.0001.Q4_K_M.gguf - - !!merge <<: *llama3 name: "l3-15b-etherealmaid-t0.0001-i1" icon: https://cdn-uploads.huggingface.co/production/uploads/64f74b6e6389380c77562762/FwYXt2h_FdmlL0Z6qYufz.png @@ -3656,8 +3649,8 @@ model: Phi-3.1-mini-4k-instruct-Q4_K_M.gguf files: - filename: Phi-3.1-mini-4k-instruct-Q4_K_M.gguf - sha256: 39458b227a4be763b7eb39d306d240c3d45205e3f8b474ec7bdca7bba0158e69 uri: huggingface://bartowski/Phi-3.1-mini-4k-instruct-GGUF/Phi-3.1-mini-4k-instruct-Q4_K_M.gguf + sha256: d6d25bf078321bea4a079c727b273cb0b5a2e0b4cf3add0f7a2c8e43075c414f - !!merge <<: *phi-3 name: "phillama-3.8b-v0.1" icon: https://cdn-uploads.huggingface.co/production/uploads/657eb5b256c9c67605a6e8b5/f96pPiJQb3puzbPYNknG2.png From 868182bc3881c67c86348422aafce3a1f60718ab Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Fri, 26 Jul 2024 00:28:34 +0200 Subject: [PATCH 0066/1851] chore: :arrow_up: Update ggerganov/llama.cpp (#3012) :arrow_up: Update ggerganov/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index f1862aef..c6028aa7 100644 --- a/Makefile +++ b/Makefile @@ -8,7 +8,7 @@ DETECT_LIBS?=true # llama.cpp versions GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be -CPPLLAMA_VERSION?=68504f0970db5a3602d176953690f503059906b1 +CPPLLAMA_VERSION?=4226a8d10e3904db3a1297919fe6c7f06beba6c0 # gpt4all version GPT4ALL_REPO?=https://github.com/nomic-ai/gpt4all From fee52942ebd4542572b86f402778d0a174e6bac2 Mon Sep 17 00:00:00 2001 From: Dave Date: Fri, 26 Jul 2024 02:46:57 -0400 Subject: [PATCH 0067/1851] fix: PR title tag for checksum checker script workflow (#3014) * fix PR title tag for checksum checker script workflow Signed-off-by: Dave Lee * Update .github/workflows/checksum_checker.yaml Signed-off-by: Ettore Di Giacinto --------- Signed-off-by: Dave Lee Signed-off-by: Ettore Di Giacinto Co-authored-by: Ettore Di Giacinto --- .github/workflows/checksum_checker.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/workflows/checksum_checker.yaml b/.github/workflows/checksum_checker.yaml index b76b7aff..4f95a4e2 100644 --- a/.github/workflows/checksum_checker.yaml +++ b/.github/workflows/checksum_checker.yaml @@ -41,7 +41,7 @@ jobs: token: ${{ secrets.UPDATE_BOT_TOKEN }} push-to-fork: ci-forks/LocalAI commit-message: ':arrow_up: Checksum updates in gallery/index.yaml' - title: 'models(gallery): :arrow_up: update checksum' + title: 'chore(model-gallery): :arrow_up: update checksum' branch: "update/checksum" body: Updating checksums in gallery/index.yaml signoff: true From 2169c3497d9b4fb4cfe13716844ea842728e0b11 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Fri, 26 Jul 2024 20:11:29 +0200 Subject: [PATCH 0068/1851] feat(grammar): add llama3.1 schema (#3015) * wip Signed-off-by: Ettore Di Giacinto * get rid of panics Signed-off-by: Ettore Di Giacinto * expose it properly from the config Signed-off-by: Ettore Di Giacinto * Simplify Signed-off-by: Ettore Di Giacinto * forgot to commit Signed-off-by: Ettore Di Giacinto * Remove focus on test Signed-off-by: Ettore Di Giacinto * Small fixups Signed-off-by: Ettore Di Giacinto --------- Signed-off-by: Ettore Di Giacinto --- core/http/endpoints/openai/chat.go | 4 +- pkg/functions/function_structure.go | 26 +- pkg/functions/functions.go | 8 - pkg/functions/functions_suite_test.go | 16 +- pkg/functions/{ => grammars}/bnf_rules.go | 15 +- pkg/functions/grammars/grammars_suite_test.go | 25 ++ .../json_schema.go} | 113 +------ .../json_schema_test.go} | 3 +- pkg/functions/grammars/llama31_schema.go | 281 ++++++++++++++++++ pkg/functions/grammars/llama31_schema_test.go | 76 +++++ pkg/functions/{ => grammars}/options.go | 17 +- pkg/functions/grammars/rules.go | 93 ++++++ pkg/functions/grammars/types.go | 33 ++ pkg/functions/parse.go | 47 ++- 14 files changed, 609 insertions(+), 148 deletions(-) rename pkg/functions/{ => grammars}/bnf_rules.go (85%) create mode 100644 pkg/functions/grammars/grammars_suite_test.go rename pkg/functions/{grammar_json_schema.go => grammars/json_schema.go} (67%) rename pkg/functions/{grammar_json_schema_test.go => grammars/json_schema_test.go} (99%) create mode 100644 pkg/functions/grammars/llama31_schema.go create mode 100644 pkg/functions/grammars/llama31_schema_test.go rename pkg/functions/{ => grammars}/options.go (76%) create mode 100644 pkg/functions/grammars/rules.go create mode 100644 pkg/functions/grammars/types.go diff --git a/core/http/endpoints/openai/chat.go b/core/http/endpoints/openai/chat.go index c7afb7bf..86b75601 100644 --- a/core/http/endpoints/openai/chat.go +++ b/core/http/endpoints/openai/chat.go @@ -226,12 +226,12 @@ func ChatEndpoint(cl *config.BackendConfigLoader, ml *model.ModelLoader, startup // Update input grammar jsStruct := funcs.ToJSONStructure(config.FunctionsConfig.FunctionNameKey, config.FunctionsConfig.FunctionNameKey) - g, err := jsStruct.Grammar(config.FunctionsConfig.GrammarConfig.Options()...) + g, err := jsStruct.Grammar(config.FunctionsConfig.GrammarOptions()...) if err == nil { config.Grammar = g } case input.JSONFunctionGrammarObject != nil: - g, err := input.JSONFunctionGrammarObject.Grammar(config.FunctionsConfig.GrammarConfig.Options()...) + g, err := input.JSONFunctionGrammarObject.Grammar(config.FunctionsConfig.GrammarOptions()...) if err == nil { config.Grammar = g } diff --git a/pkg/functions/function_structure.go b/pkg/functions/function_structure.go index 62cc68fa..c4337d67 100644 --- a/pkg/functions/function_structure.go +++ b/pkg/functions/function_structure.go @@ -1,6 +1,10 @@ package functions -import "encoding/json" +import ( + "encoding/json" + + "github.com/mudler/LocalAI/pkg/functions/grammars" +) type Item struct { Type string `json:"type"` @@ -13,13 +17,27 @@ type JSONFunctionStructure struct { Defs map[string]interface{} `json:"$defs,omitempty"` } -func (j JSONFunctionStructure) Grammar(options ...func(*GrammarOption)) (string, error) { - grammarOpts := &GrammarOption{} +func (j JSONFunctionStructure) Grammar(options ...func(*grammars.GrammarOption)) (string, error) { + grammarOpts := &grammars.GrammarOption{} grammarOpts.Apply(options...) dat, err := json.Marshal(j) if err != nil { return "", err } - return NewJSONSchemaConverter(grammarOpts.PropOrder).GrammarFromBytes(dat, options...) + + converter := NewSchemaConverter(*grammarOpts) + return converter.GrammarFromBytes(dat, options...) +} + +type SchemaConverter interface { + GrammarFromBytes([]byte, ...func(*grammars.GrammarOption)) (string, error) +} + +func NewSchemaConverter(opt grammars.GrammarOption) SchemaConverter { + switch { + case opt.SchemaType == grammars.LLama31Schema: + return grammars.NewLLama31SchemaConverter(opt.FunctionName) + } + return grammars.NewJSONSchemaConverter(opt.PropOrder) } diff --git a/pkg/functions/functions.go b/pkg/functions/functions.go index 2690b8ec..19012d53 100644 --- a/pkg/functions/functions.go +++ b/pkg/functions/functions.go @@ -95,11 +95,3 @@ func (f Functions) Select(name string) Functions { return funcs } - -func jsonString(v interface{}) (string, error) { - b, err := json.Marshal(v) - if err != nil { - return "", err - } - return string(b), nil -} diff --git a/pkg/functions/functions_suite_test.go b/pkg/functions/functions_suite_test.go index 59a90ab0..ab743609 100644 --- a/pkg/functions/functions_suite_test.go +++ b/pkg/functions/functions_suite_test.go @@ -3,23 +3,11 @@ package functions_test import ( "testing" - . "github.com/mudler/LocalAI/pkg/functions" - . "github.com/onsi/ginkgo/v2" . "github.com/onsi/gomega" ) -func TestGrammar(t *testing.T) { +func TestFunctions(t *testing.T) { RegisterFailHandler(Fail) - RunSpecs(t, "Grammar test suite") -} - -func createFunction(field1 string, field2 string, name string, properties map[string]interface{}) map[string]interface{} { - property := map[string]interface{}{} - property[field1] = FunctionName{Const: name} - property[field2] = Argument{ - Type: "object", - Properties: properties, - } - return property + RunSpecs(t, "Functions test suite") } diff --git a/pkg/functions/bnf_rules.go b/pkg/functions/grammars/bnf_rules.go similarity index 85% rename from pkg/functions/bnf_rules.go rename to pkg/functions/grammars/bnf_rules.go index 13aa3654..469e187a 100644 --- a/pkg/functions/bnf_rules.go +++ b/pkg/functions/grammars/bnf_rules.go @@ -1,6 +1,9 @@ -package functions +package grammars -import "regexp" +import ( + "encoding/json" + "regexp" +) var ( PRIMITIVE_RULES = map[string]string{ @@ -45,3 +48,11 @@ const ( ("," realvalue)* )? "]"` ) + +func jsonString(v interface{}) (string, error) { + b, err := json.Marshal(v) + if err != nil { + return "", err + } + return string(b), nil +} diff --git a/pkg/functions/grammars/grammars_suite_test.go b/pkg/functions/grammars/grammars_suite_test.go new file mode 100644 index 00000000..5ac02bc1 --- /dev/null +++ b/pkg/functions/grammars/grammars_suite_test.go @@ -0,0 +1,25 @@ +package grammars_test + +import ( + "testing" + + . "github.com/mudler/LocalAI/pkg/functions" + + . "github.com/onsi/ginkgo/v2" + . "github.com/onsi/gomega" +) + +func TestGrammar(t *testing.T) { + RegisterFailHandler(Fail) + RunSpecs(t, "Grammar test suite") +} + +func createFunction(field1 string, field2 string, name string, properties map[string]interface{}) map[string]interface{} { + property := map[string]interface{}{} + property[field1] = FunctionName{Const: name} + property[field2] = Argument{ + Type: "object", + Properties: properties, + } + return property +} diff --git a/pkg/functions/grammar_json_schema.go b/pkg/functions/grammars/json_schema.go similarity index 67% rename from pkg/functions/grammar_json_schema.go rename to pkg/functions/grammars/json_schema.go index 5ffc0ba5..df4ca6a1 100644 --- a/pkg/functions/grammar_json_schema.go +++ b/pkg/functions/grammars/json_schema.go @@ -1,4 +1,4 @@ -package functions +package grammars // a golang port of https://github.com/ggerganov/llama.cpp/pull/1887 @@ -7,13 +7,11 @@ import ( "fmt" "sort" "strings" - - "github.com/mudler/LocalAI/pkg/utils" ) type JSONSchemaConverter struct { propOrder map[string]int - rules map[string]string + rules Rules } func NewJSONSchemaConverter(propOrder string) *JSONSchemaConverter { @@ -60,90 +58,6 @@ func (sc *JSONSchemaConverter) addRule(name, rule string) string { return key } -func (sc *JSONSchemaConverter) finalizeGrammar(options ...func(*GrammarOption)) string { - - grammarOpts := &GrammarOption{} - grammarOpts.Apply(options...) - - prefix := grammarOpts.Prefix - maybeArray := grammarOpts.MaybeArray - disableParallelNewLines := grammarOpts.DisableParallelNewLines - maybeString := grammarOpts.MaybeString - noMixedFreeString := grammarOpts.NoMixedFreeString - - var lines []string - - swapRoot := maybeArray || maybeString || prefix != "" - - // write down the computed rules. - // if maybeArray is true, we need to add the array rule and slightly tweak the root rule - for name, rule := range sc.rules { - if swapRoot && name == "root" { - name = "realvalue" - } - lines = append(lines, fmt.Sprintf("%s ::= %s", name, rule)) - } - - if !swapRoot { - return strings.Join(lines, "\n") - } - - newRoot := "realvalue" - if maybeArray { - newRoot = "arr | realvalue" - } - - freestringRule := "mixedstring" - if noMixedFreeString { - freestringRule = "freestring" - } - - if prefix != "" { - // quote newlines in suffix - prefix = utils.EscapeNewLines(prefix) - - if maybeArray && maybeString { - newRoot = "(" + newRoot + ")" - } - - if maybeString { - //newRoot = "( (\"" + suffix + "\" " + newRoot + ") | freestring ) " - newRoot = "( \"" + prefix + "\" " + newRoot + " | " + freestringRule + " ) " - } else { - newRoot = "\"" + prefix + "\" " + "" + newRoot + "" - } - } else if maybeString { - if maybeArray { - // newRoot = "(" + newRoot + ")" - } - - newRoot = freestringRule + " | " + newRoot - } - - lines = append(lines, fmt.Sprintf("%s ::= %s", "root", newRoot)) - if disableParallelNewLines { - lines = append(lines, array) - } else { - lines = append(lines, arrayNewLines) - } - - if maybeArray { - if grammarOpts.ExpectStringsAfterJSON { - lines = append(lines, `mixedstring ::= freestring | freestring arr freestring | (freestring realvalue freestring)* | realvalue | arr`) - } else { - lines = append(lines, `mixedstring ::= freestring | freestring arr | freestring realvalue | realvalue | arr`) - } - } else { - if grammarOpts.ExpectStringsAfterJSON { - lines = append(lines, `mixedstring ::= freestring | (freestring realvalue freestring)* | realvalue`) - } else { - lines = append(lines, `mixedstring ::= freestring | freestring realvalue | realvalue`) - } - } - - return strings.Join(lines, "\n") -} - func (sc *JSONSchemaConverter) visit(schema map[string]interface{}, name string, rootSchema map[string]interface{}) (string, error) { st, existType := schema["type"] var schemaType string @@ -182,7 +96,10 @@ func (sc *JSONSchemaConverter) visit(schema map[string]interface{}, name string, rule := strings.Join(alternatives, " | ") return sc.addRule(ruleName, rule), nil } else if ref, exists := schema["$ref"].(string); exists { - referencedSchema := sc.resolveReference(ref, rootSchema) + referencedSchema, err := sc.resolveReference(ref, rootSchema) + if err != nil { + return "", err + } return sc.visit(referencedSchema, name, rootSchema) } else if constVal, exists := schema["const"]; exists { literal, err := sc.formatLiteral((constVal)) @@ -257,7 +174,7 @@ func (sc *JSONSchemaConverter) visit(schema map[string]interface{}, name string, } else { primitiveRule, exists := PRIMITIVE_RULES[schemaType] if !exists { - panic(fmt.Sprintf("Unrecognized schema: %v", schema)) + return "", fmt.Errorf("unrecognized schema: %v", schema) } if ruleName == "root" { schemaType = "root" @@ -265,27 +182,23 @@ func (sc *JSONSchemaConverter) visit(schema map[string]interface{}, name string, return sc.addRule(schemaType, primitiveRule), nil } } -func (sc *JSONSchemaConverter) resolveReference(ref string, rootSchema map[string]interface{}) map[string]interface{} { +func (sc *JSONSchemaConverter) resolveReference(ref string, rootSchema map[string]interface{}) (map[string]interface{}, error) { if !strings.HasPrefix(ref, "#/$defs/") { - panic(fmt.Sprintf("Invalid reference format: %s", ref)) + return nil, fmt.Errorf("invalid reference format: %s", ref) } defKey := strings.TrimPrefix(ref, "#/$defs/") definitions, exists := rootSchema["$defs"].(map[string]interface{}) if !exists { - fmt.Println(rootSchema) - - panic("No definitions found in the schema") + return nil, fmt.Errorf("no definitions found in the schema: %s", rootSchema) } def, exists := definitions[defKey].(map[string]interface{}) if !exists { - fmt.Println(definitions) - - panic(fmt.Sprintf("Definition not found: %s", defKey)) + return nil, fmt.Errorf("definition not found: %s %+v", defKey, definitions) } - return def + return def, nil } func (sc *JSONSchemaConverter) Grammar(schema map[string]interface{}, options ...func(*GrammarOption)) (string, error) { @@ -294,7 +207,7 @@ func (sc *JSONSchemaConverter) Grammar(schema map[string]interface{}, options .. if err != nil { return "", err } - return sc.finalizeGrammar(options...), nil + return sc.rules.ToGrammar(options...), nil } func (sc *JSONSchemaConverter) GrammarFromBytes(b []byte, options ...func(*GrammarOption)) (string, error) { diff --git a/pkg/functions/grammar_json_schema_test.go b/pkg/functions/grammars/json_schema_test.go similarity index 99% rename from pkg/functions/grammar_json_schema_test.go rename to pkg/functions/grammars/json_schema_test.go index 56c5fe1e..5fc4a602 100644 --- a/pkg/functions/grammar_json_schema_test.go +++ b/pkg/functions/grammars/json_schema_test.go @@ -1,9 +1,10 @@ -package functions_test +package grammars_test import ( "strings" . "github.com/mudler/LocalAI/pkg/functions" + . "github.com/mudler/LocalAI/pkg/functions/grammars" . "github.com/onsi/ginkgo/v2" . "github.com/onsi/gomega" ) diff --git a/pkg/functions/grammars/llama31_schema.go b/pkg/functions/grammars/llama31_schema.go new file mode 100644 index 00000000..04b74aa5 --- /dev/null +++ b/pkg/functions/grammars/llama31_schema.go @@ -0,0 +1,281 @@ +package grammars + +import ( + "encoding/json" + "fmt" + "regexp" + "sort" + "strings" +) + +type LLama31SchemaConverter struct { + fnName string + rules Rules +} + +func NewLLama31SchemaConverter(fnName string) *LLama31SchemaConverter { + rules := make(map[string]string) + rules["space"] = SPACE_RULE + if fnName == "" { + fnName = "name" + } + + return &LLama31SchemaConverter{ + rules: rules, + fnName: fnName, + } +} + +var GRAMMAR_LITERAL_ESCAPESLlama = map[string]string{ + "\r": `\r`, + "\n": `\n`, +} + +var GRAMMAR_LITERAL_ESCAPE_RELlama = regexp.MustCompile(`[\r\n]`) + +func (sc *LLama31SchemaConverter) formatLiteral(literal interface{}) (string, error) { + jLiteral, err := jsonString(literal) + if err != nil { + return "", err + } + escaped := GRAMMAR_LITERAL_ESCAPE_RELlama.ReplaceAllStringFunc(jLiteral, func(match string) string { + return GRAMMAR_LITERAL_ESCAPESLlama[match] + }) + return escaped, nil +} + +func (sc *LLama31SchemaConverter) formatLiteralQuoted(literal interface{}) (string, error) { + jLiteral, err := jsonString(literal) + if err != nil { + return "", err + } + escaped := GRAMMAR_LITERAL_ESCAPE_RE.ReplaceAllStringFunc(jLiteral, func(match string) string { + return GRAMMAR_LITERAL_ESCAPES[match] + }) + return fmt.Sprintf(`"%s"`, escaped), nil +} + +func (sc *LLama31SchemaConverter) addRule(name, rule string) string { + escName := INVALID_RULE_CHARS_RE.ReplaceAllString(name, "-") + key := escName + if existingRule, ok := sc.rules[escName]; ok && existingRule != rule { + i := 0 + for { + key = fmt.Sprintf("%s%d", escName, i) + if _, ok := sc.rules[key]; !ok { + break + } + i++ + } + } + sc.rules[key] = rule + return key +} + +func (sc *LLama31SchemaConverter) visit(schema map[string]interface{}, name string, rootSchema map[string]interface{}) (string, error) { + st, existType := schema["type"] + var schemaType string + if existType { + schemaType = st.(string) + } + ruleName := name + if name == "" { + ruleName = "root" + } + _, oneOfExists := schema["oneOf"] + _, anyOfExists := schema["anyOf"] + if oneOfExists || anyOfExists { + var alternatives []string + oneOfSchemas, oneOfExists := schema["oneOf"].([]interface{}) + anyOfSchemas, anyOfExists := schema["anyOf"].([]interface{}) + + if oneOfExists { + for i, altSchema := range oneOfSchemas { + alternative, err := sc.visit(altSchema.(map[string]interface{}), fmt.Sprintf("%s-%d", ruleName, i), rootSchema) + if err != nil { + return "", err + } + alternatives = append(alternatives, alternative) + } + } else if anyOfExists { + for i, altSchema := range anyOfSchemas { + alternative, err := sc.visit(altSchema.(map[string]interface{}), fmt.Sprintf("%s-%d", ruleName, i), rootSchema) + if err != nil { + return "", err + } + alternatives = append(alternatives, alternative) + } + } + + rule := strings.Join(alternatives, " | ") + return sc.addRule(ruleName, rule), nil + } else if ref, exists := schema["$ref"].(string); exists { + referencedSchema, err := sc.resolveReference(ref, rootSchema) + if err != nil { + return "", err + } + return sc.visit(referencedSchema, name, rootSchema) + } else if constVal, exists := schema["const"]; exists { + + literal, err := sc.formatLiteral((constVal)) + if err != nil { + return "", err + } + return sc.addRule(ruleName, literal), nil + } else if enumVals, exists := schema["enum"].([]interface{}); exists { + var enumRules []string + for _, enumVal := range enumVals { + enumRule, err := sc.formatLiteralQuoted(enumVal) + if err != nil { + return "", err + } + enumRules = append(enumRules, enumRule) + } + rule := strings.Join(enumRules, " | ") + return sc.addRule(ruleName, rule), nil + } else if properties, exists := schema["properties"].(map[string]interface{}); schemaType == "object" && exists { + baseProperty := false + depth := strings.Split(name, "-") + if len(depth) == 2 { + baseProperty = true + } + type propData []struct { + propName string + propSchema map[string]interface{} + } + var propPairs propData + + for propName, propSchema := range properties { + propPairs = append(propPairs, struct { + propName string + propSchema map[string]interface{} + }{propName: propName, propSchema: propSchema.(map[string]interface{})}) + } + + sort.Slice(propPairs, func(i, j int) bool { + return propPairs[i].propName < propPairs[j].propName + }) + + var rule strings.Builder + if baseProperty { + rule.WriteString(`"{" `, propRuleName)) + + for _, propPair := range propPairs { + propName := propPair.propName + propSchema := propPair.propSchema + propRuleName, err := sc.visit(propSchema, fmt.Sprintf("%s-%s", ruleName, propName), rootSchema) + if err != nil { + return "", err + } + + rule.WriteString(propRuleName) + } + + rule.WriteString(` "}"`) + + } else { + for i, propPair := range propPairs { + propName := propPair.propName + propSchema := propPair.propSchema + propRuleName, err := sc.visit(propSchema, fmt.Sprintf("%s-%s", ruleName, propName), rootSchema) + if err != nil { + return "", err + } + lPropName, err := sc.formatLiteralQuoted(propName) + if err != nil { + return "", err + } + if i > 0 { + rule.WriteString(` "," space`) + } + + rule.WriteString(fmt.Sprintf(` %s space ":" space %s`, lPropName, propRuleName)) + } + + } + + if !baseProperty { + rule.WriteString(` "}" space`) + } + + return sc.addRule(ruleName, rule.String()), nil + } else if items, exists := schema["items"].(map[string]interface{}); schemaType == "array" && exists { + itemRuleName, err := sc.visit(items, fmt.Sprintf("%s-item", ruleName), rootSchema) + if err != nil { + return "", err + } + rule := fmt.Sprintf(`"[" space (%s ("," space %s)*)? "]" space`, itemRuleName, itemRuleName) + return sc.addRule(ruleName, rule), nil + } else { + primitiveRule, exists := PRIMITIVE_RULES[schemaType] + if !exists { + return "", fmt.Errorf("unrecognized schema: %v", schema) + } + if ruleName == "root" { + schemaType = "root" + } + return sc.addRule(schemaType, primitiveRule), nil + } +} +func (sc *LLama31SchemaConverter) resolveReference(ref string, rootSchema map[string]interface{}) (map[string]interface{}, error) { + if !strings.HasPrefix(ref, "#/$defs/") { + return nil, fmt.Errorf("invalid reference format: %s", ref) + } + + defKey := strings.TrimPrefix(ref, "#/$defs/") + definitions, exists := rootSchema["$defs"].(map[string]interface{}) + if !exists { + return nil, fmt.Errorf("no definitions found in the schema: %s", rootSchema) + } + + def, exists := definitions[defKey].(map[string]interface{}) + if !exists { + return nil, fmt.Errorf("definition not found: %s %+v", defKey, definitions) + } + + return def, nil +} + +func (sc *LLama31SchemaConverter) Grammar(schema map[string]interface{}, options ...func(*GrammarOption)) (string, error) { + sc.addRule("freestring", PRIMITIVE_RULES["freestring"]) + _, err := sc.visit(schema, "", schema) + if err != nil { + return "", err + } + return sc.rules.ToGrammar(options...), nil +} + +func (sc *LLama31SchemaConverter) GrammarFromBytes(b []byte, options ...func(*GrammarOption)) (string, error) { + var schema map[string]interface{} + err := json.Unmarshal(b, &schema) + if err != nil { + return "", err + } + return sc.Grammar(schema, options...) +} diff --git a/pkg/functions/grammars/llama31_schema_test.go b/pkg/functions/grammars/llama31_schema_test.go new file mode 100644 index 00000000..84d09bd5 --- /dev/null +++ b/pkg/functions/grammars/llama31_schema_test.go @@ -0,0 +1,76 @@ +package grammars_test + +import ( + "strings" + + . "github.com/mudler/LocalAI/pkg/functions/grammars" + . "github.com/onsi/ginkgo/v2" + . "github.com/onsi/gomega" +) + +const ( + testllama31Input1 = ` + { + "oneOf": [ + { + "type": "object", + "properties": { + "function": {"const": "create_event"}, + "arguments": { + "type": "object", + "properties": { + "title": {"type": "string"}, + "date": {"type": "string"}, + "time": {"type": "string"} + } + } + } + }, + { + "type": "object", + "properties": { + "function": {"const": "search"}, + "arguments": { + "type": "object", + "properties": { + "query": {"type": "string"} + } + } + } + } + ] + }` + // {{"example_name": "example_value"}} + testllama31inputResult1 = `root-0-function ::= "create_event" +freestring ::= ( + [^"\\] | + "\\" (["\\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F]) + )* space +root-0 ::= "{" root-0-arguments "}" +root-1-arguments ::= "{" space "\"query\"" space ":" space string "}" space +root ::= root-0 | root-1 +space ::= " "? +root-0-arguments ::= "{" space "\"date\"" space ":" space string "," space "\"time\"" space ":" space string "," space "\"title\"" space ":" space string "}" space +root-1 ::= "{" root-1-arguments "}" +string ::= "\"" ( + [^"\\] | + "\\" (["\\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F]) +)* "\"" space +root-1-function ::= "search"` +) + +var _ = Describe("JSON schema grammar tests", func() { + Context("JSON", func() { + It("generates a valid grammar from JSON schema", func() { + grammar, err := NewLLama31SchemaConverter("function").GrammarFromBytes([]byte(testllama31Input1)) + Expect(err).ToNot(HaveOccurred()) + results := strings.Split(testllama31inputResult1, "\n") + for _, r := range results { + if r != "" { + Expect(grammar).To(ContainSubstring(r)) + } + } + Expect(len(results)).To(Equal(len(strings.Split(grammar, "\n")))) + }) + }) +}) diff --git a/pkg/functions/options.go b/pkg/functions/grammars/options.go similarity index 76% rename from pkg/functions/options.go rename to pkg/functions/grammars/options.go index 3a341a43..07c6c951 100644 --- a/pkg/functions/options.go +++ b/pkg/functions/grammars/options.go @@ -1,4 +1,4 @@ -package functions +package grammars type GrammarOption struct { PropOrder string @@ -8,6 +8,9 @@ type GrammarOption struct { MaybeString bool NoMixedFreeString bool ExpectStringsAfterJSON bool + + FunctionName string + SchemaType SchemaConverterType } func (o *GrammarOption) Apply(options ...func(*GrammarOption)) { @@ -48,3 +51,15 @@ func SetPropOrder(order string) func(*GrammarOption) { o.PropOrder = order } } + +func WithSchemaType(schemaType SchemaConverterType) func(*GrammarOption) { + return func(o *GrammarOption) { + o.SchemaType = schemaType + } +} + +func WithFunctionName(name string) func(*GrammarOption) { + return func(o *GrammarOption) { + o.FunctionName = name + } +} diff --git a/pkg/functions/grammars/rules.go b/pkg/functions/grammars/rules.go new file mode 100644 index 00000000..84fc8a25 --- /dev/null +++ b/pkg/functions/grammars/rules.go @@ -0,0 +1,93 @@ +package grammars + +import ( + "fmt" + "strings" + + "github.com/mudler/LocalAI/pkg/utils" +) + +type Rules map[string]string + +func (rules Rules) ToGrammar(options ...func(*GrammarOption)) string { + grammarOpts := &GrammarOption{} + grammarOpts.Apply(options...) + + prefix := grammarOpts.Prefix + maybeArray := grammarOpts.MaybeArray + disableParallelNewLines := grammarOpts.DisableParallelNewLines + maybeString := grammarOpts.MaybeString + noMixedFreeString := grammarOpts.NoMixedFreeString + + var lines []string + + swapRoot := maybeArray || maybeString || prefix != "" + + // write down the computed rules. + // if maybeArray is true, we need to add the array rule and slightly tweak the root rule + for name, rule := range rules { + if swapRoot && name == "root" { + name = "realvalue" + } + lines = append(lines, fmt.Sprintf("%s ::= %s", name, rule)) + } + + if !swapRoot { + return strings.Join(lines, "\n") + } + + newRoot := "realvalue" + if maybeArray { + newRoot = "arr | realvalue" + } + + freestringRule := "mixedstring" + if noMixedFreeString { + freestringRule = "freestring" + } + + if prefix != "" { + // quote newlines in suffix + prefix = utils.EscapeNewLines(prefix) + + if maybeArray && maybeString { + newRoot = "(" + newRoot + ")" + } + + if maybeString { + //newRoot = "( (\"" + suffix + "\" " + newRoot + ") | freestring ) " + newRoot = "( \"" + prefix + "\" " + newRoot + " | " + freestringRule + " ) " + } else { + newRoot = "\"" + prefix + "\" " + "" + newRoot + "" + } + } else if maybeString { + if maybeArray { + // newRoot = "(" + newRoot + ")" + } + + newRoot = freestringRule + " | " + newRoot + } + + lines = append(lines, fmt.Sprintf("%s ::= %s", "root", newRoot)) + if disableParallelNewLines { + lines = append(lines, array) + } else { + lines = append(lines, arrayNewLines) + } + + if maybeArray { + if grammarOpts.ExpectStringsAfterJSON { + lines = append(lines, `mixedstring ::= freestring | freestring arr freestring | (freestring realvalue freestring)* | realvalue | arr`) + } else { + lines = append(lines, `mixedstring ::= freestring | freestring arr | freestring realvalue | realvalue | arr`) + } + } else { + if grammarOpts.ExpectStringsAfterJSON { + lines = append(lines, `mixedstring ::= freestring | (freestring realvalue freestring)* | realvalue`) + } else { + lines = append(lines, `mixedstring ::= freestring | freestring realvalue | realvalue`) + } + } + + return strings.Join(lines, "\n") +} diff --git a/pkg/functions/grammars/types.go b/pkg/functions/grammars/types.go new file mode 100644 index 00000000..1fe6444a --- /dev/null +++ b/pkg/functions/grammars/types.go @@ -0,0 +1,33 @@ +package grammars + +type SchemaConverterType int + +const ( + JSONSchema SchemaConverterType = iota + LLama31Schema +) + +const ( + LlamaType string = "llama3.1" + JSONType string = "json" +) + +func (s SchemaConverterType) String() string { + switch s { + case JSONSchema: + return JSONType + case LLama31Schema: + return LlamaType + } + return "unknown" +} + +func NewType(t string) SchemaConverterType { + switch t { + case JSONType: + return JSONSchema + case LlamaType: + return LLama31Schema + } + return JSONSchema +} diff --git a/pkg/functions/parse.go b/pkg/functions/parse.go index 8e848a60..f5593690 100644 --- a/pkg/functions/parse.go +++ b/pkg/functions/parse.go @@ -7,6 +7,7 @@ import ( "regexp" "strings" + "github.com/mudler/LocalAI/pkg/functions/grammars" "github.com/mudler/LocalAI/pkg/utils" "github.com/rs/zerolog/log" ) @@ -22,7 +23,9 @@ type GrammarConfig struct { MixedMode bool `yaml:"mixed_mode"` // NoMixedFreeString disables the mixed mode for free strings - // In this way if the LLM selects a free string, it won't be mixed necessarly with JSON objects + // In this way if the LLM selects a free string, it won't be mixed necessarly with JSON objects. + // For example, if enabled the LLM or returns a JSON object or a free string, but not a mix of both + // If disabled(default): the LLM can return a JSON object surrounded by free strings (e.g. `this is the JSON result: { "bar": "baz" } for your question`). This forces the LLM to return at least a JSON object, but its not going to be strict NoMixedFreeString bool `yaml:"no_mixed_free_string"` // NoGrammar disables the grammar parsing and parses the responses directly from the LLM @@ -39,6 +42,10 @@ type GrammarConfig struct { // for instance name,arguments will make print { "name": "foo", "arguments": { "bar": "baz" } } // instead of { "arguments": { "bar": "baz" }, "name": "foo" } PropOrder string `yaml:"properties_order"` + + // SchemaType can be configured to use a specific schema type to force the grammar + // available : json, llama3.1 + SchemaType string `yaml:"schema_type"` } // FunctionsConfig is the configuration for the tool/function call. @@ -92,28 +99,36 @@ type FuncCallResults struct { Arguments string } -func (g GrammarConfig) Options() []func(o *GrammarOption) { - opts := []func(o *GrammarOption){} - if g.MixedMode { - opts = append(opts, EnableMaybeString) +func (g FunctionsConfig) GrammarOptions() []func(o *grammars.GrammarOption) { + opts := []func(o *grammars.GrammarOption){} + if g.GrammarConfig.MixedMode { + opts = append(opts, grammars.EnableMaybeString) } - if g.ParallelCalls { - opts = append(opts, EnableMaybeArray) + if g.GrammarConfig.ParallelCalls { + opts = append(opts, grammars.EnableMaybeArray) } - if g.DisableParallelNewLines { - opts = append(opts, DisableParallelNewLines) + if g.GrammarConfig.DisableParallelNewLines { + opts = append(opts, grammars.DisableParallelNewLines) } - if g.Prefix != "" { - opts = append(opts, SetPrefix(g.Prefix)) + if g.GrammarConfig.Prefix != "" { + opts = append(opts, grammars.SetPrefix(g.GrammarConfig.Prefix)) } - if g.NoMixedFreeString { - opts = append(opts, NoMixedFreeString) + if g.GrammarConfig.NoMixedFreeString { + opts = append(opts, grammars.NoMixedFreeString) } - if g.ExpectStringsAfterJSON { - opts = append(opts, ExpectStringsAfterJSON) + if g.GrammarConfig.ExpectStringsAfterJSON { + opts = append(opts, grammars.ExpectStringsAfterJSON) } - opts = append(opts, SetPropOrder(g.PropOrder)) + if g.GrammarConfig.SchemaType != "" { + opts = append(opts, grammars.WithSchemaType(grammars.NewType(g.GrammarConfig.SchemaType))) + } + + if g.FunctionNameKey != "" { + opts = append(opts, grammars.WithFunctionName(g.FunctionNameKey)) + } + + opts = append(opts, grammars.SetPropOrder(g.GrammarConfig.PropOrder)) return opts } From 80652abc9b17d18330f40e79d2d3541d249269ee Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Sat, 27 Jul 2024 01:26:28 +0200 Subject: [PATCH 0069/1851] chore: :arrow_up: Update ggerganov/llama.cpp (#3016) :arrow_up: Update ggerganov/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index c6028aa7..51893868 100644 --- a/Makefile +++ b/Makefile @@ -8,7 +8,7 @@ DETECT_LIBS?=true # llama.cpp versions GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be -CPPLLAMA_VERSION?=4226a8d10e3904db3a1297919fe6c7f06beba6c0 +CPPLLAMA_VERSION?=01245f5b1629075543bc4478418c7d72a0b4b3c7 # gpt4all version GPT4ALL_REPO?=https://github.com/nomic-ai/gpt4all From 02d4eeffc840f9517c6ce479777e6716b2027640 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Sat, 27 Jul 2024 10:24:42 +0200 Subject: [PATCH 0070/1851] models(gallery): add mistral-nemo (#3019) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index 713eb21f..bedc05f9 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -442,6 +442,21 @@ - filename: Einstein-v4-7B.Q4_K_M.gguf sha256: 78bd573de2a9eb3c6e213132858164e821145f374fcaa4b19dfd6502c05d990d uri: huggingface://mradermacher/Einstein-v4-7B-GGUF/Einstein-v4-7B.Q4_K_M.gguf +- !!merge <<: *mistral03 + name: "mistral-nemo-instruct-2407" + urls: + - https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407 + - https://huggingface.co/bartowski/Mistral-Nemo-Instruct-2407-GGUF + - https://mistral.ai/news/mistral-nemo/ + description: | + The Mistral-Nemo-Instruct-2407 Large Language Model (LLM) is an instruct fine-tuned version of the Mistral-Nemo-Base-2407. Trained jointly by Mistral AI and NVIDIA, it significantly outperforms existing models smaller or similar in size. + overrides: + parameters: + model: Mistral-Nemo-Instruct-2407-Q4_K_M.gguf + files: + - filename: Mistral-Nemo-Instruct-2407-Q4_K_M.gguf + sha256: 1a8b92fb546a80dce78151e4908f7bdb2c11fb3ef52af960e4bbe319a9cc5052 + uri: huggingface://bartowski/Mistral-Nemo-Instruct-2407-GGUF/Mistral-Nemo-Instruct-2407-Q4_K_M.gguf - &mudler ### START mudler's LocalAI specific-models url: "github:mudler/LocalAI/gallery/mudler.yaml@master" From fe4c8c825170ddbffd4e88e4c6a53a7dc8c058ae Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Sat, 27 Jul 2024 10:24:56 +0200 Subject: [PATCH 0071/1851] models(gallery): add llama3.1-8b-fireplace2 (#3018) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index bedc05f9..c9d1e2b2 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -82,6 +82,37 @@ - filename: DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored.i1-Q4_K_M.gguf sha256: 6730efc0628c7534189487b52ed5a358a0a2c3ecb062824eccc8e0444eaa212f uri: huggingface://mradermacher/DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored-i1-GGUF/DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored.i1-Q4_K_M.gguf +- !!merge <<: *llama31 + name: "llama3.1-8b-fireplace2" + icon: https://cdn-uploads.huggingface.co/production/uploads/64f267a8a4f79a118e0fcc89/JYkaXrk2DqpXhaL9WymKY.jpeg + urls: + - https://huggingface.co/ValiantLabs/Llama3.1-8B-Fireplace2 + - https://huggingface.co/mudler/Llama3.1-8B-Fireplace2-Q4_K_M-GGUF + description: | + Fireplace 2 is a chat model, adding helpful structured outputs to Llama 3.1 8b Instruct. + + an expansion pack of supplementary outputs - request them at will within your chat: + Inline function calls + SQL queries + JSON objects + Data visualization with matplotlib + Mix normal chat and structured outputs within the same conversation. + Fireplace 2 supplements the existing strengths of Llama 3.1, providing inline capabilities within the Llama 3 Instruct format. + + Version + + This is the 2024-07-23 release of Fireplace 2 for Llama 3.1 8b. + + We're excited to bring further upgrades and releases to Fireplace 2 in the future. + + Help us and recommend Fireplace 2 to your friends! + overrides: + parameters: + model: llama3.1-8b-fireplace2-q4_k_m.gguf + files: + - filename: llama3.1-8b-fireplace2-q4_k_m.gguf + sha256: 54527fd2474b576086ea31e759214ab240abe2429ae623a02d7ba825cc8cb13e + uri: huggingface://mudler/Llama3.1-8B-Fireplace2-Q4_K_M-GGUF/llama3.1-8b-fireplace2-q4_k_m.gguf - &deepseek ## Deepseek url: "github:mudler/LocalAI/gallery/deepseek.yaml@master" From 81c4b722582959295b3149e5df0ad564a3104901 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Sat, 27 Jul 2024 10:28:47 +0200 Subject: [PATCH 0072/1851] models(gallery): add lumimaid-v0.2-12b (#3020) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index c9d1e2b2..7526befa 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -488,6 +488,31 @@ - filename: Mistral-Nemo-Instruct-2407-Q4_K_M.gguf sha256: 1a8b92fb546a80dce78151e4908f7bdb2c11fb3ef52af960e4bbe319a9cc5052 uri: huggingface://bartowski/Mistral-Nemo-Instruct-2407-GGUF/Mistral-Nemo-Instruct-2407-Q4_K_M.gguf +- !!merge <<: *mistral03 + name: "lumimaid-v0.2-12b" + icon: https://cdn-uploads.huggingface.co/production/uploads/63ab1241ad514ca8d1430003/ep3ojmuMkFS-GmgRuI9iB.png + urls: + - https://huggingface.co/NeverSleep/Lumimaid-v0.2-12B + - https://huggingface.co/mudler/Lumimaid-v0.2-12B-Q4_K_M-GGUF + description: | + This model is based on: Mistral-Nemo-Instruct-2407 + + Wandb: https://wandb.ai/undis95/Lumi-Mistral-Nemo?nw=nwuserundis95 + + NOTE: As explained on Mistral-Nemo-Instruct-2407 repo, it's recommended to use a low temperature, please experiment! + + Lumimaid 0.1 -> 0.2 is a HUGE step up dataset wise. + + As some people have told us our models are sloppy, Ikari decided to say fuck it and literally nuke all chats out with most slop. + + Our dataset stayed the same since day one, we added data over time, cleaned them, and repeat. After not releasing model for a while because we were never satisfied, we think it's time to come back! + overrides: + parameters: + model: lumimaid-v0.2-12b-q4_k_m.gguf + files: + - filename: lumimaid-v0.2-12b-q4_k_m.gguf + sha256: f72299858a07e52be920b86d42ddcfcd5008b961d601ef6fd6a98a3377adccbf + uri: huggingface://mudler/Lumimaid-v0.2-12B-Q4_K_M-GGUF/lumimaid-v0.2-12b-q4_k_m.gguf - &mudler ### START mudler's LocalAI specific-models url: "github:mudler/LocalAI/gallery/mudler.yaml@master" From 7ef8edda32b1b5136ec48766ef0b5431177cf493 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Sat, 27 Jul 2024 10:59:06 +0200 Subject: [PATCH 0073/1851] =?UTF-8?q?models(gallery):=20add=20darkidol-lla?= =?UTF-8?q?ma-3.1-8b-instruct-1.1-uncensored-iq=E2=80=A6=20(#3021)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit models(gallery): add darkidol-llama-3.1-8b-instruct-1.1-uncensored-iq-imatrix-request Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index 7526befa..22855a2b 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -113,6 +113,29 @@ - filename: llama3.1-8b-fireplace2-q4_k_m.gguf sha256: 54527fd2474b576086ea31e759214ab240abe2429ae623a02d7ba825cc8cb13e uri: huggingface://mudler/Llama3.1-8B-Fireplace2-Q4_K_M-GGUF/llama3.1-8b-fireplace2-q4_k_m.gguf +- !!merge <<: *llama31 + name: "darkidol-llama-3.1-8b-instruct-1.1-uncensored-iq-imatrix-request" + icon: https://cdn-uploads.huggingface.co/production/uploads/65d4cf2693a0a3744a27536c/iDV5GTVJbjkvMp1set-ZC.png + urls: + - https://huggingface.co/LWDCLS/DarkIdol-Llama-3.1-8B-Instruct-1.1-Uncensored-GGUF-IQ-Imatrix-Request + description: | + Uncensored + virtual idol Twitter + + https://x.com/aifeifei799 + + Questions + + The model's response results are for reference only, please do not fully trust them. + This model is solely for learning and testing purposes, and errors in output are inevitable. We do not take responsibility for the output results. If the output content is to be used, it must be modified; if not modified, we will assume it has been altered. + For commercial licensing, please refer to the Llama 3.1 agreement. + overrides: + parameters: + model: DarkIdol-Llama-3.1-8B-Instruct-1.1-Uncensored-Q4_K_M-imat.gguf + files: + - filename: DarkIdol-Llama-3.1-8B-Instruct-1.1-Uncensored-Q4_K_M-imat.gguf + sha256: fa9fc56de7d902b755c43f1a5d0867d961675174a1b3e73a10d822836c3390e6 + uri: huggingface://LWDCLS/DarkIdol-Llama-3.1-8B-Instruct-1.1-Uncensored-GGUF-IQ-Imatrix-Request/DarkIdol-Llama-3.1-8B-Instruct-1.1-Uncensored-Q4_K_M-imat.gguf - &deepseek ## Deepseek url: "github:mudler/LocalAI/gallery/deepseek.yaml@master" From d5a6c1e4f62619df77ebd9afa02fa9699ecaca47 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Sat, 27 Jul 2024 11:00:21 +0200 Subject: [PATCH 0074/1851] models(gallery): add meta-llama-3.1-8b-instruct-abliterated (#3022) * models(gallery): add meta-llama-3.1-8b-instruct-abliterated Signed-off-by: Ettore Di Giacinto * Update gallery/index.yaml Signed-off-by: Ettore Di Giacinto --------- Signed-off-by: Ettore Di Giacinto Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index 22855a2b..edd9f5d8 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -113,6 +113,21 @@ - filename: llama3.1-8b-fireplace2-q4_k_m.gguf sha256: 54527fd2474b576086ea31e759214ab240abe2429ae623a02d7ba825cc8cb13e uri: huggingface://mudler/Llama3.1-8B-Fireplace2-Q4_K_M-GGUF/llama3.1-8b-fireplace2-q4_k_m.gguf +- !!merge <<: *llama31 + name: "meta-llama-3.1-8b-instruct-abliterated" + icon: https://i.imgur.com/KhorYYG.png + urls: + - https://huggingface.co/mlabonne/Meta-Llama-3.1-8B-Instruct-abliterated + - https://huggingface.co/mlabonne/Meta-Llama-3.1-8B-Instruct-abliterated-GGUF + description: | + This is an uncensored version of Llama 3.1 8B Instruct created with abliteration. + overrides: + parameters: + model: meta-llama-3.1-8b-instruct-abliterated.Q4_K_M.gguf + files: + - filename: meta-llama-3.1-8b-instruct-abliterated.Q4_K_M.gguf + sha256: 18cca47adfb3954af2b49e3aa2ce1604158337aff45fab2e7654039b65c7683e + uri: huggingface://mlabonne/Meta-Llama-3.1-8B-Instruct-abliterated-GGUF/meta-llama-3.1-8b-instruct-abliterated.Q4_K_M.gguf - !!merge <<: *llama31 name: "darkidol-llama-3.1-8b-instruct-1.1-uncensored-iq-imatrix-request" icon: https://cdn-uploads.huggingface.co/production/uploads/65d4cf2693a0a3744a27536c/iDV5GTVJbjkvMp1set-ZC.png From d59bcd539ed8def49957747aab338c9ea7c7aa86 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Sat, 27 Jul 2024 12:18:55 +0200 Subject: [PATCH 0075/1851] models(gallery): add llama-3.1-70b-japanese-instruct-2407 (#3023) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index edd9f5d8..6a76241a 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -128,6 +128,20 @@ - filename: meta-llama-3.1-8b-instruct-abliterated.Q4_K_M.gguf sha256: 18cca47adfb3954af2b49e3aa2ce1604158337aff45fab2e7654039b65c7683e uri: huggingface://mlabonne/Meta-Llama-3.1-8B-Instruct-abliterated-GGUF/meta-llama-3.1-8b-instruct-abliterated.Q4_K_M.gguf +- !!merge <<: *llama31 + name: "llama-3.1-70b-japanese-instruct-2407" + urls: + - https://huggingface.co/cyberagent/Llama-3.1-70B-Japanese-Instruct-2407 + - https://huggingface.co/mmnga/Llama-3.1-70B-Japanese-Instruct-2407-gguf + description: | + The Llama-3.1-70B-Japanese-Instruct-2407-gguf model is a Japanese language model that uses the Instruct prompt tuning method. It is based on the LLaMa-3.1-70B model and has been fine-tuned on the imatrix dataset for Japanese. The model is trained to generate informative and coherent responses to given instructions or prompts. It is available in the gguf format and can be used for a variety of tasks such as question answering, text generation, and more. + overrides: + parameters: + model: Llama-3.1-70B-Japanese-Instruct-2407-Q4_K_M.gguf + files: + - filename: Llama-3.1-70B-Japanese-Instruct-2407-Q4_K_M.gguf + sha256: f2a6f0fb5040d3a28479c9f9fc555a5ea7b906dfb9964539f1a68c0676a9c604 + uri: huggingface://mmnga/Llama-3.1-70B-Japanese-Instruct-2407-gguf/Llama-3.1-70B-Japanese-Instruct-2407-Q4_K_M.gguf - !!merge <<: *llama31 name: "darkidol-llama-3.1-8b-instruct-1.1-uncensored-iq-imatrix-request" icon: https://cdn-uploads.huggingface.co/production/uploads/65d4cf2693a0a3744a27536c/iDV5GTVJbjkvMp1set-ZC.png From 7aa7f13095db2f1cad09ce9fa79edde47615dae5 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Sat, 27 Jul 2024 12:22:30 +0200 Subject: [PATCH 0076/1851] models(gallery): add llama-3.1-8b-instruct-fei-v1-uncensored (#3024) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index 6a76241a..bac51af7 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -165,6 +165,22 @@ - filename: DarkIdol-Llama-3.1-8B-Instruct-1.1-Uncensored-Q4_K_M-imat.gguf sha256: fa9fc56de7d902b755c43f1a5d0867d961675174a1b3e73a10d822836c3390e6 uri: huggingface://LWDCLS/DarkIdol-Llama-3.1-8B-Instruct-1.1-Uncensored-GGUF-IQ-Imatrix-Request/DarkIdol-Llama-3.1-8B-Instruct-1.1-Uncensored-Q4_K_M-imat.gguf +- !!merge <<: *llama31 + name: "llama-3.1-8b-instruct-fei-v1-uncensored" + icon: https://huggingface.co/aifeifei799/Llama-3.1-8B-Instruct-Fei-v1-Uncensored/resolve/main/Llama-3.1-8B-Instruct-Fei-v1-Uncensored.png + urls: + - https://huggingface.co/aifeifei799/Llama-3.1-8B-Instruct-Fei-v1-Uncensored + - https://huggingface.co/mradermacher/Llama-3.1-8B-Instruct-Fei-v1-Uncensored-GGUF + description: | + Llama-3.1-8B-Instruct Uncensored + more informtion look at Llama-3.1-8B-Instruct + overrides: + parameters: + model: Llama-3.1-8B-Instruct-Fei-v1-Uncensored.Q4_K_M.gguf + files: + - filename: Llama-3.1-8B-Instruct-Fei-v1-Uncensored.Q4_K_M.gguf + sha256: 12fef8ff0a5c4cf6988523d33d89287edb7531f0d1644707548f45f1387e398a + uri: huggingface://mradermacher/Llama-3.1-8B-Instruct-Fei-v1-Uncensored-GGUF/Llama-3.1-8B-Instruct-Fei-v1-Uncensored.Q4_K_M.gguf - &deepseek ## Deepseek url: "github:mudler/LocalAI/gallery/deepseek.yaml@master" From 7021c02d45385fc39ab828b7066919eccf88aec9 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Sat, 27 Jul 2024 12:24:45 +0200 Subject: [PATCH 0077/1851] models(gallery): add openbuddy-llama3.1-8b-v22.1-131k (#3025) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index bac51af7..5b99bd72 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -181,6 +181,20 @@ - filename: Llama-3.1-8B-Instruct-Fei-v1-Uncensored.Q4_K_M.gguf sha256: 12fef8ff0a5c4cf6988523d33d89287edb7531f0d1644707548f45f1387e398a uri: huggingface://mradermacher/Llama-3.1-8B-Instruct-Fei-v1-Uncensored-GGUF/Llama-3.1-8B-Instruct-Fei-v1-Uncensored.Q4_K_M.gguf +- !!merge <<: *llama31 + name: "openbuddy-llama3.1-8b-v22.1-131k" + icon: https://raw.githubusercontent.com/OpenBuddy/OpenBuddy/main/media/demo.png + urls: + - https://huggingface.co/sunnyyy/openbuddy-llama3.1-8b-v22.1-131k-Q4_K_M-GGUF + description: | + OpenBuddy - Open Multilingual Chatbot + overrides: + parameters: + model: openbuddy-llama3.1-8b-v22.1-131k-q4_k_m.gguf + files: + - filename: openbuddy-llama3.1-8b-v22.1-131k-q4_k_m.gguf + sha256: c87a273785759f2d044046b7a7b42f05706baed7dc0650ed883a3bee2a097d86 + uri: huggingface://sunnyyy/openbuddy-llama3.1-8b-v22.1-131k-Q4_K_M-GGUF/openbuddy-llama3.1-8b-v22.1-131k-q4_k_m.gguf - &deepseek ## Deepseek url: "github:mudler/LocalAI/gallery/deepseek.yaml@master" From f9fad3f4ee31abdd98aebef88966e192cb930705 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Sat, 27 Jul 2024 12:26:23 +0200 Subject: [PATCH 0078/1851] models: re-order Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 121 +++++++++++++++++++++++---------------------- 1 file changed, 61 insertions(+), 60 deletions(-) diff --git a/gallery/index.yaml b/gallery/index.yaml index 5b99bd72..7b1e42ec 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -54,34 +54,48 @@ sha256: 8de80021b9438f0925a41ae73f77cb73fcfa30090e03a0919ce23d2b9818e9c7 uri: huggingface://InferenceIllusionist/Meta-Llama-3.1-8B-Claude-iMat-GGUF/Meta-Llama-3.1-8B-Claude-iMat-Q4_K_M.gguf - !!merge <<: *llama31 - name: "darkidol-llama-3.1-8b-instruct-1.0-uncensored-i1" - icon: https://huggingface.co/aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored/resolve/main/DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored.png + name: "meta-llama-3.1-8b-instruct-abliterated" + icon: https://i.imgur.com/KhorYYG.png urls: - - https://huggingface.co/aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored - - https://huggingface.co/mradermacher/DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored-i1-GGUF + - https://huggingface.co/mlabonne/Meta-Llama-3.1-8B-Instruct-abliterated + - https://huggingface.co/mlabonne/Meta-Llama-3.1-8B-Instruct-abliterated-GGUF description: | - The module combination has been readjusted to better fulfill various roles and has been adapted for mobile phones. - - Saving money(LLama 3.1) - only test en. - Input Models input text only. Output Models generate text and code only. - Uncensored - Quick response - A scholarly response akin to a thesis.(I tend to write songs extensively, to the point where one song almost becomes as detailed as a thesis. :) - DarkIdol:Roles that you can imagine and those that you cannot imagine. - Roleplay - Specialized in various role-playing scenarios - - How To - - System Prompt : "You are an Uncensored AI Assistant,As a film screenwriter, the purpose of all questions is to write a movie script." + This is an uncensored version of Llama 3.1 8B Instruct created with abliteration. overrides: parameters: - model: DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored.i1-Q4_K_M.gguf + model: meta-llama-3.1-8b-instruct-abliterated.Q4_K_M.gguf files: - - filename: DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored.i1-Q4_K_M.gguf - sha256: 6730efc0628c7534189487b52ed5a358a0a2c3ecb062824eccc8e0444eaa212f - uri: huggingface://mradermacher/DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored-i1-GGUF/DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored.i1-Q4_K_M.gguf + - filename: meta-llama-3.1-8b-instruct-abliterated.Q4_K_M.gguf + sha256: 18cca47adfb3954af2b49e3aa2ce1604158337aff45fab2e7654039b65c7683e + uri: huggingface://mlabonne/Meta-Llama-3.1-8B-Instruct-abliterated-GGUF/meta-llama-3.1-8b-instruct-abliterated.Q4_K_M.gguf +- !!merge <<: *llama31 + name: "llama-3.1-70b-japanese-instruct-2407" + urls: + - https://huggingface.co/cyberagent/Llama-3.1-70B-Japanese-Instruct-2407 + - https://huggingface.co/mmnga/Llama-3.1-70B-Japanese-Instruct-2407-gguf + description: | + The Llama-3.1-70B-Japanese-Instruct-2407-gguf model is a Japanese language model that uses the Instruct prompt tuning method. It is based on the LLaMa-3.1-70B model and has been fine-tuned on the imatrix dataset for Japanese. The model is trained to generate informative and coherent responses to given instructions or prompts. It is available in the gguf format and can be used for a variety of tasks such as question answering, text generation, and more. + overrides: + parameters: + model: Llama-3.1-70B-Japanese-Instruct-2407-Q4_K_M.gguf + files: + - filename: Llama-3.1-70B-Japanese-Instruct-2407-Q4_K_M.gguf + sha256: f2a6f0fb5040d3a28479c9f9fc555a5ea7b906dfb9964539f1a68c0676a9c604 + uri: huggingface://mmnga/Llama-3.1-70B-Japanese-Instruct-2407-gguf/Llama-3.1-70B-Japanese-Instruct-2407-Q4_K_M.gguf +- !!merge <<: *llama31 + name: "openbuddy-llama3.1-8b-v22.1-131k" + icon: https://raw.githubusercontent.com/OpenBuddy/OpenBuddy/main/media/demo.png + urls: + - https://huggingface.co/sunnyyy/openbuddy-llama3.1-8b-v22.1-131k-Q4_K_M-GGUF + description: | + OpenBuddy - Open Multilingual Chatbot + overrides: + parameters: + model: openbuddy-llama3.1-8b-v22.1-131k-q4_k_m.gguf + files: + - filename: openbuddy-llama3.1-8b-v22.1-131k-q4_k_m.gguf + sha256: c87a273785759f2d044046b7a7b42f05706baed7dc0650ed883a3bee2a097d86 + uri: huggingface://sunnyyy/openbuddy-llama3.1-8b-v22.1-131k-Q4_K_M-GGUF/openbuddy-llama3.1-8b-v22.1-131k-q4_k_m.gguf - !!merge <<: *llama31 name: "llama3.1-8b-fireplace2" icon: https://cdn-uploads.huggingface.co/production/uploads/64f267a8a4f79a118e0fcc89/JYkaXrk2DqpXhaL9WymKY.jpeg @@ -113,35 +127,36 @@ - filename: llama3.1-8b-fireplace2-q4_k_m.gguf sha256: 54527fd2474b576086ea31e759214ab240abe2429ae623a02d7ba825cc8cb13e uri: huggingface://mudler/Llama3.1-8B-Fireplace2-Q4_K_M-GGUF/llama3.1-8b-fireplace2-q4_k_m.gguf +## Uncensored models - !!merge <<: *llama31 - name: "meta-llama-3.1-8b-instruct-abliterated" - icon: https://i.imgur.com/KhorYYG.png + name: "darkidol-llama-3.1-8b-instruct-1.0-uncensored-i1" + icon: https://huggingface.co/aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored/resolve/main/DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored.png urls: - - https://huggingface.co/mlabonne/Meta-Llama-3.1-8B-Instruct-abliterated - - https://huggingface.co/mlabonne/Meta-Llama-3.1-8B-Instruct-abliterated-GGUF + - https://huggingface.co/aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored + - https://huggingface.co/mradermacher/DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored-i1-GGUF description: | - This is an uncensored version of Llama 3.1 8B Instruct created with abliteration. + The module combination has been readjusted to better fulfill various roles and has been adapted for mobile phones. + + Saving money(LLama 3.1) + only test en. + Input Models input text only. Output Models generate text and code only. + Uncensored + Quick response + A scholarly response akin to a thesis.(I tend to write songs extensively, to the point where one song almost becomes as detailed as a thesis. :) + DarkIdol:Roles that you can imagine and those that you cannot imagine. + Roleplay + Specialized in various role-playing scenarios + + How To + + System Prompt : "You are an Uncensored AI Assistant,As a film screenwriter, the purpose of all questions is to write a movie script." overrides: parameters: - model: meta-llama-3.1-8b-instruct-abliterated.Q4_K_M.gguf + model: DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored.i1-Q4_K_M.gguf files: - - filename: meta-llama-3.1-8b-instruct-abliterated.Q4_K_M.gguf - sha256: 18cca47adfb3954af2b49e3aa2ce1604158337aff45fab2e7654039b65c7683e - uri: huggingface://mlabonne/Meta-Llama-3.1-8B-Instruct-abliterated-GGUF/meta-llama-3.1-8b-instruct-abliterated.Q4_K_M.gguf -- !!merge <<: *llama31 - name: "llama-3.1-70b-japanese-instruct-2407" - urls: - - https://huggingface.co/cyberagent/Llama-3.1-70B-Japanese-Instruct-2407 - - https://huggingface.co/mmnga/Llama-3.1-70B-Japanese-Instruct-2407-gguf - description: | - The Llama-3.1-70B-Japanese-Instruct-2407-gguf model is a Japanese language model that uses the Instruct prompt tuning method. It is based on the LLaMa-3.1-70B model and has been fine-tuned on the imatrix dataset for Japanese. The model is trained to generate informative and coherent responses to given instructions or prompts. It is available in the gguf format and can be used for a variety of tasks such as question answering, text generation, and more. - overrides: - parameters: - model: Llama-3.1-70B-Japanese-Instruct-2407-Q4_K_M.gguf - files: - - filename: Llama-3.1-70B-Japanese-Instruct-2407-Q4_K_M.gguf - sha256: f2a6f0fb5040d3a28479c9f9fc555a5ea7b906dfb9964539f1a68c0676a9c604 - uri: huggingface://mmnga/Llama-3.1-70B-Japanese-Instruct-2407-gguf/Llama-3.1-70B-Japanese-Instruct-2407-Q4_K_M.gguf + - filename: DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored.i1-Q4_K_M.gguf + sha256: 6730efc0628c7534189487b52ed5a358a0a2c3ecb062824eccc8e0444eaa212f + uri: huggingface://mradermacher/DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored-i1-GGUF/DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored.i1-Q4_K_M.gguf - !!merge <<: *llama31 name: "darkidol-llama-3.1-8b-instruct-1.1-uncensored-iq-imatrix-request" icon: https://cdn-uploads.huggingface.co/production/uploads/65d4cf2693a0a3744a27536c/iDV5GTVJbjkvMp1set-ZC.png @@ -181,20 +196,6 @@ - filename: Llama-3.1-8B-Instruct-Fei-v1-Uncensored.Q4_K_M.gguf sha256: 12fef8ff0a5c4cf6988523d33d89287edb7531f0d1644707548f45f1387e398a uri: huggingface://mradermacher/Llama-3.1-8B-Instruct-Fei-v1-Uncensored-GGUF/Llama-3.1-8B-Instruct-Fei-v1-Uncensored.Q4_K_M.gguf -- !!merge <<: *llama31 - name: "openbuddy-llama3.1-8b-v22.1-131k" - icon: https://raw.githubusercontent.com/OpenBuddy/OpenBuddy/main/media/demo.png - urls: - - https://huggingface.co/sunnyyy/openbuddy-llama3.1-8b-v22.1-131k-Q4_K_M-GGUF - description: | - OpenBuddy - Open Multilingual Chatbot - overrides: - parameters: - model: openbuddy-llama3.1-8b-v22.1-131k-q4_k_m.gguf - files: - - filename: openbuddy-llama3.1-8b-v22.1-131k-q4_k_m.gguf - sha256: c87a273785759f2d044046b7a7b42f05706baed7dc0650ed883a3bee2a097d86 - uri: huggingface://sunnyyy/openbuddy-llama3.1-8b-v22.1-131k-Q4_K_M-GGUF/openbuddy-llama3.1-8b-v22.1-131k-q4_k_m.gguf - &deepseek ## Deepseek url: "github:mudler/LocalAI/gallery/deepseek.yaml@master" From 0dd21f2b5e28e9750cc4bc893783d938c3ce1fbd Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Sat, 27 Jul 2024 12:41:19 +0200 Subject: [PATCH 0079/1851] models(gallery): add lumimaid-8b (#3026) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index 7b1e42ec..a8f06b95 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -196,6 +196,29 @@ - filename: Llama-3.1-8B-Instruct-Fei-v1-Uncensored.Q4_K_M.gguf sha256: 12fef8ff0a5c4cf6988523d33d89287edb7531f0d1644707548f45f1387e398a uri: huggingface://mradermacher/Llama-3.1-8B-Instruct-Fei-v1-Uncensored-GGUF/Llama-3.1-8B-Instruct-Fei-v1-Uncensored.Q4_K_M.gguf +- !!merge <<: *llama31 + name: "lumimaid-v0.2-8b" + urls: + - https://huggingface.co/NeverSleep/Lumimaid-v0.2-8B + - https://huggingface.co/mradermacher/Lumimaid-v0.2-8B-GGUF + icon: https://cdn-uploads.huggingface.co/production/uploads/63ab1241ad514ca8d1430003/TUcHg7LKNjfo0sni88Ps7.png + description: | + This model is based on: Meta-Llama-3.1-8B-Instruct + + Wandb: https://wandb.ai/undis95/Lumi-Llama-3-1-8B?nw=nwuserundis95 + + Lumimaid 0.1 -> 0.2 is a HUGE step up dataset wise. + + As some people have told us our models are sloppy, Ikari decided to say fuck it and literally nuke all chats out with most slop. + + Our dataset stayed the same since day one, we added data over time, cleaned them, and repeat. After not releasing model for a while because we were never satisfied, we think it's time to come back! + overrides: + parameters: + model: Lumimaid-v0.2-8B.Q4_K_M.gguf + files: + - filename: Lumimaid-v0.2-8B.Q4_K_M.gguf + sha256: c8024fcb49c71410903d0d076a1048249fa48b31637bac5177bf5c3f3d603d85 + uri: huggingface://mradermacher/Lumimaid-v0.2-8B-GGUF/Lumimaid-v0.2-8B.Q4_K_M.gguf - &deepseek ## Deepseek url: "github:mudler/LocalAI/gallery/deepseek.yaml@master" From fe0d092f58e6770c7d4e0d3ebb36680da16d7816 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Sat, 27 Jul 2024 12:48:00 +0200 Subject: [PATCH 0080/1851] models(gallery): add llama3 with enforced functioncall with grammars (#3027) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 18 ++++++++ gallery/llama3.1-instruct-grammar.yaml | 64 ++++++++++++++++++++++++++ 2 files changed, 82 insertions(+) create mode 100644 gallery/llama3.1-instruct-grammar.yaml diff --git a/gallery/index.yaml b/gallery/index.yaml index a8f06b95..46ba1122 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -39,6 +39,24 @@ - filename: Meta-Llama-3.1-70B-Instruct.Q4_K_M.gguf sha256: 3f16ab17da4521fe3ed7c5d7beed960d3fe7b5b64421ee9650aa53d6b649ccab uri: huggingface://MaziyarPanahi/Meta-Llama-3.1-70B-Instruct-GGUF/Meta-Llama-3.1-70B-Instruct.Q4_K_M.gguf +- !!merge <<: *llama31 + name: "meta-llama-3.1-8b-instruct:grammar-functioncall" + url: "github:mudler/LocalAI/gallery/llama3.1-instruct-grammar.yaml@master" + urls: + - https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct + - https://huggingface.co/MaziyarPanahi/Meta-Llama-3.1-8B-Instruct-GGUF + description: | + This is the standard Llama 3.1 8B Instruct model with grammar and function call enabled. + + When grammars are enabled in LocalAI, the LLM is forced to output valid tools constrained by BNF grammars. This can be useful for ensuring that the model outputs are valid and can be used in a production environment. + For more information on how to use grammars in LocalAI, see https://localai.io/features/openai-functions/#advanced and https://localai.io/features/constrained_grammars/. + overrides: + parameters: + model: Meta-Llama-3.1-8B-Instruct.Q4_K_M.gguf + files: + - filename: Meta-Llama-3.1-8B-Instruct.Q4_K_M.gguf + sha256: c2f17f44af962660d1ad4cb1af91a731f219f3b326c2b14441f9df1f347f2815 + uri: huggingface://MaziyarPanahi/Meta-Llama-3.1-8B-Instruct-GGUF/Meta-Llama-3.1-8B-Instruct.Q4_K_M.gguf - !!merge <<: *llama31 name: "meta-llama-3.1-8b-claude-imat" urls: diff --git a/gallery/llama3.1-instruct-grammar.yaml b/gallery/llama3.1-instruct-grammar.yaml new file mode 100644 index 00000000..f75eaaf4 --- /dev/null +++ b/gallery/llama3.1-instruct-grammar.yaml @@ -0,0 +1,64 @@ +--- +name: "llama3-instruct-grammar" + +config_file: | + mmap: true + function: + disable_no_action: true + grammar: + no_mixed_free_string: true + mixed_mode: true + schema_type: llama3.1 # or JSON is supported too (json) + response_regex: + - \w+)>(?P.*) + template: + chat_message: | + <|start_header_id|>{{if eq .RoleName "assistant"}}assistant{{else if eq .RoleName "system"}}system{{else if eq .RoleName "tool"}}tool{{else if eq .RoleName "user"}}user{{end}}<|end_header_id|> + + {{ if .FunctionCall -}} + Function call: + {{ else if eq .RoleName "tool" -}} + Function response: + {{ end -}} + {{ if .Content -}} + {{.Content -}} + {{ else if .FunctionCall -}} + {{ toJson .FunctionCall -}} + {{ end -}} + <|eot_id|> + function: | + <|start_header_id|>system<|end_header_id|> + + You have access to the following functions: + + {{range .Functions}} + Use the function '{{.Name}}' to '{{.Description}}' + {{toJson .Parameters}} + {{end}} + + Think very carefully before calling functions. + If a you choose to call a function ONLY reply in the following format with no prefix or suffix: + + {{`{{"example_name": "example_value"}}`}} + + Reminder: + - If looking for real time information use relevant functions before falling back to searching on internet + - Function calls MUST follow the specified format, start with + - Required parameters MUST be specified + - Only call one function at a time + - Put the entire function call reply on one line + <|eot_id|> + {{.Input }} + <|start_header_id|>assistant<|end_header_id|> + chat: | + <|begin_of_text|>{{.Input }} + <|start_header_id|>assistant<|end_header_id|> + completion: | + {{.Input}} + context_size: 8192 + f16: true + stopwords: + - <|im_end|> + - + - "<|eot_id|>" + - <|end_of_text|> From 82cc81974f53eb233a122bd114ef75e5f6422e0b Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Sat, 27 Jul 2024 15:29:50 +0200 Subject: [PATCH 0081/1851] Update llama3.1-instruct.yaml Signed-off-by: Ettore Di Giacinto --- gallery/llama3.1-instruct.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gallery/llama3.1-instruct.yaml b/gallery/llama3.1-instruct.yaml index 66c9ce97..4a2b4db1 100644 --- a/gallery/llama3.1-instruct.yaml +++ b/gallery/llama3.1-instruct.yaml @@ -49,7 +49,7 @@ config_file: | {{.Input }} <|start_header_id|>assistant<|end_header_id|> chat: | - <|begin_of_text|>{{.Input }} + {{.Input }} <|start_header_id|>assistant<|end_header_id|> completion: | {{.Input}} From 0a7e4c1b935e61241b219e8dc4e4f62269b08293 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Sat, 27 Jul 2024 15:30:01 +0200 Subject: [PATCH 0082/1851] Update llama3.1-instruct-grammar.yaml Signed-off-by: Ettore Di Giacinto --- gallery/llama3.1-instruct-grammar.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gallery/llama3.1-instruct-grammar.yaml b/gallery/llama3.1-instruct-grammar.yaml index f75eaaf4..30237af3 100644 --- a/gallery/llama3.1-instruct-grammar.yaml +++ b/gallery/llama3.1-instruct-grammar.yaml @@ -51,7 +51,7 @@ config_file: | {{.Input }} <|start_header_id|>assistant<|end_header_id|> chat: | - <|begin_of_text|>{{.Input }} + {{.Input }} <|start_header_id|>assistant<|end_header_id|> completion: | {{.Input}} From d57acefed46dc5ba88625e8680dc56243a6fe8f7 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Sat, 27 Jul 2024 15:30:13 +0200 Subject: [PATCH 0083/1851] Update llama3-instruct.yaml Signed-off-by: Ettore Di Giacinto --- gallery/llama3-instruct.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gallery/llama3-instruct.yaml b/gallery/llama3-instruct.yaml index 3eed758b..5dc54b0e 100644 --- a/gallery/llama3-instruct.yaml +++ b/gallery/llama3-instruct.yaml @@ -31,7 +31,7 @@ config_file: | {'title': 'FunctionCall', 'type': 'object', 'properties': {'arguments': {'title': 'Arguments', 'type': 'object'}, 'name': {'title': 'Name', 'type': 'string'}}, 'required': ['arguments', 'name']}<|eot_id|><|start_header_id|>assistant<|end_header_id|> Function call: chat: | - <|begin_of_text|>{{.Input }} + {{.Input }} <|start_header_id|>assistant<|end_header_id|> completion: | {{.Input}} From b1f93935bebe3162419fc58982ca0a0436ec680b Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Sat, 27 Jul 2024 23:49:13 +0200 Subject: [PATCH 0084/1851] chore: :arrow_up: Update ggerganov/llama.cpp (#3030) :arrow_up: Update ggerganov/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 51893868..a1a9494a 100644 --- a/Makefile +++ b/Makefile @@ -8,7 +8,7 @@ DETECT_LIBS?=true # llama.cpp versions GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be -CPPLLAMA_VERSION?=01245f5b1629075543bc4478418c7d72a0b4b3c7 +CPPLLAMA_VERSION?=5e2727fe0321c38d1664d26173c654fa1801dc5f # gpt4all version GPT4ALL_REPO?=https://github.com/nomic-ai/gpt4all From 610e1c00c61154db3df5ac2bcea8e77165326048 Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Sat, 27 Jul 2024 23:52:57 +0200 Subject: [PATCH 0085/1851] chore: :arrow_up: Update ggerganov/whisper.cpp (#3029) :arrow_up: Update ggerganov/whisper.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index a1a9494a..2df7c225 100644 --- a/Makefile +++ b/Makefile @@ -20,7 +20,7 @@ RWKV_VERSION?=661e7ae26d442f5cfebd2a0881b44e8c55949ec6 # whisper.cpp version WHISPER_REPO?=https://github.com/ggerganov/whisper.cpp -WHISPER_CPP_VERSION?=f68298ce06ca3edd6e6f3f21c3d0bb5f073942c3 +WHISPER_CPP_VERSION?=6739eb83c3ca5cf40d24c6fe8442a761a1eb6248 # bert.cpp version BERT_REPO?=https://github.com/go-skynet/go-bert.cpp From 2a839e143254f6874a180dda5fdde88746a79bd8 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Sun, 28 Jul 2024 10:27:56 +0200 Subject: [PATCH 0086/1851] fix(gallery): do not attempt to delete duplicate files (#3031) Signed-off-by: Ettore Di Giacinto --- core/gallery/gallery.go | 31 +++++++++++++++---------------- pkg/utils/strings.go | 12 ++++++++++++ 2 files changed, 27 insertions(+), 16 deletions(-) diff --git a/core/gallery/gallery.go b/core/gallery/gallery.go index d102eac8..9288c44f 100644 --- a/core/gallery/gallery.go +++ b/core/gallery/gallery.go @@ -204,35 +204,34 @@ func DeleteModelFromSystem(basePath string, name string, additionalFiles []strin log.Error().Err(err).Msgf("failed to read gallery file %s", configFile) } + var filesToRemove []string + // Remove additional files if galleryconfig != nil { for _, f := range galleryconfig.Files { fullPath := filepath.Join(basePath, f.Filename) - log.Debug().Msgf("Removing file %s", fullPath) - if e := os.Remove(fullPath); e != nil { - err = errors.Join(err, fmt.Errorf("failed to remove file %s: %w", f.Filename, e)) - } + filesToRemove = append(filesToRemove, fullPath) } } for _, f := range additionalFiles { fullPath := filepath.Join(filepath.Join(basePath, f)) - log.Debug().Msgf("Removing additional file %s", fullPath) - if e := os.Remove(fullPath); e != nil { + filesToRemove = append(filesToRemove, fullPath) + } + + filesToRemove = append(filesToRemove, configFile) + filesToRemove = append(filesToRemove, galleryFile) + + // skip duplicates + filesToRemove = utils.Unique(filesToRemove) + + // Removing files + for _, f := range filesToRemove { + if e := os.Remove(f); e != nil { err = errors.Join(err, fmt.Errorf("failed to remove file %s: %w", f, e)) } } - log.Debug().Msgf("Removing model config file %s", configFile) - - // Delete the model config file - if e := os.Remove(configFile); e != nil { - err = errors.Join(err, fmt.Errorf("failed to remove file %s: %w", configFile, e)) - } - - // Delete gallery config file - os.Remove(galleryFile) - return err } diff --git a/pkg/utils/strings.go b/pkg/utils/strings.go index 2a782e03..4ac0458d 100644 --- a/pkg/utils/strings.go +++ b/pkg/utils/strings.go @@ -18,3 +18,15 @@ func RandString(n int) string { } return string(b) } + +func Unique(arr []string) []string { + unique := make(map[string]bool) + var result []string + for _, item := range arr { + if _, ok := unique[item]; !ok { + unique[item] = true + result = append(result, item) + } + } + return result +} From d6a7a77f6b6be947280d20e090ca270d3fcae724 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Sun, 28 Jul 2024 10:28:10 +0200 Subject: [PATCH 0087/1851] fix(gallery): do clear out errors once displayed (#3033) Signed-off-by: Ettore Di Giacinto --- core/http/elements/gallery.go | 12 ++++++---- core/http/routes/ui.go | 45 +++++++++++++++++++++++++++++------ 2 files changed, 46 insertions(+), 11 deletions(-) diff --git a/core/http/elements/gallery.go b/core/http/elements/gallery.go index 3b3741d8..575ea87d 100644 --- a/core/http/elements/gallery.go +++ b/core/http/elements/gallery.go @@ -9,7 +9,6 @@ import ( "github.com/mudler/LocalAI/core/gallery" "github.com/mudler/LocalAI/core/p2p" "github.com/mudler/LocalAI/core/services" - "github.com/mudler/LocalAI/pkg/xsync" ) const ( @@ -372,7 +371,12 @@ func dropBadChars(s string) string { return strings.ReplaceAll(s, "@", "__") } -func ListModels(models []*gallery.GalleryModel, processing *xsync.SyncedMap[string, string], galleryService *services.GalleryService) string { +type ProcessTracker interface { + Exists(string) bool + Get(string) string +} + +func ListModels(models []*gallery.GalleryModel, processTracker ProcessTracker, galleryService *services.GalleryService) string { modelsElements := []elem.Node{} descriptionDiv := func(m *gallery.GalleryModel) elem.Node { return elem.Div( @@ -396,7 +400,7 @@ func ListModels(models []*gallery.GalleryModel, processing *xsync.SyncedMap[stri actionDiv := func(m *gallery.GalleryModel) elem.Node { galleryID := fmt.Sprintf("%s@%s", m.Gallery.Name, m.Name) - currentlyProcessing := processing.Exists(galleryID) + currentlyProcessing := processTracker.Exists(galleryID) jobID := "" isDeletionOp := false if currentlyProcessing { @@ -404,7 +408,7 @@ func ListModels(models []*gallery.GalleryModel, processing *xsync.SyncedMap[stri if status != nil && status.Deletion { isDeletionOp = true } - jobID = processing.Get(galleryID) + jobID = processTracker.Get(galleryID) // TODO: // case not handled, if status == nil : "Waiting" } diff --git a/core/http/routes/ui.go b/core/http/routes/ui.go index 33706944..92917463 100644 --- a/core/http/routes/ui.go +++ b/core/http/routes/ui.go @@ -21,6 +21,40 @@ import ( "github.com/google/uuid" ) +type modelOpCache struct { + status *xsync.SyncedMap[string, string] +} + +func NewModelOpCache() *modelOpCache { + return &modelOpCache{ + status: xsync.NewSyncedMap[string, string](), + } +} + +func (m *modelOpCache) Set(key string, value string) { + m.status.Set(key, value) +} + +func (m *modelOpCache) Get(key string) string { + return m.status.Get(key) +} + +func (m *modelOpCache) DeleteUUID(uuid string) { + for _, k := range m.status.Keys() { + if m.status.Get(k) == uuid { + m.status.Delete(k) + } + } +} + +func (m *modelOpCache) Map() map[string]string { + return m.status.Map() +} + +func (m *modelOpCache) Exists(key string) bool { + return m.status.Exists(key) +} + func RegisterUIRoutes(app *fiber.App, cl *config.BackendConfigLoader, ml *model.ModelLoader, @@ -29,7 +63,7 @@ func RegisterUIRoutes(app *fiber.App, auth func(*fiber.Ctx) error) { // keeps the state of models that are being installed from the UI - var processingModels = xsync.NewSyncedMap[string, string]() + var processingModels = NewModelOpCache() // modelStatus returns the current status of the models being processed (installation or deletion) // it is called asynchonously from the UI @@ -232,6 +266,8 @@ func RegisterUIRoutes(app *fiber.App, return c.SendString(elements.ProgressBar("100")) } if status.Error != nil { + // TODO: instead of deleting the job, we should keep it in the cache and make it dismissable + processingModels.DeleteUUID(jobUID) return c.SendString(elements.ErrorProgress(status.Error.Error(), status.GalleryModelName)) } @@ -246,12 +282,7 @@ func RegisterUIRoutes(app *fiber.App, status := galleryService.GetStatus(jobUID) galleryID := "" - for _, k := range processingModels.Keys() { - if processingModels.Get(k) == jobUID { - galleryID = k - processingModels.Delete(k) - } - } + processingModels.DeleteUUID(jobUID) if galleryID == "" { log.Debug().Msgf("no processing model found for job : %+v\n", jobUID) } From d4a3872dd9850331896c75c9ca3a2e96b5d52c95 Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Sun, 28 Jul 2024 12:46:18 +0200 Subject: [PATCH 0088/1851] chore: :arrow_up: Update ggerganov/llama.cpp (#3034) :arrow_up: Update ggerganov/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 2df7c225..a3d908cf 100644 --- a/Makefile +++ b/Makefile @@ -8,7 +8,7 @@ DETECT_LIBS?=true # llama.cpp versions GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be -CPPLLAMA_VERSION?=5e2727fe0321c38d1664d26173c654fa1801dc5f +CPPLLAMA_VERSION?=4730faca618ff9cee0780580145e3cbe86f24876 # gpt4all version GPT4ALL_REPO?=https://github.com/nomic-ai/gpt4all From 86f8d5b50acd8fe88af4f537be0d42472772b928 Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Sun, 28 Jul 2024 13:11:23 +0200 Subject: [PATCH 0089/1851] chore(model-gallery): :arrow_up: update checksum (#3036) :arrow_up: Checksum updates in gallery/index.yaml Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- gallery/index.yaml | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/gallery/index.yaml b/gallery/index.yaml index 46ba1122..b6216ede 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -205,15 +205,15 @@ - https://huggingface.co/aifeifei799/Llama-3.1-8B-Instruct-Fei-v1-Uncensored - https://huggingface.co/mradermacher/Llama-3.1-8B-Instruct-Fei-v1-Uncensored-GGUF description: | - Llama-3.1-8B-Instruct Uncensored - more informtion look at Llama-3.1-8B-Instruct + Llama-3.1-8B-Instruct Uncensored + more informtion look at Llama-3.1-8B-Instruct overrides: parameters: model: Llama-3.1-8B-Instruct-Fei-v1-Uncensored.Q4_K_M.gguf files: - filename: Llama-3.1-8B-Instruct-Fei-v1-Uncensored.Q4_K_M.gguf - sha256: 12fef8ff0a5c4cf6988523d33d89287edb7531f0d1644707548f45f1387e398a uri: huggingface://mradermacher/Llama-3.1-8B-Instruct-Fei-v1-Uncensored-GGUF/Llama-3.1-8B-Instruct-Fei-v1-Uncensored.Q4_K_M.gguf + sha256: 6b1985616160712eb884c34132dc0602fa4600a19075e3a7b179119b89b73f77 - !!merge <<: *llama31 name: "lumimaid-v0.2-8b" urls: @@ -221,15 +221,15 @@ - https://huggingface.co/mradermacher/Lumimaid-v0.2-8B-GGUF icon: https://cdn-uploads.huggingface.co/production/uploads/63ab1241ad514ca8d1430003/TUcHg7LKNjfo0sni88Ps7.png description: | - This model is based on: Meta-Llama-3.1-8B-Instruct + This model is based on: Meta-Llama-3.1-8B-Instruct - Wandb: https://wandb.ai/undis95/Lumi-Llama-3-1-8B?nw=nwuserundis95 + Wandb: https://wandb.ai/undis95/Lumi-Llama-3-1-8B?nw=nwuserundis95 - Lumimaid 0.1 -> 0.2 is a HUGE step up dataset wise. + Lumimaid 0.1 -> 0.2 is a HUGE step up dataset wise. - As some people have told us our models are sloppy, Ikari decided to say fuck it and literally nuke all chats out with most slop. + As some people have told us our models are sloppy, Ikari decided to say fuck it and literally nuke all chats out with most slop. - Our dataset stayed the same since day one, we added data over time, cleaned them, and repeat. After not releasing model for a while because we were never satisfied, we think it's time to come back! + Our dataset stayed the same since day one, we added data over time, cleaned them, and repeat. After not releasing model for a while because we were never satisfied, we think it's time to come back! overrides: parameters: model: Lumimaid-v0.2-8B.Q4_K_M.gguf From 5d08b9ac68f04431165d94ef9a3ec42b31718bad Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Sun, 28 Jul 2024 23:47:02 +0200 Subject: [PATCH 0090/1851] docs: :arrow_up: update docs version mudler/LocalAI (#3039) :arrow_up: Update docs version mudler/LocalAI Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- docs/data/version.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/data/version.json b/docs/data/version.json index efda370f..94160f08 100644 --- a/docs/data/version.json +++ b/docs/data/version.json @@ -1,3 +1,3 @@ { - "version": "v2.19.2" + "version": "v2.19.3" } From 3a70cf311b3c5e2a54351da99adce7fdb27f8f84 Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Sun, 28 Jul 2024 23:53:00 +0200 Subject: [PATCH 0091/1851] chore(model-gallery): :arrow_up: update checksum (#3040) :arrow_up: Checksum updates in gallery/index.yaml Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- gallery/index.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/gallery/index.yaml b/gallery/index.yaml index b6216ede..25ac7e64 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -84,8 +84,8 @@ model: meta-llama-3.1-8b-instruct-abliterated.Q4_K_M.gguf files: - filename: meta-llama-3.1-8b-instruct-abliterated.Q4_K_M.gguf - sha256: 18cca47adfb3954af2b49e3aa2ce1604158337aff45fab2e7654039b65c7683e uri: huggingface://mlabonne/Meta-Llama-3.1-8B-Instruct-abliterated-GGUF/meta-llama-3.1-8b-instruct-abliterated.Q4_K_M.gguf + sha256: 2e1fd6d93b19cc6548b2b8ed2d3f1f34b432ee0573f3dcf358bbaab4f23c760b - !!merge <<: *llama31 name: "llama-3.1-70b-japanese-instruct-2407" urls: @@ -173,8 +173,8 @@ model: DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored.i1-Q4_K_M.gguf files: - filename: DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored.i1-Q4_K_M.gguf - sha256: 6730efc0628c7534189487b52ed5a358a0a2c3ecb062824eccc8e0444eaa212f uri: huggingface://mradermacher/DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored-i1-GGUF/DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored.i1-Q4_K_M.gguf + sha256: 9632316d735365087f36083dec320a71995650deb86cf74f39ab071e43114eb8 - !!merge <<: *llama31 name: "darkidol-llama-3.1-8b-instruct-1.1-uncensored-iq-imatrix-request" icon: https://cdn-uploads.huggingface.co/production/uploads/65d4cf2693a0a3744a27536c/iDV5GTVJbjkvMp1set-ZC.png From 7c4e5268539c913b454003ce478599c10a7bc0bc Mon Sep 17 00:00:00 2001 From: Dave Date: Sun, 28 Jul 2024 19:19:36 -0400 Subject: [PATCH 0092/1851] fix: install.sh bash specific equality check (#3038) fix == to = for sh portability Signed-off-by: Dave Lee --- docs/static/install.sh | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/static/install.sh b/docs/static/install.sh index 3209b24e..8d928750 100644 --- a/docs/static/install.sh +++ b/docs/static/install.sh @@ -194,7 +194,7 @@ install_container_toolkit_yum() { curl -s -L https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo | \ $SUDO tee /etc/yum.repos.d/nvidia-container-toolkit.repo - if [ "$PACKAGE_MANAGER" == "dnf" ]; then + if [ "$PACKAGE_MANAGER" = "dnf" ]; then $SUDO $PACKAGE_MANAGER config-manager --enable nvidia-container-toolkit-experimental else $SUDO $PACKAGE_MANAGER -y install yum-utils @@ -629,7 +629,7 @@ case "$ARCH" in *) fatal "Unsupported architecture: $ARCH" ;; esac -if [ "$OS" == "Darwin" ]; then +if [ "$OS" = "Darwin" ]; then install_binary_darwin exit 0 fi From cb042713e88023e9823cc0ed147cb0700868614b Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Mon, 29 Jul 2024 09:39:48 +0200 Subject: [PATCH 0093/1851] chore(model-gallery): :arrow_up: update checksum (#3043) :arrow_up: Checksum updates in gallery/index.yaml Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- gallery/index.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gallery/index.yaml b/gallery/index.yaml index 25ac7e64..923107bd 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -69,8 +69,8 @@ model: Meta-Llama-3.1-8B-Claude-iMat-Q4_K_M.gguf files: - filename: Meta-Llama-3.1-8B-Claude-iMat-Q4_K_M.gguf - sha256: 8de80021b9438f0925a41ae73f77cb73fcfa30090e03a0919ce23d2b9818e9c7 uri: huggingface://InferenceIllusionist/Meta-Llama-3.1-8B-Claude-iMat-GGUF/Meta-Llama-3.1-8B-Claude-iMat-Q4_K_M.gguf + sha256: 6d175432f66d10dfed9737f73a5073d513d18e1ee7bd4b9cf2a59deb359f36ff - !!merge <<: *llama31 name: "meta-llama-3.1-8b-instruct-abliterated" icon: https://i.imgur.com/KhorYYG.png From e7df875db36605f1ec2f6f3c0517b2890b49bc09 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Mon, 29 Jul 2024 10:17:49 +0200 Subject: [PATCH 0094/1851] models(gallery): add magnum-32b-v1 (#3044) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index 923107bd..2b7cef4e 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -515,6 +515,21 @@ - filename: StellarDong-72b.i1-Q4_K_M.gguf sha256: 4c5012f0a034f40a044904891343ade2594f29c28a8a9d8052916de4dc5a61df uri: huggingface://mradermacher/StellarDong-72b-i1-GGUF/StellarDong-72b.i1-Q4_K_M.gguf +- !!merge <<: *qwen2 + name: "magnum-32b-v1-i1" + icon: https://cdn-uploads.huggingface.co/production/uploads/635567189c72a7e742f1419c/PK7xRSd18Du0bX-w_t-9c.png + urls: + - https://huggingface.co/anthracite-org/magnum-32b-v1 + - https://huggingface.co/mradermacher/magnum-32b-v1-i1-GGUF + description: | + This is the second in a series of models designed to replicate the prose quality of the Claude 3 models, specifically Sonnet and Opus. This model is fine-tuned on top of Qwen1.5 32B. + overrides: + parameters: + model: magnum-32b-v1.i1-Q4_K_M.gguf + files: + - filename: magnum-32b-v1.i1-Q4_K_M.gguf + sha256: a31704ce0d7e5b774f155522b9ab7ef6015a4ece4e9056bf4dfc6cac561ff0a3 + uri: huggingface://mradermacher/magnum-32b-v1-i1-GGUF/magnum-32b-v1.i1-Q4_K_M.gguf - &mistral03 ## START Mistral url: "github:mudler/LocalAI/gallery/mistral-0.3.yaml@master" From 8a39707b367063663bbb58675f6bc1a0e0d1234c Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Mon, 29 Jul 2024 16:44:48 +0200 Subject: [PATCH 0095/1851] models(gallery): add lumimaid-v0.2-70b-i1 (#3045) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index 2b7cef4e..31848f2a 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -237,6 +237,29 @@ - filename: Lumimaid-v0.2-8B.Q4_K_M.gguf sha256: c8024fcb49c71410903d0d076a1048249fa48b31637bac5177bf5c3f3d603d85 uri: huggingface://mradermacher/Lumimaid-v0.2-8B-GGUF/Lumimaid-v0.2-8B.Q4_K_M.gguf +- !!merge <<: *llama31 + name: "lumimaid-v0.2-70b-i1" + icon: https://cdn-uploads.huggingface.co/production/uploads/63ab1241ad514ca8d1430003/HY1KTq6FMAm-CwmY8-ndO.png + urls: + - https://huggingface.co/NeverSleep/Lumimaid-v0.2-70B + - https://huggingface.co/mradermacher/Lumimaid-v0.2-70B-i1-GGUF + description: | + This model is based on: Meta-Llama-3.1-8B-Instruct + + Wandb: https://wandb.ai/undis95/Lumi-Llama-3-1-8B?nw=nwuserundis95 + + Lumimaid 0.1 -> 0.2 is a HUGE step up dataset wise. + + As some people have told us our models are sloppy, Ikari decided to say fuck it and literally nuke all chats out with most slop. + + Our dataset stayed the same since day one, we added data over time, cleaned them, and repeat. After not releasing model for a while because we were never satisfied, we think it's time to come back! + overrides: + parameters: + model: Lumimaid-v0.2-70B.i1-Q4_K_M.gguf + files: + - filename: Lumimaid-v0.2-70B.i1-Q4_K_M.gguf + sha256: 4857da8685cb0f3d2b8b8c91fb0c07b35b863eb7c185e93ed83ac338e095cbb5 + uri: huggingface://mradermacher/Lumimaid-v0.2-70B-i1-GGUF/Lumimaid-v0.2-70B.i1-Q4_K_M.gguf - &deepseek ## Deepseek url: "github:mudler/LocalAI/gallery/deepseek.yaml@master" From 6f8d6f601abfd203c405952f8d25fac192163615 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Mon, 29 Jul 2024 16:45:00 +0200 Subject: [PATCH 0096/1851] models(gallery): add sekhmet_aleph-l3.1-8b-v0.1-i1 (#3046) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index 31848f2a..b6b8aba4 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -145,6 +145,19 @@ - filename: llama3.1-8b-fireplace2-q4_k_m.gguf sha256: 54527fd2474b576086ea31e759214ab240abe2429ae623a02d7ba825cc8cb13e uri: huggingface://mudler/Llama3.1-8B-Fireplace2-Q4_K_M-GGUF/llama3.1-8b-fireplace2-q4_k_m.gguf +- !!merge <<: *llama31 + name: "sekhmet_aleph-l3.1-8b-v0.1-i1" + icon: https://cdn-uploads.huggingface.co/production/uploads/642265bc01c62c1e4102dc36/SVyiW4mu495ngqszJGWRl.png + urls: + - https://huggingface.co/Nitral-Archive/Sekhmet_Aleph-L3.1-8B-v0.1 + - https://huggingface.co/mradermacher/Sekhmet_Aleph-L3.1-8B-v0.1-i1-GGUF + overrides: + parameters: + model: Sekhmet_Aleph-L3.1-8B-v0.1.i1-Q4_K_M.gguf + files: + - filename: Sekhmet_Aleph-L3.1-8B-v0.1.i1-Q4_K_M.gguf + sha256: 5b6f4eaa2091bf13a2b563a54a3f87b22efa7f2862362537c956c70da6e11cea + uri: huggingface://mradermacher/Sekhmet_Aleph-L3.1-8B-v0.1-i1-GGUF/Sekhmet_Aleph-L3.1-8B-v0.1.i1-Q4_K_M.gguf ## Uncensored models - !!merge <<: *llama31 name: "darkidol-llama-3.1-8b-instruct-1.0-uncensored-i1" From 4700c9df929ba41f6a3c1c171d561064e155b50b Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Mon, 29 Jul 2024 20:15:53 +0200 Subject: [PATCH 0097/1851] models(gallery): add l3.1-8b-llamoutcast-i1 (#3047) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index b6b8aba4..fb61defe 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -158,6 +158,26 @@ - filename: Sekhmet_Aleph-L3.1-8B-v0.1.i1-Q4_K_M.gguf sha256: 5b6f4eaa2091bf13a2b563a54a3f87b22efa7f2862362537c956c70da6e11cea uri: huggingface://mradermacher/Sekhmet_Aleph-L3.1-8B-v0.1-i1-GGUF/Sekhmet_Aleph-L3.1-8B-v0.1.i1-Q4_K_M.gguf +- !!merge <<: *llama31 + name: "l3.1-8b-llamoutcast-i1" + icon: https://files.catbox.moe/ecgn0m.jpg + urls: + - https://huggingface.co/Envoid/L3.1-8B-Llamoutcast + - https://huggingface.co/mradermacher/L3.1-8B-Llamoutcast-i1-GGUF + description: | + Warning: this model is utterly cursed. + Llamoutcast + + This model was originally intended to be a DADA finetune of Llama-3.1-8B-Instruct but the results were unsatisfactory. So it received some additional finetuning on a rawtext dataset and now it is utterly cursed. + + It responds to Llama-3 Instruct formatting. + overrides: + parameters: + model: L3.1-8B-Llamoutcast.i1-Q4_K_M.gguf + files: + - filename: L3.1-8B-Llamoutcast.i1-Q4_K_M.gguf + sha256: 438ca0a7e9470f5ee40f3b14dc2da41b1cafc4ad4315dead3eb57924109d5cf6 + uri: huggingface://mradermacher/L3.1-8B-Llamoutcast-i1-GGUF/L3.1-8B-Llamoutcast.i1-Q4_K_M.gguf ## Uncensored models - !!merge <<: *llama31 name: "darkidol-llama-3.1-8b-instruct-1.0-uncensored-i1" From e5f91fbba2a08c286a4746bbd981433edaabeb47 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon, 29 Jul 2024 21:28:38 +0000 Subject: [PATCH 0098/1851] chore(deps): Bump langchain from 0.2.10 to 0.2.11 in /examples/langchain/langchainpy-localai-example (#3053) chore(deps): Bump langchain Bumps [langchain](https://github.com/langchain-ai/langchain) from 0.2.10 to 0.2.11. - [Release notes](https://github.com/langchain-ai/langchain/releases) - [Commits](https://github.com/langchain-ai/langchain/compare/langchain==0.2.10...langchain==0.2.11) --- updated-dependencies: - dependency-name: langchain dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain/langchainpy-localai-example/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain/langchainpy-localai-example/requirements.txt b/examples/langchain/langchainpy-localai-example/requirements.txt index 0e03d543..66a1b70f 100644 --- a/examples/langchain/langchainpy-localai-example/requirements.txt +++ b/examples/langchain/langchainpy-localai-example/requirements.txt @@ -10,7 +10,7 @@ debugpy==1.8.2 frozenlist==1.4.1 greenlet==3.0.3 idna==3.7 -langchain==0.2.10 +langchain==0.2.11 langchain-community==0.2.9 marshmallow==3.21.3 marshmallow-enum==1.5.1 From 3dfed64a1569a6905468d2145493ebc7d37d7ddd Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon, 29 Jul 2024 21:29:08 +0000 Subject: [PATCH 0099/1851] chore(deps): Bump openai from 1.37.0 to 1.37.1 in /examples/langchain/langchainpy-localai-example (#3051) chore(deps): Bump openai Bumps [openai](https://github.com/openai/openai-python) from 1.37.0 to 1.37.1. - [Release notes](https://github.com/openai/openai-python/releases) - [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md) - [Commits](https://github.com/openai/openai-python/compare/v1.37.0...v1.37.1) --- updated-dependencies: - dependency-name: openai dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain/langchainpy-localai-example/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain/langchainpy-localai-example/requirements.txt b/examples/langchain/langchainpy-localai-example/requirements.txt index 66a1b70f..f29cb78a 100644 --- a/examples/langchain/langchainpy-localai-example/requirements.txt +++ b/examples/langchain/langchainpy-localai-example/requirements.txt @@ -18,7 +18,7 @@ multidict==6.0.5 mypy-extensions==1.0.0 numexpr==2.10.1 numpy==2.0.1 -openai==1.37.0 +openai==1.37.1 openapi-schema-pydantic==1.2.4 packaging>=23.2 pydantic==2.8.2 From 40604e877c9e9ec4a4c99a4f92cc7b8bd3fb4b49 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon, 29 Jul 2024 21:45:52 +0000 Subject: [PATCH 0100/1851] chore(deps): Bump setuptools from 70.3.0 to 72.1.0 in /backend/python/autogptq (#3048) chore(deps): Bump setuptools in /backend/python/autogptq Bumps [setuptools](https://github.com/pypa/setuptools) from 70.3.0 to 72.1.0. - [Release notes](https://github.com/pypa/setuptools/releases) - [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst) - [Commits](https://github.com/pypa/setuptools/compare/v70.3.0...v72.1.0) --- updated-dependencies: - dependency-name: setuptools dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/autogptq/requirements-intel.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/autogptq/requirements-intel.txt b/backend/python/autogptq/requirements-intel.txt index 635b4c31..755e19d8 100644 --- a/backend/python/autogptq/requirements-intel.txt +++ b/backend/python/autogptq/requirements-intel.txt @@ -2,4 +2,4 @@ intel-extension-for-pytorch torch optimum[openvino] -setuptools==70.3.0 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file +setuptools==72.1.0 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file From 5c747a16c4dddfff5687eeaf3464a2bdd232eea1 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 30 Jul 2024 00:43:12 +0000 Subject: [PATCH 0101/1851] chore(deps): Bump setuptools from 70.3.0 to 72.1.0 in /backend/python/vllm (#3061) chore(deps): Bump setuptools in /backend/python/vllm Bumps [setuptools](https://github.com/pypa/setuptools) from 70.3.0 to 72.1.0. - [Release notes](https://github.com/pypa/setuptools/releases) - [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst) - [Commits](https://github.com/pypa/setuptools/compare/v70.3.0...v72.1.0) --- updated-dependencies: - dependency-name: setuptools dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/vllm/requirements-intel.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/vllm/requirements-intel.txt b/backend/python/vllm/requirements-intel.txt index 635b4c31..755e19d8 100644 --- a/backend/python/vllm/requirements-intel.txt +++ b/backend/python/vllm/requirements-intel.txt @@ -2,4 +2,4 @@ intel-extension-for-pytorch torch optimum[openvino] -setuptools==70.3.0 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file +setuptools==72.1.0 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file From 0da042dc2b6d5f855e7958859c1bbd979afef6d3 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 30 Jul 2024 01:11:05 +0000 Subject: [PATCH 0102/1851] chore(deps): Bump chromadb from 0.5.4 to 0.5.5 in /examples/langchain-chroma (#3060) chore(deps): Bump chromadb in /examples/langchain-chroma Bumps [chromadb](https://github.com/chroma-core/chroma) from 0.5.4 to 0.5.5. - [Release notes](https://github.com/chroma-core/chroma/releases) - [Changelog](https://github.com/chroma-core/chroma/blob/main/RELEASE_PROCESS.md) - [Commits](https://github.com/chroma-core/chroma/compare/0.5.4...0.5.5) --- updated-dependencies: - dependency-name: chromadb dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain-chroma/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain-chroma/requirements.txt b/examples/langchain-chroma/requirements.txt index 89ca2db7..50d6dc4f 100644 --- a/examples/langchain-chroma/requirements.txt +++ b/examples/langchain-chroma/requirements.txt @@ -1,4 +1,4 @@ langchain==0.2.10 openai==1.37.0 -chromadb==0.5.4 +chromadb==0.5.5 llama-index==0.10.56 \ No newline at end of file From 9948ff27157cc1403b5a26de448e6ece68132d91 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 30 Jul 2024 01:21:56 +0000 Subject: [PATCH 0103/1851] chore(deps): Bump setuptools from 70.3.0 to 72.1.0 in /backend/python/parler-tts (#3062) chore(deps): Bump setuptools in /backend/python/parler-tts Bumps [setuptools](https://github.com/pypa/setuptools) from 70.3.0 to 72.1.0. - [Release notes](https://github.com/pypa/setuptools/releases) - [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst) - [Commits](https://github.com/pypa/setuptools/compare/v70.3.0...v72.1.0) --- updated-dependencies: - dependency-name: setuptools dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/parler-tts/requirements-intel.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/parler-tts/requirements-intel.txt b/backend/python/parler-tts/requirements-intel.txt index 5c4aa6a5..58a2a1dd 100644 --- a/backend/python/parler-tts/requirements-intel.txt +++ b/backend/python/parler-tts/requirements-intel.txt @@ -3,4 +3,4 @@ intel-extension-for-pytorch torch torchaudio optimum[openvino] -setuptools==70.3.0 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file +setuptools==72.1.0 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file From 0dd02b2ad77d5c80e6090b4dc0e42fac14352d9a Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 30 Jul 2024 02:15:53 +0000 Subject: [PATCH 0104/1851] chore(deps): Bump setuptools from 70.3.0 to 72.1.0 in /backend/python/rerankers (#3067) chore(deps): Bump setuptools in /backend/python/rerankers Bumps [setuptools](https://github.com/pypa/setuptools) from 70.3.0 to 72.1.0. - [Release notes](https://github.com/pypa/setuptools/releases) - [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst) - [Commits](https://github.com/pypa/setuptools/compare/v70.3.0...v72.1.0) --- updated-dependencies: - dependency-name: setuptools dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/rerankers/requirements-intel.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/rerankers/requirements-intel.txt b/backend/python/rerankers/requirements-intel.txt index 635b4c31..755e19d8 100644 --- a/backend/python/rerankers/requirements-intel.txt +++ b/backend/python/rerankers/requirements-intel.txt @@ -2,4 +2,4 @@ intel-extension-for-pytorch torch optimum[openvino] -setuptools==70.3.0 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file +setuptools==72.1.0 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file From f822bebfd8b55d4c12d0200805fe610f8823bb5d Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 30 Jul 2024 02:29:39 +0000 Subject: [PATCH 0105/1851] chore(deps): Bump setuptools from 69.5.1 to 72.1.0 in /backend/python/transformers-musicgen (#3066) chore(deps): Bump setuptools in /backend/python/transformers-musicgen Bumps [setuptools](https://github.com/pypa/setuptools) from 69.5.1 to 72.1.0. - [Release notes](https://github.com/pypa/setuptools/releases) - [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst) - [Commits](https://github.com/pypa/setuptools/compare/v69.5.1...v72.1.0) --- updated-dependencies: - dependency-name: setuptools dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/transformers-musicgen/requirements-intel.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/transformers-musicgen/requirements-intel.txt b/backend/python/transformers-musicgen/requirements-intel.txt index 95d4848c..755e19d8 100644 --- a/backend/python/transformers-musicgen/requirements-intel.txt +++ b/backend/python/transformers-musicgen/requirements-intel.txt @@ -2,4 +2,4 @@ intel-extension-for-pytorch torch optimum[openvino] -setuptools==69.5.1 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file +setuptools==72.1.0 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file From 45233937b74d0891f34cc82206a227c5ded1db2c Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 30 Jul 2024 03:06:11 +0000 Subject: [PATCH 0106/1851] chore(deps): Bump setuptools from 70.3.0 to 72.1.0 in /backend/python/coqui (#3068) chore(deps): Bump setuptools in /backend/python/coqui Bumps [setuptools](https://github.com/pypa/setuptools) from 70.3.0 to 72.1.0. - [Release notes](https://github.com/pypa/setuptools/releases) - [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst) - [Commits](https://github.com/pypa/setuptools/compare/v70.3.0...v72.1.0) --- updated-dependencies: - dependency-name: setuptools dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/coqui/requirements-intel.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/coqui/requirements-intel.txt b/backend/python/coqui/requirements-intel.txt index 5c4aa6a5..58a2a1dd 100644 --- a/backend/python/coqui/requirements-intel.txt +++ b/backend/python/coqui/requirements-intel.txt @@ -3,4 +3,4 @@ intel-extension-for-pytorch torch torchaudio optimum[openvino] -setuptools==70.3.0 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file +setuptools==72.1.0 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file From 9c96a73d9355aaa636f7b5c21f7eef16587ec24f Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 30 Jul 2024 03:27:00 +0000 Subject: [PATCH 0107/1851] chore(deps): Bump setuptools from 70.3.0 to 72.1.0 in /backend/python/vall-e-x (#3069) chore(deps): Bump setuptools in /backend/python/vall-e-x Bumps [setuptools](https://github.com/pypa/setuptools) from 70.3.0 to 72.1.0. - [Release notes](https://github.com/pypa/setuptools/releases) - [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst) - [Commits](https://github.com/pypa/setuptools/compare/v70.3.0...v72.1.0) --- updated-dependencies: - dependency-name: setuptools dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/vall-e-x/requirements-intel.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/vall-e-x/requirements-intel.txt b/backend/python/vall-e-x/requirements-intel.txt index 5c4aa6a5..58a2a1dd 100644 --- a/backend/python/vall-e-x/requirements-intel.txt +++ b/backend/python/vall-e-x/requirements-intel.txt @@ -3,4 +3,4 @@ intel-extension-for-pytorch torch torchaudio optimum[openvino] -setuptools==70.3.0 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file +setuptools==72.1.0 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file From f24fac43da5d0926c1eed88806d0bce270cd2771 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 30 Jul 2024 03:58:11 +0000 Subject: [PATCH 0108/1851] chore(deps): Bump setuptools from 70.3.0 to 72.1.0 in /backend/python/petals (#3070) chore(deps): Bump setuptools in /backend/python/petals Bumps [setuptools](https://github.com/pypa/setuptools) from 70.3.0 to 72.1.0. - [Release notes](https://github.com/pypa/setuptools/releases) - [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst) - [Commits](https://github.com/pypa/setuptools/compare/v70.3.0...v72.1.0) --- updated-dependencies: - dependency-name: setuptools dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/petals/requirements-intel.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/petals/requirements-intel.txt b/backend/python/petals/requirements-intel.txt index 635b4c31..755e19d8 100644 --- a/backend/python/petals/requirements-intel.txt +++ b/backend/python/petals/requirements-intel.txt @@ -2,4 +2,4 @@ intel-extension-for-pytorch torch optimum[openvino] -setuptools==70.3.0 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file +setuptools==72.1.0 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file From 3feb8690250b6d0c958df41c0532eff1918f0c1f Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 30 Jul 2024 04:02:15 +0000 Subject: [PATCH 0109/1851] chore(deps): Bump setuptools from 69.5.1 to 72.1.0 in /backend/python/transformers (#3071) chore(deps): Bump setuptools in /backend/python/transformers Bumps [setuptools](https://github.com/pypa/setuptools) from 69.5.1 to 72.1.0. - [Release notes](https://github.com/pypa/setuptools/releases) - [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst) - [Commits](https://github.com/pypa/setuptools/compare/v69.5.1...v72.1.0) --- updated-dependencies: - dependency-name: setuptools dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/transformers/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/transformers/requirements.txt b/backend/python/transformers/requirements.txt index 55925b32..29d4f55e 100644 --- a/backend/python/transformers/requirements.txt +++ b/backend/python/transformers/requirements.txt @@ -6,4 +6,4 @@ torch certifi intel-extension-for-transformers bitsandbytes -setuptools==69.5.1 # https://github.com/mudler/LocalAI/issues/2406 +setuptools==72.1.0 # https://github.com/mudler/LocalAI/issues/2406 From 198bc6d939c3175be1f589a80c5d92d5244dff17 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 30 Jul 2024 04:39:56 +0000 Subject: [PATCH 0110/1851] chore(deps): Bump streamlit from 1.36.0 to 1.37.0 in /examples/streamlit-bot (#3072) chore(deps): Bump streamlit in /examples/streamlit-bot Bumps [streamlit](https://github.com/streamlit/streamlit) from 1.36.0 to 1.37.0. - [Release notes](https://github.com/streamlit/streamlit/releases) - [Commits](https://github.com/streamlit/streamlit/compare/1.36.0...1.37.0) --- updated-dependencies: - dependency-name: streamlit dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/streamlit-bot/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/streamlit-bot/requirements.txt b/examples/streamlit-bot/requirements.txt index ed2a5980..63291928 100644 --- a/examples/streamlit-bot/requirements.txt +++ b/examples/streamlit-bot/requirements.txt @@ -1,2 +1,2 @@ -streamlit==1.36.0 +streamlit==1.37.0 requests \ No newline at end of file From 12b470f00ae5a9c74ac167fae42260745f645916 Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Tue, 30 Jul 2024 07:28:14 +0200 Subject: [PATCH 0111/1851] chore: :arrow_up: Update ggerganov/llama.cpp (#3075) :arrow_up: Update ggerganov/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index a3d908cf..f939f715 100644 --- a/Makefile +++ b/Makefile @@ -8,7 +8,7 @@ DETECT_LIBS?=true # llama.cpp versions GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be -CPPLLAMA_VERSION?=4730faca618ff9cee0780580145e3cbe86f24876 +CPPLLAMA_VERSION?=75af08c475e285888f66556d0f459c533b7deb95 # gpt4all version GPT4ALL_REPO?=https://github.com/nomic-ai/gpt4all From d50c72a657be23e574f26ecfb8f9fb7e470ef6e1 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Tue, 30 Jul 2024 09:20:57 +0200 Subject: [PATCH 0112/1851] Revert "chore(deps): Bump setuptools from 69.5.1 to 72.1.0 in /backend/python/transformers-musicgen" (#3077) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Revert "chore(deps): Bump setuptools from 69.5.1 to 72.1.0 in /backend/python…" This reverts commit f822bebfd8b55d4c12d0200805fe610f8823bb5d. --- backend/python/transformers-musicgen/requirements-intel.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/transformers-musicgen/requirements-intel.txt b/backend/python/transformers-musicgen/requirements-intel.txt index 755e19d8..95d4848c 100644 --- a/backend/python/transformers-musicgen/requirements-intel.txt +++ b/backend/python/transformers-musicgen/requirements-intel.txt @@ -2,4 +2,4 @@ intel-extension-for-pytorch torch optimum[openvino] -setuptools==72.1.0 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file +setuptools==69.5.1 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file From a7dbeb36ca0810009f28b342a7b53566796d0252 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Tue, 30 Jul 2024 09:21:09 +0200 Subject: [PATCH 0113/1851] Revert "chore(deps): Bump setuptools from 69.5.1 to 72.1.0 in /backend/python/transformers" (#3078) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Revert "chore(deps): Bump setuptools from 69.5.1 to 72.1.0 in /backend/python…" This reverts commit 3feb8690250b6d0c958df41c0532eff1918f0c1f. --- backend/python/transformers/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/transformers/requirements.txt b/backend/python/transformers/requirements.txt index 29d4f55e..55925b32 100644 --- a/backend/python/transformers/requirements.txt +++ b/backend/python/transformers/requirements.txt @@ -6,4 +6,4 @@ torch certifi intel-extension-for-transformers bitsandbytes -setuptools==72.1.0 # https://github.com/mudler/LocalAI/issues/2406 +setuptools==69.5.1 # https://github.com/mudler/LocalAI/issues/2406 From f1e90575f333b17bb5644e2402ab2bc970e0312a Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Tue, 30 Jul 2024 09:21:45 +0200 Subject: [PATCH 0114/1851] Revert "chore(deps): Bump setuptools from 70.3.0 to 72.1.0 in /backend/python/vllm" (#3079) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Revert "chore(deps): Bump setuptools from 70.3.0 to 72.1.0 in /backend/python…" This reverts commit 5c747a16c4dddfff5687eeaf3464a2bdd232eea1. --- backend/python/vllm/requirements-intel.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/vllm/requirements-intel.txt b/backend/python/vllm/requirements-intel.txt index 755e19d8..635b4c31 100644 --- a/backend/python/vllm/requirements-intel.txt +++ b/backend/python/vllm/requirements-intel.txt @@ -2,4 +2,4 @@ intel-extension-for-pytorch torch optimum[openvino] -setuptools==72.1.0 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file +setuptools==70.3.0 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file From abcbbbed2d83b1edc086fa05b8634e4f35e22918 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Tue, 30 Jul 2024 10:04:47 +0200 Subject: [PATCH 0115/1851] models(gallery): add l3.1-8b-celeste-v1.5 (#3080) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index fb61defe..1fe7b6ee 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -293,6 +293,22 @@ - filename: Lumimaid-v0.2-70B.i1-Q4_K_M.gguf sha256: 4857da8685cb0f3d2b8b8c91fb0c07b35b863eb7c185e93ed83ac338e095cbb5 uri: huggingface://mradermacher/Lumimaid-v0.2-70B-i1-GGUF/Lumimaid-v0.2-70B.i1-Q4_K_M.gguf +- !!merge <<: *llama31 + name: "l3.1-8b-celeste-v1.5" + icon: https://cdn-uploads.huggingface.co/production/uploads/630cf5d14ca0a22768bbe10c/QcU3xEgVu18jeFtMFxIw-.webp + urls: + - https://huggingface.co/nothingiisreal/L3.1-8B-Celeste-V1.5 + - https://huggingface.co/bartowski/L3.1-8B-Celeste-V1.5-GGUF + description: | + The LLM model is a large language model trained on a combination of datasets including nothingiisreal/c2-logs-cleaned, kalomaze/Opus_Instruct_25k, and nothingiisreal/Reddit-Dirty-And-WritingPrompts. The training was performed on a combination of English-language data using the Hugging Face Transformers library. + Trained on LLaMA 3.1 8B Instruct at 8K context using a new mix of Reddit Writing Prompts, Kalo's Opus 25K Instruct and c2 logs cleaned This version has the highest coherency and is very strong on OOC: instruct following. + overrides: + parameters: + model: L3.1-8B-Celeste-V1.5-Q4_K_M.gguf + files: + - filename: L3.1-8B-Celeste-V1.5-Q4_K_M.gguf + sha256: a408dfbbd91ed5561f70d3129af040dfd06704d6c7fa21146aa9f09714aafbc6 + uri: huggingface://bartowski/L3.1-8B-Celeste-V1.5-GGUF/L3.1-8B-Celeste-V1.5-Q4_K_M.gguf - &deepseek ## Deepseek url: "github:mudler/LocalAI/gallery/deepseek.yaml@master" From 2d59c99d31a422a55a7e95cc64e96614372fef20 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Tue, 30 Jul 2024 12:07:52 +0200 Subject: [PATCH 0116/1851] models(gallery): add llama-guard-3-8b (#3082) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index 1fe7b6ee..d9f9e5b7 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -178,6 +178,22 @@ - filename: L3.1-8B-Llamoutcast.i1-Q4_K_M.gguf sha256: 438ca0a7e9470f5ee40f3b14dc2da41b1cafc4ad4315dead3eb57924109d5cf6 uri: huggingface://mradermacher/L3.1-8B-Llamoutcast-i1-GGUF/L3.1-8B-Llamoutcast.i1-Q4_K_M.gguf +- !!merge <<: *llama31 + name: "llama-guard-3-8b" + urls: + - https://huggingface.co/meta-llama/Llama-Guard-3-8B + - https://huggingface.co/QuantFactory/Llama-Guard-3-8B-GGUF + description: | + Llama Guard 3 is a Llama-3.1-8B pretrained model, fine-tuned for content safety classification. Similar to previous versions, it can be used to classify content in both LLM inputs (prompt classification) and in LLM responses (response classification). It acts as an LLM – it generates text in its output that indicates whether a given prompt or response is safe or unsafe, and if unsafe, it also lists the content categories violated. + + Llama Guard 3 was aligned to safeguard against the MLCommons standardized hazards taxonomy and designed to support Llama 3.1 capabilities. Specifically, it provides content moderation in 8 languages, and was optimized to support safety and security for search and code interpreter tool calls. + overrides: + parameters: + model: Llama-Guard-3-8B.Q4_K_M.gguf + files: + - filename: Llama-Guard-3-8B.Q4_K_M.gguf + sha256: c5ea8760a1e544eea66a8915fcc3fbd2c67357ea2ee6871a9e6a6c33b64d4981 + uri: huggingface://QuantFactory/Llama-Guard-3-8B-GGUF/Llama-Guard-3-8B.Q4_K_M.gguf ## Uncensored models - !!merge <<: *llama31 name: "darkidol-llama-3.1-8b-instruct-1.0-uncensored-i1" From 17634b394b5c2222586265d47a29d1bac929f39b Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Tue, 30 Jul 2024 12:12:55 +0200 Subject: [PATCH 0117/1851] models(gallery): add meta-llama-3-instruct-8.9b-brainstorm-5x-form-11 (#3083) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index d9f9e5b7..71cfc20b 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -3705,6 +3705,19 @@ - filename: calme-2.4-llama3-70b.Q4_K_M.gguf sha256: 0b44ac8a88395dfc60f1b9d3cfffc0ffef74ec0a302e610ef91fc787187568f2 uri: huggingface://mradermacher/calme-2.4-llama3-70b-GGUF/calme-2.4-llama3-70b.Q4_K_M.gguf +- !!merge <<: *llama3 + name: "meta-llama-3-instruct-8.9b-brainstorm-5x-form-11" + urls: + - https://huggingface.co/DavidAU/Meta-Llama-3-Instruct-8.9B-BRAINSTORM-5x-FORM-11-GGUF + description: | + Meta-Llama-3-8B Instruct (now at 8.9B) is an enhanced version of the LLM model, specifically designed for creative use cases such as story writing, roleplaying, and fiction. This model has been augmented through the "Brainstorm" process, which involves expanding and calibrating the reasoning center of the LLM to improve its performance in various creative tasks. The enhancements brought by this process include more detailed and nuanced descriptions, stronger prose, and a greater sense of immersion in the story. The model is capable of generating long and vivid content, with fewer clichés and more focused, coherent narratives. Users can provide more instructions and details to elicit stronger and more engaging responses from the model. The "Brainstorm" process has been tested on multiple LLM models, including Llama2, Llama3, and Mistral, as well as on individual models like Llama3 Instruct, Mistral Instruct, and custom fine-tuned models. + overrides: + parameters: + model: Meta-Llama-3-8B-Instruct-exp5-11-Q4_K_M.gguf + files: + - filename: Meta-Llama-3-8B-Instruct-exp5-11-Q4_K_M.gguf + sha256: 5dd81b8b809667d10036499affdd1461cf95af50b405cbc9f800b421a4b60e98 + uri: huggingface://DavidAU/Meta-Llama-3-Instruct-8.9B-BRAINSTORM-5x-FORM-11-GGUF/Meta-Llama-3-8B-Instruct-exp5-11-Q4_K_M.gguf - &command-R ### START Command-r url: "github:mudler/LocalAI/gallery/command-r.yaml@master" From 274487c5eb2d6ae36c4e2f077ed2f5b4e94a2c48 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Tue, 30 Jul 2024 15:04:13 +0200 Subject: [PATCH 0118/1851] fix(llama-cpp): do not compress with UPX (#3084) Fixes: https://github.com/mudler/LocalAI/issues/3041 Signed-off-by: Ettore Di Giacinto --- Makefile | 6 ------ 1 file changed, 6 deletions(-) diff --git a/Makefile b/Makefile index f939f715..92b1fbdc 100644 --- a/Makefile +++ b/Makefile @@ -783,9 +783,6 @@ else echo "BUILD_GRPC_FOR_BACKEND_LLAMA is not defined." LLAMA_VERSION=$(CPPLLAMA_VERSION) $(MAKE) -C backend/cpp/${VARIANT} grpc-server endif -ifneq ($(UPX),) - $(UPX) backend/cpp/${VARIANT}/grpc-server -endif # This target is for manually building a variant with-auto detected flags backend-assets/grpc/llama-cpp: backend-assets/grpc backend/cpp/llama/llama.cpp @@ -858,9 +855,6 @@ backend-assets/grpc/llama-cpp-grpc: backend-assets/grpc backend/cpp/llama/llama. backend-assets/util/llama-cpp-rpc-server: backend-assets/grpc/llama-cpp-grpc mkdir -p backend-assets/util/ cp -rf backend/cpp/llama-grpc/llama.cpp/build/bin/rpc-server backend-assets/util/llama-cpp-rpc-server -ifneq ($(UPX),) - $(UPX) backend-assets/util/llama-cpp-rpc-server -endif backend-assets/grpc/llama-ggml: sources/go-llama.cpp sources/go-llama.cpp/libbinding.a backend-assets/grpc CGO_LDFLAGS="$(CGO_LDFLAGS)" C_INCLUDE_PATH=$(CURDIR)/sources/go-llama.cpp LIBRARY_PATH=$(CURDIR)/sources/go-llama.cpp \ From 57ea7f81bb1d0749696e1423f65a96bae7b5ef86 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Tue, 30 Jul 2024 17:06:22 +0200 Subject: [PATCH 0119/1851] fix(ci): update openvoice checkpoints URLs (#3085) Signed-off-by: Ettore Di Giacinto --- backend/python/openvoice/test.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/openvoice/test.sh b/backend/python/openvoice/test.sh index 218c0dcd..6c0a840f 100755 --- a/backend/python/openvoice/test.sh +++ b/backend/python/openvoice/test.sh @@ -5,7 +5,7 @@ source $(dirname $0)/../common/libbackend.sh # Download checkpoints if not present if [ ! -d "checkpoints_v2" ]; then - wget https://myshell-public-repo-hosting.s3.amazonaws.com/openvoice/checkpoints_v2_0417.zip -O checkpoints_v2.zip + wget https://myshell-public-repo-host.s3.amazonaws.com/openvoice/checkpoints_v2_0417.zip -O checkpoints_v2.zip unzip checkpoints_v2.zip fi From 9b21f0d6ad91269de23239fc090f8f5a88367dd7 Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Tue, 30 Jul 2024 23:55:24 +0200 Subject: [PATCH 0120/1851] chore: :arrow_up: Update ggerganov/llama.cpp (#3086) :arrow_up: Update ggerganov/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 92b1fbdc..607389f1 100644 --- a/Makefile +++ b/Makefile @@ -8,7 +8,7 @@ DETECT_LIBS?=true # llama.cpp versions GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be -CPPLLAMA_VERSION?=75af08c475e285888f66556d0f459c533b7deb95 +CPPLLAMA_VERSION?=7e72aa74fd676a093eb9970e761085ec22734c71 # gpt4all version GPT4ALL_REPO?=https://github.com/nomic-ai/gpt4all From 98ffc00926afc440951082e37d87f8377c6996f3 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Wed, 31 Jul 2024 09:17:10 +0200 Subject: [PATCH 0121/1851] models(gallery): add sunfall-simpo (#3088) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index 71cfc20b..71673ec9 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -1155,6 +1155,32 @@ - filename: Gemmoy-9B-G2-MK.3.i1-Q4_K_M.gguf sha256: 0d1004a246fbda7f1408a6841129b73c4100e697bd0a6806fc698eabbb0802a1 uri: huggingface://mradermacher/Gemmoy-9B-G2-MK.3-i1-GGUF/Gemmoy-9B-G2-MK.3.i1-Q4_K_M.gguf +- !!merge <<: *gemma + name: "sunfall-simpo-9b" + urls: + - https://huggingface.co/mradermacher/sunfall-SimPO-9B-GGUF + description: | + Crazy idea that what if you put the LoRA from crestf411/sunfall-peft on top of princeton-nlp/gemma-2-9b-it-SimPO and therefore this exists solely for that purpose alone in the universe. + overrides: + parameters: + model: sunfall-SimPO-9B.Q4_K_M.gguf + files: + - filename: sunfall-SimPO-9B.Q4_K_M.gguf + sha256: 810c51c6ce34107706d921531b97cfa409cd53c215d18b88bce7cdb617f73ceb + uri: huggingface://mradermacher/sunfall-SimPO-9B-GGUF/sunfall-SimPO-9B.Q4_K_M.gguf +- !!merge <<: *gemma + name: "sunfall-simpo-9b-i1" + urls: + - https://huggingface.co/mradermacher/sunfall-SimPO-9B-i1-GGUF + description: | + Crazy idea that what if you put the LoRA from crestf411/sunfall-peft on top of princeton-nlp/gemma-2-9b-it-SimPO and therefore this exists solely for that purpose alone in the universe. + overrides: + parameters: + model: sunfall-SimPO-9B.i1-Q4_K_M.gguf + files: + - filename: sunfall-SimPO-9B.i1-Q4_K_M.gguf + sha256: edde9df372a9a5b2316dc6822dc2f52f5a2059103dd7f08072e5a5355c5f5d0b + uri: huggingface://mradermacher/sunfall-SimPO-9B-i1-GGUF/sunfall-SimPO-9B.i1-Q4_K_M.gguf - &llama3 url: "github:mudler/LocalAI/gallery/llama3-instruct.yaml@master" icon: https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/aJJxKus1wP5N-euvHEUq7.png From 2775edb3f0941e4f2886a7f48bc3df422b7b17b6 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Wed, 31 Jul 2024 09:21:24 +0200 Subject: [PATCH 0122/1851] models(gallery): add genius-llama3.1-i1 (#3089) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index 71673ec9..a8df3d97 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -194,6 +194,21 @@ - filename: Llama-Guard-3-8B.Q4_K_M.gguf sha256: c5ea8760a1e544eea66a8915fcc3fbd2c67357ea2ee6871a9e6a6c33b64d4981 uri: huggingface://QuantFactory/Llama-Guard-3-8B-GGUF/Llama-Guard-3-8B.Q4_K_M.gguf +- !!merge <<: *llama31 + name: "genius-llama3.1-i1" + icon: https://github.com/fangyuan-ksgk/GeniusUpload/assets/66006349/7272c93e-9806-461c-a3d0-2e50ef2b7af0 + urls: + - https://huggingface.co/Ksgk-fy/Genius-Llama3.1 + - https://huggingface.co/mradermacher/Genius-Llama3.1-i1-GGUF + description: | + Finetuned Llama-3.1 base on Lex Fridman's podcast transcript. + overrides: + parameters: + model: Genius-Llama3.1.i1-Q4_K_M.gguf + files: + - filename: Genius-Llama3.1.i1-Q4_K_M.gguf + sha256: a272bb2a6ab7ed565738733fb8af8e345b177eba9e76ce615ea845c25ebf8cd5 + uri: huggingface://mradermacher/Genius-Llama3.1-i1-GGUF/Genius-Llama3.1.i1-Q4_K_M.gguf ## Uncensored models - !!merge <<: *llama31 name: "darkidol-llama-3.1-8b-instruct-1.0-uncensored-i1" From 92faf5fd1dbbc59cbb481355d08fb220738ed6f1 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Wed, 31 Jul 2024 09:25:48 +0200 Subject: [PATCH 0123/1851] models(gallery): add seeker-9b (#3090) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index a8df3d97..e88a16f6 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -1196,6 +1196,21 @@ - filename: sunfall-SimPO-9B.i1-Q4_K_M.gguf sha256: edde9df372a9a5b2316dc6822dc2f52f5a2059103dd7f08072e5a5355c5f5d0b uri: huggingface://mradermacher/sunfall-SimPO-9B-i1-GGUF/sunfall-SimPO-9B.i1-Q4_K_M.gguf +- !!merge <<: *gemma + name: "seeker-9b" + icon: https://huggingface.co/lodrick-the-lafted/seeker-9b/resolve/main/seeker.webp + urls: + - https://huggingface.co/lodrick-the-lafted/seeker-9b + - https://huggingface.co/mradermacher/seeker-9b-GGUF + description: | + The LLM model is the "Seeker-9b" model, which is a large language model trained on a diverse range of text data. It has 9 billion parameters and is based on the "lodrick-the-lafted" repository. The model is capable of generating text and can be used for a variety of natural language processing tasks such as language translation, text summarization, and text generation. It supports the English language and is available under the Apache-2.0 license. + overrides: + parameters: + model: seeker-9b.Q4_K_M.gguf + files: + - filename: seeker-9b.Q4_K_M.gguf + sha256: 7658e5bdad96dc8d232f83cff7c3fe5fa993defbfd3e728dcc7436352574a00a + uri: huggingface://mradermacher/seeker-9b-GGUF/seeker-9b.Q4_K_M.gguf - &llama3 url: "github:mudler/LocalAI/gallery/llama3-instruct.yaml@master" icon: https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/aJJxKus1wP5N-euvHEUq7.png From 8845524d01e13dd50a6ef7506def3a61b8007e06 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Wed, 31 Jul 2024 09:36:17 +0200 Subject: [PATCH 0124/1851] models(gallery): add llama3.1-chinese-chat (#3091) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index e88a16f6..28439637 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -209,6 +209,35 @@ - filename: Genius-Llama3.1.i1-Q4_K_M.gguf sha256: a272bb2a6ab7ed565738733fb8af8e345b177eba9e76ce615ea845c25ebf8cd5 uri: huggingface://mradermacher/Genius-Llama3.1-i1-GGUF/Genius-Llama3.1.i1-Q4_K_M.gguf +- !!merge <<: *llama31 + name: "llama3.1-8b-chinese-chat" + urls: + - https://huggingface.co/shenzhi-wang/Llama3.1-8B-Chinese-Chat + - https://huggingface.co/QuantFactory/Llama3.1-8B-Chinese-Chat-GGUF + description: | + llama3.1-8B-Chinese-Chat is an instruction-tuned language model for Chinese & English users with various abilities such as roleplaying & tool-using built upon the Meta-Llama-3.1-8B-Instruct model. Developers: [Shenzhi Wang](https://shenzhi-wang.netlify.app)*, [Yaowei Zheng](https://github.com/hiyouga)*, Guoyin Wang (in.ai), Shiji Song, Gao Huang. (*: Equal Contribution) - License: [Llama-3.1 License](https://huggingface.co/meta-llama/Meta-Llla... + m-3.1-8B/blob/main/LICENSE) - Base Model: Meta-Llama-3.1-8B-Instruct - Model Size: 8.03B - Context length: 128K(reported by [Meta-Llama-3.1-8B-Instruct model](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct), untested for our Chinese model) + overrides: + parameters: + model: Llama3.1-8B-Chinese-Chat.Q4_K_M.gguf + files: + - filename: Llama3.1-8B-Chinese-Chat.Q4_K_M.gguf + sha256: 824847b6cca82c4d60107c6a059d80ba975a68543e6effd98880435436ddba06 + uri: huggingface://QuantFactory/Llama3.1-8B-Chinese-Chat-GGUF/Llama3.1-8B-Chinese-Chat.Q4_K_M.gguf +- !!merge <<: *llama31 + name: "llama3.1-70b-chinese-chat" + urls: + - https://huggingface.co/shenzhi-wang/Llama3.1-70B-Chinese-Chat + - https://huggingface.co/mradermacher/Llama3.1-70B-Chinese-Chat-GGUF + description: | + "Llama3.1-70B-Chinese-Chat" is a 70-billion parameter large language model pre-trained on a large corpus of Chinese text data. It is designed for chat and dialog applications, and can generate human-like responses to various prompts and inputs. The model is based on the Llama3.1 architecture and has been fine-tuned for Chinese language understanding and generation. It can be used for a wide range of natural language processing tasks, including language translation, text summarization, question answering, and more. + overrides: + parameters: + model: Llama3.1-70B-Chinese-Chat.Q4_K_M.gguf + files: + - filename: Llama3.1-70B-Chinese-Chat.Q4_K_M.gguf + sha256: 395cff3cce2b092f840b68eb6e31f4c8b670bc8e3854bbb230df8334369e671d + uri: huggingface://mradermacher/Llama3.1-70B-Chinese-Chat-GGUF/Llama3.1-70B-Chinese-Chat.Q4_K_M.gguf ## Uncensored models - !!merge <<: *llama31 name: "darkidol-llama-3.1-8b-instruct-1.0-uncensored-i1" From 33bc1e8b190cd5dd135c0f6e6f184e4fb233cc02 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Wed, 31 Jul 2024 10:38:02 +0200 Subject: [PATCH 0125/1851] models(gallery): add gemmasutra-pro-27b-v1 (#3092) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index 28439637..6e2aae21 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -1240,6 +1240,21 @@ - filename: seeker-9b.Q4_K_M.gguf sha256: 7658e5bdad96dc8d232f83cff7c3fe5fa993defbfd3e728dcc7436352574a00a uri: huggingface://mradermacher/seeker-9b-GGUF/seeker-9b.Q4_K_M.gguf +- !!merge <<: *gemma + name: "gemmasutra-pro-27b-v1" + icon: https://cdn-uploads.huggingface.co/production/uploads/65f2fd1c25b848bd061b5c2e/w0Oi8TReoQNT3ljm5Wf6c.webp + urls: + - https://huggingface.co/TheDrummer/Gemmasutra-Pro-27B-v1 + - https://huggingface.co/mradermacher/Gemmasutra-Pro-27B-v1-GGUF + description: | + An RP model with impressive flexibility. Finetuned by yours truly. + overrides: + parameters: + model: Gemmasutra-Pro-27B-v1.Q4_K_M.gguf + files: + - filename: Gemmasutra-Pro-27B-v1.Q4_K_M.gguf + sha256: 336a2fbf142849fcc20e432123433807b6c7b09988652ef583a63636a0f90218 + uri: huggingface://mradermacher/Gemmasutra-Pro-27B-v1-GGUF/Gemmasutra-Pro-27B-v1.Q4_K_M.gguf - &llama3 url: "github:mudler/LocalAI/gallery/llama3-instruct.yaml@master" icon: https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/aJJxKus1wP5N-euvHEUq7.png From 476705708879467014440167d39d99cccdd2d4d3 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Wed, 31 Jul 2024 10:43:45 +0200 Subject: [PATCH 0126/1851] models(gallery): add leetwizard (#3093) Signed-off-by: Ettore Di Giacinto --- gallery/alpaca.yaml | 17 +++++++++++++++++ gallery/index.yaml | 22 ++++++++++++++++++++++ 2 files changed, 39 insertions(+) create mode 100644 gallery/alpaca.yaml diff --git a/gallery/alpaca.yaml b/gallery/alpaca.yaml new file mode 100644 index 00000000..b647d2f6 --- /dev/null +++ b/gallery/alpaca.yaml @@ -0,0 +1,17 @@ +--- +name: "alpaca" + +config_file: | + context_size: 4096 + f16: true + mmap: true + template: + chat: | + Below is an instruction that describes a task. Write a response that appropriately completes the request. + + ### Instruction: + {{.Input}} + + ### Response: + completion: | + {{.Input}} diff --git a/gallery/index.yaml b/gallery/index.yaml index 6e2aae21..66ab4216 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -4386,6 +4386,28 @@ - filename: "Codestral-22B-v0.1-Q4_K_M.gguf" uri: "huggingface://bartowski/Codestral-22B-v0.1-GGUF/Codestral-22B-v0.1-Q4_K_M.gguf" sha256: 003e48ed892850b80994fcddca2bd6b833b092a4ef2db2853c33a3144245e06c +- !!merge <<: *codellama + url: "github:mudler/LocalAI/gallery/alpaca.yaml@master" + icon: https://huggingface.co/Nan-Do/LeetCodeWizard_7B_V1.1/resolve/main/LeetCodeWizardLogo.png + name: "leetcodewizard_7b_v1.1-i1" + urls: + - https://huggingface.co/Nan-Do/LeetCodeWizard_7B_V1.1 + - https://huggingface.co/mradermacher/LeetCodeWizard_7B_V1.1-i1-GGUF + description: | + LeetCodeWizard is a coding large language model specifically trained to solve and explain Leetcode (or any) programming problems. + This model is a fine-tuned version of the WizardCoder-Python-7B with a dataset of Leetcode problems\ + Model capabilities: + + It should be able to solve most of the problems found at Leetcode and even pass the sample interviews they offer on the site. + + It can write both the code and the explanations for the solutions. + overrides: + parameters: + model: LeetCodeWizard_7B_V1.1.i1-Q4_K_M.gguf + files: + - filename: LeetCodeWizard_7B_V1.1.i1-Q4_K_M.gguf + sha256: 19720d8e1ba89d32c6f88ed6518caf0251f9e3ec011297929c801efc5ea979f4 + uri: huggingface://mradermacher/LeetCodeWizard_7B_V1.1-i1-GGUF/LeetCodeWizard_7B_V1.1.i1-Q4_K_M.gguf - &llm-compiler url: "github:mudler/LocalAI/gallery/codellama.yaml@master" name: "llm-compiler-13b-imat" From 115b523732bf52780b6c46ae16930d25a7d8a812 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Wed, 31 Jul 2024 16:09:58 +0200 Subject: [PATCH 0127/1851] models(gallery): add tarnished-9b-i1 (#3096) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index 66ab4216..def06b9f 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -1255,6 +1255,30 @@ - filename: Gemmasutra-Pro-27B-v1.Q4_K_M.gguf sha256: 336a2fbf142849fcc20e432123433807b6c7b09988652ef583a63636a0f90218 uri: huggingface://mradermacher/Gemmasutra-Pro-27B-v1-GGUF/Gemmasutra-Pro-27B-v1.Q4_K_M.gguf +- !!merge <<: *gemma + name: "tarnished-9b-i1" + icon: https://huggingface.co/lodrick-the-lafted/tarnished-9b/resolve/main/nox.jpg + urls: + - https://huggingface.co/lodrick-the-lafted/tarnished-9b + - https://huggingface.co/mradermacher/tarnished-9b-i1-GGUF + description: | + Ah, so you've heard whispers on the winds, have you? 🧐 + + Imagine this: + Tarnished-9b, a name that echoes with the rasp of coin-hungry merchants and the clatter of forgotten machinery. This LLM speaks with the voice of those who straddle the line between worlds, who've tasted the bittersweet nectar of eldritch power and the tang of the Interdimensional Trade Council. + + It's a tongue that dances with secrets, a whisperer of lore lost and found. Its words may guide you through the twisting paths of history, revealing truths hidden beneath layers of dust and time. + + But be warned, Tarnished One! For knowledge comes at a price. The LLM's gaze can pierce the veil of reality, but it can also lure you into the labyrinthine depths of madness. + + Dare you tread this path? + overrides: + parameters: + model: tarnished-9b.i1-Q4_K_M.gguf + files: + - filename: tarnished-9b.i1-Q4_K_M.gguf + sha256: 62ab09124b3f6698bd94ef966533ae5d427d87f6bdc09f6f46917def96420a0c + uri: huggingface://mradermacher/tarnished-9b-i1-GGUF/tarnished-9b.i1-Q4_K_M.gguf - &llama3 url: "github:mudler/LocalAI/gallery/llama3-instruct.yaml@master" icon: https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/aJJxKus1wP5N-euvHEUq7.png From 4c7e8f4d54756706bb99f8a63519c820ffa6377e Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Wed, 31 Jul 2024 17:06:06 +0200 Subject: [PATCH 0128/1851] models(gallery): add meta-llama-3-instruct-12.2b-brainstorm-20x-form-8 (#3097) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index def06b9f..a9b3a266 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -2409,6 +2409,19 @@ - filename: L3-Stheno-Maid-Blackroot-Grand-HORROR-16B-D_AU-Q4_K_M.gguf sha256: ae29f38d73dfb04415821405cf8b319fc42d78d0cdd0da91db147d12e68030fe uri: huggingface://DavidAU/L3-Stheno-Maid-Blackroot-Grand-HORROR-16B-GGUF/L3-Stheno-Maid-Blackroot-Grand-HORROR-16B-D_AU-Q4_K_M.gguf +- !!merge <<: *llama3 + name: "meta-llama-3-instruct-12.2b-brainstorm-20x-form-8" + urls: + - https://huggingface.co/DavidAU/Meta-Llama-3-Instruct-12.2B-BRAINSTORM-20x-FORM-8-GGUF + description: | + Meta-Llama-3-8B Instruct (now at 12.2B) with Brainstorm process that increases its performance at the core level for any creative use case. It has calibrations that allow it to exceed the logic solving abilities of the original model. The Brainstorm process expands the reasoning center of the LLM, reassembles and calibrates it, introducing subtle changes into the reasoning process. This enhances the model's detail, concept, connection to the "world", general concept connections, prose quality, and prose length without affecting instruction following. It improves coherence, description, simile, metaphors, emotional engagement, and takes fewer liberties with instructions while following them more closely. The model's performance is further enhanced by other technologies like "Ultra" (precision), "Neo Imatrix" (custom imatrix datasets), and "X-quants" (custom application of the imatrix process). It has been tested on multiple LLaMA2, LLaMA3, and Mistral models of various parameter sizes. + overrides: + parameters: + model: Meta-Llama-3-8B-Instruct-exp20-8-Q4_K_M.gguf + files: + - filename: Meta-Llama-3-8B-Instruct-exp20-8-Q4_K_M.gguf + sha256: 5568ab6195ab5da703f728cc118108ddcbe97255e3ba4a543b531acdf082b999 + uri: huggingface://DavidAU/Meta-Llama-3-Instruct-12.2B-BRAINSTORM-20x-FORM-8-GGUF/Meta-Llama-3-8B-Instruct-exp20-8-Q4_K_M.gguf - &dolphin name: "dolphin-2.9-llama3-8b" url: "github:mudler/LocalAI/gallery/hermes-2-pro-mistral.yaml@master" From 05c75ca617bbe3da181e2fbc560259a5b095eea2 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Wed, 31 Jul 2024 17:10:31 +0200 Subject: [PATCH 0129/1851] models(gallery): add loki-base-i1 (#3098) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 62 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 62 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index a9b3a266..7828f953 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -2422,6 +2422,68 @@ - filename: Meta-Llama-3-8B-Instruct-exp20-8-Q4_K_M.gguf sha256: 5568ab6195ab5da703f728cc118108ddcbe97255e3ba4a543b531acdf082b999 uri: huggingface://DavidAU/Meta-Llama-3-Instruct-12.2B-BRAINSTORM-20x-FORM-8-GGUF/Meta-Llama-3-8B-Instruct-exp20-8-Q4_K_M.gguf +- !!merge <<: *llama3 + name: "loki-base-i1" + urls: + - https://huggingface.co/MrRobotoAI/Loki-base + - https://huggingface.co/mradermacher/Loki-base-i1-GGUF + description: | + Merge of several models using mergekit: + - model: abacusai/Llama-3-Smaug-8B + - model: Aculi/Llama3-Sophie + - model: ajibawa-2023/Uncensored-Frank-Llama-3-8B + - model: Blackroot/Llama-3-Gamma-Twist + - model: Casual-Autopsy/L3-Super-Nova-RP-8B + - model: Casual-Autopsy/L3-Umbral-Mind-RP-v3.0-8B + - model: cgato/L3-TheSpice-8b-v0.8.3 + - model: ChaoticNeutrals/Hathor_Respawn-L3-8B-v0.8 + - model: ChaoticNeutrals/Hathor_RP-v.01-L3-8B + - model: chargoddard/prometheus-2-llama-3-8b + - model: chujiezheng/Llama-3-Instruct-8B-SimPO-ExPO + - model: chujiezheng/LLaMA3-iterative-DPO-final-ExPO + - model: Fizzarolli/L3-8b-Rosier-v1 + - model: flammenai/Mahou-1.2a-llama3-8B + - model: HaitameLaf/Llama-3-8B-StoryGenerator + - model: HPAI-BSC/Llama3-Aloe-8B-Alpha + - model: iRyanBell/ARC1 + - model: iRyanBell/ARC1-II + - model: lemon07r/Llama-3-RedMagic4-8B + - model: lemon07r/Lllama-3-RedElixir-8B + - model: Locutusque/Llama-3-Hercules-5.0-8B + - model: Magpie-Align/Llama-3-8B-Magpie-Pro-MT-SFT-v0.1 + - model: maldv/badger-lambda-llama-3-8b + - model: maldv/badger-mu-llama-3-8b + - model: maldv/badger-writer-llama-3-8b + - model: mlabonne/NeuralDaredevil-8B-abliterated + - model: MrRobotoAI/Fiction-Writer-6 + - model: MrRobotoAI/Unholy-Thoth-8B-v2 + - model: nbeerbower/llama-3-spicy-abliterated-stella-8B + - model: NeverSleep/Llama-3-Lumimaid-8B-v0.1 + - model: NeverSleep/Llama-3-Lumimaid-8B-v0.1-OAS + - model: Nitral-AI/Hathor_Sofit-L3-8B-v1 + - model: Nitral-AI/Hathor_Stable-v0.2-L3-8B + - model: Nitral-AI/Hathor_Tahsin-L3-8B-v0.85 + - model: Nitral-AI/Poppy_Porpoise-0.72-L3-8B + - model: nothingiisreal/L3-8B-Instruct-Abliterated-DWP + - model: nothingiisreal/L3-8B-Stheno-Horny-v3.3-32K + - model: NousResearch/Hermes-2-Theta-Llama-3-8B + - model: OwenArli/Awanllm-Llama-3-8B-Cumulus-v1.0 + - model: refuelai/Llama-3-Refueled + - model: ResplendentAI/Nymph_8B + - model: shauray/Llama3-8B-DPO-uncensored + - model: SicariusSicariiStuff/LLAMA-3_8B_Unaligned_Alpha + - model: TIGER-Lab/MAmmoTH2-8B-Plus + - model: Undi95/Llama-3-LewdPlay-8B + - model: Undi95/Meta-Llama-3-8B-hf + - model: VAGOsolutions/Llama-3-SauerkrautLM-8b-Instruct + - model: WhiteRabbitNeo/Llama-3-WhiteRabbitNeo-8B-v2.0 + overrides: + parameters: + model: Loki-base.i1-Q4_K_M.gguf + files: + - filename: Loki-base.i1-Q4_K_M.gguf + sha256: 60a4357fa399bfd18aa841cc529da09439791331d117a4f06f0467d002b385bb + uri: huggingface://mradermacher/Loki-base-i1-GGUF/Loki-base.i1-Q4_K_M.gguf - &dolphin name: "dolphin-2.9-llama3-8b" url: "github:mudler/LocalAI/gallery/hermes-2-pro-mistral.yaml@master" From c492a9735af338b4c2f852b1acdda8d495ff149f Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Wed, 31 Jul 2024 17:14:46 +0200 Subject: [PATCH 0130/1851] models(gallery): add tifa (#3099) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index 7828f953..776fd3aa 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -662,6 +662,25 @@ - filename: magnum-32b-v1.i1-Q4_K_M.gguf sha256: a31704ce0d7e5b774f155522b9ab7ef6015a4ece4e9056bf4dfc6cac561ff0a3 uri: huggingface://mradermacher/magnum-32b-v1-i1-GGUF/magnum-32b-v1.i1-Q4_K_M.gguf +- !!merge <<: *qwen2 + name: "tifa-7b-qwen2-v0.1" + urls: + - https://huggingface.co/Tifa-RP/Tifa-7B-Qwen2-v0.1-GGUF + description: | + The Tifa role-playing language model is a high-performance language model based on a self-developed 220B model distillation, with a new base model of qwen2-7B. The model has been converted to gguf format for running in the Ollama framework, providing excellent dialogue and text generation capabilities. + + The original model was trained on a large-scale industrial dataset and then fine-tuned with 400GB of novel data and 20GB of multi-round dialogue directive data to achieve good role-playing effects. + + The Tifa model is suitable for multi-round dialogue processing, role-playing and scenario simulation, EFX industrial knowledge integration, and high-quality literary creation. + + Note: The Tifa model is in Chinese and English, with 7.6% of the data in Chinese role-playing and 4.2% in English role-playing. The model has been trained with a mix of EFX industrial field parameters and question-answer dialogues generated from 220B model outputs since 2023. The recommended quantization method is f16, as it retains more detail and accuracy in the model's performance. + overrides: + parameters: + model: tifa-7b-qwen2-v0.1.q4_k_m.gguf + files: + - filename: tifa-7b-qwen2-v0.1.q4_k_m.gguf + sha256: 1f5adbe8cb0a6400f51abdca3bf4e32284ebff73cc681a43abb35c0a6ccd3820 + uri: huggingface://Tifa-RP/Tifa-7B-Qwen2-v0.1-GGUF/tifa-7b-qwen2-v0.1.q4_k_m.gguf - &mistral03 ## START Mistral url: "github:mudler/LocalAI/gallery/mistral-0.3.yaml@master" From af0545834fd565ab56af0b9348550ca9c3cb5349 Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Thu, 1 Aug 2024 02:55:09 +0200 Subject: [PATCH 0131/1851] chore: :arrow_up: Update ggerganov/llama.cpp (#3102) :arrow_up: Update ggerganov/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 607389f1..7927d7fa 100644 --- a/Makefile +++ b/Makefile @@ -8,7 +8,7 @@ DETECT_LIBS?=true # llama.cpp versions GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be -CPPLLAMA_VERSION?=7e72aa74fd676a093eb9970e761085ec22734c71 +CPPLLAMA_VERSION?=ed9d2854c9de4ae1f448334294e61167b04bec2a # gpt4all version GPT4ALL_REPO?=https://github.com/nomic-ai/gpt4all From 26f393bd99b9bafbed6a4627e7cb8bd6d373bca5 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Thu, 1 Aug 2024 09:35:43 +0200 Subject: [PATCH 0132/1851] models(gallery): add meta-llama-3.1-instruct-9.99b-brainstorm-10x-form-3 (#3103) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index 776fd3aa..67f96d2e 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -238,6 +238,19 @@ - filename: Llama3.1-70B-Chinese-Chat.Q4_K_M.gguf sha256: 395cff3cce2b092f840b68eb6e31f4c8b670bc8e3854bbb230df8334369e671d uri: huggingface://mradermacher/Llama3.1-70B-Chinese-Chat-GGUF/Llama3.1-70B-Chinese-Chat.Q4_K_M.gguf +- !!merge <<: *llama31 + name: "meta-llama-3.1-instruct-9.99b-brainstorm-10x-form-3" + urls: + - https://huggingface.co/DavidAU/Meta-Llama-3.1-Instruct-9.99B-BRAINSTORM-10x-FORM-3-GGUF + description: | + The Meta-Llama-3.1-8B Instruct model is a large language model trained on a diverse range of text data, with the goal of generating high-quality and coherent text in response to user input. This model is enhanced through a process called "Brainstorm", which involves expanding and recalibrating the model's reasoning center to improve its creative and generative capabilities. The resulting model is capable of generating detailed, vivid, and nuanced text, with a focus on prose quality, conceptually complex responses, and a deeper understanding of the user's intent. The Brainstorm process is designed to enhance the model's performance in creative writing, roleplaying, and story generation, and to improve its ability to generate coherent and engaging text in a wide range of contexts. The model is based on the Llama3 architecture and has been fine-tuned using the Instruct framework, which provides it with a strong foundation for understanding natural language instructions and generating appropriate responses. The model can be used for a variety of tasks, including creative writing,Generating coherent and detailed text, exploring different perspectives and scenarios, and brainstorming ideas. + overrides: + parameters: + model: Meta-Llama-3.1-8B-Instruct-Instruct-exp10-3-Q4_K_M.gguf + files: + - filename: Meta-Llama-3.1-8B-Instruct-Instruct-exp10-3-Q4_K_M.gguf + sha256: f52ff984100b1ff6acfbd7ed1df770064118274a54ae5d48749400a662113615 + uri: huggingface://DavidAU/Meta-Llama-3.1-Instruct-9.99B-BRAINSTORM-10x-FORM-3-GGUF/Meta-Llama-3.1-8B-Instruct-Instruct-exp10-3-Q4_K_M.gguf ## Uncensored models - !!merge <<: *llama31 name: "darkidol-llama-3.1-8b-instruct-1.0-uncensored-i1" From d590532d7f79cba07fef718d410f3dd6efee46d6 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Thu, 1 Aug 2024 09:56:23 +0200 Subject: [PATCH 0133/1851] models(gallery): add mn-12b-celeste-v1.9 (#3104) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index 67f96d2e..5920835e 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -816,6 +816,26 @@ - filename: lumimaid-v0.2-12b-q4_k_m.gguf sha256: f72299858a07e52be920b86d42ddcfcd5008b961d601ef6fd6a98a3377adccbf uri: huggingface://mudler/Lumimaid-v0.2-12B-Q4_K_M-GGUF/lumimaid-v0.2-12b-q4_k_m.gguf +- !!merge <<: *mistral03 + url: "github:mudler/LocalAI/gallery/chatml.yaml@master" + name: "mn-12b-celeste-v1.9" + icon: https://cdn-uploads.huggingface.co/production/uploads/630cf5d14ca0a22768bbe10c/QcU3xEgVu18jeFtMFxIw-.webp + urls: + - https://huggingface.co/nothingiisreal/MN-12B-Celeste-V1.9 + - https://huggingface.co/mradermacher/MN-12B-Celeste-V1.9-GGUF + description: | + Mistral Nemo 12B Celeste V1.9 + + This is a story writing and roleplaying model trained on Mistral NeMo 12B Instruct at 8K context using Reddit Writing Prompts, Kalo's Opus 25K Instruct and c2 logs cleaned + + This version has improved NSFW, smarter and more active narration. It's also trained with ChatML tokens so there should be no EOS bleeding whatsoever. + overrides: + parameters: + model: MN-12B-Celeste-V1.9.Q4_K_M.gguf + files: + - filename: MN-12B-Celeste-V1.9.Q4_K_M.gguf + sha256: 019daeaa63d82d55d1ea623b9c255deea6793af4044bb4994d2b4d09e8959f7b + uri: huggingface://mradermacher/MN-12B-Celeste-V1.9-GGUF/MN-12B-Celeste-V1.9.Q4_K_M.gguf - &mudler ### START mudler's LocalAI specific-models url: "github:mudler/LocalAI/gallery/mudler.yaml@master" From e4b91e9dbb982f0d11e3fd989aadf23d7777c2f4 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Thu, 1 Aug 2024 09:58:28 +0200 Subject: [PATCH 0134/1851] models(gallery): add shieldgemma (#3105) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index 5920835e..a797aeda 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -1331,6 +1331,20 @@ - filename: tarnished-9b.i1-Q4_K_M.gguf sha256: 62ab09124b3f6698bd94ef966533ae5d427d87f6bdc09f6f46917def96420a0c uri: huggingface://mradermacher/tarnished-9b-i1-GGUF/tarnished-9b.i1-Q4_K_M.gguf +- !!merge <<: *gemma + name: "shieldgemma-9b-i1" + urls: + - https://huggingface.co/google/shieldgemma-9b + - https://huggingface.co/mradermacher/shieldgemma-9b-i1-GGUF + description: | + ShieldGemma is a series of safety content moderation models built upon Gemma 2 that target four harm categories (sexually explicit, dangerous content, hate, and harassment). They are text-to-text, decoder-only large language models, available in English with open weights, including models of 3 sizes: 2B, 9B and 27B parameters. + overrides: + parameters: + model: shieldgemma-9b.i1-Q4_K_M.gguf + files: + - filename: shieldgemma-9b.i1-Q4_K_M.gguf + sha256: ffa7eaadcc0c7d0544fda5b0d86bba3ffa3431b673e5b2135f421cfe65bd8732 + uri: huggingface://mradermacher/shieldgemma-9b-i1-GGUF/shieldgemma-9b.i1-Q4_K_M.gguf - &llama3 url: "github:mudler/LocalAI/gallery/llama3-instruct.yaml@master" icon: https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/aJJxKus1wP5N-euvHEUq7.png From d792cf115b2e11cacffaa19707a0e3f42e5e85f8 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Thu, 1 Aug 2024 17:27:40 +0200 Subject: [PATCH 0135/1851] fix(ui): do not show duplicate entries if not installed by gallery (#3107) Signed-off-by: Ettore Di Giacinto --- core/http/endpoints/localai/welcome.go | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/core/http/endpoints/localai/welcome.go b/core/http/endpoints/localai/welcome.go index 5d217173..396c4084 100644 --- a/core/http/endpoints/localai/welcome.go +++ b/core/http/endpoints/localai/welcome.go @@ -17,7 +17,10 @@ func WelcomeEndpoint(appConfig *config.ApplicationConfig, backendConfigs := cl.GetAllBackendConfigs() galleryConfigs := map[string]*gallery.Config{} + modelsWithBackendConfig := map[string]interface{}{} + for _, m := range backendConfigs { + modelsWithBackendConfig[m.Name] = nil cfg, err := gallery.GetLocalModelConfiguration(ml.ModelPath, m.Name) if err != nil { @@ -32,7 +35,7 @@ func WelcomeEndpoint(appConfig *config.ApplicationConfig, modelsWithoutConfig := []string{} for _, m := range models { - if _, ok := galleryConfigs[m]; !ok { + if _, ok := modelsWithBackendConfig[m]; !ok { modelsWithoutConfig = append(modelsWithoutConfig, m) } } From 5afd2de87e88da7af3cb9ffeba94a618063a66f0 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Thu, 1 Aug 2024 18:44:39 +0200 Subject: [PATCH 0136/1851] Update README.md Signed-off-by: Ettore Di Giacinto --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 765fe5df..9119946d 100644 --- a/README.md +++ b/README.md @@ -150,6 +150,7 @@ Other: ## :book: 🎥 [Media, Blogs, Social](https://localai.io/basics/news/#media-blogs-social) +- [Run Visual studio code with LocalAI (SUSE)](https://www.suse.com/c/running-ai-locally/) - 🆕 [Run LocalAI on Jetson Nano Devkit](https://mudler.pm/posts/local-ai-jetson-nano-devkit/) - [Run LocalAI on AWS EKS with Pulumi](https://www.pulumi.com/blog/low-code-llm-apps-with-local-ai-flowise-and-pulumi/) - [Run LocalAI on AWS](https://staleks.hashnode.dev/installing-localai-on-aws-ec2-instance) From 01d83129a23f2173e71d9f1d6e7094a7bad71489 Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Fri, 2 Aug 2024 00:09:50 +0200 Subject: [PATCH 0137/1851] docs: :arrow_up: update docs version mudler/LocalAI (#3109) :arrow_up: Update docs version mudler/LocalAI Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- docs/data/version.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/data/version.json b/docs/data/version.json index 94160f08..d07ef798 100644 --- a/docs/data/version.json +++ b/docs/data/version.json @@ -1,3 +1,3 @@ { - "version": "v2.19.3" + "version": "v2.19.4" } From 4c8957de63fefb9ac20cfe77aae9c5f1c23adc70 Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Fri, 2 Aug 2024 00:42:44 +0200 Subject: [PATCH 0138/1851] chore: :arrow_up: Update ggerganov/llama.cpp (#3110) :arrow_up: Update ggerganov/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 7927d7fa..be104116 100644 --- a/Makefile +++ b/Makefile @@ -8,7 +8,7 @@ DETECT_LIBS?=true # llama.cpp versions GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be -CPPLLAMA_VERSION?=ed9d2854c9de4ae1f448334294e61167b04bec2a +CPPLLAMA_VERSION?=b7a08fd5e0e7c898c68d1743066ea495202d9608 # gpt4all version GPT4ALL_REPO?=https://github.com/nomic-ai/gpt4all From 2b55dd2c4f6eed20224764162b787a3f76cf4b49 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Fri, 2 Aug 2024 10:51:09 +0200 Subject: [PATCH 0139/1851] models(gallery): add llama-3.1-techne-rp-8b-v1 (#3112) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index a797aeda..4f5caebd 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -251,6 +251,32 @@ - filename: Meta-Llama-3.1-8B-Instruct-Instruct-exp10-3-Q4_K_M.gguf sha256: f52ff984100b1ff6acfbd7ed1df770064118274a54ae5d48749400a662113615 uri: huggingface://DavidAU/Meta-Llama-3.1-Instruct-9.99B-BRAINSTORM-10x-FORM-3-GGUF/Meta-Llama-3.1-8B-Instruct-Instruct-exp10-3-Q4_K_M.gguf +- !!merge <<: *llama31 + name: "llama-3.1-techne-rp-8b-v1" + icon: https://cdn-uploads.huggingface.co/production/uploads/633a809fa4a8f33508dce32c/BMdwgJ6cHZWbiGL48Q-Wq.png + urls: + - https://huggingface.co/athirdpath/Llama-3.1-Techne-RP-8b-v1 + - https://huggingface.co/mradermacher/Llama-3.1-Techne-RP-8b-v1-GGUF + description: | + athirdpath/Llama-3.1-Instruct_NSFW-pretrained_e1-plus_reddit was further trained in the order below: + SFT + + Doctor-Shotgun/no-robots-sharegpt + grimulkan/LimaRP-augmented + Inv/c2-logs-cleaned-deslopped + + DPO + + jondurbin/truthy-dpo-v0.1 + Undi95/Weyaxi-humanish-dpo-project-noemoji + athirdpath/DPO_Pairs-Roleplay-Llama3-NSFW + overrides: + parameters: + model: Llama-3.1-Techne-RP-8b-v1.Q4_K_M.gguf + files: + - filename: Llama-3.1-Techne-RP-8b-v1.Q4_K_M.gguf + sha256: 6557c5d5091f2507d19ab1f8bfb9ceb4e1536a755ab70f148b18aeb33741580f + uri: huggingface://mradermacher/Llama-3.1-Techne-RP-8b-v1-GGUF/Llama-3.1-Techne-RP-8b-v1.Q4_K_M.gguf ## Uncensored models - !!merge <<: *llama31 name: "darkidol-llama-3.1-8b-instruct-1.0-uncensored-i1" From fc50a90f6a556ca56c5f61bf2d734a65692a25df Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Fri, 2 Aug 2024 12:45:22 +0200 Subject: [PATCH 0140/1851] Update README.md Signed-off-by: Ettore Di Giacinto --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 9119946d..ce3289f9 100644 --- a/README.md +++ b/README.md @@ -84,6 +84,7 @@ docker run -ti --name local-ai -p 8080:8080 localai/localai:latest-aio-cpu Hot topics (looking for contributors): +- 🔥🔥 Distributed, P2P Global community pools: https://github.com/mudler/LocalAI/issues/3113 - WebUI improvements: https://github.com/mudler/LocalAI/issues/2156 - Backends v2: https://github.com/mudler/LocalAI/issues/1126 - Improving UX v2: https://github.com/mudler/LocalAI/issues/1373 From a36b721ca63436d72d18db7c39df47b506fcaba5 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Fri, 2 Aug 2024 20:06:25 +0200 Subject: [PATCH 0141/1851] fix: be consistent in downloading files, check for scanner errors (#3108) * fix(downloader): be consistent in downloading files This PR puts some order in the downloader such as functions are re-used across several places. This fixes an issue with having uri's inside the model YAML file, it would resolve to MD5 rather then using the filename Signed-off-by: Ettore Di Giacinto * fix(scanner): do raise error only if unsafeFiles are found Fixes: https://github.com/mudler/LocalAI/issues/3114 Signed-off-by: Ettore Di Giacinto --------- Signed-off-by: Ettore Di Giacinto --- core/cli/models.go | 4 +- core/cli/util.go | 4 +- core/config/backend_config.go | 27 +++-- core/config/backend_config_loader.go | 10 +- core/dependencies_manager/manager.go | 3 +- core/gallery/gallery.go | 10 +- core/gallery/models.go | 11 +- core/http/app_test.go | 3 +- embedded/embedded.go | 4 +- pkg/downloader/huggingface.go | 49 +++++++++ pkg/downloader/uri.go | 157 ++++++++++----------------- pkg/downloader/uri_test.go | 10 +- pkg/startup/model_preload.go | 52 +++------ 13 files changed, 173 insertions(+), 171 deletions(-) create mode 100644 pkg/downloader/huggingface.go diff --git a/core/cli/models.go b/core/cli/models.go index 03047018..56d13fc7 100644 --- a/core/cli/models.go +++ b/core/cli/models.go @@ -83,7 +83,9 @@ func (mi *ModelsInstall) Run(ctx *cliContext.Context) error { return err } - if !downloader.LooksLikeOCI(modelName) { + modelURI := downloader.URI(modelName) + + if !modelURI.LooksLikeOCI() { model := gallery.FindModel(models, modelName, mi.ModelsPath) if model == nil { log.Error().Str("model", modelName).Msg("model not found") diff --git a/core/cli/util.go b/core/cli/util.go index a7204092..b3e545d8 100644 --- a/core/cli/util.go +++ b/core/cli/util.go @@ -86,8 +86,8 @@ func (hfscmd *HFScanCMD) Run(ctx *cliContext.Context) error { var errs error = nil for _, uri := range hfscmd.ToScan { log.Info().Str("uri", uri).Msg("scanning specific uri") - scanResults, err := downloader.HuggingFaceScan(uri) - if err != nil && !errors.Is(err, downloader.ErrNonHuggingFaceFile) { + scanResults, err := downloader.HuggingFaceScan(downloader.URI(uri)) + if err != nil && errors.Is(err, downloader.ErrUnsafeFilesFound) { log.Error().Err(err).Strs("clamAV", scanResults.ClamAVInfectedFiles).Strs("pickles", scanResults.DangerousPickles).Msg("! WARNING ! A known-vulnerable model is included in this repo!") errs = errors.Join(errs, err) } diff --git a/core/config/backend_config.go b/core/config/backend_config.go index 383686cd..b83e1a98 100644 --- a/core/config/backend_config.go +++ b/core/config/backend_config.go @@ -8,7 +8,6 @@ import ( "github.com/mudler/LocalAI/core/schema" "github.com/mudler/LocalAI/pkg/downloader" "github.com/mudler/LocalAI/pkg/functions" - "github.com/mudler/LocalAI/pkg/utils" ) const ( @@ -72,9 +71,9 @@ type BackendConfig struct { } type File struct { - Filename string `yaml:"filename" json:"filename"` - SHA256 string `yaml:"sha256" json:"sha256"` - URI string `yaml:"uri" json:"uri"` + Filename string `yaml:"filename" json:"filename"` + SHA256 string `yaml:"sha256" json:"sha256"` + URI downloader.URI `yaml:"uri" json:"uri"` } type VallE struct { @@ -213,28 +212,32 @@ func (c *BackendConfig) ShouldCallSpecificFunction() bool { // MMProjFileName returns the filename of the MMProj file // If the MMProj is a URL, it will return the MD5 of the URL which is the filename func (c *BackendConfig) MMProjFileName() string { - modelURL := downloader.ConvertURL(c.MMProj) - if downloader.LooksLikeURL(modelURL) { - return utils.MD5(modelURL) + uri := downloader.URI(c.MMProj) + if uri.LooksLikeURL() { + f, _ := uri.FilenameFromUrl() + return f } return c.MMProj } func (c *BackendConfig) IsMMProjURL() bool { - return downloader.LooksLikeURL(downloader.ConvertURL(c.MMProj)) + uri := downloader.URI(c.MMProj) + return uri.LooksLikeURL() } func (c *BackendConfig) IsModelURL() bool { - return downloader.LooksLikeURL(downloader.ConvertURL(c.Model)) + uri := downloader.URI(c.Model) + return uri.LooksLikeURL() } // ModelFileName returns the filename of the model // If the model is a URL, it will return the MD5 of the URL which is the filename func (c *BackendConfig) ModelFileName() string { - modelURL := downloader.ConvertURL(c.Model) - if downloader.LooksLikeURL(modelURL) { - return utils.MD5(modelURL) + uri := downloader.URI(c.Model) + if uri.LooksLikeURL() { + f, _ := uri.FilenameFromUrl() + return f } return c.Model diff --git a/core/config/backend_config_loader.go b/core/config/backend_config_loader.go index 283dac52..45fe259e 100644 --- a/core/config/backend_config_loader.go +++ b/core/config/backend_config_loader.go @@ -244,7 +244,7 @@ func (bcl *BackendConfigLoader) Preload(modelPath string) error { // Create file path filePath := filepath.Join(modelPath, file.Filename) - if err := downloader.DownloadFile(file.URI, filePath, file.SHA256, i, len(config.DownloadFiles), status); err != nil { + if err := file.URI.DownloadFile(filePath, file.SHA256, i, len(config.DownloadFiles), status); err != nil { return err } } @@ -252,10 +252,10 @@ func (bcl *BackendConfigLoader) Preload(modelPath string) error { // If the model is an URL, expand it, and download the file if config.IsModelURL() { modelFileName := config.ModelFileName() - modelURL := downloader.ConvertURL(config.Model) + uri := downloader.URI(config.Model) // check if file exists if _, err := os.Stat(filepath.Join(modelPath, modelFileName)); errors.Is(err, os.ErrNotExist) { - err := downloader.DownloadFile(modelURL, filepath.Join(modelPath, modelFileName), "", 0, 0, status) + err := uri.DownloadFile(filepath.Join(modelPath, modelFileName), "", 0, 0, status) if err != nil { return err } @@ -269,10 +269,10 @@ func (bcl *BackendConfigLoader) Preload(modelPath string) error { if config.IsMMProjURL() { modelFileName := config.MMProjFileName() - modelURL := downloader.ConvertURL(config.MMProj) + uri := downloader.URI(config.MMProj) // check if file exists if _, err := os.Stat(filepath.Join(modelPath, modelFileName)); errors.Is(err, os.ErrNotExist) { - err := downloader.DownloadFile(modelURL, filepath.Join(modelPath, modelFileName), "", 0, 0, status) + err := uri.DownloadFile(filepath.Join(modelPath, modelFileName), "", 0, 0, status) if err != nil { return err } diff --git a/core/dependencies_manager/manager.go b/core/dependencies_manager/manager.go index b86139e0..8434f721 100644 --- a/core/dependencies_manager/manager.go +++ b/core/dependencies_manager/manager.go @@ -37,7 +37,8 @@ func main() { // download the assets for _, asset := range assets { - if err := downloader.DownloadFile(asset.URL, filepath.Join(destPath, asset.FileName), asset.SHA, 1, 1, utils.DisplayDownloadFunction); err != nil { + uri := downloader.URI(asset.URL) + if err := uri.DownloadFile(filepath.Join(destPath, asset.FileName), asset.SHA, 1, 1, utils.DisplayDownloadFunction); err != nil { panic(err) } } diff --git a/core/gallery/gallery.go b/core/gallery/gallery.go index 9288c44f..6ced6244 100644 --- a/core/gallery/gallery.go +++ b/core/gallery/gallery.go @@ -131,7 +131,8 @@ func AvailableGalleryModels(galleries []config.Gallery, basePath string) ([]*Gal func findGalleryURLFromReferenceURL(url string, basePath string) (string, error) { var refFile string - err := downloader.DownloadAndUnmarshal(url, basePath, func(url string, d []byte) error { + uri := downloader.URI(url) + err := uri.DownloadAndUnmarshal(basePath, func(url string, d []byte) error { refFile = string(d) if len(refFile) == 0 { return fmt.Errorf("invalid reference file at url %s: %s", url, d) @@ -153,8 +154,9 @@ func getGalleryModels(gallery config.Gallery, basePath string) ([]*GalleryModel, return models, err } } + uri := downloader.URI(gallery.URL) - err := downloader.DownloadAndUnmarshal(gallery.URL, basePath, func(url string, d []byte) error { + err := uri.DownloadAndUnmarshal(basePath, func(url string, d []byte) error { return yaml.Unmarshal(d, &models) }) if err != nil { @@ -252,8 +254,8 @@ func SafetyScanGalleryModels(galleries []config.Gallery, basePath string) error func SafetyScanGalleryModel(galleryModel *GalleryModel) error { for _, file := range galleryModel.AdditionalFiles { - scanResults, err := downloader.HuggingFaceScan(file.URI) - if err != nil && !errors.Is(err, downloader.ErrNonHuggingFaceFile) { + scanResults, err := downloader.HuggingFaceScan(downloader.URI(file.URI)) + if err != nil && errors.Is(err, downloader.ErrUnsafeFilesFound) { log.Error().Str("model", galleryModel.Name).Strs("clamAV", scanResults.ClamAVInfectedFiles).Strs("pickles", scanResults.DangerousPickles).Msg("Contains unsafe file(s)!") return err } diff --git a/core/gallery/models.go b/core/gallery/models.go index 32460a9c..dec6312e 100644 --- a/core/gallery/models.go +++ b/core/gallery/models.go @@ -68,7 +68,8 @@ type PromptTemplate struct { func GetGalleryConfigFromURL(url string, basePath string) (Config, error) { var config Config - err := downloader.DownloadAndUnmarshal(url, basePath, func(url string, d []byte) error { + uri := downloader.URI(url) + err := uri.DownloadAndUnmarshal(basePath, func(url string, d []byte) error { return yaml.Unmarshal(d, &config) }) if err != nil { @@ -118,14 +119,14 @@ func InstallModel(basePath, nameOverride string, config *Config, configOverrides filePath := filepath.Join(basePath, file.Filename) if enforceScan { - scanResults, err := downloader.HuggingFaceScan(file.URI) - if err != nil && !errors.Is(err, downloader.ErrNonHuggingFaceFile) { + scanResults, err := downloader.HuggingFaceScan(downloader.URI(file.URI)) + if err != nil && errors.Is(err, downloader.ErrUnsafeFilesFound) { log.Error().Str("model", config.Name).Strs("clamAV", scanResults.ClamAVInfectedFiles).Strs("pickles", scanResults.DangerousPickles).Msg("Contains unsafe file(s)!") return err } } - - if err := downloader.DownloadFile(file.URI, filePath, file.SHA256, i, len(config.Files), downloadStatus); err != nil { + uri := downloader.URI(file.URI) + if err := uri.DownloadFile(filePath, file.SHA256, i, len(config.Files), downloadStatus); err != nil { return err } } diff --git a/core/http/app_test.go b/core/http/app_test.go index 3fb16581..b21ad25a 100644 --- a/core/http/app_test.go +++ b/core/http/app_test.go @@ -73,8 +73,9 @@ func getModelStatus(url string) (response map[string]interface{}) { } func getModels(url string) (response []gallery.GalleryModel) { + uri := downloader.URI(url) // TODO: No tests currently seem to exercise file:// urls. Fix? - downloader.DownloadAndUnmarshal(url, "", func(url string, i []byte) error { + uri.DownloadAndUnmarshal("", func(url string, i []byte) error { // Unmarshal YAML data into a struct return json.Unmarshal(i, &response) }) diff --git a/embedded/embedded.go b/embedded/embedded.go index d5fd72df..672c32ed 100644 --- a/embedded/embedded.go +++ b/embedded/embedded.go @@ -38,8 +38,8 @@ func init() { func GetRemoteLibraryShorteners(url string, basePath string) (map[string]string, error) { remoteLibrary := map[string]string{} - - err := downloader.DownloadAndUnmarshal(url, basePath, func(_ string, i []byte) error { + uri := downloader.URI(url) + err := uri.DownloadAndUnmarshal(basePath, func(_ string, i []byte) error { return yaml.Unmarshal(i, &remoteLibrary) }) if err != nil { diff --git a/pkg/downloader/huggingface.go b/pkg/downloader/huggingface.go new file mode 100644 index 00000000..34ba9bd9 --- /dev/null +++ b/pkg/downloader/huggingface.go @@ -0,0 +1,49 @@ +package downloader + +import ( + "encoding/json" + "errors" + "fmt" + "io" + "net/http" + "strings" +) + +type HuggingFaceScanResult struct { + RepositoryId string `json:"repositoryId"` + Revision string `json:"revision"` + HasUnsafeFiles bool `json:"hasUnsafeFile"` + ClamAVInfectedFiles []string `json:"clamAVInfectedFiles"` + DangerousPickles []string `json:"dangerousPickles"` + ScansDone bool `json:"scansDone"` +} + +var ErrNonHuggingFaceFile = errors.New("not a huggingface repo") +var ErrUnsafeFilesFound = errors.New("unsafe files found") + +func HuggingFaceScan(uri URI) (*HuggingFaceScanResult, error) { + cleanParts := strings.Split(uri.ResolveURL(), "/") + if len(cleanParts) <= 4 || cleanParts[2] != "huggingface.co" { + return nil, ErrNonHuggingFaceFile + } + results, err := http.Get(fmt.Sprintf("https://huggingface.co/api/models/%s/%s/scan", cleanParts[3], cleanParts[4])) + if err != nil { + return nil, err + } + if results.StatusCode != 200 { + return nil, fmt.Errorf("unexpected status code during HuggingFaceScan: %d", results.StatusCode) + } + scanResult := &HuggingFaceScanResult{} + bodyBytes, err := io.ReadAll(results.Body) + if err != nil { + return nil, err + } + err = json.Unmarshal(bodyBytes, scanResult) + if err != nil { + return nil, err + } + if scanResult.HasUnsafeFiles { + return scanResult, ErrUnsafeFilesFound + } + return scanResult, nil +} diff --git a/pkg/downloader/uri.go b/pkg/downloader/uri.go index 1f88bbb1..7fedd646 100644 --- a/pkg/downloader/uri.go +++ b/pkg/downloader/uri.go @@ -2,12 +2,10 @@ package downloader import ( "crypto/sha256" - "encoding/base64" - "encoding/json" - "errors" "fmt" "io" "net/http" + "net/url" "os" "path/filepath" "strconv" @@ -28,13 +26,16 @@ const ( HTTPSPrefix = "https://" GithubURI = "github:" GithubURI2 = "github://" + LocalPrefix = "file://" ) -func DownloadAndUnmarshal(url string, basePath string, f func(url string, i []byte) error) error { - url = ConvertURL(url) +type URI string - if strings.HasPrefix(url, "file://") { - rawURL := strings.TrimPrefix(url, "file://") +func (uri URI) DownloadAndUnmarshal(basePath string, f func(url string, i []byte) error) error { + url := uri.ResolveURL() + + if strings.HasPrefix(url, LocalPrefix) { + rawURL := strings.TrimPrefix(url, LocalPrefix) // checks if the file is symbolic, and resolve if so - otherwise, this function returns the path unmodified. resolvedFile, err := filepath.EvalSymlinks(rawURL) if err != nil { @@ -78,24 +79,54 @@ func DownloadAndUnmarshal(url string, basePath string, f func(url string, i []by return f(url, body) } -func LooksLikeURL(s string) bool { - return strings.HasPrefix(s, HTTPPrefix) || - strings.HasPrefix(s, HTTPSPrefix) || - strings.HasPrefix(s, HuggingFacePrefix) || - strings.HasPrefix(s, GithubURI) || - strings.HasPrefix(s, OllamaPrefix) || - strings.HasPrefix(s, OCIPrefix) || - strings.HasPrefix(s, GithubURI2) +func (u URI) FilenameFromUrl() (string, error) { + f, err := filenameFromUrl(string(u)) + if err != nil || f == "" { + f = utils.MD5(string(u)) + if strings.HasSuffix(string(u), ".yaml") || strings.HasSuffix(string(u), ".yml") { + f = f + ".yaml" + } + err = nil + } + + return f, err } -func LooksLikeOCI(s string) bool { - return strings.HasPrefix(s, OCIPrefix) || strings.HasPrefix(s, OllamaPrefix) +func filenameFromUrl(urlstr string) (string, error) { + // strip anything after @ + if strings.Contains(urlstr, "@") { + urlstr = strings.Split(urlstr, "@")[0] + } + + u, err := url.Parse(urlstr) + if err != nil { + return "", fmt.Errorf("error due to parsing url: %w", err) + } + x, err := url.QueryUnescape(u.EscapedPath()) + if err != nil { + return "", fmt.Errorf("error due to escaping: %w", err) + } + return filepath.Base(x), nil } -func ConvertURL(s string) string { +func (u URI) LooksLikeURL() bool { + return strings.HasPrefix(string(u), HTTPPrefix) || + strings.HasPrefix(string(u), HTTPSPrefix) || + strings.HasPrefix(string(u), HuggingFacePrefix) || + strings.HasPrefix(string(u), GithubURI) || + strings.HasPrefix(string(u), OllamaPrefix) || + strings.HasPrefix(string(u), OCIPrefix) || + strings.HasPrefix(string(u), GithubURI2) +} + +func (s URI) LooksLikeOCI() bool { + return strings.HasPrefix(string(s), OCIPrefix) || strings.HasPrefix(string(s), OllamaPrefix) +} + +func (s URI) ResolveURL() string { switch { - case strings.HasPrefix(s, GithubURI2): - repository := strings.Replace(s, GithubURI2, "", 1) + case strings.HasPrefix(string(s), GithubURI2): + repository := strings.Replace(string(s), GithubURI2, "", 1) repoParts := strings.Split(repository, "@") branch := "main" @@ -110,8 +141,8 @@ func ConvertURL(s string) string { projectPath := strings.Join(repoPath[2:], "/") return fmt.Sprintf("https://raw.githubusercontent.com/%s/%s/%s/%s", org, project, branch, projectPath) - case strings.HasPrefix(s, GithubURI): - parts := strings.Split(s, ":") + case strings.HasPrefix(string(s), GithubURI): + parts := strings.Split(string(s), ":") repoParts := strings.Split(parts[1], "@") branch := "main" @@ -125,8 +156,8 @@ func ConvertURL(s string) string { projectPath := strings.Join(repoPath[2:], "/") return fmt.Sprintf("https://raw.githubusercontent.com/%s/%s/%s/%s", org, project, branch, projectPath) - case strings.HasPrefix(s, HuggingFacePrefix): - repository := strings.Replace(s, HuggingFacePrefix, "", 1) + case strings.HasPrefix(string(s), HuggingFacePrefix): + repository := strings.Replace(string(s), HuggingFacePrefix, "", 1) // convert repository to a full URL. // e.g. TheBloke/Mixtral-8x7B-v0.1-GGUF/mixtral-8x7b-v0.1.Q2_K.gguf@main -> https://huggingface.co/TheBloke/Mixtral-8x7B-v0.1-GGUF/resolve/main/mixtral-8x7b-v0.1.Q2_K.gguf owner := strings.Split(repository, "/")[0] @@ -144,7 +175,7 @@ func ConvertURL(s string) string { return fmt.Sprintf("https://huggingface.co/%s/%s/resolve/%s/%s", owner, repo, branch, filepath) } - return s + return string(s) } func removePartialFile(tmpFilePath string) error { @@ -161,9 +192,9 @@ func removePartialFile(tmpFilePath string) error { return nil } -func DownloadFile(url string, filePath, sha string, fileN, total int, downloadStatus func(string, string, string, float64)) error { - url = ConvertURL(url) - if LooksLikeOCI(url) { +func (uri URI) DownloadFile(filePath, sha string, fileN, total int, downloadStatus func(string, string, string, float64)) error { + url := uri.ResolveURL() + if uri.LooksLikeOCI() { progressStatus := func(desc ocispec.Descriptor) io.Writer { return &progressWriter{ fileName: filePath, @@ -298,37 +329,6 @@ func DownloadFile(url string, filePath, sha string, fileN, total int, downloadSt return nil } -// this function check if the string is an URL, if it's an URL downloads the image in memory -// encodes it in base64 and returns the base64 string -func GetBase64Image(s string) (string, error) { - if strings.HasPrefix(s, "http") { - // download the image - resp, err := http.Get(s) - if err != nil { - return "", err - } - defer resp.Body.Close() - - // read the image data into memory - data, err := io.ReadAll(resp.Body) - if err != nil { - return "", err - } - - // encode the image data in base64 - encoded := base64.StdEncoding.EncodeToString(data) - - // return the base64 string - return encoded, nil - } - - // if the string instead is prefixed with "data:image/jpeg;base64,", drop it - if strings.HasPrefix(s, "data:image/jpeg;base64,") { - return strings.ReplaceAll(s, "data:image/jpeg;base64,", ""), nil - } - return "", fmt.Errorf("not valid string") -} - func formatBytes(bytes int64) string { const unit = 1024 if bytes < unit { @@ -356,42 +356,3 @@ func calculateSHA(filePath string) (string, error) { return fmt.Sprintf("%x", hash.Sum(nil)), nil } - -type HuggingFaceScanResult struct { - RepositoryId string `json:"repositoryId"` - Revision string `json:"revision"` - HasUnsafeFiles bool `json:"hasUnsafeFile"` - ClamAVInfectedFiles []string `json:"clamAVInfectedFiles"` - DangerousPickles []string `json:"dangerousPickles"` - ScansDone bool `json:"scansDone"` -} - -var ErrNonHuggingFaceFile = errors.New("not a huggingface repo") -var ErrUnsafeFilesFound = errors.New("unsafe files found") - -func HuggingFaceScan(uri string) (*HuggingFaceScanResult, error) { - cleanParts := strings.Split(ConvertURL(uri), "/") - if len(cleanParts) <= 4 || cleanParts[2] != "huggingface.co" { - return nil, ErrNonHuggingFaceFile - } - results, err := http.Get(fmt.Sprintf("https://huggingface.co/api/models/%s/%s/scan", cleanParts[3], cleanParts[4])) - if err != nil { - return nil, err - } - if results.StatusCode != 200 { - return nil, fmt.Errorf("unexpected status code during HuggingFaceScan: %d", results.StatusCode) - } - scanResult := &HuggingFaceScanResult{} - bodyBytes, err := io.ReadAll(results.Body) - if err != nil { - return nil, err - } - err = json.Unmarshal(bodyBytes, scanResult) - if err != nil { - return nil, err - } - if scanResult.HasUnsafeFiles { - return scanResult, ErrUnsafeFilesFound - } - return scanResult, nil -} diff --git a/pkg/downloader/uri_test.go b/pkg/downloader/uri_test.go index 66a4cb4e..21a093a9 100644 --- a/pkg/downloader/uri_test.go +++ b/pkg/downloader/uri_test.go @@ -9,24 +9,28 @@ import ( var _ = Describe("Gallery API tests", func() { Context("URI", func() { It("parses github with a branch", func() { + uri := URI("github:go-skynet/model-gallery/gpt4all-j.yaml") Expect( - DownloadAndUnmarshal("github:go-skynet/model-gallery/gpt4all-j.yaml", "", func(url string, i []byte) error { + uri.DownloadAndUnmarshal("", func(url string, i []byte) error { Expect(url).To(Equal("https://raw.githubusercontent.com/go-skynet/model-gallery/main/gpt4all-j.yaml")) return nil }), ).ToNot(HaveOccurred()) }) It("parses github without a branch", func() { + uri := URI("github:go-skynet/model-gallery/gpt4all-j.yaml@main") + Expect( - DownloadAndUnmarshal("github:go-skynet/model-gallery/gpt4all-j.yaml@main", "", func(url string, i []byte) error { + uri.DownloadAndUnmarshal("", func(url string, i []byte) error { Expect(url).To(Equal("https://raw.githubusercontent.com/go-skynet/model-gallery/main/gpt4all-j.yaml")) return nil }), ).ToNot(HaveOccurred()) }) It("parses github with urls", func() { + uri := URI("https://raw.githubusercontent.com/go-skynet/model-gallery/main/gpt4all-j.yaml") Expect( - DownloadAndUnmarshal("https://raw.githubusercontent.com/go-skynet/model-gallery/main/gpt4all-j.yaml", "", func(url string, i []byte) error { + uri.DownloadAndUnmarshal("", func(url string, i []byte) error { Expect(url).To(Equal("https://raw.githubusercontent.com/go-skynet/model-gallery/main/gpt4all-j.yaml")) return nil }), diff --git a/pkg/startup/model_preload.go b/pkg/startup/model_preload.go index 9fa890b0..a445b10e 100644 --- a/pkg/startup/model_preload.go +++ b/pkg/startup/model_preload.go @@ -3,7 +3,6 @@ package startup import ( "errors" "fmt" - "net/url" "os" "path/filepath" "strings" @@ -23,21 +22,21 @@ func InstallModels(galleries []config.Gallery, modelLibraryURL string, modelPath // create an error that groups all errors var err error - for _, url := range models { + lib, _ := embedded.GetRemoteLibraryShorteners(modelLibraryURL, modelPath) + for _, url := range models { // As a best effort, try to resolve the model from the remote library // if it's not resolved we try with the other method below if modelLibraryURL != "" { - lib, err := embedded.GetRemoteLibraryShorteners(modelLibraryURL, modelPath) - if err == nil { - if lib[url] != "" { - log.Debug().Msgf("[startup] model configuration is defined remotely: %s (%s)", url, lib[url]) - url = lib[url] - } + if lib[url] != "" { + log.Debug().Msgf("[startup] model configuration is defined remotely: %s (%s)", url, lib[url]) + url = lib[url] } } url = embedded.ModelShortURL(url) + uri := downloader.URI(url) + switch { case embedded.ExistsInModelsLibrary(url): modelYAML, e := embedded.ResolveContent(url) @@ -55,7 +54,7 @@ func InstallModels(galleries []config.Gallery, modelLibraryURL string, modelPath log.Error().Err(e).Str("filepath", modelDefinitionFilePath).Msg("error writing model definition") err = errors.Join(err, e) } - case downloader.LooksLikeOCI(url): + case uri.LooksLikeOCI(): log.Debug().Msgf("[startup] resolved OCI model to download: %s", url) // convert OCI image name to a file name. @@ -67,7 +66,7 @@ func InstallModels(galleries []config.Gallery, modelLibraryURL string, modelPath // check if file exists if _, e := os.Stat(filepath.Join(modelPath, ociName)); errors.Is(e, os.ErrNotExist) { modelDefinitionFilePath := filepath.Join(modelPath, ociName) - e := downloader.DownloadFile(url, modelDefinitionFilePath, "", 0, 0, func(fileName, current, total string, percent float64) { + e := uri.DownloadFile(modelDefinitionFilePath, "", 0, 0, func(fileName, current, total string, percent float64) { utils.DisplayDownloadFunction(fileName, current, total, percent) }) if e != nil { @@ -77,19 +76,15 @@ func InstallModels(galleries []config.Gallery, modelLibraryURL string, modelPath } log.Info().Msgf("[startup] installed model from OCI repository: %s", ociName) - case downloader.LooksLikeURL(url): + case uri.LooksLikeURL(): log.Debug().Msgf("[startup] downloading %s", url) // Extract filename from URL - fileName, e := filenameFromUrl(url) - if e != nil || fileName == "" { - fileName = utils.MD5(url) - if strings.HasSuffix(url, ".yaml") || strings.HasSuffix(url, ".yml") { - fileName = fileName + ".yaml" - } + fileName, e := uri.FilenameFromUrl() + if e != nil { log.Warn().Err(e).Str("url", url).Msg("error extracting filename from URL") - //err = errors.Join(err, e) - //continue + err = errors.Join(err, e) + continue } modelPath := filepath.Join(modelPath, fileName) @@ -102,7 +97,7 @@ func InstallModels(galleries []config.Gallery, modelLibraryURL string, modelPath // check if file exists if _, e := os.Stat(modelPath); errors.Is(e, os.ErrNotExist) { - e := downloader.DownloadFile(url, modelPath, "", 0, 0, func(fileName, current, total string, percent float64) { + e := uri.DownloadFile(modelPath, "", 0, 0, func(fileName, current, total string, percent float64) { utils.DisplayDownloadFunction(fileName, current, total, percent) }) if e != nil { @@ -167,20 +162,3 @@ func installModel(galleries []config.Gallery, modelName, modelPath string, downl return nil, true } - -func filenameFromUrl(urlstr string) (string, error) { - // strip anything after @ - if strings.Contains(urlstr, "@") { - urlstr = strings.Split(urlstr, "@")[0] - } - - u, err := url.Parse(urlstr) - if err != nil { - return "", fmt.Errorf("error due to parsing url: %w", err) - } - x, err := url.QueryUnescape(u.EscapedPath()) - if err != nil { - return "", fmt.Errorf("error due to escaping: %w", err) - } - return filepath.Base(x), nil -} From 797c1739ce7480b15a9b18ec77adcf5b58e835ce Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Fri, 2 Aug 2024 23:54:45 +0200 Subject: [PATCH 0142/1851] chore: :arrow_up: Update ggerganov/llama.cpp (#3115) :arrow_up: Update ggerganov/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index be104116..77eded2b 100644 --- a/Makefile +++ b/Makefile @@ -8,7 +8,7 @@ DETECT_LIBS?=true # llama.cpp versions GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be -CPPLLAMA_VERSION?=b7a08fd5e0e7c898c68d1743066ea495202d9608 +CPPLLAMA_VERSION?=b72c20b85c1029d135022d39e9a20d4807c11893 # gpt4all version GPT4ALL_REPO?=https://github.com/nomic-ai/gpt4all From c2576d08798fcf2eabcf2196a70b1eeb92af3b1c Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Sat, 3 Aug 2024 10:36:25 +0200 Subject: [PATCH 0143/1851] models(gallery): add llama-spark (#3116) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index 4f5caebd..57881d3b 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -277,6 +277,21 @@ - filename: Llama-3.1-Techne-RP-8b-v1.Q4_K_M.gguf sha256: 6557c5d5091f2507d19ab1f8bfb9ceb4e1536a755ab70f148b18aeb33741580f uri: huggingface://mradermacher/Llama-3.1-Techne-RP-8b-v1-GGUF/Llama-3.1-Techne-RP-8b-v1.Q4_K_M.gguf +- !!merge <<: *llama31 + icon: https://i.ibb.co/9hwFrvL/BLMs-Wkx-NQf-W-46-FZDg-ILhg.jpg + name: "llama-spark" + urls: + - https://huggingface.co/arcee-ai/Llama-Spark + - https://huggingface.co/arcee-ai/Llama-Spark-GGUF + description: | + Llama-Spark is a powerful conversational AI model developed by Arcee.ai. It's built on the foundation of Llama-3.1-8B and merges the power of our Tome Dataset with Llama-3.1-8B-Instruct, resulting in a remarkable conversationalist that punches well above its 8B parameter weight class. + overrides: + parameters: + model: llama-spark-dpo-v0.3-Q4_K_M.gguf + files: + - filename: llama-spark-dpo-v0.3-Q4_K_M.gguf + sha256: 41367168bbdc4b16eb80efcbee4dacc941781ee8748065940167fe6947b4e4c3 + uri: huggingface://arcee-ai/Llama-Spark-GGUF/llama-spark-dpo-v0.3-Q4_K_M.gguf ## Uncensored models - !!merge <<: *llama31 name: "darkidol-llama-3.1-8b-instruct-1.0-uncensored-i1" From 8f0bf9810af5ad5e53b7d22d89713533feba7985 Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Sat, 3 Aug 2024 23:47:06 +0200 Subject: [PATCH 0144/1851] chore: :arrow_up: Update ggerganov/llama.cpp (#3117) :arrow_up: Update ggerganov/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 77eded2b..557ab56d 100644 --- a/Makefile +++ b/Makefile @@ -8,7 +8,7 @@ DETECT_LIBS?=true # llama.cpp versions GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be -CPPLLAMA_VERSION?=b72c20b85c1029d135022d39e9a20d4807c11893 +CPPLLAMA_VERSION?=76614f352e94d25659306d9e97321f204e5de0d3 # gpt4all version GPT4ALL_REPO?=https://github.com/nomic-ai/gpt4all From d1a123954b252eedeebeb11e32a239faa4dafbb0 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Serta=C3=A7=20=C3=96zercan?= <852750+sozercan@users.noreply.github.com> Date: Sun, 4 Aug 2024 00:45:42 -0700 Subject: [PATCH 0145/1851] feat(guesser): add gemma2 (#3118) * feat(guesser): add gemma2 Signed-off-by: Sertac Ozercan * update Signed-off-by: Sertac Ozercan --------- Signed-off-by: Sertac Ozercan --- core/config/guesser.go | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/core/config/guesser.go b/core/config/guesser.go index 6c6ef430..b63dd051 100644 --- a/core/config/guesser.go +++ b/core/config/guesser.go @@ -26,15 +26,17 @@ const ( type settingsConfig struct { StopWords []string TemplateConfig TemplateConfig + RepeatPenalty float64 } // default settings to adopt with a given model family var defaultsSettings map[familyType]settingsConfig = map[familyType]settingsConfig{ Gemma: { + RepeatPenalty: 1.0, StopWords: []string{"<|im_end|>", "", ""}, TemplateConfig: TemplateConfig{ - Chat: "{{.Input }}\n<|start_of_turn|>model\n", - ChatMessage: "<|start_of_turn|>{{if eq .RoleName \"assistant\" }}model{{else}}{{ .RoleName }}{{end}}\n{{ if .Content -}}\n{{.Content -}}\n{{ end -}}<|end_of_turn|>", + Chat: "{{.Input }}\nmodel\n", + ChatMessage: "{{if eq .RoleName \"assistant\" }}model{{else}}{{ .RoleName }}{{end}}\n{{ if .Content -}}\n{{.Content -}}\n{{ end -}}", Completion: "{{.Input}}", }, }, @@ -192,6 +194,9 @@ func guessDefaultsFromFile(cfg *BackendConfig, modelPath string) { if len(cfg.StopWords) == 0 { cfg.StopWords = settings.StopWords } + if cfg.RepeatPenalty == 0.0 { + cfg.RepeatPenalty = settings.RepeatPenalty + } } else { log.Debug().Any("family", family).Msgf("guessDefaultsFromFile: no template found for family") } @@ -219,7 +224,7 @@ func identifyFamily(f *gguf.GGUFFile) familyType { commandR := arch == "command-r" && eosTokenID == 255001 qwen2 := arch == "qwen2" phi3 := arch == "phi-3" - gemma := strings.HasPrefix(f.Model().Name, "gemma") + gemma := strings.HasPrefix(arch, "gemma") || strings.Contains(strings.ToLower(f.Model().Name), "gemma") deepseek2 := arch == "deepseek2" switch { From 12d6d2d1779ccf354f803bbebef8be4162cc411b Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Sun, 4 Aug 2024 14:50:32 +0200 Subject: [PATCH 0146/1851] models(gallery): add glitz (#3119) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index 57881d3b..a0ee1448 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -292,6 +292,21 @@ - filename: llama-spark-dpo-v0.3-Q4_K_M.gguf sha256: 41367168bbdc4b16eb80efcbee4dacc941781ee8748065940167fe6947b4e4c3 uri: huggingface://arcee-ai/Llama-Spark-GGUF/llama-spark-dpo-v0.3-Q4_K_M.gguf +- !!merge <<: *llama31 + name: "l3.1-70b-glitz-v0.2-i1" + icon: https://cdn-uploads.huggingface.co/production/uploads/634262af8d8089ebaefd410e/q2dOUnzc1GRbZp3YfzGXB.png + urls: + - https://huggingface.co/Fizzarolli/L3.1-70b-glitz-v0.2 + - https://huggingface.co/mradermacher/L3.1-70b-glitz-v0.2-i1-GGUF + description: | + this is an experimental l3.1 70b finetuning run... that crashed midway through. however, the results are still interesting, so i wanted to publish them :3 + overrides: + parameters: + model: L3.1-70b-glitz-v0.2.i1-Q4_K_M.gguf + files: + - filename: L3.1-70b-glitz-v0.2.i1-Q4_K_M.gguf + sha256: 585efc83e7f6893043be2487fc09c914a381fb463ce97942ef2f25ae85103bcd + uri: huggingface://mradermacher/L3.1-70b-glitz-v0.2-i1-GGUF/L3.1-70b-glitz-v0.2.i1-Q4_K_M.gguf ## Uncensored models - !!merge <<: *llama31 name: "darkidol-llama-3.1-8b-instruct-1.0-uncensored-i1" From 1788fc8d4acd240834e6d396dd3efadad2d191a8 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Sun, 4 Aug 2024 15:17:24 +0200 Subject: [PATCH 0147/1851] models(gallery): add gemmasutra-mini (#3120) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index a0ee1448..7e308d93 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -1363,6 +1363,20 @@ - filename: Gemmasutra-Pro-27B-v1.Q4_K_M.gguf sha256: 336a2fbf142849fcc20e432123433807b6c7b09988652ef583a63636a0f90218 uri: huggingface://mradermacher/Gemmasutra-Pro-27B-v1-GGUF/Gemmasutra-Pro-27B-v1.Q4_K_M.gguf +- !!merge <<: *gemma + name: "gemmasutra-mini-2b-v1" + icon: https://cdn-uploads.huggingface.co/production/uploads/65f2fd1c25b848bd061b5c2e/w0Oi8TReoQNT3ljm5Wf6c.webp + urls: + - https://huggingface.co/TheDrummer/Gemmasutra-Mini-2B-v1-GGUF + description: | + It is a small, 2 billion parameter language model that has been trained for role-playing purposes. The model is designed to work well in various settings, such as in the browser, on a laptop, or even on a Raspberry Pi. It has been fine-tuned for RP use and claims to provide a satisfying experience, even in low-resource environments. The model is uncensored and unaligned, and it can be used with the Gemma Instruct template or with chat completion. For the best experience, it is recommended to modify the template to support the `system` role. The model also features examples of its output, highlighting its versatility and creativity. + overrides: + parameters: + model: Gemmasutra-Mini-2B-v1i-Q4_K_M.gguf + files: + - filename: Gemmasutra-Mini-2B-v1i-Q4_K_M.gguf + sha256: 29ba3db911fbadef4452ba757ddd9ce58fb892b7a872f19eefd0743c961797fb + uri: huggingface://TheDrummer/Gemmasutra-Mini-2B-v1-GGUF/Gemmasutra-Mini-2B-v1i-Q4_K_M.gguf - !!merge <<: *gemma name: "tarnished-9b-i1" icon: https://huggingface.co/lodrick-the-lafted/tarnished-9b/resolve/main/nox.jpg From e2e2a8e447e8996d7d3cb4916520a3bc6fa0c2cb Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Sun, 4 Aug 2024 15:20:02 +0200 Subject: [PATCH 0148/1851] models(gallery): add kumiho-v1-rp-uwu-8b (#3121) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index 7e308d93..bcfa4f35 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -438,6 +438,20 @@ - filename: L3.1-8B-Celeste-V1.5-Q4_K_M.gguf sha256: a408dfbbd91ed5561f70d3129af040dfd06704d6c7fa21146aa9f09714aafbc6 uri: huggingface://bartowski/L3.1-8B-Celeste-V1.5-GGUF/L3.1-8B-Celeste-V1.5-Q4_K_M.gguf +- !!merge <<: *llama31 + icon: https://cdn-uploads.huggingface.co/production/uploads/659c4ecb413a1376bee2f661/szz8sIxofYzSe5XPet2pO.png + name: "kumiho-v1-rp-uwu-8b" + urls: + - https://huggingface.co/juvi21/Kumiho-v1-rp-UwU-8B-GGUF + description: | + Meet Kumiho-V1 uwu. Kumiho-V1-rp-UwU aims to be a generalist model with specialization in roleplay and writing capabilities. It is finetuned and merged with various models, with a heavy base of Meta's LLaMA 3.1-8B as base model, and Claude 3.5 Sonnet and Claude 3 Opus generated synthetic data. + overrides: + parameters: + model: Kumiho-v1-rp-UwU-8B-gguf-q4_k_m.gguf + files: + - filename: Kumiho-v1-rp-UwU-8B-gguf-q4_k_m.gguf + sha256: a1deb46675418277cf785a406cd1508fec556ff6e4d45d2231eb2a82986d52d0 + uri: huggingface://juvi21/Kumiho-v1-rp-UwU-8B-GGUF/Kumiho-v1-rp-UwU-8B-gguf-q4_k_m.gguf - &deepseek ## Deepseek url: "github:mudler/LocalAI/gallery/deepseek.yaml@master" From 6e1ec08f46422ef5a2bff868c27599040fb5106c Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Sun, 4 Aug 2024 23:48:09 +0200 Subject: [PATCH 0149/1851] chore: :arrow_up: Update ggerganov/llama.cpp (#3123) :arrow_up: Update ggerganov/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 557ab56d..f8155e06 100644 --- a/Makefile +++ b/Makefile @@ -8,7 +8,7 @@ DETECT_LIBS?=true # llama.cpp versions GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be -CPPLLAMA_VERSION?=76614f352e94d25659306d9e97321f204e5de0d3 +CPPLLAMA_VERSION?=0d6fb52be0c1b7e77eb855f3adc4952771c8ce4c # gpt4all version GPT4ALL_REPO?=https://github.com/nomic-ai/gpt4all From f15a93b19b885ad139e12685272dd9ab95de5140 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Mon, 5 Aug 2024 10:11:00 +0200 Subject: [PATCH 0150/1851] models(gallery): add humanish-roleplay-llama-3.1-8b-i1 (#3126) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index bcfa4f35..c80455a8 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -307,6 +307,23 @@ - filename: L3.1-70b-glitz-v0.2.i1-Q4_K_M.gguf sha256: 585efc83e7f6893043be2487fc09c914a381fb463ce97942ef2f25ae85103bcd uri: huggingface://mradermacher/L3.1-70b-glitz-v0.2-i1-GGUF/L3.1-70b-glitz-v0.2.i1-Q4_K_M.gguf +- !!merge <<: *llama31 + name: "humanish-roleplay-llama-3.1-8b-i1" + icon: https://cdn-uploads.huggingface.co/production/uploads/5fad8602b8423e1d80b8a965/VPwtjS3BtjEEEq7ck4kAQ.webp + urls: + - https://huggingface.co/mradermacher/Humanish-Roleplay-Llama-3.1-8B-i1-GGUF + description: | + A DPO-tuned Llama-3.1 to behave more "humanish", i.e., avoiding all the AI assistant slop. It also works for role-play (RP). To achieve this, the model was fine-tuned over a series of datasets: + General conversations from Claude Opus, from Undi95/Meta-Llama-3.1-8B-Claude + Undi95/Weyaxi-humanish-dpo-project-noemoji, to make the model react as a human, rejecting assistant-like or too neutral responses. + ResplendentAI/NSFW_RP_Format_DPO, to steer the model towards using the *action* format in RP settings. Works best if in the first message you also use this format naturally (see example) + overrides: + parameters: + model: Humanish-Roleplay-Llama-3.1-8B.i1-Q4_K_M.gguf + files: + - filename: Humanish-Roleplay-Llama-3.1-8B.i1-Q4_K_M.gguf + sha256: 18cf753684e5226b51f3defc708852ca4924f50dc8bc31c9a7d0a036a477b7a7 + uri: huggingface://mradermacher/Humanish-Roleplay-Llama-3.1-8B-i1-GGUF/Humanish-Roleplay-Llama-3.1-8B.i1-Q4_K_M.gguf ## Uncensored models - !!merge <<: *llama31 name: "darkidol-llama-3.1-8b-instruct-1.0-uncensored-i1" From ed322bf59f0dcc4d3c7329c829e52e6cbcd02291 Mon Sep 17 00:00:00 2001 From: cryptk <421501+cryptk@users.noreply.github.com> Date: Mon, 5 Aug 2024 11:38:33 -0500 Subject: [PATCH 0151/1851] fix: ensure correct version of torch is always installed based on BUILD_TYPE(#2890) * fix: ensure correct version of torch is always installed based on BUILD_TYPE Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * Move causal-conv1d installation to build_types Signed-off-by: mudler * Move mamba-ssd install to build-type requirements.txt Signed-off-by: mudler --------- Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> Signed-off-by: mudler Co-authored-by: Ettore Di Giacinto Co-authored-by: mudler --- backend/python/autogptq/requirements-cublas11.txt | 2 ++ backend/python/autogptq/requirements-cublas12.txt | 1 + backend/python/autogptq/requirements.txt | 1 - backend/python/bark/requirements-cublas11.txt | 3 +++ backend/python/bark/requirements-cublas12.txt | 2 ++ backend/python/common/libbackend.sh | 7 +++++++ backend/python/coqui/requirements-cublas11.txt | 3 +++ backend/python/coqui/requirements-cublas12.txt | 2 ++ backend/python/diffusers/requirements-cublas11.txt | 2 ++ backend/python/diffusers/requirements-cublas12.txt | 1 + backend/python/diffusers/requirements.txt | 1 - backend/python/exllama/requirements-cublas11.txt | 2 ++ backend/python/exllama/requirements-cublas12.txt | 1 + backend/python/exllama/requirements.txt | 1 - backend/python/exllama2/requirements-cublas11.txt | 2 ++ backend/python/exllama2/requirements-cublas12.txt | 1 + backend/python/exllama2/requirements.txt | 1 - backend/python/mamba/requirements-after.txt | 2 ++ backend/python/mamba/requirements-cpu.txt | 1 + backend/python/mamba/requirements-cublas11.txt | 2 ++ backend/python/mamba/requirements-cublas12.txt | 1 + backend/python/mamba/requirements-install.txt | 3 +-- backend/python/mamba/requirements.txt | 2 -- backend/python/openvoice/requirements-cublas11.txt | 2 ++ backend/python/openvoice/requirements-cublas12.txt | 1 + backend/python/parler-tts/requirements-cublas11.txt | 3 +++ backend/python/parler-tts/requirements-cublas12.txt | 2 ++ backend/python/parler-tts/requirements.txt | 1 - backend/python/petals/requirements-cublas11.txt | 2 ++ backend/python/petals/requirements-cublas12.txt | 1 + backend/python/rerankers/requirements-cublas11.txt | 2 ++ backend/python/rerankers/requirements-cublas12.txt | 1 + .../python/sentencetransformers/requirements-cublas11.txt | 2 ++ .../python/sentencetransformers/requirements-cublas12.txt | 1 + .../python/transformers-musicgen/requirements-cublas11.txt | 2 ++ .../python/transformers-musicgen/requirements-cublas12.txt | 1 + backend/python/transformers-musicgen/requirements.txt | 1 - backend/python/transformers/requirements-cublas11.txt | 2 ++ backend/python/transformers/requirements-cublas12.txt | 1 + backend/python/transformers/requirements.txt | 1 - backend/python/vall-e-x/requirements-cublas11.txt | 3 +++ backend/python/vall-e-x/requirements-cublas12.txt | 2 ++ backend/python/vllm/requirements-cublas.txt | 1 - backend/python/vllm/requirements-cublas11.txt | 3 +++ backend/python/vllm/requirements-cublas12.txt | 2 ++ 45 files changed, 69 insertions(+), 12 deletions(-) create mode 100644 backend/python/autogptq/requirements-cublas11.txt create mode 100644 backend/python/autogptq/requirements-cublas12.txt create mode 100644 backend/python/bark/requirements-cublas11.txt create mode 100644 backend/python/bark/requirements-cublas12.txt create mode 100644 backend/python/coqui/requirements-cublas11.txt create mode 100644 backend/python/coqui/requirements-cublas12.txt create mode 100644 backend/python/diffusers/requirements-cublas11.txt create mode 100644 backend/python/diffusers/requirements-cublas12.txt create mode 100644 backend/python/exllama/requirements-cublas11.txt create mode 100644 backend/python/exllama/requirements-cublas12.txt create mode 100644 backend/python/exllama2/requirements-cublas11.txt create mode 100644 backend/python/exllama2/requirements-cublas12.txt create mode 100644 backend/python/mamba/requirements-after.txt create mode 100644 backend/python/mamba/requirements-cpu.txt create mode 100644 backend/python/mamba/requirements-cublas11.txt create mode 100644 backend/python/mamba/requirements-cublas12.txt create mode 100644 backend/python/openvoice/requirements-cublas11.txt create mode 100644 backend/python/openvoice/requirements-cublas12.txt create mode 100644 backend/python/parler-tts/requirements-cublas11.txt create mode 100644 backend/python/parler-tts/requirements-cublas12.txt create mode 100644 backend/python/petals/requirements-cublas11.txt create mode 100644 backend/python/petals/requirements-cublas12.txt create mode 100644 backend/python/rerankers/requirements-cublas11.txt create mode 100644 backend/python/rerankers/requirements-cublas12.txt create mode 100644 backend/python/sentencetransformers/requirements-cublas11.txt create mode 100644 backend/python/sentencetransformers/requirements-cublas12.txt create mode 100644 backend/python/transformers-musicgen/requirements-cublas11.txt create mode 100644 backend/python/transformers-musicgen/requirements-cublas12.txt create mode 100644 backend/python/transformers/requirements-cublas11.txt create mode 100644 backend/python/transformers/requirements-cublas12.txt create mode 100644 backend/python/vall-e-x/requirements-cublas11.txt create mode 100644 backend/python/vall-e-x/requirements-cublas12.txt delete mode 100644 backend/python/vllm/requirements-cublas.txt create mode 100644 backend/python/vllm/requirements-cublas11.txt create mode 100644 backend/python/vllm/requirements-cublas12.txt diff --git a/backend/python/autogptq/requirements-cublas11.txt b/backend/python/autogptq/requirements-cublas11.txt new file mode 100644 index 00000000..6461b696 --- /dev/null +++ b/backend/python/autogptq/requirements-cublas11.txt @@ -0,0 +1,2 @@ +--extra-index-url https://download.pytorch.org/whl/cu118 +torch diff --git a/backend/python/autogptq/requirements-cublas12.txt b/backend/python/autogptq/requirements-cublas12.txt new file mode 100644 index 00000000..12c6d5d5 --- /dev/null +++ b/backend/python/autogptq/requirements-cublas12.txt @@ -0,0 +1 @@ +torch diff --git a/backend/python/autogptq/requirements.txt b/backend/python/autogptq/requirements.txt index 7a1bf85f..078c015f 100644 --- a/backend/python/autogptq/requirements.txt +++ b/backend/python/autogptq/requirements.txt @@ -2,6 +2,5 @@ accelerate auto-gptq==0.7.1 grpcio==1.65.1 protobuf -torch certifi transformers \ No newline at end of file diff --git a/backend/python/bark/requirements-cublas11.txt b/backend/python/bark/requirements-cublas11.txt new file mode 100644 index 00000000..0de92979 --- /dev/null +++ b/backend/python/bark/requirements-cublas11.txt @@ -0,0 +1,3 @@ +--extra-index-url https://download.pytorch.org/whl/cu118 +torch +torchaudio \ No newline at end of file diff --git a/backend/python/bark/requirements-cublas12.txt b/backend/python/bark/requirements-cublas12.txt new file mode 100644 index 00000000..6c3c7e7a --- /dev/null +++ b/backend/python/bark/requirements-cublas12.txt @@ -0,0 +1,2 @@ +torch +torchaudio \ No newline at end of file diff --git a/backend/python/common/libbackend.sh b/backend/python/common/libbackend.sh index e8dfea03..7287fb95 100644 --- a/backend/python/common/libbackend.sh +++ b/backend/python/common/libbackend.sh @@ -122,6 +122,13 @@ function installRequirements() { requirementFiles+=("${MY_DIR}/requirements-${BUILD_PROFILE}.txt") fi + # if BUILD_TYPE is empty, we are a CPU build, so we should try to install the CPU requirements + if [ "x${BUILD_TYPE}" == "x" ]; then + requirementFiles+=("${MY_DIR}/requirements-cpu.txt") + fi + + requirementFiles+=("${MY_DIR}/requirements-after.txt") + for reqFile in ${requirementFiles[@]}; do if [ -f ${reqFile} ]; then echo "starting requirements install for ${reqFile}" diff --git a/backend/python/coqui/requirements-cublas11.txt b/backend/python/coqui/requirements-cublas11.txt new file mode 100644 index 00000000..0de92979 --- /dev/null +++ b/backend/python/coqui/requirements-cublas11.txt @@ -0,0 +1,3 @@ +--extra-index-url https://download.pytorch.org/whl/cu118 +torch +torchaudio \ No newline at end of file diff --git a/backend/python/coqui/requirements-cublas12.txt b/backend/python/coqui/requirements-cublas12.txt new file mode 100644 index 00000000..6c3c7e7a --- /dev/null +++ b/backend/python/coqui/requirements-cublas12.txt @@ -0,0 +1,2 @@ +torch +torchaudio \ No newline at end of file diff --git a/backend/python/diffusers/requirements-cublas11.txt b/backend/python/diffusers/requirements-cublas11.txt new file mode 100644 index 00000000..6461b696 --- /dev/null +++ b/backend/python/diffusers/requirements-cublas11.txt @@ -0,0 +1,2 @@ +--extra-index-url https://download.pytorch.org/whl/cu118 +torch diff --git a/backend/python/diffusers/requirements-cublas12.txt b/backend/python/diffusers/requirements-cublas12.txt new file mode 100644 index 00000000..12c6d5d5 --- /dev/null +++ b/backend/python/diffusers/requirements-cublas12.txt @@ -0,0 +1 @@ +torch diff --git a/backend/python/diffusers/requirements.txt b/backend/python/diffusers/requirements.txt index 6f04d677..ea707bb7 100644 --- a/backend/python/diffusers/requirements.txt +++ b/backend/python/diffusers/requirements.txt @@ -8,6 +8,5 @@ opencv-python pillow protobuf sentencepiece -torch transformers certifi diff --git a/backend/python/exllama/requirements-cublas11.txt b/backend/python/exllama/requirements-cublas11.txt new file mode 100644 index 00000000..6461b696 --- /dev/null +++ b/backend/python/exllama/requirements-cublas11.txt @@ -0,0 +1,2 @@ +--extra-index-url https://download.pytorch.org/whl/cu118 +torch diff --git a/backend/python/exllama/requirements-cublas12.txt b/backend/python/exllama/requirements-cublas12.txt new file mode 100644 index 00000000..12c6d5d5 --- /dev/null +++ b/backend/python/exllama/requirements-cublas12.txt @@ -0,0 +1 @@ +torch diff --git a/backend/python/exllama/requirements.txt b/backend/python/exllama/requirements.txt index 2aab2631..b06efcea 100644 --- a/backend/python/exllama/requirements.txt +++ b/backend/python/exllama/requirements.txt @@ -1,6 +1,5 @@ grpcio==1.65.0 protobuf -torch transformers certifi setuptools \ No newline at end of file diff --git a/backend/python/exllama2/requirements-cublas11.txt b/backend/python/exllama2/requirements-cublas11.txt new file mode 100644 index 00000000..6461b696 --- /dev/null +++ b/backend/python/exllama2/requirements-cublas11.txt @@ -0,0 +1,2 @@ +--extra-index-url https://download.pytorch.org/whl/cu118 +torch diff --git a/backend/python/exllama2/requirements-cublas12.txt b/backend/python/exllama2/requirements-cublas12.txt new file mode 100644 index 00000000..12c6d5d5 --- /dev/null +++ b/backend/python/exllama2/requirements-cublas12.txt @@ -0,0 +1 @@ +torch diff --git a/backend/python/exllama2/requirements.txt b/backend/python/exllama2/requirements.txt index 6aae273c..f2dfa976 100644 --- a/backend/python/exllama2/requirements.txt +++ b/backend/python/exllama2/requirements.txt @@ -2,6 +2,5 @@ accelerate grpcio==1.65.1 protobuf certifi -torch wheel setuptools \ No newline at end of file diff --git a/backend/python/mamba/requirements-after.txt b/backend/python/mamba/requirements-after.txt new file mode 100644 index 00000000..ea6890eb --- /dev/null +++ b/backend/python/mamba/requirements-after.txt @@ -0,0 +1,2 @@ +causal-conv1d==1.4.0 +mamba-ssm==2.2.2 \ No newline at end of file diff --git a/backend/python/mamba/requirements-cpu.txt b/backend/python/mamba/requirements-cpu.txt new file mode 100644 index 00000000..08ed5eeb --- /dev/null +++ b/backend/python/mamba/requirements-cpu.txt @@ -0,0 +1 @@ +torch \ No newline at end of file diff --git a/backend/python/mamba/requirements-cublas11.txt b/backend/python/mamba/requirements-cublas11.txt new file mode 100644 index 00000000..2f89bd95 --- /dev/null +++ b/backend/python/mamba/requirements-cublas11.txt @@ -0,0 +1,2 @@ +--extra-index-url https://download.pytorch.org/whl/cu118 +torch \ No newline at end of file diff --git a/backend/python/mamba/requirements-cublas12.txt b/backend/python/mamba/requirements-cublas12.txt new file mode 100644 index 00000000..08ed5eeb --- /dev/null +++ b/backend/python/mamba/requirements-cublas12.txt @@ -0,0 +1 @@ +torch \ No newline at end of file diff --git a/backend/python/mamba/requirements-install.txt b/backend/python/mamba/requirements-install.txt index 2fc9a07c..69d263f0 100644 --- a/backend/python/mamba/requirements-install.txt +++ b/backend/python/mamba/requirements-install.txt @@ -3,5 +3,4 @@ # https://github.com/Dao-AILab/causal-conv1d/issues/24 packaging setuptools -wheel -torch==2.3.1 \ No newline at end of file +wheel \ No newline at end of file diff --git a/backend/python/mamba/requirements.txt b/backend/python/mamba/requirements.txt index 2aac2cda..068bf336 100644 --- a/backend/python/mamba/requirements.txt +++ b/backend/python/mamba/requirements.txt @@ -1,5 +1,3 @@ -causal-conv1d==1.4.0 -mamba-ssm==2.2.2 grpcio==1.65.1 protobuf certifi diff --git a/backend/python/openvoice/requirements-cublas11.txt b/backend/python/openvoice/requirements-cublas11.txt new file mode 100644 index 00000000..6461b696 --- /dev/null +++ b/backend/python/openvoice/requirements-cublas11.txt @@ -0,0 +1,2 @@ +--extra-index-url https://download.pytorch.org/whl/cu118 +torch diff --git a/backend/python/openvoice/requirements-cublas12.txt b/backend/python/openvoice/requirements-cublas12.txt new file mode 100644 index 00000000..12c6d5d5 --- /dev/null +++ b/backend/python/openvoice/requirements-cublas12.txt @@ -0,0 +1 @@ +torch diff --git a/backend/python/parler-tts/requirements-cublas11.txt b/backend/python/parler-tts/requirements-cublas11.txt new file mode 100644 index 00000000..0de92979 --- /dev/null +++ b/backend/python/parler-tts/requirements-cublas11.txt @@ -0,0 +1,3 @@ +--extra-index-url https://download.pytorch.org/whl/cu118 +torch +torchaudio \ No newline at end of file diff --git a/backend/python/parler-tts/requirements-cublas12.txt b/backend/python/parler-tts/requirements-cublas12.txt new file mode 100644 index 00000000..6c3c7e7a --- /dev/null +++ b/backend/python/parler-tts/requirements-cublas12.txt @@ -0,0 +1,2 @@ +torch +torchaudio \ No newline at end of file diff --git a/backend/python/parler-tts/requirements.txt b/backend/python/parler-tts/requirements.txt index 147cad9a..1dfa6675 100644 --- a/backend/python/parler-tts/requirements.txt +++ b/backend/python/parler-tts/requirements.txt @@ -1,7 +1,6 @@ accelerate grpcio==1.65.1 protobuf -torch git+https://github.com/huggingface/parler-tts.git@10016fb0300c0dc31a0fb70e26f3affee7b62f16 certifi transformers \ No newline at end of file diff --git a/backend/python/petals/requirements-cublas11.txt b/backend/python/petals/requirements-cublas11.txt new file mode 100644 index 00000000..6461b696 --- /dev/null +++ b/backend/python/petals/requirements-cublas11.txt @@ -0,0 +1,2 @@ +--extra-index-url https://download.pytorch.org/whl/cu118 +torch diff --git a/backend/python/petals/requirements-cublas12.txt b/backend/python/petals/requirements-cublas12.txt new file mode 100644 index 00000000..12c6d5d5 --- /dev/null +++ b/backend/python/petals/requirements-cublas12.txt @@ -0,0 +1 @@ +torch diff --git a/backend/python/rerankers/requirements-cublas11.txt b/backend/python/rerankers/requirements-cublas11.txt new file mode 100644 index 00000000..6461b696 --- /dev/null +++ b/backend/python/rerankers/requirements-cublas11.txt @@ -0,0 +1,2 @@ +--extra-index-url https://download.pytorch.org/whl/cu118 +torch diff --git a/backend/python/rerankers/requirements-cublas12.txt b/backend/python/rerankers/requirements-cublas12.txt new file mode 100644 index 00000000..12c6d5d5 --- /dev/null +++ b/backend/python/rerankers/requirements-cublas12.txt @@ -0,0 +1 @@ +torch diff --git a/backend/python/sentencetransformers/requirements-cublas11.txt b/backend/python/sentencetransformers/requirements-cublas11.txt new file mode 100644 index 00000000..6461b696 --- /dev/null +++ b/backend/python/sentencetransformers/requirements-cublas11.txt @@ -0,0 +1,2 @@ +--extra-index-url https://download.pytorch.org/whl/cu118 +torch diff --git a/backend/python/sentencetransformers/requirements-cublas12.txt b/backend/python/sentencetransformers/requirements-cublas12.txt new file mode 100644 index 00000000..12c6d5d5 --- /dev/null +++ b/backend/python/sentencetransformers/requirements-cublas12.txt @@ -0,0 +1 @@ +torch diff --git a/backend/python/transformers-musicgen/requirements-cublas11.txt b/backend/python/transformers-musicgen/requirements-cublas11.txt new file mode 100644 index 00000000..6461b696 --- /dev/null +++ b/backend/python/transformers-musicgen/requirements-cublas11.txt @@ -0,0 +1,2 @@ +--extra-index-url https://download.pytorch.org/whl/cu118 +torch diff --git a/backend/python/transformers-musicgen/requirements-cublas12.txt b/backend/python/transformers-musicgen/requirements-cublas12.txt new file mode 100644 index 00000000..12c6d5d5 --- /dev/null +++ b/backend/python/transformers-musicgen/requirements-cublas12.txt @@ -0,0 +1 @@ +torch diff --git a/backend/python/transformers-musicgen/requirements.txt b/backend/python/transformers-musicgen/requirements.txt index 8ffa3c31..ac758034 100644 --- a/backend/python/transformers-musicgen/requirements.txt +++ b/backend/python/transformers-musicgen/requirements.txt @@ -2,6 +2,5 @@ accelerate transformers grpcio==1.65.1 protobuf -torch scipy==1.14.0 certifi \ No newline at end of file diff --git a/backend/python/transformers/requirements-cublas11.txt b/backend/python/transformers/requirements-cublas11.txt new file mode 100644 index 00000000..6461b696 --- /dev/null +++ b/backend/python/transformers/requirements-cublas11.txt @@ -0,0 +1,2 @@ +--extra-index-url https://download.pytorch.org/whl/cu118 +torch diff --git a/backend/python/transformers/requirements-cublas12.txt b/backend/python/transformers/requirements-cublas12.txt new file mode 100644 index 00000000..12c6d5d5 --- /dev/null +++ b/backend/python/transformers/requirements-cublas12.txt @@ -0,0 +1 @@ +torch diff --git a/backend/python/transformers/requirements.txt b/backend/python/transformers/requirements.txt index 55925b32..c32fe1f8 100644 --- a/backend/python/transformers/requirements.txt +++ b/backend/python/transformers/requirements.txt @@ -2,7 +2,6 @@ accelerate transformers grpcio==1.65.1 protobuf -torch certifi intel-extension-for-transformers bitsandbytes diff --git a/backend/python/vall-e-x/requirements-cublas11.txt b/backend/python/vall-e-x/requirements-cublas11.txt new file mode 100644 index 00000000..0de92979 --- /dev/null +++ b/backend/python/vall-e-x/requirements-cublas11.txt @@ -0,0 +1,3 @@ +--extra-index-url https://download.pytorch.org/whl/cu118 +torch +torchaudio \ No newline at end of file diff --git a/backend/python/vall-e-x/requirements-cublas12.txt b/backend/python/vall-e-x/requirements-cublas12.txt new file mode 100644 index 00000000..6c3c7e7a --- /dev/null +++ b/backend/python/vall-e-x/requirements-cublas12.txt @@ -0,0 +1,2 @@ +torch +torchaudio \ No newline at end of file diff --git a/backend/python/vllm/requirements-cublas.txt b/backend/python/vllm/requirements-cublas.txt deleted file mode 100644 index 7bfe8efe..00000000 --- a/backend/python/vllm/requirements-cublas.txt +++ /dev/null @@ -1 +0,0 @@ -flash-attn \ No newline at end of file diff --git a/backend/python/vllm/requirements-cublas11.txt b/backend/python/vllm/requirements-cublas11.txt new file mode 100644 index 00000000..bed8cea8 --- /dev/null +++ b/backend/python/vllm/requirements-cublas11.txt @@ -0,0 +1,3 @@ +--extra-index-url https://download.pytorch.org/whl/cu118 +torch +flash-attn \ No newline at end of file diff --git a/backend/python/vllm/requirements-cublas12.txt b/backend/python/vllm/requirements-cublas12.txt new file mode 100644 index 00000000..b6fef4d7 --- /dev/null +++ b/backend/python/vllm/requirements-cublas12.txt @@ -0,0 +1,2 @@ +torch +flash-attn \ No newline at end of file From 42fe864cb463d799c388b15f71b82644a59ea1a6 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon, 5 Aug 2024 21:32:10 +0000 Subject: [PATCH 0152/1851] chore(deps): Bump grpcio from 1.65.1 to 1.65.4 in /backend/python/autogptq (#3130) chore(deps): Bump grpcio in /backend/python/autogptq Bumps [grpcio](https://github.com/grpc/grpc) from 1.65.1 to 1.65.4. - [Release notes](https://github.com/grpc/grpc/releases) - [Changelog](https://github.com/grpc/grpc/blob/master/doc/grpc_release_schedule.md) - [Commits](https://github.com/grpc/grpc/compare/v1.65.1...v1.65.4) --- updated-dependencies: - dependency-name: grpcio dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/autogptq/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/autogptq/requirements.txt b/backend/python/autogptq/requirements.txt index 078c015f..53946f23 100644 --- a/backend/python/autogptq/requirements.txt +++ b/backend/python/autogptq/requirements.txt @@ -1,6 +1,6 @@ accelerate auto-gptq==0.7.1 -grpcio==1.65.1 +grpcio==1.65.4 protobuf certifi transformers \ No newline at end of file From 094a6fccd8695a05f056fd8585f86f35c27726c9 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon, 5 Aug 2024 21:35:07 +0000 Subject: [PATCH 0153/1851] chore(deps): Bump grpcio from 1.65.1 to 1.65.4 in /backend/python/common/template (#3131) chore(deps): Bump grpcio in /backend/python/common/template Bumps [grpcio](https://github.com/grpc/grpc) from 1.65.1 to 1.65.4. - [Release notes](https://github.com/grpc/grpc/releases) - [Changelog](https://github.com/grpc/grpc/blob/master/doc/grpc_release_schedule.md) - [Commits](https://github.com/grpc/grpc/compare/v1.65.1...v1.65.4) --- updated-dependencies: - dependency-name: grpcio dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/common/template/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/common/template/requirements.txt b/backend/python/common/template/requirements.txt index 8d1e3151..ad97e2ae 100644 --- a/backend/python/common/template/requirements.txt +++ b/backend/python/common/template/requirements.txt @@ -1,2 +1,2 @@ -grpcio==1.65.1 +grpcio==1.65.4 protobuf \ No newline at end of file From 55318cca0f881df00db3573c209c4260072875c1 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon, 5 Aug 2024 21:37:47 +0000 Subject: [PATCH 0154/1851] chore(deps): Bump langchain from 0.2.10 to 0.2.12 in /examples/functions (#3132) Bumps [langchain](https://github.com/langchain-ai/langchain) from 0.2.10 to 0.2.12. - [Release notes](https://github.com/langchain-ai/langchain/releases) - [Commits](https://github.com/langchain-ai/langchain/compare/langchain==0.2.10...langchain==0.2.12) --- updated-dependencies: - dependency-name: langchain dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/functions/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/functions/requirements.txt b/examples/functions/requirements.txt index f8afacdc..27bb9881 100644 --- a/examples/functions/requirements.txt +++ b/examples/functions/requirements.txt @@ -1,2 +1,2 @@ -langchain==0.2.10 +langchain==0.2.12 openai==1.37.0 From 62176de6d2add590fde4c39cf6af27f08a1d35e6 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon, 5 Aug 2024 22:13:59 +0000 Subject: [PATCH 0155/1851] chore(deps): Bump grpcio from 1.65.1 to 1.65.4 in /backend/python/openvoice (#3137) chore(deps): Bump grpcio in /backend/python/openvoice Bumps [grpcio](https://github.com/grpc/grpc) from 1.65.1 to 1.65.4. - [Release notes](https://github.com/grpc/grpc/releases) - [Changelog](https://github.com/grpc/grpc/blob/master/doc/grpc_release_schedule.md) - [Commits](https://github.com/grpc/grpc/compare/v1.65.1...v1.65.4) --- updated-dependencies: - dependency-name: grpcio dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/openvoice/requirements-intel.txt | 2 +- backend/python/openvoice/requirements.txt | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/backend/python/openvoice/requirements-intel.txt b/backend/python/openvoice/requirements-intel.txt index bad088a9..85618c86 100644 --- a/backend/python/openvoice/requirements-intel.txt +++ b/backend/python/openvoice/requirements-intel.txt @@ -2,7 +2,7 @@ intel-extension-for-pytorch torch optimum[openvino] -grpcio==1.65.1 +grpcio==1.65.4 protobuf librosa==0.9.1 faster-whisper==1.0.3 diff --git a/backend/python/openvoice/requirements.txt b/backend/python/openvoice/requirements.txt index 86d16ec2..cc40adbc 100644 --- a/backend/python/openvoice/requirements.txt +++ b/backend/python/openvoice/requirements.txt @@ -1,4 +1,4 @@ -grpcio==1.65.1 +grpcio==1.65.4 protobuf librosa faster-whisper From 1c0bbb92b27790ff14a1e1e239eddbb380984235 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon, 5 Aug 2024 22:27:49 +0000 Subject: [PATCH 0156/1851] chore(deps): Bump grpcio from 1.65.1 to 1.65.4 in /backend/python/coqui (#3138) Bumps [grpcio](https://github.com/grpc/grpc) from 1.65.1 to 1.65.4. - [Release notes](https://github.com/grpc/grpc/releases) - [Changelog](https://github.com/grpc/grpc/blob/master/doc/grpc_release_schedule.md) - [Commits](https://github.com/grpc/grpc/compare/v1.65.1...v1.65.4) --- updated-dependencies: - dependency-name: grpcio dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/coqui/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/coqui/requirements.txt b/backend/python/coqui/requirements.txt index e1cddaa3..a1bdac44 100644 --- a/backend/python/coqui/requirements.txt +++ b/backend/python/coqui/requirements.txt @@ -1,6 +1,6 @@ accelerate TTS==0.22.0 -grpcio==1.65.1 +grpcio==1.65.4 protobuf certifi transformers \ No newline at end of file From 4c31e4567a069de3522d8685ea984e31f85cd108 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon, 5 Aug 2024 22:30:08 +0000 Subject: [PATCH 0157/1851] chore(deps): Bump grpcio from 1.65.1 to 1.65.4 in /backend/python/transformers-musicgen (#3140) chore(deps): Bump grpcio in /backend/python/transformers-musicgen Bumps [grpcio](https://github.com/grpc/grpc) from 1.65.1 to 1.65.4. - [Release notes](https://github.com/grpc/grpc/releases) - [Changelog](https://github.com/grpc/grpc/blob/master/doc/grpc_release_schedule.md) - [Commits](https://github.com/grpc/grpc/compare/v1.65.1...v1.65.4) --- updated-dependencies: - dependency-name: grpcio dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/transformers-musicgen/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/transformers-musicgen/requirements.txt b/backend/python/transformers-musicgen/requirements.txt index ac758034..bec86241 100644 --- a/backend/python/transformers-musicgen/requirements.txt +++ b/backend/python/transformers-musicgen/requirements.txt @@ -1,6 +1,6 @@ accelerate transformers -grpcio==1.65.1 +grpcio==1.65.4 protobuf scipy==1.14.0 certifi \ No newline at end of file From dc38b1f71ef93dff1d8ccdb28629859bf32bf30a Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon, 5 Aug 2024 23:27:07 +0000 Subject: [PATCH 0158/1851] chore(deps): Bump grpcio from 1.65.1 to 1.65.4 in /backend/python/diffusers (#3141) chore(deps): Bump grpcio in /backend/python/diffusers Bumps [grpcio](https://github.com/grpc/grpc) from 1.65.1 to 1.65.4. - [Release notes](https://github.com/grpc/grpc/releases) - [Changelog](https://github.com/grpc/grpc/blob/master/doc/grpc_release_schedule.md) - [Commits](https://github.com/grpc/grpc/compare/v1.65.1...v1.65.4) --- updated-dependencies: - dependency-name: grpcio dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/diffusers/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/diffusers/requirements.txt b/backend/python/diffusers/requirements.txt index ea707bb7..9919b20a 100644 --- a/backend/python/diffusers/requirements.txt +++ b/backend/python/diffusers/requirements.txt @@ -3,7 +3,7 @@ accelerate compel peft diffusers -grpcio==1.65.1 +grpcio==1.65.4 opencv-python pillow protobuf From 22ffe1a0833113c57d979c55a23c94f3d3c02e87 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 6 Aug 2024 00:15:54 +0000 Subject: [PATCH 0159/1851] chore(deps): Bump llama-index from 0.10.56 to 0.10.59 in /examples/chainlit (#3143) chore(deps): Bump llama-index in /examples/chainlit Bumps [llama-index](https://github.com/run-llama/llama_index) from 0.10.56 to 0.10.59. - [Release notes](https://github.com/run-llama/llama_index/releases) - [Changelog](https://github.com/run-llama/llama_index/blob/main/CHANGELOG.md) - [Commits](https://github.com/run-llama/llama_index/compare/v0.10.56...v0.10.59) --- updated-dependencies: - dependency-name: llama-index dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/chainlit/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/chainlit/requirements.txt b/examples/chainlit/requirements.txt index 13415f11..52e2b8a2 100644 --- a/examples/chainlit/requirements.txt +++ b/examples/chainlit/requirements.txt @@ -1,4 +1,4 @@ -llama_index==0.10.56 +llama_index==0.10.59 requests==2.32.3 weaviate_client==4.6.7 transformers From 57c96fe05e8eeee80e049be5ab738df78a79f670 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 6 Aug 2024 00:46:41 +0000 Subject: [PATCH 0160/1851] chore(deps): Bump docs/themes/hugo-theme-relearn from `7aec99b` to `8b14837` (#3142) chore(deps): Bump docs/themes/hugo-theme-relearn Bumps [docs/themes/hugo-theme-relearn](https://github.com/McShelby/hugo-theme-relearn) from `7aec99b` to `8b14837`. - [Release notes](https://github.com/McShelby/hugo-theme-relearn/releases) - [Commits](https://github.com/McShelby/hugo-theme-relearn/compare/7aec99b38dc2668c6139bf71855535ace41c123c...8b148373366a643684eaa4b3fc5f8cfc4f9d4341) --- updated-dependencies: - dependency-name: docs/themes/hugo-theme-relearn dependency-type: direct:production ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- docs/themes/hugo-theme-relearn | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/themes/hugo-theme-relearn b/docs/themes/hugo-theme-relearn index 7aec99b3..8b148373 160000 --- a/docs/themes/hugo-theme-relearn +++ b/docs/themes/hugo-theme-relearn @@ -1 +1 @@ -Subproject commit 7aec99b38dc2668c6139bf71855535ace41c123c +Subproject commit 8b148373366a643684eaa4b3fc5f8cfc4f9d4341 From 30916e8eec27142497efe92130a45b3ada05a0e8 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 6 Aug 2024 01:08:38 +0000 Subject: [PATCH 0161/1851] chore(deps): Bump grpcio from 1.65.1 to 1.65.4 in /backend/python/exllama2 (#3146) chore(deps): Bump grpcio in /backend/python/exllama2 Bumps [grpcio](https://github.com/grpc/grpc) from 1.65.1 to 1.65.4. - [Release notes](https://github.com/grpc/grpc/releases) - [Changelog](https://github.com/grpc/grpc/blob/master/doc/grpc_release_schedule.md) - [Commits](https://github.com/grpc/grpc/compare/v1.65.1...v1.65.4) --- updated-dependencies: - dependency-name: grpcio dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/exllama2/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/exllama2/requirements.txt b/backend/python/exllama2/requirements.txt index f2dfa976..487d89a9 100644 --- a/backend/python/exllama2/requirements.txt +++ b/backend/python/exllama2/requirements.txt @@ -1,5 +1,5 @@ accelerate -grpcio==1.65.1 +grpcio==1.65.4 protobuf certifi wheel From f0ed4aff1a0ef3448ca2e0439e49bf4d3bef5292 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 6 Aug 2024 01:21:26 +0000 Subject: [PATCH 0162/1851] chore(deps): Bump grpcio from 1.65.1 to 1.65.4 in /backend/python/bark (#3144) Bumps [grpcio](https://github.com/grpc/grpc) from 1.65.1 to 1.65.4. - [Release notes](https://github.com/grpc/grpc/releases) - [Changelog](https://github.com/grpc/grpc/blob/master/doc/grpc_release_schedule.md) - [Commits](https://github.com/grpc/grpc/compare/v1.65.1...v1.65.4) --- updated-dependencies: - dependency-name: grpcio dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/bark/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/bark/requirements.txt b/backend/python/bark/requirements.txt index d3f9f52b..2e34d5a4 100644 --- a/backend/python/bark/requirements.txt +++ b/backend/python/bark/requirements.txt @@ -1,6 +1,6 @@ accelerate bark==0.1.5 -grpcio==1.65.1 +grpcio==1.65.4 protobuf certifi transformers \ No newline at end of file From a02fb001f9703066ee6faa0743ea6c931ad8f716 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 6 Aug 2024 01:44:31 +0000 Subject: [PATCH 0163/1851] chore(deps): Bump grpcio from 1.65.1 to 1.65.4 in /backend/python/rerankers (#3147) chore(deps): Bump grpcio in /backend/python/rerankers Bumps [grpcio](https://github.com/grpc/grpc) from 1.65.1 to 1.65.4. - [Release notes](https://github.com/grpc/grpc/releases) - [Changelog](https://github.com/grpc/grpc/blob/master/doc/grpc_release_schedule.md) - [Commits](https://github.com/grpc/grpc/compare/v1.65.1...v1.65.4) --- updated-dependencies: - dependency-name: grpcio dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/rerankers/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/rerankers/requirements.txt b/backend/python/rerankers/requirements.txt index 8b2ad4d0..33166382 100644 --- a/backend/python/rerankers/requirements.txt +++ b/backend/python/rerankers/requirements.txt @@ -1,6 +1,6 @@ accelerate rerankers[transformers] -grpcio==1.65.1 +grpcio==1.65.4 protobuf certifi transformers \ No newline at end of file From 416aec3db61d352ec06f7e2a7129299845af6e94 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 6 Aug 2024 01:45:20 +0000 Subject: [PATCH 0164/1851] chore(deps): Bump langchain from 0.2.10 to 0.2.12 in /examples/langchain-chroma (#3148) chore(deps): Bump langchain in /examples/langchain-chroma Bumps [langchain](https://github.com/langchain-ai/langchain) from 0.2.10 to 0.2.12. - [Release notes](https://github.com/langchain-ai/langchain/releases) - [Commits](https://github.com/langchain-ai/langchain/compare/langchain==0.2.10...langchain==0.2.12) --- updated-dependencies: - dependency-name: langchain dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain-chroma/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain-chroma/requirements.txt b/examples/langchain-chroma/requirements.txt index 50d6dc4f..f9c41621 100644 --- a/examples/langchain-chroma/requirements.txt +++ b/examples/langchain-chroma/requirements.txt @@ -1,4 +1,4 @@ -langchain==0.2.10 +langchain==0.2.12 openai==1.37.0 chromadb==0.5.5 llama-index==0.10.56 \ No newline at end of file From 9818d2d1e1fd91e6d03b5003639df1de67dfd6d1 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 6 Aug 2024 02:25:17 +0000 Subject: [PATCH 0165/1851] chore(deps): Bump streamlit from 1.37.0 to 1.37.1 in /examples/streamlit-bot (#3151) chore(deps): Bump streamlit in /examples/streamlit-bot Bumps [streamlit](https://github.com/streamlit/streamlit) from 1.37.0 to 1.37.1. - [Release notes](https://github.com/streamlit/streamlit/releases) - [Commits](https://github.com/streamlit/streamlit/compare/1.37.0...1.37.1) --- updated-dependencies: - dependency-name: streamlit dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/streamlit-bot/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/streamlit-bot/requirements.txt b/examples/streamlit-bot/requirements.txt index 63291928..17e1bee0 100644 --- a/examples/streamlit-bot/requirements.txt +++ b/examples/streamlit-bot/requirements.txt @@ -1,2 +1,2 @@ -streamlit==1.37.0 +streamlit==1.37.1 requests \ No newline at end of file From e1e221b6e54d45ead5472e0f904fa989d734b23e Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 6 Aug 2024 03:12:15 +0000 Subject: [PATCH 0166/1851] chore(deps): Bump grpcio from 1.65.1 to 1.65.4 in /backend/python/vllm (#3152) Bumps [grpcio](https://github.com/grpc/grpc) from 1.65.1 to 1.65.4. - [Release notes](https://github.com/grpc/grpc/releases) - [Changelog](https://github.com/grpc/grpc/blob/master/doc/grpc_release_schedule.md) - [Commits](https://github.com/grpc/grpc/compare/v1.65.1...v1.65.4) --- updated-dependencies: - dependency-name: grpcio dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/vllm/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/vllm/requirements.txt b/backend/python/vllm/requirements.txt index 7c612a2f..b8b79afb 100644 --- a/backend/python/vllm/requirements.txt +++ b/backend/python/vllm/requirements.txt @@ -1,6 +1,6 @@ accelerate vllm -grpcio==1.65.1 +grpcio==1.65.4 protobuf certifi transformers From de1f010f0195c2ad2fa69309c0b91125880e0fad Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 6 Aug 2024 04:21:27 +0000 Subject: [PATCH 0167/1851] chore(deps): Bump langchain from 0.2.11 to 0.2.12 in /examples/langchain/langchainpy-localai-example (#3155) chore(deps): Bump langchain Bumps [langchain](https://github.com/langchain-ai/langchain) from 0.2.11 to 0.2.12. - [Release notes](https://github.com/langchain-ai/langchain/releases) - [Commits](https://github.com/langchain-ai/langchain/compare/langchain==0.2.11...langchain==0.2.12) --- updated-dependencies: - dependency-name: langchain dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain/langchainpy-localai-example/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain/langchainpy-localai-example/requirements.txt b/examples/langchain/langchainpy-localai-example/requirements.txt index f29cb78a..40c20afb 100644 --- a/examples/langchain/langchainpy-localai-example/requirements.txt +++ b/examples/langchain/langchainpy-localai-example/requirements.txt @@ -10,7 +10,7 @@ debugpy==1.8.2 frozenlist==1.4.1 greenlet==3.0.3 idna==3.7 -langchain==0.2.11 +langchain==0.2.12 langchain-community==0.2.9 marshmallow==3.21.3 marshmallow-enum==1.5.1 From ada35e428e8ed20e67d7778d49d32e99ec1689f1 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 6 Aug 2024 04:46:39 +0000 Subject: [PATCH 0168/1851] chore(deps): Bump grpcio from 1.65.1 to 1.65.4 in /backend/python/transformers (#3161) chore(deps): Bump grpcio in /backend/python/transformers Bumps [grpcio](https://github.com/grpc/grpc) from 1.65.1 to 1.65.4. - [Release notes](https://github.com/grpc/grpc/releases) - [Changelog](https://github.com/grpc/grpc/blob/master/doc/grpc_release_schedule.md) - [Commits](https://github.com/grpc/grpc/compare/v1.65.1...v1.65.4) --- updated-dependencies: - dependency-name: grpcio dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/transformers/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/transformers/requirements.txt b/backend/python/transformers/requirements.txt index c32fe1f8..2a08ba45 100644 --- a/backend/python/transformers/requirements.txt +++ b/backend/python/transformers/requirements.txt @@ -1,6 +1,6 @@ accelerate transformers -grpcio==1.65.1 +grpcio==1.65.4 protobuf certifi intel-extension-for-transformers From 7bf5cc50b53ed3f686b5959744e5db2e74086f73 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 6 Aug 2024 04:50:40 +0000 Subject: [PATCH 0169/1851] chore(deps): Bump grpcio from 1.65.1 to 1.65.4 in /backend/python/vall-e-x (#3156) chore(deps): Bump grpcio in /backend/python/vall-e-x Bumps [grpcio](https://github.com/grpc/grpc) from 1.65.1 to 1.65.4. - [Release notes](https://github.com/grpc/grpc/releases) - [Changelog](https://github.com/grpc/grpc/blob/master/doc/grpc_release_schedule.md) - [Commits](https://github.com/grpc/grpc/compare/v1.65.1...v1.65.4) --- updated-dependencies: - dependency-name: grpcio dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- backend/python/vall-e-x/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/vall-e-x/requirements.txt b/backend/python/vall-e-x/requirements.txt index d1d0583e..ec3584b2 100644 --- a/backend/python/vall-e-x/requirements.txt +++ b/backend/python/vall-e-x/requirements.txt @@ -1,4 +1,4 @@ accelerate -grpcio==1.65.1 +grpcio==1.65.4 protobuf certifi \ No newline at end of file From 77c8152cbf68fd32bdce3100bdd2522c364c9734 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 6 Aug 2024 05:42:59 +0000 Subject: [PATCH 0170/1851] chore(deps): Bump sqlalchemy from 2.0.31 to 2.0.32 in /examples/langchain/langchainpy-localai-example (#3157) chore(deps): Bump sqlalchemy Bumps [sqlalchemy](https://github.com/sqlalchemy/sqlalchemy) from 2.0.31 to 2.0.32. - [Release notes](https://github.com/sqlalchemy/sqlalchemy/releases) - [Changelog](https://github.com/sqlalchemy/sqlalchemy/blob/main/CHANGES.rst) - [Commits](https://github.com/sqlalchemy/sqlalchemy/commits) --- updated-dependencies: - dependency-name: sqlalchemy dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain/langchainpy-localai-example/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain/langchainpy-localai-example/requirements.txt b/examples/langchain/langchainpy-localai-example/requirements.txt index 40c20afb..9d937ad6 100644 --- a/examples/langchain/langchainpy-localai-example/requirements.txt +++ b/examples/langchain/langchainpy-localai-example/requirements.txt @@ -24,7 +24,7 @@ packaging>=23.2 pydantic==2.8.2 PyYAML==6.0.1 requests==2.32.3 -SQLAlchemy==2.0.31 +SQLAlchemy==2.0.32 tenacity==8.5.0 tqdm==4.66.4 typing-inspect==0.9.0 From 1494ba13e60ceff754af9afbbb10edf511493e1d Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Tue, 6 Aug 2024 08:59:03 +0200 Subject: [PATCH 0171/1851] chore: :arrow_up: Update ggerganov/whisper.cpp (#3164) :arrow_up: Update ggerganov/whisper.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index f8155e06..9b6552bf 100644 --- a/Makefile +++ b/Makefile @@ -20,7 +20,7 @@ RWKV_VERSION?=661e7ae26d442f5cfebd2a0881b44e8c55949ec6 # whisper.cpp version WHISPER_REPO?=https://github.com/ggerganov/whisper.cpp -WHISPER_CPP_VERSION?=6739eb83c3ca5cf40d24c6fe8442a761a1eb6248 +WHISPER_CPP_VERSION?=fe36c909715e6751277ddb020e7892c7670b61d4 # bert.cpp version BERT_REPO?=https://github.com/go-skynet/go-bert.cpp From f9ddc31b77a2c9d06ae1a42ab2c82d8cddf3697a Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Tue, 6 Aug 2024 09:04:57 +0200 Subject: [PATCH 0172/1851] ci(bump_deps): attempt to link also commit diff Signed-off-by: Ettore Di Giacinto --- .github/bump_deps.sh | 13 +++++++++++++ .github/workflows/bump_deps.yaml | 8 +++++++- 2 files changed, 20 insertions(+), 1 deletion(-) diff --git a/.github/bump_deps.sh b/.github/bump_deps.sh index d8fff4a3..ea730fd9 100755 --- a/.github/bump_deps.sh +++ b/.github/bump_deps.sh @@ -6,4 +6,17 @@ VAR=$3 LAST_COMMIT=$(curl -s -H "Accept: application/vnd.github.VERSION.sha" "https://api.github.com/repos/$REPO/commits/$BRANCH") +# Read $VAR from Makefile (only first match) +set +e +CURRENT_COMMIT="$(grep -m1 "^$VAR?=" Makefile | cut -d'=' -f2)" +set -e + sed -i Makefile -e "s/$VAR?=.*/$VAR?=$LAST_COMMIT/" + +if [ -z "$CURRENT_COMMIT" ]; then + echo "Could not find $VAR in Makefile." + exit 0 +fi + +echo "Updated $VAR from $CURRENT_COMMIT to $LAST_COMMIT." > "$REPO_message.txt" +echo "https://github.com/$REPO/compare/$CURRENT_COMMIT..$LAST_COMMIT" >> "$REPO_message.txt" \ No newline at end of file diff --git a/.github/workflows/bump_deps.yaml b/.github/workflows/bump_deps.yaml index 5909c981..b32dc378 100644 --- a/.github/workflows/bump_deps.yaml +++ b/.github/workflows/bump_deps.yaml @@ -40,8 +40,14 @@ jobs: steps: - uses: actions/checkout@v4 - name: Bump dependencies 🔧 + id: bump run: | bash .github/bump_deps.sh ${{ matrix.repository }} ${{ matrix.branch }} ${{ matrix.variable }} + { + echo 'message<> "$GITHUB_OUTPUT" - name: Create Pull Request uses: peter-evans/create-pull-request@v6 with: @@ -50,7 +56,7 @@ jobs: commit-message: ':arrow_up: Update ${{ matrix.repository }}' title: 'chore: :arrow_up: Update ${{ matrix.repository }}' branch: "update/${{ matrix.variable }}" - body: Bump of ${{ matrix.repository }} version + body: ${{ steps.bump.outputs.message }} signoff: true From c53196e19779921632ae38c55b33a5c82c3883de Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Tue, 6 Aug 2024 09:07:07 +0200 Subject: [PATCH 0173/1851] ci: use var as placeholder Signed-off-by: Ettore Di Giacinto --- .github/bump_deps.sh | 4 ++-- .github/workflows/bump_deps.yaml | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/.github/bump_deps.sh b/.github/bump_deps.sh index ea730fd9..8c24ce42 100755 --- a/.github/bump_deps.sh +++ b/.github/bump_deps.sh @@ -18,5 +18,5 @@ if [ -z "$CURRENT_COMMIT" ]; then exit 0 fi -echo "Updated $VAR from $CURRENT_COMMIT to $LAST_COMMIT." > "$REPO_message.txt" -echo "https://github.com/$REPO/compare/$CURRENT_COMMIT..$LAST_COMMIT" >> "$REPO_message.txt" \ No newline at end of file +echo "Updated $VAR from $CURRENT_COMMIT to $LAST_COMMIT." > "$VAR_message.txt" +echo "https://github.com/$REPO/compare/$CURRENT_COMMIT..$LAST_COMMIT" >> "$VAR_message.txt" \ No newline at end of file diff --git a/.github/workflows/bump_deps.yaml b/.github/workflows/bump_deps.yaml index b32dc378..08654fac 100644 --- a/.github/workflows/bump_deps.yaml +++ b/.github/workflows/bump_deps.yaml @@ -45,7 +45,7 @@ jobs: bash .github/bump_deps.sh ${{ matrix.repository }} ${{ matrix.branch }} ${{ matrix.variable }} { echo 'message<> "$GITHUB_OUTPUT" - name: Create Pull Request From 69a2cf06c85a6b0df3bfab1ccd965d19a232175b Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Tue, 6 Aug 2024 09:08:44 +0200 Subject: [PATCH 0174/1851] ci: fixups Signed-off-by: Ettore Di Giacinto --- .github/bump_deps.sh | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/.github/bump_deps.sh b/.github/bump_deps.sh index 8c24ce42..48d3c58b 100755 --- a/.github/bump_deps.sh +++ b/.github/bump_deps.sh @@ -18,5 +18,5 @@ if [ -z "$CURRENT_COMMIT" ]; then exit 0 fi -echo "Updated $VAR from $CURRENT_COMMIT to $LAST_COMMIT." > "$VAR_message.txt" -echo "https://github.com/$REPO/compare/$CURRENT_COMMIT..$LAST_COMMIT" >> "$VAR_message.txt" \ No newline at end of file +echo "Updated $VAR from $CURRENT_COMMIT to ${LAST_COMMIT}." > "${VAR}_message.txt" +echo "https://github.com/$REPO/compare/${CURRENT_COMMIT}..${LAST_COMMIT}" >> "${VAR}_message.txt" \ No newline at end of file From d1a222ea8763b7b1ea43c61091fcf60728e19561 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Tue, 6 Aug 2024 09:10:24 +0200 Subject: [PATCH 0175/1851] ci: remove message file Signed-off-by: Ettore Di Giacinto --- .github/workflows/bump_deps.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/.github/workflows/bump_deps.yaml b/.github/workflows/bump_deps.yaml index 08654fac..0d4c5cd3 100644 --- a/.github/workflows/bump_deps.yaml +++ b/.github/workflows/bump_deps.yaml @@ -48,6 +48,7 @@ jobs: cat "${{ matrix.variable }}_message.txt" echo EOF } >> "$GITHUB_OUTPUT" + rm -rfv ${{ matrix.variable }}_message.txt - name: Create Pull Request uses: peter-evans/create-pull-request@v6 with: From e03363df3d6137f207c8fcf078c78848b03af150 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Tue, 6 Aug 2024 09:12:10 +0200 Subject: [PATCH 0176/1851] ci: add commit id to title Signed-off-by: Ettore Di Giacinto --- .github/bump_deps.sh | 4 +++- .github/workflows/bump_deps.yaml | 11 ++++++++--- 2 files changed, 11 insertions(+), 4 deletions(-) diff --git a/.github/bump_deps.sh b/.github/bump_deps.sh index 48d3c58b..54b1b854 100755 --- a/.github/bump_deps.sh +++ b/.github/bump_deps.sh @@ -19,4 +19,6 @@ if [ -z "$CURRENT_COMMIT" ]; then fi echo "Updated $VAR from $CURRENT_COMMIT to ${LAST_COMMIT}." > "${VAR}_message.txt" -echo "https://github.com/$REPO/compare/${CURRENT_COMMIT}..${LAST_COMMIT}" >> "${VAR}_message.txt" \ No newline at end of file +echo "" >> "${VAR}_message.txt" +echo "Diff URL: https://github.com/$REPO/compare/${CURRENT_COMMIT}..${LAST_COMMIT}" >> "${VAR}_message.txt" +echo "${LAST_COMMIT}" >> "${VAR}_commit.txt" \ No newline at end of file diff --git a/.github/workflows/bump_deps.yaml b/.github/workflows/bump_deps.yaml index 0d4c5cd3..a79898b1 100644 --- a/.github/workflows/bump_deps.yaml +++ b/.github/workflows/bump_deps.yaml @@ -48,16 +48,21 @@ jobs: cat "${{ matrix.variable }}_message.txt" echo EOF } >> "$GITHUB_OUTPUT" - rm -rfv ${{ matrix.variable }}_message.txt + { + echo 'commit<> "$GITHUB_OUTPUT" + rm -rfv ${{ matrix.variable }}_commit.txt - name: Create Pull Request uses: peter-evans/create-pull-request@v6 with: token: ${{ secrets.UPDATE_BOT_TOKEN }} push-to-fork: ci-forks/LocalAI commit-message: ':arrow_up: Update ${{ matrix.repository }}' - title: 'chore: :arrow_up: Update ${{ matrix.repository }}' + title: 'chore: :arrow_up: Update ${{ matrix.repository }} to ${{ steps.bump.outputs.commit }}' branch: "update/${{ matrix.variable }}" - body: ${{ steps.bump.outputs.message }} + body: ${{ steps.bump.outputs.message }} signoff: true From b3f362f22901721e03fad0f3495bd1afd9aca7b6 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Tue, 6 Aug 2024 09:16:17 +0200 Subject: [PATCH 0177/1851] ci: small fixes Signed-off-by: Ettore Di Giacinto --- .github/bump_deps.sh | 4 ++-- .github/workflows/bump_deps.yaml | 3 ++- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/.github/bump_deps.sh b/.github/bump_deps.sh index 54b1b854..6ecb81a9 100755 --- a/.github/bump_deps.sh +++ b/.github/bump_deps.sh @@ -18,7 +18,7 @@ if [ -z "$CURRENT_COMMIT" ]; then exit 0 fi -echo "Updated $VAR from $CURRENT_COMMIT to ${LAST_COMMIT}." > "${VAR}_message.txt" +echo "Updated `$VAR` from `$CURRENT_COMMIT` to `${LAST_COMMIT}`." > "${VAR}_message.txt" echo "" >> "${VAR}_message.txt" -echo "Diff URL: https://github.com/$REPO/compare/${CURRENT_COMMIT}..${LAST_COMMIT}" >> "${VAR}_message.txt" +echo "Changes: https://github.com/$REPO/compare/${CURRENT_COMMIT}..${LAST_COMMIT}" >> "${VAR}_message.txt" echo "${LAST_COMMIT}" >> "${VAR}_commit.txt" \ No newline at end of file diff --git a/.github/workflows/bump_deps.yaml b/.github/workflows/bump_deps.yaml index a79898b1..68cb81cb 100644 --- a/.github/workflows/bump_deps.yaml +++ b/.github/workflows/bump_deps.yaml @@ -53,6 +53,7 @@ jobs: cat "${{ matrix.variable }}_commit.txt" echo EOF } >> "$GITHUB_OUTPUT" + rm -rfv ${{ matrix.variable }}_message.txt rm -rfv ${{ matrix.variable }}_commit.txt - name: Create Pull Request uses: peter-evans/create-pull-request@v6 @@ -60,7 +61,7 @@ jobs: token: ${{ secrets.UPDATE_BOT_TOKEN }} push-to-fork: ci-forks/LocalAI commit-message: ':arrow_up: Update ${{ matrix.repository }}' - title: 'chore: :arrow_up: Update ${{ matrix.repository }} to ${{ steps.bump.outputs.commit }}' + title: 'chore: :arrow_up: Update ${{ matrix.repository }} to `${{ steps.bump.outputs.commit }}`' branch: "update/${{ matrix.variable }}" body: ${{ steps.bump.outputs.message }} signoff: true From c8fc92d6d5522ba8ec392c54e4101d2173888a7b Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Tue, 6 Aug 2024 09:21:37 +0200 Subject: [PATCH 0178/1851] ci: small fixes Signed-off-by: Ettore Di Giacinto --- .github/bump_deps.sh | 2 -- 1 file changed, 2 deletions(-) diff --git a/.github/bump_deps.sh b/.github/bump_deps.sh index 6ecb81a9..66dea9a3 100755 --- a/.github/bump_deps.sh +++ b/.github/bump_deps.sh @@ -18,7 +18,5 @@ if [ -z "$CURRENT_COMMIT" ]; then exit 0 fi -echo "Updated `$VAR` from `$CURRENT_COMMIT` to `${LAST_COMMIT}`." > "${VAR}_message.txt" -echo "" >> "${VAR}_message.txt" echo "Changes: https://github.com/$REPO/compare/${CURRENT_COMMIT}..${LAST_COMMIT}" >> "${VAR}_message.txt" echo "${LAST_COMMIT}" >> "${VAR}_commit.txt" \ No newline at end of file From ecc63454360debd60d81dc94bde0400a5c9499ff Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 6 Aug 2024 07:53:34 +0000 Subject: [PATCH 0179/1851] chore(deps): Bump openai from 1.37.0 to 1.39.0 in /examples/functions (#3134) Bumps [openai](https://github.com/openai/openai-python) from 1.37.0 to 1.39.0. - [Release notes](https://github.com/openai/openai-python/releases) - [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md) - [Commits](https://github.com/openai/openai-python/compare/v1.37.0...v1.39.0) --- updated-dependencies: - dependency-name: openai dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/functions/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/functions/requirements.txt b/examples/functions/requirements.txt index 27bb9881..a8a8ca8c 100644 --- a/examples/functions/requirements.txt +++ b/examples/functions/requirements.txt @@ -1,2 +1,2 @@ langchain==0.2.12 -openai==1.37.0 +openai==1.39.0 From 06aa068ac731652a8239caf257c804973700cfe9 Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Tue, 6 Aug 2024 10:27:22 +0200 Subject: [PATCH 0180/1851] chore(model-gallery): :arrow_up: update checksum (#3167) :arrow_up: Checksum updates in gallery/index.yaml Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- gallery/index.yaml | 38 ++++++++++++++------------------------ 1 file changed, 14 insertions(+), 24 deletions(-) diff --git a/gallery/index.yaml b/gallery/index.yaml index c80455a8..0d120e82 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -85,7 +85,7 @@ files: - filename: meta-llama-3.1-8b-instruct-abliterated.Q4_K_M.gguf uri: huggingface://mlabonne/Meta-Llama-3.1-8B-Instruct-abliterated-GGUF/meta-llama-3.1-8b-instruct-abliterated.Q4_K_M.gguf - sha256: 2e1fd6d93b19cc6548b2b8ed2d3f1f34b432ee0573f3dcf358bbaab4f23c760b + sha256: c4735f9efaba8eb2c30113291652e3ffe13bf940b675ed61f6be749608b4f266 - !!merge <<: *llama31 name: "llama-3.1-70b-japanese-instruct-2407" urls: @@ -258,18 +258,18 @@ - https://huggingface.co/athirdpath/Llama-3.1-Techne-RP-8b-v1 - https://huggingface.co/mradermacher/Llama-3.1-Techne-RP-8b-v1-GGUF description: | - athirdpath/Llama-3.1-Instruct_NSFW-pretrained_e1-plus_reddit was further trained in the order below: - SFT + athirdpath/Llama-3.1-Instruct_NSFW-pretrained_e1-plus_reddit was further trained in the order below: + SFT - Doctor-Shotgun/no-robots-sharegpt - grimulkan/LimaRP-augmented - Inv/c2-logs-cleaned-deslopped + Doctor-Shotgun/no-robots-sharegpt + grimulkan/LimaRP-augmented + Inv/c2-logs-cleaned-deslopped - DPO + DPO - jondurbin/truthy-dpo-v0.1 - Undi95/Weyaxi-humanish-dpo-project-noemoji - athirdpath/DPO_Pairs-Roleplay-Llama3-NSFW + jondurbin/truthy-dpo-v0.1 + Undi95/Weyaxi-humanish-dpo-project-noemoji + athirdpath/DPO_Pairs-Roleplay-Llama3-NSFW overrides: parameters: model: Llama-3.1-Techne-RP-8b-v1.Q4_K_M.gguf @@ -911,11 +911,11 @@ - https://huggingface.co/nothingiisreal/MN-12B-Celeste-V1.9 - https://huggingface.co/mradermacher/MN-12B-Celeste-V1.9-GGUF description: | - Mistral Nemo 12B Celeste V1.9 + Mistral Nemo 12B Celeste V1.9 - This is a story writing and roleplaying model trained on Mistral NeMo 12B Instruct at 8K context using Reddit Writing Prompts, Kalo's Opus 25K Instruct and c2 logs cleaned + This is a story writing and roleplaying model trained on Mistral NeMo 12B Instruct at 8K context using Reddit Writing Prompts, Kalo's Opus 25K Instruct and c2 logs cleaned - This version has improved NSFW, smarter and more active narration. It's also trained with ChatML tokens so there should be no EOS bleeding whatsoever. + This version has improved NSFW, smarter and more active narration. It's also trained with ChatML tokens so there should be no EOS bleeding whatsoever. overrides: parameters: model: MN-12B-Celeste-V1.9.Q4_K_M.gguf @@ -1414,17 +1414,7 @@ urls: - https://huggingface.co/lodrick-the-lafted/tarnished-9b - https://huggingface.co/mradermacher/tarnished-9b-i1-GGUF - description: | - Ah, so you've heard whispers on the winds, have you? 🧐 - - Imagine this: - Tarnished-9b, a name that echoes with the rasp of coin-hungry merchants and the clatter of forgotten machinery. This LLM speaks with the voice of those who straddle the line between worlds, who've tasted the bittersweet nectar of eldritch power and the tang of the Interdimensional Trade Council. - - It's a tongue that dances with secrets, a whisperer of lore lost and found. Its words may guide you through the twisting paths of history, revealing truths hidden beneath layers of dust and time. - - But be warned, Tarnished One! For knowledge comes at a price. The LLM's gaze can pierce the veil of reality, but it can also lure you into the labyrinthine depths of madness. - - Dare you tread this path? + description: "Ah, so you've heard whispers on the winds, have you? \U0001F9D0\n\nImagine this:\nTarnished-9b, a name that echoes with the rasp of coin-hungry merchants and the clatter of forgotten machinery. This LLM speaks with the voice of those who straddle the line between worlds, who've tasted the bittersweet nectar of eldritch power and the tang of the Interdimensional Trade Council.\n\nIt's a tongue that dances with secrets, a whisperer of lore lost and found. Its words may guide you through the twisting paths of history, revealing truths hidden beneath layers of dust and time.\n\nBut be warned, Tarnished One! For knowledge comes at a price. The LLM's gaze can pierce the veil of reality, but it can also lure you into the labyrinthine depths of madness.\n\nDare you tread this path?\n" overrides: parameters: model: tarnished-9b.i1-Q4_K_M.gguf From 307ad7592b22a7e80f8bf86f24c31a569698244d Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 6 Aug 2024 08:47:07 +0000 Subject: [PATCH 0181/1851] chore(deps): Bump openai from 1.37.0 to 1.39.0 in /examples/langchain-chroma (#3149) chore(deps): Bump openai in /examples/langchain-chroma Bumps [openai](https://github.com/openai/openai-python) from 1.37.0 to 1.39.0. - [Release notes](https://github.com/openai/openai-python/releases) - [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md) - [Commits](https://github.com/openai/openai-python/compare/v1.37.0...v1.39.0) --- updated-dependencies: - dependency-name: openai dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain-chroma/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain-chroma/requirements.txt b/examples/langchain-chroma/requirements.txt index f9c41621..2b8f8b84 100644 --- a/examples/langchain-chroma/requirements.txt +++ b/examples/langchain-chroma/requirements.txt @@ -1,4 +1,4 @@ langchain==0.2.12 -openai==1.37.0 +openai==1.39.0 chromadb==0.5.5 llama-index==0.10.56 \ No newline at end of file From 52ba230d313cdbd1c9a7a383f9fef2b10858f557 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 6 Aug 2024 09:15:32 +0000 Subject: [PATCH 0182/1851] chore(deps): Bump openai from 1.37.1 to 1.39.0 in /examples/langchain/langchainpy-localai-example (#3158) chore(deps): Bump openai Bumps [openai](https://github.com/openai/openai-python) from 1.37.1 to 1.39.0. - [Release notes](https://github.com/openai/openai-python/releases) - [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md) - [Commits](https://github.com/openai/openai-python/compare/v1.37.1...v1.39.0) --- updated-dependencies: - dependency-name: openai dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain/langchainpy-localai-example/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain/langchainpy-localai-example/requirements.txt b/examples/langchain/langchainpy-localai-example/requirements.txt index 9d937ad6..1cf7e0a7 100644 --- a/examples/langchain/langchainpy-localai-example/requirements.txt +++ b/examples/langchain/langchainpy-localai-example/requirements.txt @@ -18,7 +18,7 @@ multidict==6.0.5 mypy-extensions==1.0.0 numexpr==2.10.1 numpy==2.0.1 -openai==1.37.1 +openai==1.39.0 openapi-schema-pydantic==1.2.4 packaging>=23.2 pydantic==2.8.2 From 4e11ca55fde2a9ceaa7144c2b7103fcbe33eb3b4 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Tue, 6 Aug 2024 11:39:35 +0200 Subject: [PATCH 0183/1851] chore: :arrow_up: Update ggerganov/llama.cpp (#3166) * arrow_up: Update ggerganov/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * fix(llama.cpp): adapt init function call Signed-off-by: Ettore Di Giacinto --------- Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Signed-off-by: Ettore Di Giacinto Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- Makefile | 2 +- backend/cpp/llama/grpc-server.cpp | 4 +++- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/Makefile b/Makefile index 9b6552bf..5263e686 100644 --- a/Makefile +++ b/Makefile @@ -8,7 +8,7 @@ DETECT_LIBS?=true # llama.cpp versions GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be -CPPLLAMA_VERSION?=0d6fb52be0c1b7e77eb855f3adc4952771c8ce4c +CPPLLAMA_VERSION?=0a4ce786814b123096d18aadca89cd352b9e590b # gpt4all version GPT4ALL_REPO?=https://github.com/nomic-ai/gpt4all diff --git a/backend/cpp/llama/grpc-server.cpp b/backend/cpp/llama/grpc-server.cpp index cb5c85f1..5de46798 100644 --- a/backend/cpp/llama/grpc-server.cpp +++ b/backend/cpp/llama/grpc-server.cpp @@ -458,7 +458,9 @@ struct llama_server_context } } - std::tie(model, ctx) = llama_init_from_gpt_params(params); + llama_init_result llama_init = llama_init_from_gpt_params(params); + model = llama_init.model; + ctx = llama_init.context; if (model == nullptr) { LOG_ERROR("unable to load model", {{"model", params.model}}); From ad5978b3cad08e730ff6f9533b173f0b4fdb04cf Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Tue, 6 Aug 2024 11:46:00 +0200 Subject: [PATCH 0184/1851] models(gallery): add calme-2.2-qwen2-72b (#3185) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index 0d120e82..65516cc3 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -781,6 +781,33 @@ - filename: tifa-7b-qwen2-v0.1.q4_k_m.gguf sha256: 1f5adbe8cb0a6400f51abdca3bf4e32284ebff73cc681a43abb35c0a6ccd3820 uri: huggingface://Tifa-RP/Tifa-7B-Qwen2-v0.1-GGUF/tifa-7b-qwen2-v0.1.q4_k_m.gguf +- !!merge <<: *qwen2 + name: "calme-2.2-qwen2-72b" + icon: https://huggingface.co/MaziyarPanahi/calme-2.2-qwen2-72b/resolve/main/calme-2.webp + urls: + - https://huggingface.co/MaziyarPanahi/calme-2.2-qwen2-72b-GGUF + - https://huggingface.co/MaziyarPanahi/calme-2.2-qwen2-72b + description: | + This model is a fine-tuned version of the powerful Qwen/Qwen2-72B-Instruct, pushing the boundaries of natural language understanding and generation even further. My goal was to create a versatile and robust model that excels across a wide range of benchmarks and real-world applications. + + The post-training process is identical to the calme-2.1-qwen2-72b model; however, some parameters are different, and it was trained for a longer period. + + Use Cases + + This model is suitable for a wide range of applications, including but not limited to: + + Advanced question-answering systems + Intelligent chatbots and virtual assistants + Content generation and summarization + Code generation and analysis + Complex problem-solving and decision support + overrides: + parameters: + model: calme-2.2-qwen2-72b.Q4_K_M.gguf + files: + - filename: calme-2.2-qwen2-72b.Q4_K_M.gguf + sha256: 95b9613df0abe6c1b6b7b017d7cc8bcf19b46c29f92a503dcc6da1704b12b402 + uri: huggingface://MaziyarPanahi/calme-2.2-qwen2-72b-GGUF/calme-2.2-qwen2-72b.Q4_K_M.gguf - &mistral03 ## START Mistral url: "github:mudler/LocalAI/gallery/mistral-0.3.yaml@master" From c3306fe825748b25007f71449466f255949044f2 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 6 Aug 2024 09:56:03 +0000 Subject: [PATCH 0185/1851] chore(deps): Bump tqdm from 4.66.4 to 4.66.5 in /examples/langchain/langchainpy-localai-example (#3159) chore(deps): Bump tqdm Bumps [tqdm](https://github.com/tqdm/tqdm) from 4.66.4 to 4.66.5. - [Release notes](https://github.com/tqdm/tqdm/releases) - [Commits](https://github.com/tqdm/tqdm/compare/v4.66.4...v4.66.5) --- updated-dependencies: - dependency-name: tqdm dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain/langchainpy-localai-example/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain/langchainpy-localai-example/requirements.txt b/examples/langchain/langchainpy-localai-example/requirements.txt index 1cf7e0a7..1d1b5023 100644 --- a/examples/langchain/langchainpy-localai-example/requirements.txt +++ b/examples/langchain/langchainpy-localai-example/requirements.txt @@ -26,7 +26,7 @@ PyYAML==6.0.1 requests==2.32.3 SQLAlchemy==2.0.32 tenacity==8.5.0 -tqdm==4.66.4 +tqdm==4.66.5 typing-inspect==0.9.0 typing_extensions==4.12.2 urllib3==2.2.2 From 9cfc9ac66f9933e5f915b0ebbb06c2f613bffbcf Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 6 Aug 2024 11:05:01 +0000 Subject: [PATCH 0186/1851] chore(deps): Bump llama-index from 0.10.56 to 0.10.61 in /examples/langchain-chroma (#3168) chore(deps): Bump llama-index in /examples/langchain-chroma Bumps [llama-index](https://github.com/run-llama/llama_index) from 0.10.56 to 0.10.61. - [Release notes](https://github.com/run-llama/llama_index/releases) - [Changelog](https://github.com/run-llama/llama_index/blob/main/CHANGELOG.md) - [Commits](https://github.com/run-llama/llama_index/compare/v0.10.56...v0.10.61) --- updated-dependencies: - dependency-name: llama-index dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain-chroma/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain-chroma/requirements.txt b/examples/langchain-chroma/requirements.txt index 2b8f8b84..535c6537 100644 --- a/examples/langchain-chroma/requirements.txt +++ b/examples/langchain-chroma/requirements.txt @@ -1,4 +1,4 @@ langchain==0.2.12 openai==1.39.0 chromadb==0.5.5 -llama-index==0.10.56 \ No newline at end of file +llama-index==0.10.61 \ No newline at end of file From abcf0ff000bc7aa1c5bece337386e6a3dbfccf1d Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Wed, 7 Aug 2024 01:10:21 +0200 Subject: [PATCH 0187/1851] chore: :arrow_up: Update ggerganov/llama.cpp to `1e6f6554aa11fa10160a5fda689e736c3c34169f` (#3189) * arrow_up: Update ggerganov/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * fix(llama.cpp): adapt to upstream naming changes Signed-off-by: Ettore Di Giacinto --------- Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Signed-off-by: Ettore Di Giacinto Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- Makefile | 2 +- backend/cpp/llama/grpc-server.cpp | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/Makefile b/Makefile index 5263e686..476caac6 100644 --- a/Makefile +++ b/Makefile @@ -8,7 +8,7 @@ DETECT_LIBS?=true # llama.cpp versions GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be -CPPLLAMA_VERSION?=0a4ce786814b123096d18aadca89cd352b9e590b +CPPLLAMA_VERSION?=1e6f6554aa11fa10160a5fda689e736c3c34169f # gpt4all version GPT4ALL_REPO?=https://github.com/nomic-ai/gpt4all diff --git a/backend/cpp/llama/grpc-server.cpp b/backend/cpp/llama/grpc-server.cpp index 5de46798..e8701d36 100644 --- a/backend/cpp/llama/grpc-server.cpp +++ b/backend/cpp/llama/grpc-server.cpp @@ -2260,7 +2260,7 @@ static void params_parse(const backend::ModelOptions* request, } // get the directory of modelfile std::string model_dir = params.model.substr(0, params.model.find_last_of("/\\")); - params.lora_adapter.push_back(std::make_tuple(model_dir + "/"+request->loraadapter(), scale_factor)); + params.lora_adapters.push_back({ model_dir + "/"+request->loraadapter(), scale_factor }); } params.use_mlock = request->mlock(); params.use_mmap = request->mmap(); From 61b56021113692d95b3a68447d15b13c4227142a Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Wed, 7 Aug 2024 17:02:32 +0200 Subject: [PATCH 0188/1851] fix(python): move accelerate and GPU-specific libs to build-type (#3194) Some of the dependencies in `requirements.txt`, even if generic, pulls down the line CUDA libraries. This changes moves mostly all GPU-specific libs to the build-type, and tries a safer approach. In `requirements.txt` now are listed only "first-level" dependencies, for instance, grpc, but libs-dependencies are moved down to the respective build-type `requirements.txt` to avoid any mixin. This should fix #2737 and #1592. Signed-off-by: Ettore Di Giacinto --- backend/python/bark/requirements-cpu.txt | 4 ++++ backend/python/bark/requirements-cublas11.txt | 4 +++- backend/python/bark/requirements-cublas12.txt | 4 +++- backend/python/bark/requirements-hipblas.txt | 4 +++- backend/python/bark/requirements-intel.txt | 4 +++- backend/python/bark/requirements.txt | 4 +--- backend/python/coqui/requirements-cpu.txt | 3 +++ backend/python/coqui/requirements-cublas11.txt | 4 +++- backend/python/coqui/requirements-cublas12.txt | 4 +++- backend/python/coqui/requirements-hipblas.txt | 4 +++- backend/python/coqui/requirements-intel.txt | 4 +++- backend/python/coqui/requirements.txt | 4 +--- backend/python/diffusers/requirements-cpu.txt | 8 ++++++++ backend/python/diffusers/requirements-cublas11.txt | 7 +++++++ backend/python/diffusers/requirements-cublas12.txt | 7 +++++++ backend/python/diffusers/requirements-hipblas.txt | 9 ++++++++- backend/python/diffusers/requirements-intel.txt | 9 ++++++++- backend/python/exllama/requirements-cpu.txt | 3 +++ backend/python/exllama/requirements-cublas11.txt | 2 ++ backend/python/exllama/requirements-cublas12.txt | 2 ++ backend/python/exllama/requirements.txt | 1 - backend/python/exllama2/requirements-cpu.txt | 3 +++ backend/python/exllama2/requirements-cublas11.txt | 2 ++ backend/python/exllama2/requirements-cublas12.txt | 2 ++ backend/python/exllama2/requirements.txt | 1 - backend/python/mamba/requirements-cpu.txt | 3 ++- backend/python/mamba/requirements-cublas11.txt | 3 ++- backend/python/mamba/requirements-cublas12.txt | 3 ++- backend/python/mamba/requirements.txt | 3 +-- backend/python/openvoice/requirements-cpu.txt | 1 + backend/python/parler-tts/requirements-cpu.txt | 3 +++ backend/python/parler-tts/requirements-cublas11.txt | 4 +++- backend/python/parler-tts/requirements-cublas12.txt | 4 +++- backend/python/parler-tts/requirements-hipblas.txt | 4 +++- backend/python/parler-tts/requirements-intel.txt | 4 +++- backend/python/parler-tts/requirements.txt | 4 +--- backend/python/petals/requirements-cpu.txt | 3 +++ backend/python/petals/requirements-cublas11.txt | 1 + backend/python/petals/requirements-cublas12.txt | 1 + backend/python/petals/requirements-hipblas.txt | 1 + backend/python/petals/requirements-intel.txt | 3 ++- backend/python/petals/requirements.txt | 3 +-- backend/python/rerankers/requirements-cpu.txt | 4 ++++ backend/python/rerankers/requirements-cublas11.txt | 3 +++ backend/python/rerankers/requirements-cublas12.txt | 3 +++ backend/python/rerankers/requirements-hipblas.txt | 5 ++++- backend/python/rerankers/requirements-intel.txt | 3 +++ backend/python/rerankers/requirements.txt | 5 +---- backend/python/sentencetransformers/requirements-cpu.txt | 6 ++++++ .../sentencetransformers/requirements-cublas11.txt | 3 +++ .../sentencetransformers/requirements-cublas12.txt | 3 +++ .../python/sentencetransformers/requirements-hipblas.txt | 5 ++++- .../python/sentencetransformers/requirements-intel.txt | 5 ++++- backend/python/sentencetransformers/requirements.txt | 3 --- .../python/transformers-musicgen/requirements-cpu.txt | 3 +++ .../transformers-musicgen/requirements-cublas11.txt | 4 +++- .../transformers-musicgen/requirements-cublas12.txt | 4 +++- .../transformers-musicgen/requirements-hipblas.txt | 2 ++ .../python/transformers-musicgen/requirements-intel.txt | 2 ++ backend/python/transformers-musicgen/requirements.txt | 2 -- backend/python/transformers/requirements-cpu.txt | 4 ++++ backend/python/transformers/requirements-cublas11.txt | 3 +++ backend/python/transformers/requirements-cublas12.txt | 3 +++ backend/python/transformers/requirements-hipblas.txt | 5 ++++- backend/python/transformers/requirements-intel.txt | 2 ++ backend/python/transformers/requirements.txt | 6 +----- backend/python/vall-e-x/requirements-cpu.txt | 3 +++ backend/python/vall-e-x/requirements-cublas11.txt | 1 + backend/python/vall-e-x/requirements-cublas12.txt | 1 + backend/python/vall-e-x/requirements-hipblas.txt | 1 + backend/python/vall-e-x/requirements-intel.txt | 1 + backend/python/vall-e-x/requirements.txt | 1 - backend/python/vllm/requirements-after.txt | 1 + backend/python/vllm/requirements-cpu.txt | 4 ++++ backend/python/vllm/requirements-cublas11.txt | 4 +++- backend/python/vllm/requirements-cublas12.txt | 4 +++- backend/python/vllm/requirements-hipblas.txt | 5 ++++- backend/python/vllm/requirements-intel.txt | 5 ++++- backend/python/vllm/requirements.txt | 3 --- 79 files changed, 212 insertions(+), 61 deletions(-) create mode 100644 backend/python/bark/requirements-cpu.txt create mode 100644 backend/python/coqui/requirements-cpu.txt create mode 100644 backend/python/diffusers/requirements-cpu.txt create mode 100644 backend/python/exllama/requirements-cpu.txt create mode 100644 backend/python/exllama2/requirements-cpu.txt create mode 100644 backend/python/openvoice/requirements-cpu.txt create mode 100644 backend/python/parler-tts/requirements-cpu.txt create mode 100644 backend/python/petals/requirements-cpu.txt create mode 100644 backend/python/rerankers/requirements-cpu.txt create mode 100644 backend/python/sentencetransformers/requirements-cpu.txt create mode 100644 backend/python/transformers-musicgen/requirements-cpu.txt create mode 100644 backend/python/transformers/requirements-cpu.txt create mode 100644 backend/python/vall-e-x/requirements-cpu.txt create mode 100644 backend/python/vllm/requirements-after.txt create mode 100644 backend/python/vllm/requirements-cpu.txt diff --git a/backend/python/bark/requirements-cpu.txt b/backend/python/bark/requirements-cpu.txt new file mode 100644 index 00000000..0b2c3bc7 --- /dev/null +++ b/backend/python/bark/requirements-cpu.txt @@ -0,0 +1,4 @@ +transformers +accelerate +torch +torchaudio \ No newline at end of file diff --git a/backend/python/bark/requirements-cublas11.txt b/backend/python/bark/requirements-cublas11.txt index 0de92979..71a6a93f 100644 --- a/backend/python/bark/requirements-cublas11.txt +++ b/backend/python/bark/requirements-cublas11.txt @@ -1,3 +1,5 @@ --extra-index-url https://download.pytorch.org/whl/cu118 torch -torchaudio \ No newline at end of file +torchaudio +transformers +accelerate \ No newline at end of file diff --git a/backend/python/bark/requirements-cublas12.txt b/backend/python/bark/requirements-cublas12.txt index 6c3c7e7a..0fa27074 100644 --- a/backend/python/bark/requirements-cublas12.txt +++ b/backend/python/bark/requirements-cublas12.txt @@ -1,2 +1,4 @@ torch -torchaudio \ No newline at end of file +torchaudio +transformers +accelerate \ No newline at end of file diff --git a/backend/python/bark/requirements-hipblas.txt b/backend/python/bark/requirements-hipblas.txt index 7bfc411b..af9e820e 100644 --- a/backend/python/bark/requirements-hipblas.txt +++ b/backend/python/bark/requirements-hipblas.txt @@ -1,3 +1,5 @@ --extra-index-url https://download.pytorch.org/whl/rocm6.0 torch -torchaudio \ No newline at end of file +torchaudio +transformers +accelerate \ No newline at end of file diff --git a/backend/python/bark/requirements-intel.txt b/backend/python/bark/requirements-intel.txt index 5c4aa6a5..9feb6eef 100644 --- a/backend/python/bark/requirements-intel.txt +++ b/backend/python/bark/requirements-intel.txt @@ -3,4 +3,6 @@ intel-extension-for-pytorch torch torchaudio optimum[openvino] -setuptools==70.3.0 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file +setuptools==70.3.0 # https://github.com/mudler/LocalAI/issues/2406 +transformers +accelerate \ No newline at end of file diff --git a/backend/python/bark/requirements.txt b/backend/python/bark/requirements.txt index 2e34d5a4..93f9fb78 100644 --- a/backend/python/bark/requirements.txt +++ b/backend/python/bark/requirements.txt @@ -1,6 +1,4 @@ -accelerate bark==0.1.5 grpcio==1.65.4 protobuf -certifi -transformers \ No newline at end of file +certifi \ No newline at end of file diff --git a/backend/python/coqui/requirements-cpu.txt b/backend/python/coqui/requirements-cpu.txt new file mode 100644 index 00000000..bbcdc8cd --- /dev/null +++ b/backend/python/coqui/requirements-cpu.txt @@ -0,0 +1,3 @@ +transformers +accelerate +torch \ No newline at end of file diff --git a/backend/python/coqui/requirements-cublas11.txt b/backend/python/coqui/requirements-cublas11.txt index 0de92979..71a6a93f 100644 --- a/backend/python/coqui/requirements-cublas11.txt +++ b/backend/python/coqui/requirements-cublas11.txt @@ -1,3 +1,5 @@ --extra-index-url https://download.pytorch.org/whl/cu118 torch -torchaudio \ No newline at end of file +torchaudio +transformers +accelerate \ No newline at end of file diff --git a/backend/python/coqui/requirements-cublas12.txt b/backend/python/coqui/requirements-cublas12.txt index 6c3c7e7a..0fa27074 100644 --- a/backend/python/coqui/requirements-cublas12.txt +++ b/backend/python/coqui/requirements-cublas12.txt @@ -1,2 +1,4 @@ torch -torchaudio \ No newline at end of file +torchaudio +transformers +accelerate \ No newline at end of file diff --git a/backend/python/coqui/requirements-hipblas.txt b/backend/python/coqui/requirements-hipblas.txt index 7bfc411b..af9e820e 100644 --- a/backend/python/coqui/requirements-hipblas.txt +++ b/backend/python/coqui/requirements-hipblas.txt @@ -1,3 +1,5 @@ --extra-index-url https://download.pytorch.org/whl/rocm6.0 torch -torchaudio \ No newline at end of file +torchaudio +transformers +accelerate \ No newline at end of file diff --git a/backend/python/coqui/requirements-intel.txt b/backend/python/coqui/requirements-intel.txt index 58a2a1dd..002a55c3 100644 --- a/backend/python/coqui/requirements-intel.txt +++ b/backend/python/coqui/requirements-intel.txt @@ -3,4 +3,6 @@ intel-extension-for-pytorch torch torchaudio optimum[openvino] -setuptools==72.1.0 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file +setuptools==72.1.0 # https://github.com/mudler/LocalAI/issues/2406 +transformers +accelerate \ No newline at end of file diff --git a/backend/python/coqui/requirements.txt b/backend/python/coqui/requirements.txt index a1bdac44..35c62449 100644 --- a/backend/python/coqui/requirements.txt +++ b/backend/python/coqui/requirements.txt @@ -1,6 +1,4 @@ -accelerate TTS==0.22.0 grpcio==1.65.4 protobuf -certifi -transformers \ No newline at end of file +certifi \ No newline at end of file diff --git a/backend/python/diffusers/requirements-cpu.txt b/backend/python/diffusers/requirements-cpu.txt new file mode 100644 index 00000000..e46a53e5 --- /dev/null +++ b/backend/python/diffusers/requirements-cpu.txt @@ -0,0 +1,8 @@ +diffusers +opencv-python +transformers +accelerate +compel +peft +sentencepiece +torch \ No newline at end of file diff --git a/backend/python/diffusers/requirements-cublas11.txt b/backend/python/diffusers/requirements-cublas11.txt index 6461b696..df28b821 100644 --- a/backend/python/diffusers/requirements-cublas11.txt +++ b/backend/python/diffusers/requirements-cublas11.txt @@ -1,2 +1,9 @@ --extra-index-url https://download.pytorch.org/whl/cu118 torch +diffusers +opencv-python +transformers +accelerate +compel +peft +sentencepiece \ No newline at end of file diff --git a/backend/python/diffusers/requirements-cublas12.txt b/backend/python/diffusers/requirements-cublas12.txt index 12c6d5d5..b0685a62 100644 --- a/backend/python/diffusers/requirements-cublas12.txt +++ b/backend/python/diffusers/requirements-cublas12.txt @@ -1 +1,8 @@ torch +diffusers +opencv-python +transformers +accelerate +compel +peft +sentencepiece \ No newline at end of file diff --git a/backend/python/diffusers/requirements-hipblas.txt b/backend/python/diffusers/requirements-hipblas.txt index 6c8da20d..9e992d02 100644 --- a/backend/python/diffusers/requirements-hipblas.txt +++ b/backend/python/diffusers/requirements-hipblas.txt @@ -1,3 +1,10 @@ --extra-index-url https://download.pytorch.org/whl/rocm6.0 torch -torchvision \ No newline at end of file +torchvision +diffusers +opencv-python +transformers +accelerate +compel +peft +sentencepiece \ No newline at end of file diff --git a/backend/python/diffusers/requirements-intel.txt b/backend/python/diffusers/requirements-intel.txt index c393b118..77f9e674 100644 --- a/backend/python/diffusers/requirements-intel.txt +++ b/backend/python/diffusers/requirements-intel.txt @@ -3,4 +3,11 @@ intel-extension-for-pytorch torch torchvision optimum[openvino] -setuptools==70.3.0 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file +setuptools==70.3.0 # https://github.com/mudler/LocalAI/issues/2406 +diffusers +opencv-python +transformers +accelerate +compel +peft +sentencepiece \ No newline at end of file diff --git a/backend/python/exllama/requirements-cpu.txt b/backend/python/exllama/requirements-cpu.txt new file mode 100644 index 00000000..bbcdc8cd --- /dev/null +++ b/backend/python/exllama/requirements-cpu.txt @@ -0,0 +1,3 @@ +transformers +accelerate +torch \ No newline at end of file diff --git a/backend/python/exllama/requirements-cublas11.txt b/backend/python/exllama/requirements-cublas11.txt index 6461b696..1dfb5b98 100644 --- a/backend/python/exllama/requirements-cublas11.txt +++ b/backend/python/exllama/requirements-cublas11.txt @@ -1,2 +1,4 @@ --extra-index-url https://download.pytorch.org/whl/cu118 torch +transformers +accelerate \ No newline at end of file diff --git a/backend/python/exllama/requirements-cublas12.txt b/backend/python/exllama/requirements-cublas12.txt index 12c6d5d5..1ec544cd 100644 --- a/backend/python/exllama/requirements-cublas12.txt +++ b/backend/python/exllama/requirements-cublas12.txt @@ -1 +1,3 @@ torch +transformers +accelerate \ No newline at end of file diff --git a/backend/python/exllama/requirements.txt b/backend/python/exllama/requirements.txt index b06efcea..835671a2 100644 --- a/backend/python/exllama/requirements.txt +++ b/backend/python/exllama/requirements.txt @@ -1,5 +1,4 @@ grpcio==1.65.0 protobuf -transformers certifi setuptools \ No newline at end of file diff --git a/backend/python/exllama2/requirements-cpu.txt b/backend/python/exllama2/requirements-cpu.txt new file mode 100644 index 00000000..bbcdc8cd --- /dev/null +++ b/backend/python/exllama2/requirements-cpu.txt @@ -0,0 +1,3 @@ +transformers +accelerate +torch \ No newline at end of file diff --git a/backend/python/exllama2/requirements-cublas11.txt b/backend/python/exllama2/requirements-cublas11.txt index 6461b696..1dfb5b98 100644 --- a/backend/python/exllama2/requirements-cublas11.txt +++ b/backend/python/exllama2/requirements-cublas11.txt @@ -1,2 +1,4 @@ --extra-index-url https://download.pytorch.org/whl/cu118 torch +transformers +accelerate \ No newline at end of file diff --git a/backend/python/exllama2/requirements-cublas12.txt b/backend/python/exllama2/requirements-cublas12.txt index 12c6d5d5..1ec544cd 100644 --- a/backend/python/exllama2/requirements-cublas12.txt +++ b/backend/python/exllama2/requirements-cublas12.txt @@ -1 +1,3 @@ torch +transformers +accelerate \ No newline at end of file diff --git a/backend/python/exllama2/requirements.txt b/backend/python/exllama2/requirements.txt index 487d89a9..ce15b0b6 100644 --- a/backend/python/exllama2/requirements.txt +++ b/backend/python/exllama2/requirements.txt @@ -1,4 +1,3 @@ -accelerate grpcio==1.65.4 protobuf certifi diff --git a/backend/python/mamba/requirements-cpu.txt b/backend/python/mamba/requirements-cpu.txt index 08ed5eeb..39dab0fd 100644 --- a/backend/python/mamba/requirements-cpu.txt +++ b/backend/python/mamba/requirements-cpu.txt @@ -1 +1,2 @@ -torch \ No newline at end of file +torch +transformers \ No newline at end of file diff --git a/backend/python/mamba/requirements-cublas11.txt b/backend/python/mamba/requirements-cublas11.txt index 2f89bd95..7048a14f 100644 --- a/backend/python/mamba/requirements-cublas11.txt +++ b/backend/python/mamba/requirements-cublas11.txt @@ -1,2 +1,3 @@ --extra-index-url https://download.pytorch.org/whl/cu118 -torch \ No newline at end of file +torch +transformers \ No newline at end of file diff --git a/backend/python/mamba/requirements-cublas12.txt b/backend/python/mamba/requirements-cublas12.txt index 08ed5eeb..39dab0fd 100644 --- a/backend/python/mamba/requirements-cublas12.txt +++ b/backend/python/mamba/requirements-cublas12.txt @@ -1 +1,2 @@ -torch \ No newline at end of file +torch +transformers \ No newline at end of file diff --git a/backend/python/mamba/requirements.txt b/backend/python/mamba/requirements.txt index 068bf336..22ae46ad 100644 --- a/backend/python/mamba/requirements.txt +++ b/backend/python/mamba/requirements.txt @@ -1,4 +1,3 @@ grpcio==1.65.1 protobuf -certifi -transformers \ No newline at end of file +certifi \ No newline at end of file diff --git a/backend/python/openvoice/requirements-cpu.txt b/backend/python/openvoice/requirements-cpu.txt new file mode 100644 index 00000000..08ed5eeb --- /dev/null +++ b/backend/python/openvoice/requirements-cpu.txt @@ -0,0 +1 @@ +torch \ No newline at end of file diff --git a/backend/python/parler-tts/requirements-cpu.txt b/backend/python/parler-tts/requirements-cpu.txt new file mode 100644 index 00000000..bbcdc8cd --- /dev/null +++ b/backend/python/parler-tts/requirements-cpu.txt @@ -0,0 +1,3 @@ +transformers +accelerate +torch \ No newline at end of file diff --git a/backend/python/parler-tts/requirements-cublas11.txt b/backend/python/parler-tts/requirements-cublas11.txt index 0de92979..71a6a93f 100644 --- a/backend/python/parler-tts/requirements-cublas11.txt +++ b/backend/python/parler-tts/requirements-cublas11.txt @@ -1,3 +1,5 @@ --extra-index-url https://download.pytorch.org/whl/cu118 torch -torchaudio \ No newline at end of file +torchaudio +transformers +accelerate \ No newline at end of file diff --git a/backend/python/parler-tts/requirements-cublas12.txt b/backend/python/parler-tts/requirements-cublas12.txt index 6c3c7e7a..0fa27074 100644 --- a/backend/python/parler-tts/requirements-cublas12.txt +++ b/backend/python/parler-tts/requirements-cublas12.txt @@ -1,2 +1,4 @@ torch -torchaudio \ No newline at end of file +torchaudio +transformers +accelerate \ No newline at end of file diff --git a/backend/python/parler-tts/requirements-hipblas.txt b/backend/python/parler-tts/requirements-hipblas.txt index 7bfc411b..af9e820e 100644 --- a/backend/python/parler-tts/requirements-hipblas.txt +++ b/backend/python/parler-tts/requirements-hipblas.txt @@ -1,3 +1,5 @@ --extra-index-url https://download.pytorch.org/whl/rocm6.0 torch -torchaudio \ No newline at end of file +torchaudio +transformers +accelerate \ No newline at end of file diff --git a/backend/python/parler-tts/requirements-intel.txt b/backend/python/parler-tts/requirements-intel.txt index 58a2a1dd..002a55c3 100644 --- a/backend/python/parler-tts/requirements-intel.txt +++ b/backend/python/parler-tts/requirements-intel.txt @@ -3,4 +3,6 @@ intel-extension-for-pytorch torch torchaudio optimum[openvino] -setuptools==72.1.0 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file +setuptools==72.1.0 # https://github.com/mudler/LocalAI/issues/2406 +transformers +accelerate \ No newline at end of file diff --git a/backend/python/parler-tts/requirements.txt b/backend/python/parler-tts/requirements.txt index 1dfa6675..297ddd0b 100644 --- a/backend/python/parler-tts/requirements.txt +++ b/backend/python/parler-tts/requirements.txt @@ -1,6 +1,4 @@ -accelerate grpcio==1.65.1 protobuf git+https://github.com/huggingface/parler-tts.git@10016fb0300c0dc31a0fb70e26f3affee7b62f16 -certifi -transformers \ No newline at end of file +certifi \ No newline at end of file diff --git a/backend/python/petals/requirements-cpu.txt b/backend/python/petals/requirements-cpu.txt new file mode 100644 index 00000000..bbcdc8cd --- /dev/null +++ b/backend/python/petals/requirements-cpu.txt @@ -0,0 +1,3 @@ +transformers +accelerate +torch \ No newline at end of file diff --git a/backend/python/petals/requirements-cublas11.txt b/backend/python/petals/requirements-cublas11.txt index 6461b696..f7683016 100644 --- a/backend/python/petals/requirements-cublas11.txt +++ b/backend/python/petals/requirements-cublas11.txt @@ -1,2 +1,3 @@ --extra-index-url https://download.pytorch.org/whl/cu118 torch +transformers diff --git a/backend/python/petals/requirements-cublas12.txt b/backend/python/petals/requirements-cublas12.txt index 12c6d5d5..4f492ddc 100644 --- a/backend/python/petals/requirements-cublas12.txt +++ b/backend/python/petals/requirements-cublas12.txt @@ -1 +1,2 @@ torch +transformers diff --git a/backend/python/petals/requirements-hipblas.txt b/backend/python/petals/requirements-hipblas.txt index 0331f106..8a4e2ff0 100644 --- a/backend/python/petals/requirements-hipblas.txt +++ b/backend/python/petals/requirements-hipblas.txt @@ -1,2 +1,3 @@ --extra-index-url https://download.pytorch.org/whl/rocm6.0 torch +transformers diff --git a/backend/python/petals/requirements-intel.txt b/backend/python/petals/requirements-intel.txt index 755e19d8..4e3ed017 100644 --- a/backend/python/petals/requirements-intel.txt +++ b/backend/python/petals/requirements-intel.txt @@ -2,4 +2,5 @@ intel-extension-for-pytorch torch optimum[openvino] -setuptools==72.1.0 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file +setuptools==72.1.0 # https://github.com/mudler/LocalAI/issues/2406 +transformers \ No newline at end of file diff --git a/backend/python/petals/requirements.txt b/backend/python/petals/requirements.txt index 10f5114e..0755fe01 100644 --- a/backend/python/petals/requirements.txt +++ b/backend/python/petals/requirements.txt @@ -1,3 +1,2 @@ git+https://github.com/bigscience-workshop/petals -certifi -transformers \ No newline at end of file +certifi \ No newline at end of file diff --git a/backend/python/rerankers/requirements-cpu.txt b/backend/python/rerankers/requirements-cpu.txt new file mode 100644 index 00000000..25a1d8ab --- /dev/null +++ b/backend/python/rerankers/requirements-cpu.txt @@ -0,0 +1,4 @@ +transformers +accelerate +torch +rerankers[transformers] \ No newline at end of file diff --git a/backend/python/rerankers/requirements-cublas11.txt b/backend/python/rerankers/requirements-cublas11.txt index 6461b696..06c4b2cf 100644 --- a/backend/python/rerankers/requirements-cublas11.txt +++ b/backend/python/rerankers/requirements-cublas11.txt @@ -1,2 +1,5 @@ --extra-index-url https://download.pytorch.org/whl/cu118 +transformers +accelerate torch +rerankers[transformers] \ No newline at end of file diff --git a/backend/python/rerankers/requirements-cublas12.txt b/backend/python/rerankers/requirements-cublas12.txt index 12c6d5d5..25a1d8ab 100644 --- a/backend/python/rerankers/requirements-cublas12.txt +++ b/backend/python/rerankers/requirements-cublas12.txt @@ -1 +1,4 @@ +transformers +accelerate torch +rerankers[transformers] \ No newline at end of file diff --git a/backend/python/rerankers/requirements-hipblas.txt b/backend/python/rerankers/requirements-hipblas.txt index 76018445..961d150c 100644 --- a/backend/python/rerankers/requirements-hipblas.txt +++ b/backend/python/rerankers/requirements-hipblas.txt @@ -1,2 +1,5 @@ --extra-index-url https://download.pytorch.org/whl/rocm6.0 -torch \ No newline at end of file +transformers +accelerate +torch +rerankers[transformers] \ No newline at end of file diff --git a/backend/python/rerankers/requirements-intel.txt b/backend/python/rerankers/requirements-intel.txt index 755e19d8..1a39cf4f 100644 --- a/backend/python/rerankers/requirements-intel.txt +++ b/backend/python/rerankers/requirements-intel.txt @@ -1,5 +1,8 @@ --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ intel-extension-for-pytorch +transformers +accelerate torch +rerankers[transformers] optimum[openvino] setuptools==72.1.0 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file diff --git a/backend/python/rerankers/requirements.txt b/backend/python/rerankers/requirements.txt index 33166382..2a8d18b1 100644 --- a/backend/python/rerankers/requirements.txt +++ b/backend/python/rerankers/requirements.txt @@ -1,6 +1,3 @@ -accelerate -rerankers[transformers] grpcio==1.65.4 protobuf -certifi -transformers \ No newline at end of file +certifi \ No newline at end of file diff --git a/backend/python/sentencetransformers/requirements-cpu.txt b/backend/python/sentencetransformers/requirements-cpu.txt new file mode 100644 index 00000000..cd9924ef --- /dev/null +++ b/backend/python/sentencetransformers/requirements-cpu.txt @@ -0,0 +1,6 @@ +torch +accelerate +transformers +bitsandbytes +sentence-transformers==3.0.1 +transformers \ No newline at end of file diff --git a/backend/python/sentencetransformers/requirements-cublas11.txt b/backend/python/sentencetransformers/requirements-cublas11.txt index 6461b696..1131f066 100644 --- a/backend/python/sentencetransformers/requirements-cublas11.txt +++ b/backend/python/sentencetransformers/requirements-cublas11.txt @@ -1,2 +1,5 @@ --extra-index-url https://download.pytorch.org/whl/cu118 torch +accelerate +sentence-transformers==3.0.1 +transformers \ No newline at end of file diff --git a/backend/python/sentencetransformers/requirements-cublas12.txt b/backend/python/sentencetransformers/requirements-cublas12.txt index 12c6d5d5..2936e17b 100644 --- a/backend/python/sentencetransformers/requirements-cublas12.txt +++ b/backend/python/sentencetransformers/requirements-cublas12.txt @@ -1 +1,4 @@ torch +accelerate +sentence-transformers==3.0.1 +transformers \ No newline at end of file diff --git a/backend/python/sentencetransformers/requirements-hipblas.txt b/backend/python/sentencetransformers/requirements-hipblas.txt index 76018445..3b187c68 100644 --- a/backend/python/sentencetransformers/requirements-hipblas.txt +++ b/backend/python/sentencetransformers/requirements-hipblas.txt @@ -1,2 +1,5 @@ --extra-index-url https://download.pytorch.org/whl/rocm6.0 -torch \ No newline at end of file +torch +accelerate +sentence-transformers==3.0.1 +transformers \ No newline at end of file diff --git a/backend/python/sentencetransformers/requirements-intel.txt b/backend/python/sentencetransformers/requirements-intel.txt index 95d4848c..806e3d47 100644 --- a/backend/python/sentencetransformers/requirements-intel.txt +++ b/backend/python/sentencetransformers/requirements-intel.txt @@ -2,4 +2,7 @@ intel-extension-for-pytorch torch optimum[openvino] -setuptools==69.5.1 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file +setuptools==69.5.1 # https://github.com/mudler/LocalAI/issues/2406 +accelerate +sentence-transformers==3.0.1 +transformers \ No newline at end of file diff --git a/backend/python/sentencetransformers/requirements.txt b/backend/python/sentencetransformers/requirements.txt index 4ef4a28b..22ae46ad 100644 --- a/backend/python/sentencetransformers/requirements.txt +++ b/backend/python/sentencetransformers/requirements.txt @@ -1,6 +1,3 @@ -accelerate -sentence-transformers==3.0.1 -transformers grpcio==1.65.1 protobuf certifi \ No newline at end of file diff --git a/backend/python/transformers-musicgen/requirements-cpu.txt b/backend/python/transformers-musicgen/requirements-cpu.txt new file mode 100644 index 00000000..bbcdc8cd --- /dev/null +++ b/backend/python/transformers-musicgen/requirements-cpu.txt @@ -0,0 +1,3 @@ +transformers +accelerate +torch \ No newline at end of file diff --git a/backend/python/transformers-musicgen/requirements-cublas11.txt b/backend/python/transformers-musicgen/requirements-cublas11.txt index 6461b696..191a6eef 100644 --- a/backend/python/transformers-musicgen/requirements-cublas11.txt +++ b/backend/python/transformers-musicgen/requirements-cublas11.txt @@ -1,2 +1,4 @@ --extra-index-url https://download.pytorch.org/whl/cu118 -torch +transformers +accelerate +torch \ No newline at end of file diff --git a/backend/python/transformers-musicgen/requirements-cublas12.txt b/backend/python/transformers-musicgen/requirements-cublas12.txt index 12c6d5d5..bbcdc8cd 100644 --- a/backend/python/transformers-musicgen/requirements-cublas12.txt +++ b/backend/python/transformers-musicgen/requirements-cublas12.txt @@ -1 +1,3 @@ -torch +transformers +accelerate +torch \ No newline at end of file diff --git a/backend/python/transformers-musicgen/requirements-hipblas.txt b/backend/python/transformers-musicgen/requirements-hipblas.txt index 76018445..00f0a946 100644 --- a/backend/python/transformers-musicgen/requirements-hipblas.txt +++ b/backend/python/transformers-musicgen/requirements-hipblas.txt @@ -1,2 +1,4 @@ --extra-index-url https://download.pytorch.org/whl/rocm6.0 +transformers +accelerate torch \ No newline at end of file diff --git a/backend/python/transformers-musicgen/requirements-intel.txt b/backend/python/transformers-musicgen/requirements-intel.txt index 95d4848c..89bfa6a2 100644 --- a/backend/python/transformers-musicgen/requirements-intel.txt +++ b/backend/python/transformers-musicgen/requirements-intel.txt @@ -1,5 +1,7 @@ --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ intel-extension-for-pytorch +transformers +accelerate torch optimum[openvino] setuptools==69.5.1 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file diff --git a/backend/python/transformers-musicgen/requirements.txt b/backend/python/transformers-musicgen/requirements.txt index bec86241..420b968c 100644 --- a/backend/python/transformers-musicgen/requirements.txt +++ b/backend/python/transformers-musicgen/requirements.txt @@ -1,5 +1,3 @@ -accelerate -transformers grpcio==1.65.4 protobuf scipy==1.14.0 diff --git a/backend/python/transformers/requirements-cpu.txt b/backend/python/transformers/requirements-cpu.txt new file mode 100644 index 00000000..f1e6281b --- /dev/null +++ b/backend/python/transformers/requirements-cpu.txt @@ -0,0 +1,4 @@ +torch +accelerate +transformers +bitsandbytes \ No newline at end of file diff --git a/backend/python/transformers/requirements-cublas11.txt b/backend/python/transformers/requirements-cublas11.txt index 6461b696..0abd72d9 100644 --- a/backend/python/transformers/requirements-cublas11.txt +++ b/backend/python/transformers/requirements-cublas11.txt @@ -1,2 +1,5 @@ --extra-index-url https://download.pytorch.org/whl/cu118 torch +accelerate +transformers +bitsandbytes \ No newline at end of file diff --git a/backend/python/transformers/requirements-cublas12.txt b/backend/python/transformers/requirements-cublas12.txt index 12c6d5d5..f1e6281b 100644 --- a/backend/python/transformers/requirements-cublas12.txt +++ b/backend/python/transformers/requirements-cublas12.txt @@ -1 +1,4 @@ torch +accelerate +transformers +bitsandbytes \ No newline at end of file diff --git a/backend/python/transformers/requirements-hipblas.txt b/backend/python/transformers/requirements-hipblas.txt index 76018445..f6900af1 100644 --- a/backend/python/transformers/requirements-hipblas.txt +++ b/backend/python/transformers/requirements-hipblas.txt @@ -1,2 +1,5 @@ --extra-index-url https://download.pytorch.org/whl/rocm6.0 -torch \ No newline at end of file +torch +accelerate +transformers +bitsandbytes \ No newline at end of file diff --git a/backend/python/transformers/requirements-intel.txt b/backend/python/transformers/requirements-intel.txt index 8fc18a0e..5d9efb71 100644 --- a/backend/python/transformers/requirements-intel.txt +++ b/backend/python/transformers/requirements-intel.txt @@ -2,3 +2,5 @@ intel-extension-for-pytorch torch optimum[openvino] +intel-extension-for-transformers +bitsandbytes \ No newline at end of file diff --git a/backend/python/transformers/requirements.txt b/backend/python/transformers/requirements.txt index 2a08ba45..318560d9 100644 --- a/backend/python/transformers/requirements.txt +++ b/backend/python/transformers/requirements.txt @@ -1,8 +1,4 @@ -accelerate -transformers grpcio==1.65.4 protobuf certifi -intel-extension-for-transformers -bitsandbytes -setuptools==69.5.1 # https://github.com/mudler/LocalAI/issues/2406 +setuptools==69.5.1 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file diff --git a/backend/python/vall-e-x/requirements-cpu.txt b/backend/python/vall-e-x/requirements-cpu.txt new file mode 100644 index 00000000..3a3304c0 --- /dev/null +++ b/backend/python/vall-e-x/requirements-cpu.txt @@ -0,0 +1,3 @@ +accelerate +torch +torchaudio \ No newline at end of file diff --git a/backend/python/vall-e-x/requirements-cublas11.txt b/backend/python/vall-e-x/requirements-cublas11.txt index 0de92979..4e0a151a 100644 --- a/backend/python/vall-e-x/requirements-cublas11.txt +++ b/backend/python/vall-e-x/requirements-cublas11.txt @@ -1,3 +1,4 @@ --extra-index-url https://download.pytorch.org/whl/cu118 +accelerate torch torchaudio \ No newline at end of file diff --git a/backend/python/vall-e-x/requirements-cublas12.txt b/backend/python/vall-e-x/requirements-cublas12.txt index 6c3c7e7a..3a3304c0 100644 --- a/backend/python/vall-e-x/requirements-cublas12.txt +++ b/backend/python/vall-e-x/requirements-cublas12.txt @@ -1,2 +1,3 @@ +accelerate torch torchaudio \ No newline at end of file diff --git a/backend/python/vall-e-x/requirements-hipblas.txt b/backend/python/vall-e-x/requirements-hipblas.txt index 7bfc411b..6ddd0b8d 100644 --- a/backend/python/vall-e-x/requirements-hipblas.txt +++ b/backend/python/vall-e-x/requirements-hipblas.txt @@ -1,3 +1,4 @@ --extra-index-url https://download.pytorch.org/whl/rocm6.0 +accelerate torch torchaudio \ No newline at end of file diff --git a/backend/python/vall-e-x/requirements-intel.txt b/backend/python/vall-e-x/requirements-intel.txt index 58a2a1dd..6185314f 100644 --- a/backend/python/vall-e-x/requirements-intel.txt +++ b/backend/python/vall-e-x/requirements-intel.txt @@ -1,5 +1,6 @@ --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ intel-extension-for-pytorch +accelerate torch torchaudio optimum[openvino] diff --git a/backend/python/vall-e-x/requirements.txt b/backend/python/vall-e-x/requirements.txt index ec3584b2..2a8d18b1 100644 --- a/backend/python/vall-e-x/requirements.txt +++ b/backend/python/vall-e-x/requirements.txt @@ -1,4 +1,3 @@ -accelerate grpcio==1.65.4 protobuf certifi \ No newline at end of file diff --git a/backend/python/vllm/requirements-after.txt b/backend/python/vllm/requirements-after.txt new file mode 100644 index 00000000..7bfe8efe --- /dev/null +++ b/backend/python/vllm/requirements-after.txt @@ -0,0 +1 @@ +flash-attn \ No newline at end of file diff --git a/backend/python/vllm/requirements-cpu.txt b/backend/python/vllm/requirements-cpu.txt new file mode 100644 index 00000000..cc5a50c6 --- /dev/null +++ b/backend/python/vllm/requirements-cpu.txt @@ -0,0 +1,4 @@ +accelerate +torch +transformers +vllm \ No newline at end of file diff --git a/backend/python/vllm/requirements-cublas11.txt b/backend/python/vllm/requirements-cublas11.txt index bed8cea8..48722834 100644 --- a/backend/python/vllm/requirements-cublas11.txt +++ b/backend/python/vllm/requirements-cublas11.txt @@ -1,3 +1,5 @@ --extra-index-url https://download.pytorch.org/whl/cu118 +accelerate torch -flash-attn \ No newline at end of file +transformers +vllm \ No newline at end of file diff --git a/backend/python/vllm/requirements-cublas12.txt b/backend/python/vllm/requirements-cublas12.txt index b6fef4d7..cc5a50c6 100644 --- a/backend/python/vllm/requirements-cublas12.txt +++ b/backend/python/vllm/requirements-cublas12.txt @@ -1,2 +1,4 @@ +accelerate torch -flash-attn \ No newline at end of file +transformers +vllm \ No newline at end of file diff --git a/backend/python/vllm/requirements-hipblas.txt b/backend/python/vllm/requirements-hipblas.txt index 76018445..b11ba692 100644 --- a/backend/python/vllm/requirements-hipblas.txt +++ b/backend/python/vllm/requirements-hipblas.txt @@ -1,2 +1,5 @@ --extra-index-url https://download.pytorch.org/whl/rocm6.0 -torch \ No newline at end of file +accelerate +torch +transformers +vllm \ No newline at end of file diff --git a/backend/python/vllm/requirements-intel.txt b/backend/python/vllm/requirements-intel.txt index 635b4c31..516e3d01 100644 --- a/backend/python/vllm/requirements-intel.txt +++ b/backend/python/vllm/requirements-intel.txt @@ -1,5 +1,8 @@ --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ intel-extension-for-pytorch +accelerate torch +transformers optimum[openvino] -setuptools==70.3.0 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file +setuptools==70.3.0 # https://github.com/mudler/LocalAI/issues/2406 +vllm \ No newline at end of file diff --git a/backend/python/vllm/requirements.txt b/backend/python/vllm/requirements.txt index b8b79afb..99dc865e 100644 --- a/backend/python/vllm/requirements.txt +++ b/backend/python/vllm/requirements.txt @@ -1,7 +1,4 @@ -accelerate -vllm grpcio==1.65.4 protobuf certifi -transformers setuptools \ No newline at end of file From 11b2adae0c166a29af4ea0f728cc4f9ed2233941 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Wed, 7 Aug 2024 18:08:26 +0200 Subject: [PATCH 0189/1851] fix(vllm): drop flash-attn installation afterwards Signed-off-by: Ettore Di Giacinto --- backend/python/vllm/requirements-after.txt | 1 - 1 file changed, 1 deletion(-) delete mode 100644 backend/python/vllm/requirements-after.txt diff --git a/backend/python/vllm/requirements-after.txt b/backend/python/vllm/requirements-after.txt deleted file mode 100644 index 7bfe8efe..00000000 --- a/backend/python/vllm/requirements-after.txt +++ /dev/null @@ -1 +0,0 @@ -flash-attn \ No newline at end of file From 66cf38b0b36f46d915a79b0e8d2ceae90614f6bb Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Wed, 7 Aug 2024 19:45:14 +0200 Subject: [PATCH 0190/1851] feat(venv): shared env (#3195) * feat(venv): allow to share veenvs Signed-off-by: Ettore Di Giacinto * fix(vllm): add back flash-attn Signed-off-by: Ettore Di Giacinto --------- Signed-off-by: Ettore Di Giacinto --- backend/python/common/libbackend.sh | 40 ++++++++++++++----- .../vllm/requirements-cublas11-after.txt | 1 + .../vllm/requirements-cublas12-after.txt | 1 + 3 files changed, 31 insertions(+), 11 deletions(-) create mode 100644 backend/python/vllm/requirements-cublas11-after.txt create mode 100644 backend/python/vllm/requirements-cublas12-after.txt diff --git a/backend/python/common/libbackend.sh b/backend/python/common/libbackend.sh index 7287fb95..934b1fd3 100644 --- a/backend/python/common/libbackend.sh +++ b/backend/python/common/libbackend.sh @@ -18,10 +18,23 @@ # source $(dirname $0)/../common/libbackend.sh # function init() { + # Name of the backend (directory name) BACKEND_NAME=${PWD##*/} + + # Path where all backends files are MY_DIR=$(realpath `dirname $0`) + + # Build type BUILD_PROFILE=$(getBuildProfile) + # Environment directory + EDIR=${MY_DIR} + + # Allow to specify a custom env dir for shared environments + if [ "x${ENV_DIR}" != "x" ]; then + EDIR=${ENV_DIR} + fi + # If a backend has defined a list of valid build profiles... if [ ! -z "${LIMIT_TARGETS}" ]; then isValidTarget=$(checkTargets ${LIMIT_TARGETS}) @@ -74,13 +87,14 @@ function getBuildProfile() { # This function is idempotent, so you can call it as many times as you want and it will # always result in an activated virtual environment function ensureVenv() { - if [ ! -d "${MY_DIR}/venv" ]; then - uv venv ${MY_DIR}/venv + if [ ! -d "${EDIR}/venv" ]; then + uv venv ${EDIR}/venv echo "virtualenv created" fi - - if [ "x${VIRTUAL_ENV}" != "x${MY_DIR}/venv" ]; then - source ${MY_DIR}/venv/bin/activate + + # Source if we are not already in a Virtual env + if [ "x${VIRTUAL_ENV}" != "x${EDIR}/venv" ]; then + source ${EDIR}/venv/bin/activate echo "virtualenv activated" fi @@ -113,21 +127,25 @@ function installRequirements() { # These are the requirements files we will attempt to install, in order declare -a requirementFiles=( - "${MY_DIR}/requirements-install.txt" - "${MY_DIR}/requirements.txt" - "${MY_DIR}/requirements-${BUILD_TYPE}.txt" + "${EDIR}/requirements-install.txt" + "${EDIR}/requirements.txt" + "${EDIR}/requirements-${BUILD_TYPE}.txt" ) if [ "x${BUILD_TYPE}" != "x${BUILD_PROFILE}" ]; then - requirementFiles+=("${MY_DIR}/requirements-${BUILD_PROFILE}.txt") + requirementFiles+=("${EDIR}/requirements-${BUILD_PROFILE}.txt") fi # if BUILD_TYPE is empty, we are a CPU build, so we should try to install the CPU requirements if [ "x${BUILD_TYPE}" == "x" ]; then - requirementFiles+=("${MY_DIR}/requirements-cpu.txt") + requirementFiles+=("${EDIR}/requirements-cpu.txt") fi - requirementFiles+=("${MY_DIR}/requirements-after.txt") + requirementFiles+=("${EDIR}/requirements-after.txt") + + if [ "x${BUILD_TYPE}" != "x${BUILD_PROFILE}" ]; then + requirementFiles+=("${EDIR}/requirements-${BUILD_PROFILE}-after.txt") + fi for reqFile in ${requirementFiles[@]}; do if [ -f ${reqFile} ]; then diff --git a/backend/python/vllm/requirements-cublas11-after.txt b/backend/python/vllm/requirements-cublas11-after.txt new file mode 100644 index 00000000..7bfe8efe --- /dev/null +++ b/backend/python/vllm/requirements-cublas11-after.txt @@ -0,0 +1 @@ +flash-attn \ No newline at end of file diff --git a/backend/python/vllm/requirements-cublas12-after.txt b/backend/python/vllm/requirements-cublas12-after.txt new file mode 100644 index 00000000..7bfe8efe --- /dev/null +++ b/backend/python/vllm/requirements-cublas12-after.txt @@ -0,0 +1 @@ +flash-attn \ No newline at end of file From e198347886199a8119140f0d7d1a6442b4541ebc Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Wed, 7 Aug 2024 21:27:02 +0200 Subject: [PATCH 0191/1851] feat(openai): add `json_schema` format type and strict mode (#3193) * feat(openai): add json_schema and strict mode Signed-off-by: Ettore Di Giacinto * handle err vs _ security scanners prefer if we put these branches in, and I tend to agree. Signed-off-by: Dave --------- Signed-off-by: Ettore Di Giacinto Signed-off-by: Dave Co-authored-by: Dave --- core/http/endpoints/openai/chat.go | 37 +++++++++++++++++++++++++++--- core/schema/openai.go | 11 +++++++++ pkg/functions/functions.go | 1 + 3 files changed, 46 insertions(+), 3 deletions(-) diff --git a/core/http/endpoints/openai/chat.go b/core/http/endpoints/openai/chat.go index 86b75601..12a14eac 100644 --- a/core/http/endpoints/openai/chat.go +++ b/core/http/endpoints/openai/chat.go @@ -172,6 +172,14 @@ func ChatEndpoint(cl *config.BackendConfigLoader, ml *model.ModelLoader, startup funcs := input.Functions shouldUseFn := len(input.Functions) > 0 && config.ShouldUseFunctions() + strictMode := false + + for _, f := range input.Functions { + if f.Strict { + strictMode = true + break + } + } // Allow the user to set custom actions via config file // to be "embedded" in each model @@ -187,10 +195,33 @@ func ChatEndpoint(cl *config.BackendConfigLoader, ml *model.ModelLoader, startup if config.ResponseFormatMap != nil { d := schema.ChatCompletionResponseFormat{} - dat, _ := json.Marshal(config.ResponseFormatMap) - _ = json.Unmarshal(dat, &d) + dat, err := json.Marshal(config.ResponseFormatMap) + if err != nil { + return err + } + err = json.Unmarshal(dat, &d) + if err != nil { + return err + } if d.Type == "json_object" { input.Grammar = functions.JSONBNF + } else if d.Type == "json_schema" { + d := schema.JsonSchemaRequest{} + dat, err := json.Marshal(config.ResponseFormatMap) + if err != nil { + return err + } + err = json.Unmarshal(dat, &d) + if err != nil { + return err + } + fs := &functions.JSONFunctionStructure{ + AnyOf: []functions.Item{d.JsonSchema.Schema}, + } + g, err := fs.Grammar(config.FunctionsConfig.GrammarOptions()...) + if err == nil { + input.Grammar = g + } } } @@ -201,7 +232,7 @@ func ChatEndpoint(cl *config.BackendConfigLoader, ml *model.ModelLoader, startup } switch { - case !config.FunctionsConfig.GrammarConfig.NoGrammar && shouldUseFn: + case (!config.FunctionsConfig.GrammarConfig.NoGrammar || strictMode) && shouldUseFn: noActionGrammar := functions.Function{ Name: noActionName, Description: noActionDescription, diff --git a/core/schema/openai.go b/core/schema/openai.go index 3b39eaf3..fe4745bf 100644 --- a/core/schema/openai.go +++ b/core/schema/openai.go @@ -139,6 +139,17 @@ type ChatCompletionResponseFormat struct { Type ChatCompletionResponseFormatType `json:"type,omitempty"` } +type JsonSchemaRequest struct { + Type string `json:"type"` + JsonSchema JsonSchema `json:"json_schema"` +} + +type JsonSchema struct { + Name string `json:"name"` + Strict bool `json:"strict"` + Schema functions.Item `json:"schema"` +} + type OpenAIRequest struct { PredictionOptions diff --git a/pkg/functions/functions.go b/pkg/functions/functions.go index 19012d53..1a7e1ff1 100644 --- a/pkg/functions/functions.go +++ b/pkg/functions/functions.go @@ -14,6 +14,7 @@ const ( type Function struct { Name string `json:"name"` Description string `json:"description"` + Strict bool `json:"strict"` Parameters map[string]interface{} `json:"parameters"` } type Functions []Function From 2c8623dbb40dbe748d0c361074a836a660b8a91b Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Wed, 7 Aug 2024 23:34:37 +0200 Subject: [PATCH 0192/1851] fix(python): move vllm to after deps, drop diffusers main deps Signed-off-by: Ettore Di Giacinto --- backend/python/diffusers/requirements.txt | 7 ------- backend/python/vllm/requirements-after.txt | 1 + backend/python/vllm/requirements-cpu.txt | 3 +-- backend/python/vllm/requirements-cublas11.txt | 3 +-- backend/python/vllm/requirements-cublas12.txt | 3 +-- backend/python/vllm/requirements-hipblas.txt | 3 +-- backend/python/vllm/requirements-intel.txt | 3 +-- 7 files changed, 6 insertions(+), 17 deletions(-) create mode 100644 backend/python/vllm/requirements-after.txt diff --git a/backend/python/diffusers/requirements.txt b/backend/python/diffusers/requirements.txt index 9919b20a..b4195fc5 100644 --- a/backend/python/diffusers/requirements.txt +++ b/backend/python/diffusers/requirements.txt @@ -1,12 +1,5 @@ setuptools -accelerate -compel -peft -diffusers grpcio==1.65.4 -opencv-python pillow protobuf -sentencepiece -transformers certifi diff --git a/backend/python/vllm/requirements-after.txt b/backend/python/vllm/requirements-after.txt new file mode 100644 index 00000000..76f11f15 --- /dev/null +++ b/backend/python/vllm/requirements-after.txt @@ -0,0 +1 @@ +vllm \ No newline at end of file diff --git a/backend/python/vllm/requirements-cpu.txt b/backend/python/vllm/requirements-cpu.txt index cc5a50c6..765a1ef5 100644 --- a/backend/python/vllm/requirements-cpu.txt +++ b/backend/python/vllm/requirements-cpu.txt @@ -1,4 +1,3 @@ accelerate torch -transformers -vllm \ No newline at end of file +transformers \ No newline at end of file diff --git a/backend/python/vllm/requirements-cublas11.txt b/backend/python/vllm/requirements-cublas11.txt index 48722834..43817727 100644 --- a/backend/python/vllm/requirements-cublas11.txt +++ b/backend/python/vllm/requirements-cublas11.txt @@ -1,5 +1,4 @@ --extra-index-url https://download.pytorch.org/whl/cu118 accelerate torch -transformers -vllm \ No newline at end of file +transformers \ No newline at end of file diff --git a/backend/python/vllm/requirements-cublas12.txt b/backend/python/vllm/requirements-cublas12.txt index cc5a50c6..765a1ef5 100644 --- a/backend/python/vllm/requirements-cublas12.txt +++ b/backend/python/vllm/requirements-cublas12.txt @@ -1,4 +1,3 @@ accelerate torch -transformers -vllm \ No newline at end of file +transformers \ No newline at end of file diff --git a/backend/python/vllm/requirements-hipblas.txt b/backend/python/vllm/requirements-hipblas.txt index b11ba692..c73d8141 100644 --- a/backend/python/vllm/requirements-hipblas.txt +++ b/backend/python/vllm/requirements-hipblas.txt @@ -1,5 +1,4 @@ --extra-index-url https://download.pytorch.org/whl/rocm6.0 accelerate torch -transformers -vllm \ No newline at end of file +transformers \ No newline at end of file diff --git a/backend/python/vllm/requirements-intel.txt b/backend/python/vllm/requirements-intel.txt index 516e3d01..7903282e 100644 --- a/backend/python/vllm/requirements-intel.txt +++ b/backend/python/vllm/requirements-intel.txt @@ -4,5 +4,4 @@ accelerate torch transformers optimum[openvino] -setuptools==70.3.0 # https://github.com/mudler/LocalAI/issues/2406 -vllm \ No newline at end of file +setuptools==70.3.0 # https://github.com/mudler/LocalAI/issues/2406 \ No newline at end of file From 36e185ba6352686f433f3ac8a288b97eb4ae4c16 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Wed, 7 Aug 2024 23:35:44 +0200 Subject: [PATCH 0193/1851] feat(p2p): allow to run multiple clusters in the same p2p network (#3128) feat(p2p): allow to run multiple clusters in the same network Allow to specify a network ID via CLI which allows to run multiple clusters, logically separated within the same network (by using the same shared token). Note: This segregation is not "secure" by any means, anyone having the network token can see the services available in all the network, however, this provides a way to separate the inference endpoints. This allows for instance to have a node which is both federated and having attached a set of llama.cpp workers. Signed-off-by: Ettore Di Giacinto --- core/cli/federated.go | 9 +++++---- core/cli/run.go | 11 +++++++---- core/cli/worker/worker_p2p.go | 17 +++++++++-------- core/config/application_config.go | 7 +++++++ core/http/endpoints/localai/p2p.go | 12 +++++++----- core/http/routes/localai.go | 2 +- core/http/routes/ui.go | 9 +++++---- core/p2p/federated.go | 9 +++++++++ 8 files changed, 50 insertions(+), 26 deletions(-) diff --git a/core/cli/federated.go b/core/cli/federated.go index 32f0fa87..271babca 100644 --- a/core/cli/federated.go +++ b/core/cli/federated.go @@ -8,14 +8,15 @@ import ( ) type FederatedCLI struct { - Address string `env:"LOCALAI_ADDRESS,ADDRESS" default:":8080" help:"Bind address for the API server" group:"api"` - Peer2PeerToken string `env:"LOCALAI_P2P_TOKEN,P2P_TOKEN,TOKEN" name:"p2ptoken" help:"Token for P2P mode (optional)" group:"p2p"` - LoadBalanced bool `env:"LOCALAI_LOAD_BALANCED,LOAD_BALANCED" default:"false" help:"Enable load balancing" group:"p2p"` + Address string `env:"LOCALAI_ADDRESS,ADDRESS" default:":8080" help:"Bind address for the API server" group:"api"` + Peer2PeerToken string `env:"LOCALAI_P2P_TOKEN,P2P_TOKEN,TOKEN" name:"p2ptoken" help:"Token for P2P mode (optional)" group:"p2p"` + LoadBalanced bool `env:"LOCALAI_LOAD_BALANCED,LOAD_BALANCED" default:"false" help:"Enable load balancing" group:"p2p"` + Peer2PeerNetworkID string `env:"LOCALAI_P2P_NETWORK_ID,P2P_NETWORK_ID" help:"Network ID for P2P mode, can be set arbitrarly by the user for grouping a set of instances." group:"p2p"` } func (f *FederatedCLI) Run(ctx *cliContext.Context) error { - fs := p2p.NewFederatedServer(f.Address, p2p.FederatedID, f.Peer2PeerToken, f.LoadBalanced) + fs := p2p.NewFederatedServer(f.Address, p2p.NetworkID(f.Peer2PeerNetworkID, p2p.FederatedID), f.Peer2PeerToken, f.LoadBalanced) return fs.Start(context.Background()) } diff --git a/core/cli/run.go b/core/cli/run.go index b3d91632..9d58f6d9 100644 --- a/core/cli/run.go +++ b/core/cli/run.go @@ -54,6 +54,7 @@ type RunCMD struct { OpaqueErrors bool `env:"LOCALAI_OPAQUE_ERRORS" default:"false" help:"If true, all error responses are replaced with blank 500 errors. This is intended only for hardening against information leaks and is normally not recommended." group:"hardening"` Peer2Peer bool `env:"LOCALAI_P2P,P2P" name:"p2p" default:"false" help:"Enable P2P mode" group:"p2p"` Peer2PeerToken string `env:"LOCALAI_P2P_TOKEN,P2P_TOKEN,TOKEN" name:"p2ptoken" help:"Token for P2P mode (optional)" group:"p2p"` + Peer2PeerNetworkID string `env:"LOCALAI_P2P_NETWORK_ID,P2P_NETWORK_ID" help:"Network ID for P2P mode, can be set arbitrarly by the user for grouping a set of instances" group:"p2p"` ParallelRequests bool `env:"LOCALAI_PARALLEL_REQUESTS,PARALLEL_REQUESTS" help:"Enable backends to handle multiple requests in parallel if they support it (e.g.: llama.cpp or vllm)" group:"backends"` SingleActiveBackend bool `env:"LOCALAI_SINGLE_ACTIVE_BACKEND,SINGLE_ACTIVE_BACKEND" help:"Allow only one backend to be run at a time" group:"backends"` PreloadBackendOnly bool `env:"LOCALAI_PRELOAD_BACKEND_ONLY,PRELOAD_BACKEND_ONLY" default:"false" help:"Do not launch the API services, only the preloaded models / backends are started (useful for multi-node setups)" group:"backends"` @@ -94,6 +95,7 @@ func (r *RunCMD) Run(ctx *cliContext.Context) error { config.WithModelsURL(append(r.Models, r.ModelArgs...)...), config.WithOpaqueErrors(r.OpaqueErrors), config.WithEnforcedPredownloadScans(!r.DisablePredownloadScan), + config.WithP2PNetworkID(r.Peer2PeerNetworkID), } token := "" @@ -119,9 +121,9 @@ func (r *RunCMD) Run(ctx *cliContext.Context) error { } log.Info().Msg("Starting P2P server discovery...") - if err := p2p.ServiceDiscoverer(context.Background(), node, token, "", func(serviceID string, node p2p.NodeData) { + if err := p2p.ServiceDiscoverer(context.Background(), node, token, p2p.NetworkID(r.Peer2PeerNetworkID, ""), func(serviceID string, node p2p.NodeData) { var tunnelAddresses []string - for _, v := range p2p.GetAvailableNodes("") { + for _, v := range p2p.GetAvailableNodes(p2p.NetworkID(r.Peer2PeerNetworkID, "")) { if v.IsOnline() { tunnelAddresses = append(tunnelAddresses, v.TunnelAddress) } else { @@ -142,14 +144,15 @@ func (r *RunCMD) Run(ctx *cliContext.Context) error { if err != nil { return err } - if err := p2p.ExposeService(context.Background(), "localhost", port, token, p2p.FederatedID); err != nil { + if err := p2p.ExposeService(context.Background(), "localhost", port, token, p2p.NetworkID(r.Peer2PeerNetworkID, p2p.FederatedID)); err != nil { return err } node, err := p2p.NewNode(token) if err != nil { return err } - if err := p2p.ServiceDiscoverer(context.Background(), node, token, p2p.FederatedID, nil); err != nil { + + if err := p2p.ServiceDiscoverer(context.Background(), node, token, p2p.NetworkID(r.Peer2PeerNetworkID, p2p.FederatedID), nil); err != nil { return err } } diff --git a/core/cli/worker/worker_p2p.go b/core/cli/worker/worker_p2p.go index 2eb5cb94..ddb3518c 100644 --- a/core/cli/worker/worker_p2p.go +++ b/core/cli/worker/worker_p2p.go @@ -19,12 +19,13 @@ import ( ) type P2P struct { - WorkerFlags `embed:""` - Token string `env:"LOCALAI_TOKEN,LOCALAI_P2P_TOKEN,TOKEN" help:"P2P token to use"` - NoRunner bool `env:"LOCALAI_NO_RUNNER,NO_RUNNER" help:"Do not start the llama-cpp-rpc-server"` - RunnerAddress string `env:"LOCALAI_RUNNER_ADDRESS,RUNNER_ADDRESS" help:"Address of the llama-cpp-rpc-server"` - RunnerPort string `env:"LOCALAI_RUNNER_PORT,RUNNER_PORT" help:"Port of the llama-cpp-rpc-server"` - ExtraLLamaCPPArgs []string `env:"LOCALAI_EXTRA_LLAMA_CPP_ARGS,EXTRA_LLAMA_CPP_ARGS" help:"Extra arguments to pass to llama-cpp-rpc-server"` + WorkerFlags `embed:""` + Token string `env:"LOCALAI_TOKEN,LOCALAI_P2P_TOKEN,TOKEN" help:"P2P token to use"` + NoRunner bool `env:"LOCALAI_NO_RUNNER,NO_RUNNER" help:"Do not start the llama-cpp-rpc-server"` + RunnerAddress string `env:"LOCALAI_RUNNER_ADDRESS,RUNNER_ADDRESS" help:"Address of the llama-cpp-rpc-server"` + RunnerPort string `env:"LOCALAI_RUNNER_PORT,RUNNER_PORT" help:"Port of the llama-cpp-rpc-server"` + ExtraLLamaCPPArgs []string `env:"LOCALAI_EXTRA_LLAMA_CPP_ARGS,EXTRA_LLAMA_CPP_ARGS" help:"Extra arguments to pass to llama-cpp-rpc-server"` + Peer2PeerNetworkID string `env:"LOCALAI_P2P_NETWORK_ID,P2P_NETWORK_ID" help:"Network ID for P2P mode, can be set arbitrarly by the user for grouping a set of instances" group:"p2p"` } func (r *P2P) Run(ctx *cliContext.Context) error { @@ -59,7 +60,7 @@ func (r *P2P) Run(ctx *cliContext.Context) error { p = r.RunnerPort } - err = p2p.ExposeService(context.Background(), address, p, r.Token, "") + err = p2p.ExposeService(context.Background(), address, p, r.Token, p2p.NetworkID(r.Peer2PeerNetworkID, "")) if err != nil { return err } @@ -99,7 +100,7 @@ func (r *P2P) Run(ctx *cliContext.Context) error { } }() - err = p2p.ExposeService(context.Background(), address, fmt.Sprint(port), r.Token, "") + err = p2p.ExposeService(context.Background(), address, fmt.Sprint(port), r.Token, p2p.NetworkID(r.Peer2PeerNetworkID, "")) if err != nil { return err } diff --git a/core/config/application_config.go b/core/config/application_config.go index 7233d1ac..6e8c46e1 100644 --- a/core/config/application_config.go +++ b/core/config/application_config.go @@ -34,6 +34,7 @@ type ApplicationConfig struct { EnforcePredownloadScans bool OpaqueErrors bool P2PToken string + P2PNetworkID string ModelLibraryURL string @@ -91,6 +92,12 @@ func WithCors(b bool) AppOption { } } +func WithP2PNetworkID(s string) AppOption { + return func(o *ApplicationConfig) { + o.P2PNetworkID = s + } +} + func WithCsrf(b bool) AppOption { return func(o *ApplicationConfig) { o.CSRF = b diff --git a/core/http/endpoints/localai/p2p.go b/core/http/endpoints/localai/p2p.go index cab0bb5d..93e9b5d5 100644 --- a/core/http/endpoints/localai/p2p.go +++ b/core/http/endpoints/localai/p2p.go @@ -11,12 +11,14 @@ import ( // @Summary Returns available P2P nodes // @Success 200 {object} []schema.P2PNodesResponse "Response" // @Router /api/p2p [get] -func ShowP2PNodes(c *fiber.Ctx) error { +func ShowP2PNodes(appConfig *config.ApplicationConfig) func(*fiber.Ctx) error { // Render index - return c.JSON(schema.P2PNodesResponse{ - Nodes: p2p.GetAvailableNodes(""), - FederatedNodes: p2p.GetAvailableNodes(p2p.FederatedID), - }) + return func(c *fiber.Ctx) error { + return c.JSON(schema.P2PNodesResponse{ + Nodes: p2p.GetAvailableNodes(p2p.NetworkID(appConfig.P2PNetworkID, "")), + FederatedNodes: p2p.GetAvailableNodes(p2p.NetworkID(appConfig.P2PNetworkID, p2p.FederatedID)), + }) + } } // ShowP2PToken returns the P2P token diff --git a/core/http/routes/localai.go b/core/http/routes/localai.go index b8a811b5..9c420010 100644 --- a/core/http/routes/localai.go +++ b/core/http/routes/localai.go @@ -59,7 +59,7 @@ func RegisterLocalAIRoutes(app *fiber.App, // p2p if p2p.IsP2PEnabled() { - app.Get("/api/p2p", auth, localai.ShowP2PNodes) + app.Get("/api/p2p", auth, localai.ShowP2PNodes(appConfig)) app.Get("/api/p2p/token", auth, localai.ShowP2PToken(appConfig)) } diff --git a/core/http/routes/ui.go b/core/http/routes/ui.go index 92917463..4f8afd3c 100644 --- a/core/http/routes/ui.go +++ b/core/http/routes/ui.go @@ -96,6 +96,7 @@ func RegisterUIRoutes(app *fiber.App, //"FederatedNodes": p2p.GetAvailableNodes(p2p.FederatedID), "IsP2PEnabled": p2p.IsP2PEnabled(), "P2PToken": appConfig.P2PToken, + "NetworkID": appConfig.P2PNetworkID, } // Render index @@ -104,17 +105,17 @@ func RegisterUIRoutes(app *fiber.App, /* show nodes live! */ app.Get("/p2p/ui/workers", auth, func(c *fiber.Ctx) error { - return c.SendString(elements.P2PNodeBoxes(p2p.GetAvailableNodes(""))) + return c.SendString(elements.P2PNodeBoxes(p2p.GetAvailableNodes(p2p.NetworkID(appConfig.P2PNetworkID, "")))) }) app.Get("/p2p/ui/workers-federation", auth, func(c *fiber.Ctx) error { - return c.SendString(elements.P2PNodeBoxes(p2p.GetAvailableNodes(p2p.FederatedID))) + return c.SendString(elements.P2PNodeBoxes(p2p.GetAvailableNodes(p2p.NetworkID(appConfig.P2PNetworkID, p2p.FederatedID)))) }) app.Get("/p2p/ui/workers-stats", auth, func(c *fiber.Ctx) error { - return c.SendString(elements.P2PNodeStats(p2p.GetAvailableNodes(""))) + return c.SendString(elements.P2PNodeStats(p2p.GetAvailableNodes(p2p.NetworkID(appConfig.P2PNetworkID, "")))) }) app.Get("/p2p/ui/workers-federation-stats", auth, func(c *fiber.Ctx) error { - return c.SendString(elements.P2PNodeStats(p2p.GetAvailableNodes(p2p.FederatedID))) + return c.SendString(elements.P2PNodeStats(p2p.GetAvailableNodes(p2p.NetworkID(appConfig.P2PNetworkID, p2p.FederatedID)))) }) } diff --git a/core/p2p/federated.go b/core/p2p/federated.go index b56c9e0c..3ac3ff91 100644 --- a/core/p2p/federated.go +++ b/core/p2p/federated.go @@ -1,7 +1,16 @@ package p2p +import "fmt" + const FederatedID = "federated" +func NetworkID(networkID, serviceID string) string { + if networkID != "" { + return fmt.Sprintf("%s_%s", networkID, serviceID) + } + return serviceID +} + type FederatedServer struct { listenAddr, service, p2ptoken string requestTable map[string]int From 8814b31805b8b77a467fcaf4ce25aa37f36f59dc Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Wed, 7 Aug 2024 23:35:55 +0200 Subject: [PATCH 0194/1851] chore: drop gpt4all.cpp (#3106) chore: drop gpt4all gpt4all is already supported in llama.cpp - the backend was kept for keeping compatibility with old gpt4all models (prior to gguf format). It is good time now to clean up and remove it to slim the compilation process. Signed-off-by: Ettore Di Giacinto --- Makefile | 42 +------------------- backend/go/llm/gpt4all/gpt4all.go | 62 ------------------------------ backend/go/llm/gpt4all/main.go | 21 ---------- core/cli/worker/worker_llamacpp.go | 2 +- core/cli/worker/worker_p2p.go | 2 +- core/http/app_test.go | 40 ------------------- core/http/routes/ui.go | 2 +- core/startup/startup.go | 2 +- pkg/model/initializers.go | 11 +----- 9 files changed, 7 insertions(+), 177 deletions(-) delete mode 100644 backend/go/llm/gpt4all/gpt4all.go delete mode 100644 backend/go/llm/gpt4all/main.go diff --git a/Makefile b/Makefile index 476caac6..bcbdbe83 100644 --- a/Makefile +++ b/Makefile @@ -10,10 +10,6 @@ GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be CPPLLAMA_VERSION?=1e6f6554aa11fa10160a5fda689e736c3c34169f -# gpt4all version -GPT4ALL_REPO?=https://github.com/nomic-ai/gpt4all -GPT4ALL_VERSION?=27a8b020c36b0df8f8b82a252d261cda47cf44b8 - # go-rwkv version RWKV_REPO?=https://github.com/donomii/go-rwkv.cpp RWKV_VERSION?=661e7ae26d442f5cfebd2a0881b44e8c55949ec6 @@ -190,7 +186,6 @@ ALL_GRPC_BACKENDS+=backend-assets/grpc/llama-cpp-fallback ALL_GRPC_BACKENDS+=backend-assets/grpc/llama-ggml ALL_GRPC_BACKENDS+=backend-assets/grpc/llama-cpp-grpc ALL_GRPC_BACKENDS+=backend-assets/util/llama-cpp-rpc-server -ALL_GRPC_BACKENDS+=backend-assets/grpc/gpt4all ALL_GRPC_BACKENDS+=backend-assets/grpc/rwkv ALL_GRPC_BACKENDS+=backend-assets/grpc/whisper ALL_GRPC_BACKENDS+=backend-assets/grpc/local-store @@ -253,18 +248,6 @@ sources/go-piper: sources/go-piper/libpiper_binding.a: sources/go-piper $(MAKE) -C sources/go-piper libpiper_binding.a example/main piper.o -## GPT4ALL -sources/gpt4all: - mkdir -p sources/gpt4all - cd sources/gpt4all && \ - git init && \ - git remote add origin $(GPT4ALL_REPO) && \ - git fetch origin && \ - git checkout $(GPT4ALL_VERSION) && \ - git submodule update --init --recursive --depth 1 --single-branch - -sources/gpt4all/gpt4all-bindings/golang/libgpt4all.a: sources/gpt4all - $(MAKE) -C sources/gpt4all/gpt4all-bindings/golang/ libgpt4all.a ## RWKV sources/go-rwkv.cpp: @@ -318,7 +301,7 @@ sources/whisper.cpp: sources/whisper.cpp/libwhisper.a: sources/whisper.cpp cd sources/whisper.cpp && $(MAKE) libwhisper.a libggml.a -get-sources: sources/go-llama.cpp sources/gpt4all sources/go-piper sources/go-rwkv.cpp sources/whisper.cpp sources/go-bert.cpp sources/go-stable-diffusion sources/go-tiny-dream backend/cpp/llama/llama.cpp +get-sources: sources/go-llama.cpp sources/go-piper sources/go-rwkv.cpp sources/whisper.cpp sources/go-bert.cpp sources/go-stable-diffusion sources/go-tiny-dream backend/cpp/llama/llama.cpp replace: $(GOCMD) mod edit -replace github.com/donomii/go-rwkv.cpp=$(CURDIR)/sources/go-rwkv.cpp @@ -328,7 +311,6 @@ replace: $(GOCMD) mod edit -replace github.com/M0Rf30/go-tiny-dream=$(CURDIR)/sources/go-tiny-dream $(GOCMD) mod edit -replace github.com/mudler/go-piper=$(CURDIR)/sources/go-piper $(GOCMD) mod edit -replace github.com/mudler/go-stable-diffusion=$(CURDIR)/sources/go-stable-diffusion - $(GOCMD) mod edit -replace github.com/nomic-ai/gpt4all/gpt4all-bindings/golang=$(CURDIR)/sources/gpt4all/gpt4all-bindings/golang $(GOCMD) mod edit -replace github.com/go-skynet/go-llama.cpp=$(CURDIR)/sources/go-llama.cpp dropreplace: @@ -339,7 +321,6 @@ dropreplace: $(GOCMD) mod edit -dropreplace github.com/M0Rf30/go-tiny-dream $(GOCMD) mod edit -dropreplace github.com/mudler/go-piper $(GOCMD) mod edit -dropreplace github.com/mudler/go-stable-diffusion - $(GOCMD) mod edit -dropreplace github.com/nomic-ai/gpt4all/gpt4all-bindings/golang $(GOCMD) mod edit -dropreplace github.com/go-skynet/go-llama.cpp prepare-sources: get-sources replace @@ -349,7 +330,6 @@ prepare-sources: get-sources replace rebuild: ## Rebuilds the project $(GOCMD) clean -cache $(MAKE) -C sources/go-llama.cpp clean - $(MAKE) -C sources/gpt4all/gpt4all-bindings/golang/ clean $(MAKE) -C sources/go-rwkv.cpp clean $(MAKE) -C sources/whisper.cpp clean $(MAKE) -C sources/go-stable-diffusion clean @@ -469,8 +449,7 @@ test: prepare test-models/testmodel.ggml grpcs export GO_TAGS="tts stablediffusion debug" $(MAKE) prepare-test HUGGINGFACE_GRPC=$(abspath ./)/backend/python/sentencetransformers/run.sh TEST_DIR=$(abspath ./)/test-dir/ FIXTURES=$(abspath ./)/tests/fixtures CONFIG_FILE=$(abspath ./)/test-models/config.yaml MODELS_PATH=$(abspath ./)/test-models \ - $(GOCMD) run github.com/onsi/ginkgo/v2/ginkgo --label-filter="!gpt4all && !llama && !llama-gguf" --flake-attempts $(TEST_FLAKES) --fail-fast -v -r $(TEST_PATHS) - $(MAKE) test-gpt4all + $(GOCMD) run github.com/onsi/ginkgo/v2/ginkgo --label-filter="!llama && !llama-gguf" --flake-attempts $(TEST_FLAKES) --fail-fast -v -r $(TEST_PATHS) $(MAKE) test-llama $(MAKE) test-llama-gguf $(MAKE) test-tts @@ -500,10 +479,6 @@ teardown-e2e: rm -rf $(TEST_DIR) || true docker stop $$(docker ps -q --filter ancestor=localai-tests) -test-gpt4all: prepare-test - TEST_DIR=$(abspath ./)/test-dir/ FIXTURES=$(abspath ./)/tests/fixtures CONFIG_FILE=$(abspath ./)/test-models/config.yaml MODELS_PATH=$(abspath ./)/test-models \ - $(GOCMD) run github.com/onsi/ginkgo/v2/ginkgo --label-filter="gpt4all" --flake-attempts 5 -v -r $(TEST_PATHS) - test-llama: prepare-test TEST_DIR=$(abspath ./)/test-dir/ FIXTURES=$(abspath ./)/tests/fixtures CONFIG_FILE=$(abspath ./)/test-models/config.yaml MODELS_PATH=$(abspath ./)/test-models \ $(GOCMD) run github.com/onsi/ginkgo/v2/ginkgo --label-filter="llama" --flake-attempts 5 -v -r $(TEST_PATHS) @@ -730,12 +705,6 @@ backend-assets/espeak-ng-data: sources/go-piper sources/go-piper/libpiper_bindin mkdir -p backend-assets/espeak-ng-data @cp -rf sources/go-piper/piper-phonemize/pi/share/espeak-ng-data/. backend-assets/espeak-ng-data -backend-assets/gpt4all: sources/gpt4all sources/gpt4all/gpt4all-bindings/golang/libgpt4all.a - mkdir -p backend-assets/gpt4all - @cp sources/gpt4all/gpt4all-bindings/golang/buildllm/*.so backend-assets/gpt4all/ || true - @cp sources/gpt4all/gpt4all-bindings/golang/buildllm/*.dylib backend-assets/gpt4all/ || true - @cp sources/gpt4all/gpt4all-bindings/golang/buildllm/*.dll backend-assets/gpt4all/ || true - backend-assets/grpc: protogen-go replace mkdir -p backend-assets/grpc @@ -746,13 +715,6 @@ ifneq ($(UPX),) $(UPX) backend-assets/grpc/bert-embeddings endif -backend-assets/grpc/gpt4all: sources/gpt4all sources/gpt4all/gpt4all-bindings/golang/libgpt4all.a backend-assets/gpt4all backend-assets/grpc - CGO_LDFLAGS="$(CGO_LDFLAGS)" C_INCLUDE_PATH=$(CURDIR)/sources/gpt4all/gpt4all-bindings/golang/ LIBRARY_PATH=$(CURDIR)/sources/gpt4all/gpt4all-bindings/golang/ \ - $(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/gpt4all ./backend/go/llm/gpt4all/ -ifneq ($(UPX),) - $(UPX) backend-assets/grpc/gpt4all -endif - backend-assets/grpc/huggingface: backend-assets/grpc $(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/huggingface ./backend/go/llm/langchain/ ifneq ($(UPX),) diff --git a/backend/go/llm/gpt4all/gpt4all.go b/backend/go/llm/gpt4all/gpt4all.go deleted file mode 100644 index 9caab48c..00000000 --- a/backend/go/llm/gpt4all/gpt4all.go +++ /dev/null @@ -1,62 +0,0 @@ -package main - -// This is a wrapper to statisfy the GRPC service interface -// It is meant to be used by the main executable that is the server for the specific backend type (falcon, gpt3, etc) -import ( - "fmt" - - "github.com/mudler/LocalAI/pkg/grpc/base" - pb "github.com/mudler/LocalAI/pkg/grpc/proto" - gpt4all "github.com/nomic-ai/gpt4all/gpt4all-bindings/golang" -) - -type LLM struct { - base.SingleThread - - gpt4all *gpt4all.Model -} - -func (llm *LLM) Load(opts *pb.ModelOptions) error { - model, err := gpt4all.New(opts.ModelFile, - gpt4all.SetThreads(int(opts.Threads)), - gpt4all.SetLibrarySearchPath(opts.LibrarySearchPath)) - llm.gpt4all = model - return err -} - -func buildPredictOptions(opts *pb.PredictOptions) []gpt4all.PredictOption { - predictOptions := []gpt4all.PredictOption{ - gpt4all.SetTemperature(float64(opts.Temperature)), - gpt4all.SetTopP(float64(opts.TopP)), - gpt4all.SetTopK(int(opts.TopK)), - gpt4all.SetTokens(int(opts.Tokens)), - } - - if opts.Batch != 0 { - predictOptions = append(predictOptions, gpt4all.SetBatch(int(opts.Batch))) - } - return predictOptions -} - -func (llm *LLM) Predict(opts *pb.PredictOptions) (string, error) { - return llm.gpt4all.Predict(opts.Prompt, buildPredictOptions(opts)...) -} - -func (llm *LLM) PredictStream(opts *pb.PredictOptions, results chan string) error { - predictOptions := buildPredictOptions(opts) - - go func() { - llm.gpt4all.SetTokenCallback(func(token string) bool { - results <- token - return true - }) - _, err := llm.gpt4all.Predict(opts.Prompt, predictOptions...) - if err != nil { - fmt.Println("err: ", err) - } - llm.gpt4all.SetTokenCallback(nil) - close(results) - }() - - return nil -} diff --git a/backend/go/llm/gpt4all/main.go b/backend/go/llm/gpt4all/main.go deleted file mode 100644 index acf44087..00000000 --- a/backend/go/llm/gpt4all/main.go +++ /dev/null @@ -1,21 +0,0 @@ -package main - -// Note: this is started internally by LocalAI and a server is allocated for each model - -import ( - "flag" - - grpc "github.com/mudler/LocalAI/pkg/grpc" -) - -var ( - addr = flag.String("addr", "localhost:50051", "the address to connect to") -) - -func main() { - flag.Parse() - - if err := grpc.StartServer(*addr, &LLM{}); err != nil { - panic(err) - } -} diff --git a/core/cli/worker/worker_llamacpp.go b/core/cli/worker/worker_llamacpp.go index 5598a485..2baf51ec 100644 --- a/core/cli/worker/worker_llamacpp.go +++ b/core/cli/worker/worker_llamacpp.go @@ -21,7 +21,7 @@ func (r *LLamaCPP) Run(ctx *cliContext.Context) error { err := assets.ExtractFiles(ctx.BackendAssets, r.BackendAssetsPath) log.Debug().Msgf("Extracting backend assets files to %s", r.BackendAssetsPath) if err != nil { - log.Warn().Msgf("Failed extracting backend assets files: %s (might be required for some backends to work properly, like gpt4all)", err) + log.Warn().Msgf("Failed extracting backend assets files: %s (might be required for some backends to work properly)", err) } if len(os.Args) < 4 { diff --git a/core/cli/worker/worker_p2p.go b/core/cli/worker/worker_p2p.go index ddb3518c..93a365cb 100644 --- a/core/cli/worker/worker_p2p.go +++ b/core/cli/worker/worker_p2p.go @@ -33,7 +33,7 @@ func (r *P2P) Run(ctx *cliContext.Context) error { err := assets.ExtractFiles(ctx.BackendAssets, r.BackendAssetsPath) log.Debug().Msgf("Extracting backend assets files to %s", r.BackendAssetsPath) if err != nil { - log.Warn().Msgf("Failed extracting backend assets files: %s (might be required for some backends to work properly, like gpt4all)", err) + log.Warn().Msgf("Failed extracting backend assets files: %s (might be required for some backends to work properly)", err) } // Check if the token is set diff --git a/core/http/app_test.go b/core/http/app_test.go index b21ad25a..a837e20c 100644 --- a/core/http/app_test.go +++ b/core/http/app_test.go @@ -563,32 +563,6 @@ var _ = Describe("API test", func() { Expect(res["unit"]).To(Equal("celcius"), fmt.Sprint(res)) Expect(string(resp2.Choices[0].FinishReason)).To(Equal("function_call"), fmt.Sprint(resp2.Choices[0].FinishReason)) }) - - It("runs gpt4all", Label("gpt4all"), func() { - if runtime.GOOS != "linux" { - Skip("test supported only on linux") - } - - response := postModelApplyRequest("http://127.0.0.1:9090/models/apply", modelApplyRequest{ - URL: "github:go-skynet/model-gallery/gpt4all-j.yaml", - Name: "gpt4all-j", - }) - - Expect(response["uuid"]).ToNot(BeEmpty(), fmt.Sprint(response)) - - uuid := response["uuid"].(string) - - Eventually(func() bool { - response := getModelStatus("http://127.0.0.1:9090/models/jobs/" + uuid) - return response["processed"].(bool) - }, "960s", "10s").Should(Equal(true)) - - resp, err := client.CreateChatCompletion(context.TODO(), openai.ChatCompletionRequest{Model: "gpt4all-j", Messages: []openai.ChatCompletionMessage{openai.ChatCompletionMessage{Role: "user", Content: "How are you?"}}}) - Expect(err).ToNot(HaveOccurred()) - Expect(len(resp.Choices)).To(Equal(1)) - Expect(resp.Choices[0].Message.Content).To(ContainSubstring("well")) - }) - }) }) @@ -792,20 +766,6 @@ var _ = Describe("API test", func() { Expect(resp.Choices[0].Message.Content).ToNot(BeEmpty()) }) - It("can generate completions from model configs", func() { - resp, err := client.CreateCompletion(context.TODO(), openai.CompletionRequest{Model: "gpt4all", Prompt: testPrompt}) - Expect(err).ToNot(HaveOccurred()) - Expect(len(resp.Choices)).To(Equal(1)) - Expect(resp.Choices[0].Text).ToNot(BeEmpty()) - }) - - It("can generate chat completions from model configs", func() { - resp, err := client.CreateChatCompletion(context.TODO(), openai.ChatCompletionRequest{Model: "gpt4all-2", Messages: []openai.ChatCompletionMessage{openai.ChatCompletionMessage{Role: "user", Content: testPrompt}}}) - Expect(err).ToNot(HaveOccurred()) - Expect(len(resp.Choices)).To(Equal(1)) - Expect(resp.Choices[0].Message.Content).ToNot(BeEmpty()) - }) - It("returns errors", func() { _, err := client.CreateCompletion(context.TODO(), openai.CompletionRequest{Model: "foomodel", Prompt: testPrompt}) Expect(err).To(HaveOccurred()) diff --git a/core/http/routes/ui.go b/core/http/routes/ui.go index 4f8afd3c..2996e9dc 100644 --- a/core/http/routes/ui.go +++ b/core/http/routes/ui.go @@ -267,7 +267,7 @@ func RegisterUIRoutes(app *fiber.App, return c.SendString(elements.ProgressBar("100")) } if status.Error != nil { - // TODO: instead of deleting the job, we should keep it in the cache and make it dismissable + // TODO: instead of deleting the job, we should keep it in the cache and make it dismissable by the user processingModels.DeleteUUID(jobUID) return c.SendString(elements.ErrorProgress(status.Error.Error(), status.GalleryModelName)) } diff --git a/core/startup/startup.go b/core/startup/startup.go index 55f930a4..3565d196 100644 --- a/core/startup/startup.go +++ b/core/startup/startup.go @@ -106,7 +106,7 @@ func Startup(opts ...config.AppOption) (*config.BackendConfigLoader, *model.Mode err := assets.ExtractFiles(options.BackendAssets, options.AssetsDestination) log.Debug().Msgf("Extracting backend assets files to %s", options.AssetsDestination) if err != nil { - log.Warn().Msgf("Failed extracting backend assets files: %s (might be required for some backends to work properly, like gpt4all)", err) + log.Warn().Msgf("Failed extracting backend assets files: %s (might be required for some backends to work properly)", err) } } diff --git a/pkg/model/initializers.go b/pkg/model/initializers.go index 88a08f28..11980f03 100644 --- a/pkg/model/initializers.go +++ b/pkg/model/initializers.go @@ -45,11 +45,6 @@ const ( LLamaCPPGRPC = "llama-cpp-grpc" - Gpt4AllLlamaBackend = "gpt4all-llama" - Gpt4AllMptBackend = "gpt4all-mpt" - Gpt4AllJBackend = "gpt4all-j" - Gpt4All = "gpt4all" - BertEmbeddingsBackend = "bert-embeddings" RwkvBackend = "rwkv" WhisperBackend = "whisper" @@ -144,11 +139,10 @@ ENTRY: // sets a priority list - first has more priority priorityList := []string{ - // First llama.cpp(variants) and llama-ggml to follow. // We keep the fallback to prevent that if the llama.cpp variants // that depends on shared libs if breaks have still a safety net. - LLamaCPP, LlamaGGML, Gpt4All, LLamaCPPFallback, + LLamaCPP, LlamaGGML, LLamaCPPFallback, } toTheEnd := []string{ @@ -434,9 +428,6 @@ func (ml *ModelLoader) BackendLoader(opts ...Option) (client grpc.Backend, err e var backendToConsume string switch backend { - case Gpt4AllLlamaBackend, Gpt4AllMptBackend, Gpt4AllJBackend, Gpt4All: - o.gRPCOptions.LibrarySearchPath = filepath.Join(o.assetDir, "backend-assets", "gpt4all") - backendToConsume = Gpt4All case PiperBackend: o.gRPCOptions.LibrarySearchPath = filepath.Join(o.assetDir, "backend-assets", "espeak-ng-data") backendToConsume = PiperBackend From 1d94aaa10f5955f8e0299170ef8cc9e02e5d811a Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Wed, 7 Aug 2024 23:54:27 +0200 Subject: [PATCH 0195/1851] feat(swagger): update swagger (#3196) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- swagger/docs.go | 3 +++ swagger/swagger.json | 3 +++ swagger/swagger.yaml | 2 ++ 3 files changed, 8 insertions(+) diff --git a/swagger/docs.go b/swagger/docs.go index 9a5a1784..4d89a926 100644 --- a/swagger/docs.go +++ b/swagger/docs.go @@ -712,6 +712,9 @@ const docTemplate = `{ "parameters": { "type": "object", "additionalProperties": true + }, + "strict": { + "type": "boolean" } } }, diff --git a/swagger/swagger.json b/swagger/swagger.json index 9d53fbbe..ef038c4c 100644 --- a/swagger/swagger.json +++ b/swagger/swagger.json @@ -705,6 +705,9 @@ "parameters": { "type": "object", "additionalProperties": true + }, + "strict": { + "type": "boolean" } } }, diff --git a/swagger/swagger.yaml b/swagger/swagger.yaml index 2d628566..34d3d64f 100644 --- a/swagger/swagger.yaml +++ b/swagger/swagger.yaml @@ -16,6 +16,8 @@ definitions: parameters: additionalProperties: true type: object + strict: + type: boolean type: object functions.Item: properties: From 1c708d21de87371bb17c27e2615aa352e9ac5790 Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Thu, 8 Aug 2024 00:19:20 +0200 Subject: [PATCH 0196/1851] chore: :arrow_up: Update ggerganov/llama.cpp to `15fa07a5c564d3ed7e7eb64b73272cedb27e73ec` (#3197) :arrow_up: Update ggerganov/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index bcbdbe83..6799cf2b 100644 --- a/Makefile +++ b/Makefile @@ -8,7 +8,7 @@ DETECT_LIBS?=true # llama.cpp versions GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be -CPPLLAMA_VERSION?=1e6f6554aa11fa10160a5fda689e736c3c34169f +CPPLLAMA_VERSION?=15fa07a5c564d3ed7e7eb64b73272cedb27e73ec # go-rwkv version RWKV_REPO?=https://github.com/donomii/go-rwkv.cpp From 60117ec05722c41033376b67976f33dd7d8c34b3 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Thu, 8 Aug 2024 06:59:10 +0200 Subject: [PATCH 0197/1851] fix(apple): disable BUILD_TYPE metal on fallback (#3199) When compiling the single-binary on Apple, we enforce BUILD_TYPE=metal, however, we want still to have the fallback vanilla such as if llama.cpp fails to load metal (e.g. if Acceleration framework is missing, or MacOS version is too old) we can still run by offloading to the CPU. The default backend is still using metal as usual. Signed-off-by: Ettore Di Giacinto --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 6799cf2b..22c48110 100644 --- a/Makefile +++ b/Makefile @@ -387,7 +387,7 @@ ifeq ($(DETECT_LIBS),true) scripts/prepare-libs.sh backend-assets/grpc/llama-cpp-avx2 endif ifeq ($(OS),Darwin) - $(info ${GREEN}I Skip CUDA/hipblas build on MacOS${RESET}) + BUILD_TYPE=none $(MAKE) backend-assets/grpc/llama-cpp-fallback else $(MAKE) backend-assets/grpc/llama-cpp-cuda $(MAKE) backend-assets/grpc/llama-cpp-hipblas From f7ffa9cd58c588dcf6ba56c2dd0ab5470efa2fae Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Thu, 8 Aug 2024 11:59:31 +0200 Subject: [PATCH 0198/1851] fix(vall-e-x): pin hipblas deps (#3201) Signed-off-by: Ettore Di Giacinto --- backend/python/vall-e-x/requirements-hipblas.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/backend/python/vall-e-x/requirements-hipblas.txt b/backend/python/vall-e-x/requirements-hipblas.txt index 6ddd0b8d..fc43790a 100644 --- a/backend/python/vall-e-x/requirements-hipblas.txt +++ b/backend/python/vall-e-x/requirements-hipblas.txt @@ -1,4 +1,4 @@ --extra-index-url https://download.pytorch.org/whl/rocm6.0 accelerate -torch -torchaudio \ No newline at end of file +torch==2.3.0+rocm6.0 +torchaudio==2.3.0+rocm6.0 \ No newline at end of file From 4a1a3a56ba9d6fc695b124a9de623fe7431cd224 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Thu, 8 Aug 2024 11:59:42 +0200 Subject: [PATCH 0199/1851] models(gallery): add calme-2.3-legalkit-8b (#3200) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 21 ++++++++++++++++++++- 1 file changed, 20 insertions(+), 1 deletion(-) diff --git a/gallery/index.yaml b/gallery/index.yaml index 65516cc3..3119fae0 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -307,6 +307,26 @@ - filename: L3.1-70b-glitz-v0.2.i1-Q4_K_M.gguf sha256: 585efc83e7f6893043be2487fc09c914a381fb463ce97942ef2f25ae85103bcd uri: huggingface://mradermacher/L3.1-70b-glitz-v0.2-i1-GGUF/L3.1-70b-glitz-v0.2.i1-Q4_K_M.gguf +- !!merge <<: *llama31 + name: "calme-2.3-legalkit-8b-i1" + icon: https://huggingface.co/MaziyarPanahi/calme-2.3-legalkit-8b/resolve/main/calme-2-legalkit.webp + urls: + - https://huggingface.co/mradermacher/calme-2.3-legalkit-8b-i1-GGUF + - https://huggingface.co/MaziyarPanahi/calme-2.3-legalkit-8b + description: | + This model is an advanced iteration of the powerful meta-llama/Meta-Llama-3.1-8B-Instruct, specifically fine-tuned to enhance its capabilities in the legal domain. The fine-tuning process utilized a synthetically generated dataset derived from the French LegalKit, a comprehensive legal language resource. + + To create this specialized dataset, I used the NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO model in conjunction with Hugging Face's Inference Endpoint. This approach allowed for the generation of high-quality, synthetic data that incorporates Chain of Thought (CoT) and advanced reasoning in its responses. + + The resulting model combines the robust foundation of Llama-3.1-8B with tailored legal knowledge and enhanced reasoning capabilities. This makes it particularly well-suited for tasks requiring in-depth legal analysis, interpretation, and application of French legal concepts. + overrides: + parameters: + model: calme-2.3-legalkit-8b.i1-Q4_K_M.gguf + files: + - filename: calme-2.3-legalkit-8b.i1-Q4_K_M.gguf + sha256: b71dfea8bbd73b0fbd5793ef462b8540c24e1c52a47b1794561adb88109a9e80 + uri: huggingface://mradermacher/calme-2.3-legalkit-8b-i1-GGUF/calme-2.3-legalkit-8b.i1-Q4_K_M.gguf +## Uncensored models - !!merge <<: *llama31 name: "humanish-roleplay-llama-3.1-8b-i1" icon: https://cdn-uploads.huggingface.co/production/uploads/5fad8602b8423e1d80b8a965/VPwtjS3BtjEEEq7ck4kAQ.webp @@ -324,7 +344,6 @@ - filename: Humanish-Roleplay-Llama-3.1-8B.i1-Q4_K_M.gguf sha256: 18cf753684e5226b51f3defc708852ca4924f50dc8bc31c9a7d0a036a477b7a7 uri: huggingface://mradermacher/Humanish-Roleplay-Llama-3.1-8B-i1-GGUF/Humanish-Roleplay-Llama-3.1-8B.i1-Q4_K_M.gguf -## Uncensored models - !!merge <<: *llama31 name: "darkidol-llama-3.1-8b-instruct-1.0-uncensored-i1" icon: https://huggingface.co/aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored/resolve/main/DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored.png From 8317839ca5e36fe6abcb981ce26f0ea6428fa7d2 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Thu, 8 Aug 2024 17:28:07 +0200 Subject: [PATCH 0200/1851] fix(diffusers): use nightly rocm for hipblas builds (#3202) Signed-off-by: Ettore Di Giacinto --- backend/python/diffusers/requirements-hipblas.txt | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/backend/python/diffusers/requirements-hipblas.txt b/backend/python/diffusers/requirements-hipblas.txt index 9e992d02..92987e7a 100644 --- a/backend/python/diffusers/requirements-hipblas.txt +++ b/backend/python/diffusers/requirements-hipblas.txt @@ -1,4 +1,5 @@ ---extra-index-url https://download.pytorch.org/whl/rocm6.0 +--pre +--extra-index-url https://download.pytorch.org/whl/nightly/ torch torchvision diffusers @@ -7,4 +8,4 @@ transformers accelerate compel peft -sentencepiece \ No newline at end of file +sentencepiece From a507c13f8e58ccaebca4755d9fc7324a1c8b6bcb Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Thu, 8 Aug 2024 22:21:05 +0200 Subject: [PATCH 0201/1851] fix(diffusers): do not specify `--pre` as with pip drop --pre as it is not supported by `uv` Signed-off-by: Ettore Di Giacinto --- backend/python/diffusers/requirements-hipblas.txt | 1 - 1 file changed, 1 deletion(-) diff --git a/backend/python/diffusers/requirements-hipblas.txt b/backend/python/diffusers/requirements-hipblas.txt index 92987e7a..b7890f6e 100644 --- a/backend/python/diffusers/requirements-hipblas.txt +++ b/backend/python/diffusers/requirements-hipblas.txt @@ -1,4 +1,3 @@ ---pre --extra-index-url https://download.pytorch.org/whl/nightly/ torch torchvision From b1773e33d55b8748af5e553a6c8b7818a8e08bfe Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Fri, 9 Aug 2024 00:18:00 +0200 Subject: [PATCH 0202/1851] chore: :arrow_up: Update ggerganov/whisper.cpp to `6eac06759b87b50132a01be019e9250a3ffc8969` (#3203) :arrow_up: Update ggerganov/whisper.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 22c48110..a6ee126f 100644 --- a/Makefile +++ b/Makefile @@ -16,7 +16,7 @@ RWKV_VERSION?=661e7ae26d442f5cfebd2a0881b44e8c55949ec6 # whisper.cpp version WHISPER_REPO?=https://github.com/ggerganov/whisper.cpp -WHISPER_CPP_VERSION?=fe36c909715e6751277ddb020e7892c7670b61d4 +WHISPER_CPP_VERSION?=6eac06759b87b50132a01be019e9250a3ffc8969 # bert.cpp version BERT_REPO?=https://github.com/go-skynet/go-bert.cpp From 74f8785047e2a2f253e3cd34d92c7855601a7e90 Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Fri, 9 Aug 2024 00:36:08 +0200 Subject: [PATCH 0203/1851] chore: :arrow_up: Update ggerganov/llama.cpp to `3a14e00366399040a139c67dd5951177a8cb5695` (#3204) :arrow_up: Update ggerganov/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index a6ee126f..1ed68c08 100644 --- a/Makefile +++ b/Makefile @@ -8,7 +8,7 @@ DETECT_LIBS?=true # llama.cpp versions GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be -CPPLLAMA_VERSION?=15fa07a5c564d3ed7e7eb64b73272cedb27e73ec +CPPLLAMA_VERSION?=3a14e00366399040a139c67dd5951177a8cb5695 # go-rwkv version RWKV_REPO?=https://github.com/donomii/go-rwkv.cpp From 5fcafc3d1e530aed28fb298bf36583f0814884f5 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Fri, 9 Aug 2024 11:07:38 +0200 Subject: [PATCH 0204/1851] fix(diffusers): allow pre-releases for requirements Signed-off-by: Ettore Di Giacinto --- backend/python/diffusers/install.sh | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/backend/python/diffusers/install.sh b/backend/python/diffusers/install.sh index 36443ef1..b0b46a86 100755 --- a/backend/python/diffusers/install.sh +++ b/backend/python/diffusers/install.sh @@ -11,4 +11,9 @@ if [ "x${BUILD_PROFILE}" == "xintel" ]; then EXTRA_PIP_INSTALL_FLAGS+=" --upgrade --index-strategy=unsafe-first-match" fi +# hipblas builds from nightly that needs pre-releases to work +if [ "x${BUILD_PROFILE}" == "xhipblas" ]; then + EXTRA_PIP_INSTALL_FLAGS+=" --prerelease=allow" +fi + installRequirements From 9e3e892ac79b200586e3be5ebf04a619ae25b2a8 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Fri, 9 Aug 2024 20:12:01 +0200 Subject: [PATCH 0205/1851] feat(p2p): add network explorer and community pools (#3125) * WIP Signed-off-by: Ettore Di Giacinto * Fixups Signed-off-by: Ettore Di Giacinto * Wire up a simple explorer DB Signed-off-by: Ettore Di Giacinto * wip Signed-off-by: Ettore Di Giacinto * WIP Signed-off-by: Ettore Di Giacinto * refactor: group services id so can be identified easily in the ledger table Signed-off-by: Ettore Di Giacinto * feat(discovery): discovery service now gather worker informations correctly Signed-off-by: Ettore Di Giacinto * feat(explorer): display network token Signed-off-by: Ettore Di Giacinto * feat(explorer): display form to add new networks Signed-off-by: Ettore Di Giacinto * feat(explorer): stop from overwriting networks Signed-off-by: Ettore Di Giacinto * feat(explorer): display only networks with active workers Signed-off-by: Ettore Di Giacinto * feat(explorer): list only clusters in a network if it has online workers Signed-off-by: Ettore Di Giacinto * remove invalid and inactive networks if networks have no workers delete them from the database, similarly, if invalid. Signed-off-by: Ettore Di Giacinto * ci: add workflow to deploy new explorer versions automatically Signed-off-by: Ettore Di Giacinto * build-api: build with p2p tag Signed-off-by: Ettore Di Giacinto * Allow to specify a connection timeout Signed-off-by: Ettore Di Giacinto * logging Signed-off-by: Ettore Di Giacinto * Better p2p defaults Signed-off-by: Ettore Di Giacinto * Set loglevel Signed-off-by: Ettore Di Giacinto * Fix dht enable Signed-off-by: Ettore Di Giacinto * Default to info for loglevel Signed-off-by: Ettore Di Giacinto * Add navbar Signed-off-by: Ettore Di Giacinto * Slightly improve rendering Signed-off-by: Ettore Di Giacinto * Allow to copy the token easily Signed-off-by: Ettore Di Giacinto * ci fixups Signed-off-by: Ettore Di Giacinto --------- Signed-off-by: Ettore Di Giacinto --- .github/workflows/deploy-explorer.yaml | 64 ++++ Makefile | 2 +- core/cli/cli.go | 1 + core/cli/explorer.go | 35 ++ core/cli/run.go | 4 +- core/cli/worker/worker_p2p.go | 4 +- core/explorer/database.go | 106 ++++++ core/explorer/database_test.go | 92 +++++ core/explorer/discovery.go | 203 +++++++++++ core/explorer/explorer_suite_test.go | 13 + core/http/endpoints/explorer/dashboard.go | 105 ++++++ core/http/endpoints/localai/p2p.go | 2 +- core/http/explorer.go | 46 +++ core/http/routes/explorer.go | 13 + core/http/routes/ui.go | 4 +- core/http/views/explorer.html | 342 ++++++++++++++++++ core/http/views/partials/navbar_explorer.html | 39 ++ core/p2p/node.go | 5 +- core/p2p/p2p.go | 19 +- 19 files changed, 1082 insertions(+), 17 deletions(-) create mode 100644 .github/workflows/deploy-explorer.yaml create mode 100644 core/cli/explorer.go create mode 100644 core/explorer/database.go create mode 100644 core/explorer/database_test.go create mode 100644 core/explorer/discovery.go create mode 100644 core/explorer/explorer_suite_test.go create mode 100644 core/http/endpoints/explorer/dashboard.go create mode 100644 core/http/explorer.go create mode 100644 core/http/routes/explorer.go create mode 100644 core/http/views/explorer.html create mode 100644 core/http/views/partials/navbar_explorer.html diff --git a/.github/workflows/deploy-explorer.yaml b/.github/workflows/deploy-explorer.yaml new file mode 100644 index 00000000..71a14183 --- /dev/null +++ b/.github/workflows/deploy-explorer.yaml @@ -0,0 +1,64 @@ +name: Explorer deployment + +on: + push: + branches: + - master + tags: + - 'v*' + +concurrency: + group: ci-deploy-${{ github.head_ref || github.ref }}-${{ github.repository }} + +jobs: + build-linux: + runs-on: ubuntu-latest + steps: + - name: Clone + uses: actions/checkout@v4 + with: + submodules: true + - uses: actions/setup-go@v5 + with: + go-version: '1.21.x' + cache: false + - name: Dependencies + run: | + sudo apt-get update + sudo apt-get install -y wget curl build-essential ffmpeg protobuf-compiler ccache upx-ucl gawk cmake libgmock-dev + go install google.golang.org/grpc/cmd/protoc-gen-go-grpc@1958fcbe2ca8bd93af633f11e97d44e567e945af + go install google.golang.org/protobuf/cmd/protoc-gen-go@v1.34.2 + make protogen-go + - name: Build api + run: | + make build-api + - name: rm + uses: appleboy/ssh-action@v1.0.3 + with: + host: ${{ secrets.EXPLORER_SSH_HOST }} + username: ${{ secrets.EXPLORER_SSH_USERNAME }} + key: ${{ secrets.EXPLORER_SSH_KEY }} + port: ${{ secrets.EXPLORER_SSH_PORT }} + script: | + sudo rm -rf local-ai/ || true + - name: copy file via ssh + uses: appleboy/scp-action@v0.1.7 + with: + host: ${{ secrets.EXPLORER_SSH_HOST }} + username: ${{ secrets.EXPLORER_SSH_USERNAME }} + key: ${{ secrets.EXPLORER_SSH_KEY }} + port: ${{ secrets.EXPLORER_SSH_PORT }} + source: "local-ai" + overwrite: true + rm: true + target: ./local-ai + - name: restarting + uses: appleboy/ssh-action@v1.0.3 + with: + host: ${{ secrets.EXPLORER_SSH_HOST }} + username: ${{ secrets.EXPLORER_SSH_USERNAME }} + key: ${{ secrets.EXPLORER_SSH_KEY }} + port: ${{ secrets.EXPLORER_SSH_PORT }} + script: | + sudo cp -rfv local-ai/local-ai /usr/bin/local-ai + sudo systemctl restart local-ai diff --git a/Makefile b/Makefile index 1ed68c08..d690e483 100644 --- a/Makefile +++ b/Makefile @@ -376,7 +376,7 @@ build-minimal: BUILD_GRPC_FOR_BACKEND_LLAMA=true GRPC_BACKENDS="backend-assets/grpc/llama-cpp-avx2" GO_TAGS=p2p $(MAKE) build build-api: - BUILD_GRPC_FOR_BACKEND_LLAMA=true BUILD_API_ONLY=true GO_TAGS=none $(MAKE) build + BUILD_GRPC_FOR_BACKEND_LLAMA=true BUILD_API_ONLY=true GO_TAGS=p2p $(MAKE) build backend-assets/lib: mkdir -p backend-assets/lib diff --git a/core/cli/cli.go b/core/cli/cli.go index 0fed33fd..2073778d 100644 --- a/core/cli/cli.go +++ b/core/cli/cli.go @@ -15,4 +15,5 @@ var CLI struct { Transcript TranscriptCMD `cmd:"" help:"Convert audio to text"` Worker worker.Worker `cmd:"" help:"Run workers to distribute workload (llama.cpp-only)"` Util UtilCMD `cmd:"" help:"Utility commands"` + Explorer ExplorerCMD `cmd:"" help:"Run p2p explorer"` } diff --git a/core/cli/explorer.go b/core/cli/explorer.go new file mode 100644 index 00000000..0fcde728 --- /dev/null +++ b/core/cli/explorer.go @@ -0,0 +1,35 @@ +package cli + +import ( + "context" + "time" + + cliContext "github.com/mudler/LocalAI/core/cli/context" + "github.com/mudler/LocalAI/core/explorer" + "github.com/mudler/LocalAI/core/http" +) + +type ExplorerCMD struct { + Address string `env:"LOCALAI_ADDRESS,ADDRESS" default:":8080" help:"Bind address for the API server" group:"api"` + PoolDatabase string `env:"LOCALAI_POOL_DATABASE,POOL_DATABASE" default:"explorer.json" help:"Path to the pool database" group:"api"` + ConnectionTimeout string `env:"LOCALAI_CONNECTION_TIMEOUT,CONNECTION_TIMEOUT" default:"2m" help:"Connection timeout for the explorer" group:"api"` +} + +func (e *ExplorerCMD) Run(ctx *cliContext.Context) error { + + db, err := explorer.NewDatabase(e.PoolDatabase) + if err != nil { + return err + } + + dur, err := time.ParseDuration(e.ConnectionTimeout) + if err != nil { + return err + } + ds := explorer.NewDiscoveryServer(db, dur) + + go ds.Start(context.Background()) + appHTTP := http.Explorer(db, ds) + + return appHTTP.Listen(e.Address) +} diff --git a/core/cli/run.go b/core/cli/run.go index 9d58f6d9..707f6afb 100644 --- a/core/cli/run.go +++ b/core/cli/run.go @@ -121,9 +121,9 @@ func (r *RunCMD) Run(ctx *cliContext.Context) error { } log.Info().Msg("Starting P2P server discovery...") - if err := p2p.ServiceDiscoverer(context.Background(), node, token, p2p.NetworkID(r.Peer2PeerNetworkID, ""), func(serviceID string, node p2p.NodeData) { + if err := p2p.ServiceDiscoverer(context.Background(), node, token, p2p.NetworkID(r.Peer2PeerNetworkID, p2p.WorkerID), func(serviceID string, node p2p.NodeData) { var tunnelAddresses []string - for _, v := range p2p.GetAvailableNodes(p2p.NetworkID(r.Peer2PeerNetworkID, "")) { + for _, v := range p2p.GetAvailableNodes(p2p.NetworkID(r.Peer2PeerNetworkID, p2p.WorkerID)) { if v.IsOnline() { tunnelAddresses = append(tunnelAddresses, v.TunnelAddress) } else { diff --git a/core/cli/worker/worker_p2p.go b/core/cli/worker/worker_p2p.go index 93a365cb..17b9ff08 100644 --- a/core/cli/worker/worker_p2p.go +++ b/core/cli/worker/worker_p2p.go @@ -60,7 +60,7 @@ func (r *P2P) Run(ctx *cliContext.Context) error { p = r.RunnerPort } - err = p2p.ExposeService(context.Background(), address, p, r.Token, p2p.NetworkID(r.Peer2PeerNetworkID, "")) + err = p2p.ExposeService(context.Background(), address, p, r.Token, p2p.NetworkID(r.Peer2PeerNetworkID, p2p.WorkerID)) if err != nil { return err } @@ -100,7 +100,7 @@ func (r *P2P) Run(ctx *cliContext.Context) error { } }() - err = p2p.ExposeService(context.Background(), address, fmt.Sprint(port), r.Token, p2p.NetworkID(r.Peer2PeerNetworkID, "")) + err = p2p.ExposeService(context.Background(), address, fmt.Sprint(port), r.Token, p2p.NetworkID(r.Peer2PeerNetworkID, p2p.WorkerID)) if err != nil { return err } diff --git a/core/explorer/database.go b/core/explorer/database.go new file mode 100644 index 00000000..8535140c --- /dev/null +++ b/core/explorer/database.go @@ -0,0 +1,106 @@ +package explorer + +// A simple JSON database for storing and retrieving p2p network tokens and a name and description. + +import ( + "encoding/json" + "os" + "sort" + "sync" +) + +// Database is a simple JSON database for storing and retrieving p2p network tokens and a name and description. +type Database struct { + sync.RWMutex + path string + data map[string]TokenData +} + +// TokenData is a p2p network token with a name and description. +type TokenData struct { + Name string `json:"name"` + Description string `json:"description"` +} + +// NewDatabase creates a new Database with the given path. +func NewDatabase(path string) (*Database, error) { + db := &Database{ + data: make(map[string]TokenData), + path: path, + } + return db, db.load() +} + +// Get retrieves a Token from the Database by its token. +func (db *Database) Get(token string) (TokenData, bool) { + db.RLock() + defer db.RUnlock() + t, ok := db.data[token] + return t, ok +} + +// Set stores a Token in the Database by its token. +func (db *Database) Set(token string, t TokenData) error { + db.Lock() + db.data[token] = t + db.Unlock() + + return db.Save() +} + +// Delete removes a Token from the Database by its token. +func (db *Database) Delete(token string) error { + db.Lock() + delete(db.data, token) + db.Unlock() + return db.Save() +} + +func (db *Database) TokenList() []string { + db.RLock() + defer db.RUnlock() + tokens := []string{} + for k := range db.data { + tokens = append(tokens, k) + } + + sort.Slice(tokens, func(i, j int) bool { + // sort by token + return tokens[i] < tokens[j] + }) + + return tokens +} + +// load reads the Database from disk. +func (db *Database) load() error { + db.Lock() + defer db.Unlock() + + if _, err := os.Stat(db.path); os.IsNotExist(err) { + return nil + } + + // Read the file from disk + // Unmarshal the JSON into db.data + f, err := os.ReadFile(db.path) + if err != nil { + return err + } + return json.Unmarshal(f, &db.data) +} + +// Save writes the Database to disk. +func (db *Database) Save() error { + db.RLock() + defer db.RUnlock() + + // Marshal db.data into JSON + // Write the JSON to the file + f, err := os.Create(db.path) + if err != nil { + return err + } + defer f.Close() + return json.NewEncoder(f).Encode(db.data) +} diff --git a/core/explorer/database_test.go b/core/explorer/database_test.go new file mode 100644 index 00000000..7f2cbd26 --- /dev/null +++ b/core/explorer/database_test.go @@ -0,0 +1,92 @@ +package explorer_test + +import ( + "os" + + . "github.com/onsi/ginkgo/v2" + . "github.com/onsi/gomega" + + "github.com/mudler/LocalAI/core/explorer" +) + +var _ = Describe("Database", func() { + var ( + dbPath string + db *explorer.Database + err error + ) + + BeforeEach(func() { + // Create a temporary file path for the database + dbPath = "test_db.json" + db, err = explorer.NewDatabase(dbPath) + Expect(err).To(BeNil()) + }) + + AfterEach(func() { + // Clean up the temporary database file + os.Remove(dbPath) + }) + + Context("when managing tokens", func() { + It("should add and retrieve a token", func() { + token := "token123" + t := explorer.TokenData{Name: "TokenName", Description: "A test token"} + + err = db.Set(token, t) + Expect(err).To(BeNil()) + + retrievedToken, exists := db.Get(token) + Expect(exists).To(BeTrue()) + Expect(retrievedToken).To(Equal(t)) + }) + + It("should delete a token", func() { + token := "token123" + t := explorer.TokenData{Name: "TokenName", Description: "A test token"} + + err = db.Set(token, t) + Expect(err).To(BeNil()) + + err = db.Delete(token) + Expect(err).To(BeNil()) + + _, exists := db.Get(token) + Expect(exists).To(BeFalse()) + }) + + It("should persist data to disk", func() { + token := "token123" + t := explorer.TokenData{Name: "TokenName", Description: "A test token"} + + err = db.Set(token, t) + Expect(err).To(BeNil()) + + // Recreate the database object to simulate reloading from disk + db, err = explorer.NewDatabase(dbPath) + Expect(err).To(BeNil()) + + retrievedToken, exists := db.Get(token) + Expect(exists).To(BeTrue()) + Expect(retrievedToken).To(Equal(t)) + + // Check the token list + tokenList := db.TokenList() + Expect(tokenList).To(ContainElement(token)) + }) + }) + + Context("when loading an empty or non-existent file", func() { + It("should start with an empty database", func() { + dbPath = "empty_db.json" + db, err = explorer.NewDatabase(dbPath) + Expect(err).To(BeNil()) + + _, exists := db.Get("nonexistent") + Expect(exists).To(BeFalse()) + + // Clean up + os.Remove(dbPath) + }) + }) +}) diff --git a/core/explorer/discovery.go b/core/explorer/discovery.go new file mode 100644 index 00000000..73281dc0 --- /dev/null +++ b/core/explorer/discovery.go @@ -0,0 +1,203 @@ +package explorer + +import ( + "context" + "fmt" + "strings" + "sync" + "time" + + "github.com/rs/zerolog/log" + + "github.com/mudler/LocalAI/core/p2p" + "github.com/mudler/edgevpn/pkg/blockchain" +) + +type DiscoveryServer struct { + sync.Mutex + database *Database + networkState *NetworkState + connectionTime time.Duration +} + +type NetworkState struct { + Networks map[string]Network +} + +func (s *DiscoveryServer) NetworkState() *NetworkState { + s.Lock() + defer s.Unlock() + return s.networkState +} + +// NewDiscoveryServer creates a new DiscoveryServer with the given Database. +// it keeps the db state in sync with the network state +func NewDiscoveryServer(db *Database, dur time.Duration) *DiscoveryServer { + if dur == 0 { + dur = 50 * time.Second + } + return &DiscoveryServer{ + database: db, + connectionTime: dur, + networkState: &NetworkState{ + Networks: map[string]Network{}, + }, + } +} + +type Network struct { + Clusters []ClusterData +} + +func (s *DiscoveryServer) runBackground() { + if len(s.database.TokenList()) == 0 { + time.Sleep(5 * time.Second) // avoid busy loop + return + } + + for _, token := range s.database.TokenList() { + c, cancel := context.WithTimeout(context.Background(), s.connectionTime) + defer cancel() + + // Connect to the network + // Get the number of nodes + // save it in the current state (mutex) + // do not do in parallel + n, err := p2p.NewNode(token) + if err != nil { + log.Err(err).Msg("Failed to create node") + s.database.Delete(token) + continue + } + + err = n.Start(c) + if err != nil { + log.Err(err).Msg("Failed to start node") + s.database.Delete(token) + continue + } + + ledger, err := n.Ledger() + if err != nil { + log.Err(err).Msg("Failed to start ledger") + s.database.Delete(token) + continue + } + + networkData := make(chan ClusterData) + + // get the network data - it takes the whole timeout + // as we might not be connected to the network yet, + // and few attempts would have to be made before bailing out + go s.retrieveNetworkData(c, ledger, networkData) + + hasWorkers := false + ledgerK := []ClusterData{} + for key := range networkData { + ledgerK = append(ledgerK, key) + if len(key.Workers) > 0 { + hasWorkers = true + } + } + + log.Debug().Any("network", token).Msgf("Network has %d clusters", len(ledgerK)) + if len(ledgerK) != 0 { + for _, k := range ledgerK { + log.Debug().Any("network", token).Msgf("Clusterdata %+v", k) + } + } + + if hasWorkers { + s.Lock() + s.networkState.Networks[token] = Network{ + Clusters: ledgerK, + } + s.Unlock() + } else { + log.Info().Any("network", token).Msg("No workers found in the network. Removing it from the database") + s.database.Delete(token) + } + } +} + +type ClusterData struct { + Workers []string + Type string + NetworkID string +} + +func (s *DiscoveryServer) retrieveNetworkData(c context.Context, ledger *blockchain.Ledger, networkData chan ClusterData) { + clusters := map[string]ClusterData{} + + defer func() { + for _, n := range clusters { + networkData <- n + } + close(networkData) + }() + + for { + select { + case <-c.Done(): + return + default: + time.Sleep(5 * time.Second) + + data := ledger.LastBlock().Storage + LEDGER: + for d := range data { + toScanForWorkers := false + cd := ClusterData{} + isWorkerCluster := d == p2p.WorkerID || (strings.Contains(d, "_") && strings.Contains(d, p2p.WorkerID)) + isFederatedCluster := d == p2p.FederatedID || (strings.Contains(d, "_") && strings.Contains(d, p2p.FederatedID)) + switch { + case isWorkerCluster: + toScanForWorkers = true + cd.Type = "worker" + case isFederatedCluster: + toScanForWorkers = true + cd.Type = "federated" + } + + if strings.Contains(d, "_") { + cd.NetworkID = strings.Split(d, "_")[0] + } + + if !toScanForWorkers { + continue LEDGER + } + + atLeastOneWorker := false + DATA: + for _, v := range data[d] { + nd := &p2p.NodeData{} + if err := v.Unmarshal(nd); err != nil { + continue DATA + } + + if nd.IsOnline() { + atLeastOneWorker = true + (&cd).Workers = append(cd.Workers, nd.ID) + } + } + + if atLeastOneWorker { + clusters[d] = cd + } + } + } + } +} + +// Start the discovery server. This is meant to be run in to a goroutine. +func (s *DiscoveryServer) Start(ctx context.Context) error { + for { + select { + case <-ctx.Done(): + return fmt.Errorf("context cancelled") + default: + // Collect data + s.runBackground() + } + } +} diff --git a/core/explorer/explorer_suite_test.go b/core/explorer/explorer_suite_test.go new file mode 100644 index 00000000..fc718d5f --- /dev/null +++ b/core/explorer/explorer_suite_test.go @@ -0,0 +1,13 @@ +package explorer_test + +import ( + "testing" + + . "github.com/onsi/ginkgo/v2" + . "github.com/onsi/gomega" +) + +func TestExplorer(t *testing.T) { + RegisterFailHandler(Fail) + RunSpecs(t, "Explorer test suite") +} diff --git a/core/http/endpoints/explorer/dashboard.go b/core/http/endpoints/explorer/dashboard.go new file mode 100644 index 00000000..7cd9f3c9 --- /dev/null +++ b/core/http/endpoints/explorer/dashboard.go @@ -0,0 +1,105 @@ +package explorer + +import ( + "encoding/base64" + "sort" + + "github.com/gofiber/fiber/v2" + "github.com/mudler/LocalAI/core/explorer" + "github.com/mudler/LocalAI/internal" +) + +func Dashboard() func(*fiber.Ctx) error { + return func(c *fiber.Ctx) error { + + summary := fiber.Map{ + "Title": "LocalAI API - " + internal.PrintableVersion(), + "Version": internal.PrintableVersion(), + } + + if string(c.Context().Request.Header.ContentType()) == "application/json" || len(c.Accepts("html")) == 0 { + // The client expects a JSON response + return c.Status(fiber.StatusOK).JSON(summary) + } else { + // Render index + return c.Render("views/explorer", summary) + } + } +} + +type AddNetworkRequest struct { + Token string `json:"token"` + Name string `json:"name"` + Description string `json:"description"` +} + +type Network struct { + explorer.Network + explorer.TokenData + Token string `json:"token"` +} + +func ShowNetworks(db *explorer.Database, ds *explorer.DiscoveryServer) func(*fiber.Ctx) error { + return func(c *fiber.Ctx) error { + networkState := ds.NetworkState() + results := []Network{} + for token, network := range networkState.Networks { + networkData, exists := db.Get(token) // get the token data + hasWorkers := false + for _, cluster := range network.Clusters { + if len(cluster.Workers) > 0 { + hasWorkers = true + break + } + } + if exists && hasWorkers { + results = append(results, Network{Network: network, TokenData: networkData, Token: token}) + } + } + + // order by number of clusters + sort.Slice(results, func(i, j int) bool { + return len(results[i].Clusters) > len(results[j].Clusters) + }) + + return c.JSON(results) + } +} + +func AddNetwork(db *explorer.Database) func(*fiber.Ctx) error { + return func(c *fiber.Ctx) error { + request := new(AddNetworkRequest) + if err := c.BodyParser(request); err != nil { + return c.Status(fiber.StatusBadRequest).JSON(fiber.Map{"error": "Cannot parse JSON"}) + } + + if request.Token == "" { + return c.Status(fiber.StatusBadRequest).JSON(fiber.Map{"error": "Token is required"}) + } + + if request.Name == "" { + return c.Status(fiber.StatusBadRequest).JSON(fiber.Map{"error": "Name is required"}) + } + + if request.Description == "" { + return c.Status(fiber.StatusBadRequest).JSON(fiber.Map{"error": "Description is required"}) + } + + // TODO: check if token is valid, otherwise reject + // try to decode the token from base64 + _, err := base64.StdEncoding.DecodeString(request.Token) + if err != nil { + return c.Status(fiber.StatusBadRequest).JSON(fiber.Map{"error": "Invalid token"}) + } + + if _, exists := db.Get(request.Token); exists { + return c.Status(fiber.StatusBadRequest).JSON(fiber.Map{"error": "Token already exists"}) + } + err = db.Set(request.Token, explorer.TokenData{Name: request.Name, Description: request.Description}) + if err != nil { + return c.Status(fiber.StatusInternalServerError).JSON(fiber.Map{"error": "Cannot add token"}) + } + + return c.Status(fiber.StatusOK).JSON(fiber.Map{"message": "Token added"}) + } +} diff --git a/core/http/endpoints/localai/p2p.go b/core/http/endpoints/localai/p2p.go index 93e9b5d5..bbcee8c8 100644 --- a/core/http/endpoints/localai/p2p.go +++ b/core/http/endpoints/localai/p2p.go @@ -15,7 +15,7 @@ func ShowP2PNodes(appConfig *config.ApplicationConfig) func(*fiber.Ctx) error { // Render index return func(c *fiber.Ctx) error { return c.JSON(schema.P2PNodesResponse{ - Nodes: p2p.GetAvailableNodes(p2p.NetworkID(appConfig.P2PNetworkID, "")), + Nodes: p2p.GetAvailableNodes(p2p.NetworkID(appConfig.P2PNetworkID, p2p.WorkerID)), FederatedNodes: p2p.GetAvailableNodes(p2p.NetworkID(appConfig.P2PNetworkID, p2p.FederatedID)), }) } diff --git a/core/http/explorer.go b/core/http/explorer.go new file mode 100644 index 00000000..608ecdb5 --- /dev/null +++ b/core/http/explorer.go @@ -0,0 +1,46 @@ +package http + +import ( + "net/http" + + "github.com/gofiber/fiber/v2" + "github.com/gofiber/fiber/v2/middleware/favicon" + "github.com/gofiber/fiber/v2/middleware/filesystem" + "github.com/mudler/LocalAI/core/explorer" + "github.com/mudler/LocalAI/core/http/routes" +) + +func Explorer(db *explorer.Database, discoveryServer *explorer.DiscoveryServer) *fiber.App { + + fiberCfg := fiber.Config{ + Views: renderEngine(), + // We disable the Fiber startup message as it does not conform to structured logging. + // We register a startup log line with connection information in the OnListen hook to keep things user friendly though + DisableStartupMessage: false, + // Override default error handler + } + + app := fiber.New(fiberCfg) + + routes.RegisterExplorerRoutes(app, db, discoveryServer) + + httpFS := http.FS(embedDirStatic) + + app.Use(favicon.New(favicon.Config{ + URL: "/favicon.ico", + FileSystem: httpFS, + File: "static/favicon.ico", + })) + + app.Use("/static", filesystem.New(filesystem.Config{ + Root: httpFS, + PathPrefix: "static", + Browse: true, + })) + + // Define a custom 404 handler + // Note: keep this at the bottom! + app.Use(notFoundHandler) + + return app +} diff --git a/core/http/routes/explorer.go b/core/http/routes/explorer.go new file mode 100644 index 00000000..b3c0d40b --- /dev/null +++ b/core/http/routes/explorer.go @@ -0,0 +1,13 @@ +package routes + +import ( + "github.com/gofiber/fiber/v2" + coreExplorer "github.com/mudler/LocalAI/core/explorer" + "github.com/mudler/LocalAI/core/http/endpoints/explorer" +) + +func RegisterExplorerRoutes(app *fiber.App, db *coreExplorer.Database, ds *coreExplorer.DiscoveryServer) { + app.Get("/", explorer.Dashboard()) + app.Post("/network/add", explorer.AddNetwork(db)) + app.Get("/networks", explorer.ShowNetworks(db, ds)) +} diff --git a/core/http/routes/ui.go b/core/http/routes/ui.go index 2996e9dc..0a9867fe 100644 --- a/core/http/routes/ui.go +++ b/core/http/routes/ui.go @@ -105,14 +105,14 @@ func RegisterUIRoutes(app *fiber.App, /* show nodes live! */ app.Get("/p2p/ui/workers", auth, func(c *fiber.Ctx) error { - return c.SendString(elements.P2PNodeBoxes(p2p.GetAvailableNodes(p2p.NetworkID(appConfig.P2PNetworkID, "")))) + return c.SendString(elements.P2PNodeBoxes(p2p.GetAvailableNodes(p2p.NetworkID(appConfig.P2PNetworkID, p2p.WorkerID)))) }) app.Get("/p2p/ui/workers-federation", auth, func(c *fiber.Ctx) error { return c.SendString(elements.P2PNodeBoxes(p2p.GetAvailableNodes(p2p.NetworkID(appConfig.P2PNetworkID, p2p.FederatedID)))) }) app.Get("/p2p/ui/workers-stats", auth, func(c *fiber.Ctx) error { - return c.SendString(elements.P2PNodeStats(p2p.GetAvailableNodes(p2p.NetworkID(appConfig.P2PNetworkID, "")))) + return c.SendString(elements.P2PNodeStats(p2p.GetAvailableNodes(p2p.NetworkID(appConfig.P2PNetworkID, p2p.WorkerID)))) }) app.Get("/p2p/ui/workers-federation-stats", auth, func(c *fiber.Ctx) error { return c.SendString(elements.P2PNodeStats(p2p.GetAvailableNodes(p2p.NetworkID(appConfig.P2PNetworkID, p2p.FederatedID)))) diff --git a/core/http/views/explorer.html b/core/http/views/explorer.html new file mode 100644 index 00000000..91cb9720 --- /dev/null +++ b/core/http/views/explorer.html @@ -0,0 +1,342 @@ + + + +{{template "views/partials/head" .}} + + + + +
+ {{template "views/partials/navbar_explorer" .}} + +
+

Network Clusters Explorer

+

View the clusters and workers available in each network.

+
+ +
+ +
+ + The explorer is a global, community-driven tool to share network tokens and view available clusters in the globe. + Anyone can use the tokens to offload computation and use the clusters available or share resources. + This is provided without any warranty. Use it at your own risk. We are not responsible for any potential harm or misuse. Sharing tokens globally allows anyone from the internet to use your instances. + Although the community will address bugs, this is experimental software and may be insecure to deploy on your hardware unless you take all necessary precautions. +
+
+ + +
+ +
+

Add New Network

+
+ + +
+
+ + +
+
+ + +
+ + + +
+ + + + + + + + +
+ + + {{template "views/partials/footer" .}} +
+ + + + diff --git a/core/http/views/partials/navbar_explorer.html b/core/http/views/partials/navbar_explorer.html new file mode 100644 index 00000000..ffc6c4d5 --- /dev/null +++ b/core/http/views/partials/navbar_explorer.html @@ -0,0 +1,39 @@ + + + diff --git a/core/p2p/node.go b/core/p2p/node.go index 6394498f..b89bb7c6 100644 --- a/core/p2p/node.go +++ b/core/p2p/node.go @@ -5,7 +5,10 @@ import ( "time" ) -const defaultServicesID = "services_localai" +const ( + defaultServicesID = "services" + WorkerID = "worker" +) type NodeData struct { Name string diff --git a/core/p2p/p2p.go b/core/p2p/p2p.go index 927f0e24..37b892d9 100644 --- a/core/p2p/p2p.go +++ b/core/p2p/p2p.go @@ -345,13 +345,16 @@ func newNodeOpts(token string) ([]node.Option, error) { // TODO: move this up, expose more config options when creating a node noDHT := os.Getenv("LOCALAI_P2P_DISABLE_DHT") == "true" - noLimits := os.Getenv("LOCALAI_P2P_DISABLE_LIMITS") == "true" + noLimits := os.Getenv("LOCALAI_P2P_ENABLE_LIMITS") == "true" - loglevel := "info" + loglevel := os.Getenv("LOCALAI_P2P_LOGLEVEL") + if loglevel == "" { + loglevel = "info" + } c := config.Config{ Limit: config.ResourceLimit{ - Enable: !noLimits, + Enable: noLimits, MaxConns: 100, }, NetworkToken: token, @@ -366,19 +369,19 @@ func newNodeOpts(token string) ([]node.Option, error) { Service: true, Map: true, RateLimit: true, - RateLimitGlobal: 10, - RateLimitPeer: 10, + RateLimitGlobal: 100, + RateLimitPeer: 100, RateLimitInterval: defaultInterval, }, Discovery: config.Discovery{ - DHT: noDHT, + DHT: !noDHT, MDNS: true, - Interval: 30 * time.Second, + Interval: 10 * time.Second, }, Connection: config.Connection{ HolePunch: true, AutoRelay: true, - MaxConnections: 100, + MaxConnections: 1000, }, } From 6d20f38510937a0740bb1e0b7337dd617cbd7be8 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Fri, 9 Aug 2024 20:08:24 +0000 Subject: [PATCH 0206/1851] chore(deps): Bump aiohttp from 3.9.5 to 3.10.2 in /examples/langchain/langchainpy-localai-example in the pip group (#3207) chore(deps): Bump aiohttp Bumps the pip group in /examples/langchain/langchainpy-localai-example with 1 update: [aiohttp](https://github.com/aio-libs/aiohttp). Updates `aiohttp` from 3.9.5 to 3.10.2 - [Release notes](https://github.com/aio-libs/aiohttp/releases) - [Changelog](https://github.com/aio-libs/aiohttp/blob/master/CHANGES.rst) - [Commits](https://github.com/aio-libs/aiohttp/compare/v3.9.5...v3.10.2) --- updated-dependencies: - dependency-name: aiohttp dependency-type: direct:production dependency-group: pip ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain/langchainpy-localai-example/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain/langchainpy-localai-example/requirements.txt b/examples/langchain/langchainpy-localai-example/requirements.txt index 1d1b5023..414a1b27 100644 --- a/examples/langchain/langchainpy-localai-example/requirements.txt +++ b/examples/langchain/langchainpy-localai-example/requirements.txt @@ -1,4 +1,4 @@ -aiohttp==3.9.5 +aiohttp==3.10.2 aiosignal==1.3.1 async-timeout==4.0.3 attrs==23.2.0 From 2e2a0dffbc4aae5ad1225278d98370a6ec898657 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Fri, 9 Aug 2024 22:36:10 +0200 Subject: [PATCH 0207/1851] fix(diffusers-hipblas): pin to rocm6.1 As per https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/3rd-party/pytorch-install.html Signed-off-by: Ettore Di Giacinto --- backend/python/diffusers/requirements-hipblas.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/python/diffusers/requirements-hipblas.txt b/backend/python/diffusers/requirements-hipblas.txt index b7890f6e..8c4c070c 100644 --- a/backend/python/diffusers/requirements-hipblas.txt +++ b/backend/python/diffusers/requirements-hipblas.txt @@ -1,4 +1,4 @@ ---extra-index-url https://download.pytorch.org/whl/nightly/ +--extra-index-url https://download.pytorch.org/whl/nightly/rocm6.1/ torch torchvision diffusers From 71b823207659bdfd30105673c45d125aedda0f81 Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Sat, 10 Aug 2024 00:20:51 +0200 Subject: [PATCH 0208/1851] chore: :arrow_up: Update ggerganov/llama.cpp to `b72942fac998672a79a1ae3c03b340f7e629980b` (#3208) :arrow_up: Update ggerganov/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index d690e483..5ce38a82 100644 --- a/Makefile +++ b/Makefile @@ -8,7 +8,7 @@ DETECT_LIBS?=true # llama.cpp versions GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be -CPPLLAMA_VERSION?=3a14e00366399040a139c67dd5951177a8cb5695 +CPPLLAMA_VERSION?=b72942fac998672a79a1ae3c03b340f7e629980b # go-rwkv version RWKV_REPO?=https://github.com/donomii/go-rwkv.cpp From a0e0804f25bb74543c6d524fa14eb0d6ad57a31f Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Sat, 10 Aug 2024 00:35:22 +0200 Subject: [PATCH 0209/1851] chore: :arrow_up: Update ggerganov/whisper.cpp to `81c999fe0a25c4ebbfef10ed8a1a96df9cfc10fd` (#3209) :arrow_up: Update ggerganov/whisper.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 5ce38a82..9d09b917 100644 --- a/Makefile +++ b/Makefile @@ -16,7 +16,7 @@ RWKV_VERSION?=661e7ae26d442f5cfebd2a0881b44e8c55949ec6 # whisper.cpp version WHISPER_REPO?=https://github.com/ggerganov/whisper.cpp -WHISPER_CPP_VERSION?=6eac06759b87b50132a01be019e9250a3ffc8969 +WHISPER_CPP_VERSION?=81c999fe0a25c4ebbfef10ed8a1a96df9cfc10fd # bert.cpp version BERT_REPO?=https://github.com/go-skynet/go-bert.cpp From 63ee689f2169fb36dffbdb741a18179233d2b556 Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Sat, 10 Aug 2024 01:02:22 +0200 Subject: [PATCH 0210/1851] chore(model-gallery): :arrow_up: update checksum (#3210) :arrow_up: Checksum updates in gallery/index.yaml Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- gallery/index.yaml | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/gallery/index.yaml b/gallery/index.yaml index 3119fae0..8daa39c6 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -314,11 +314,11 @@ - https://huggingface.co/mradermacher/calme-2.3-legalkit-8b-i1-GGUF - https://huggingface.co/MaziyarPanahi/calme-2.3-legalkit-8b description: | - This model is an advanced iteration of the powerful meta-llama/Meta-Llama-3.1-8B-Instruct, specifically fine-tuned to enhance its capabilities in the legal domain. The fine-tuning process utilized a synthetically generated dataset derived from the French LegalKit, a comprehensive legal language resource. + This model is an advanced iteration of the powerful meta-llama/Meta-Llama-3.1-8B-Instruct, specifically fine-tuned to enhance its capabilities in the legal domain. The fine-tuning process utilized a synthetically generated dataset derived from the French LegalKit, a comprehensive legal language resource. - To create this specialized dataset, I used the NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO model in conjunction with Hugging Face's Inference Endpoint. This approach allowed for the generation of high-quality, synthetic data that incorporates Chain of Thought (CoT) and advanced reasoning in its responses. + To create this specialized dataset, I used the NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO model in conjunction with Hugging Face's Inference Endpoint. This approach allowed for the generation of high-quality, synthetic data that incorporates Chain of Thought (CoT) and advanced reasoning in its responses. - The resulting model combines the robust foundation of Llama-3.1-8B with tailored legal knowledge and enhanced reasoning capabilities. This makes it particularly well-suited for tasks requiring in-depth legal analysis, interpretation, and application of French legal concepts. + The resulting model combines the robust foundation of Llama-3.1-8B with tailored legal knowledge and enhanced reasoning capabilities. This makes it particularly well-suited for tasks requiring in-depth legal analysis, interpretation, and application of French legal concepts. overrides: parameters: model: calme-2.3-legalkit-8b.i1-Q4_K_M.gguf @@ -4319,7 +4319,7 @@ files: - filename: "Phi-3-medium-4k-instruct-Q4_K_M.gguf" uri: "huggingface://bartowski/Phi-3-medium-4k-instruct-GGUF/Phi-3-medium-4k-instruct-Q4_K_M.gguf" - sha256: 4e8d4258ed44562573c8984a045b0a4651c51e7e4d9d00a06c65cd2149ab4539 + sha256: 6f05c97bc676dd1ec8d58e9a8795b4f5c809db771f6fc7bf48634c805face82c - !!merge <<: *phi-3 name: "cream-phi-3-14b-v1" icon: https://cdn-uploads.huggingface.co/production/uploads/65f2fd1c25b848bd061b5c2e/AP4-OHepdqiqHj2KSi26M.gif From 0c0bc18c94e8c000c7d2d47cb8bdac8b4c73e5cc Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Sat, 10 Aug 2024 10:10:47 +0200 Subject: [PATCH 0211/1851] fix(diffusers): pin torch and torchvision (#1592) Signed-off-by: Ettore Di Giacinto --- backend/python/diffusers/install.sh | 5 ----- backend/python/diffusers/requirements-hipblas.txt | 6 +++--- 2 files changed, 3 insertions(+), 8 deletions(-) diff --git a/backend/python/diffusers/install.sh b/backend/python/diffusers/install.sh index b0b46a86..36443ef1 100755 --- a/backend/python/diffusers/install.sh +++ b/backend/python/diffusers/install.sh @@ -11,9 +11,4 @@ if [ "x${BUILD_PROFILE}" == "xintel" ]; then EXTRA_PIP_INSTALL_FLAGS+=" --upgrade --index-strategy=unsafe-first-match" fi -# hipblas builds from nightly that needs pre-releases to work -if [ "x${BUILD_PROFILE}" == "xhipblas" ]; then - EXTRA_PIP_INSTALL_FLAGS+=" --prerelease=allow" -fi - installRequirements diff --git a/backend/python/diffusers/requirements-hipblas.txt b/backend/python/diffusers/requirements-hipblas.txt index 8c4c070c..fc9ea3b4 100644 --- a/backend/python/diffusers/requirements-hipblas.txt +++ b/backend/python/diffusers/requirements-hipblas.txt @@ -1,6 +1,6 @@ ---extra-index-url https://download.pytorch.org/whl/nightly/rocm6.1/ -torch -torchvision +--extra-index-url https://download.pytorch.org/whl/rocm6.0 +torch==2.3.1+rocm6.0 +torchvision==0.18.1+rocm6.0 diffusers opencv-python transformers From 8627bc2dd4371f3a7629bf0b24a024b1c001d3f3 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Sat, 10 Aug 2024 20:50:57 +0200 Subject: [PATCH 0212/1851] feat(explorer): relax token deletion with error threshold (#3211) feat(explorer): relax token deletion with error threashold Signed-off-by: Ettore Di Giacinto --- core/cli/explorer.go | 9 ++++---- core/explorer/discovery.go | 43 ++++++++++++++++++++++++++++++-------- 2 files changed, 39 insertions(+), 13 deletions(-) diff --git a/core/cli/explorer.go b/core/cli/explorer.go index 0fcde728..f3e3618d 100644 --- a/core/cli/explorer.go +++ b/core/cli/explorer.go @@ -10,9 +10,10 @@ import ( ) type ExplorerCMD struct { - Address string `env:"LOCALAI_ADDRESS,ADDRESS" default:":8080" help:"Bind address for the API server" group:"api"` - PoolDatabase string `env:"LOCALAI_POOL_DATABASE,POOL_DATABASE" default:"explorer.json" help:"Path to the pool database" group:"api"` - ConnectionTimeout string `env:"LOCALAI_CONNECTION_TIMEOUT,CONNECTION_TIMEOUT" default:"2m" help:"Connection timeout for the explorer" group:"api"` + Address string `env:"LOCALAI_ADDRESS,ADDRESS" default:":8080" help:"Bind address for the API server" group:"api"` + PoolDatabase string `env:"LOCALAI_POOL_DATABASE,POOL_DATABASE" default:"explorer.json" help:"Path to the pool database" group:"api"` + ConnectionTimeout string `env:"LOCALAI_CONNECTION_TIMEOUT,CONNECTION_TIMEOUT" default:"2m" help:"Connection timeout for the explorer" group:"api"` + ConnectionErrorThreshold int `env:"LOCALAI_CONNECTION_ERROR_THRESHOLD,CONNECTION_ERROR_THRESHOLD" default:"3" help:"Connection failure threshold for the explorer" group:"api"` } func (e *ExplorerCMD) Run(ctx *cliContext.Context) error { @@ -26,7 +27,7 @@ func (e *ExplorerCMD) Run(ctx *cliContext.Context) error { if err != nil { return err } - ds := explorer.NewDiscoveryServer(db, dur) + ds := explorer.NewDiscoveryServer(db, dur, e.ConnectionErrorThreshold) go ds.Start(context.Background()) appHTTP := http.Explorer(db, ds) diff --git a/core/explorer/discovery.go b/core/explorer/discovery.go index 73281dc0..dc2b6e88 100644 --- a/core/explorer/discovery.go +++ b/core/explorer/discovery.go @@ -15,9 +15,11 @@ import ( type DiscoveryServer struct { sync.Mutex - database *Database - networkState *NetworkState - connectionTime time.Duration + database *Database + networkState *NetworkState + connectionTime time.Duration + failures map[string]int + errorThreshold int } type NetworkState struct { @@ -32,16 +34,20 @@ func (s *DiscoveryServer) NetworkState() *NetworkState { // NewDiscoveryServer creates a new DiscoveryServer with the given Database. // it keeps the db state in sync with the network state -func NewDiscoveryServer(db *Database, dur time.Duration) *DiscoveryServer { +func NewDiscoveryServer(db *Database, dur time.Duration, failureThreshold int) *DiscoveryServer { if dur == 0 { dur = 50 * time.Second } + if failureThreshold == 0 { + failureThreshold = 3 + } return &DiscoveryServer{ database: db, connectionTime: dur, networkState: &NetworkState{ Networks: map[string]Network{}, }, + errorThreshold: failureThreshold, } } @@ -66,21 +72,21 @@ func (s *DiscoveryServer) runBackground() { n, err := p2p.NewNode(token) if err != nil { log.Err(err).Msg("Failed to create node") - s.database.Delete(token) + s.failedToken(token) continue } err = n.Start(c) if err != nil { log.Err(err).Msg("Failed to start node") - s.database.Delete(token) + s.failedToken(token) continue } ledger, err := n.Ledger() if err != nil { log.Err(err).Msg("Failed to start ledger") - s.database.Delete(token) + s.failedToken(token) continue } @@ -114,8 +120,27 @@ func (s *DiscoveryServer) runBackground() { } s.Unlock() } else { - log.Info().Any("network", token).Msg("No workers found in the network. Removing it from the database") - s.database.Delete(token) + s.failedToken(token) + } + } + + s.deleteFailedConnections() +} + +func (s *DiscoveryServer) failedToken(token string) { + s.Lock() + defer s.Unlock() + s.failures[token]++ +} + +func (s *DiscoveryServer) deleteFailedConnections() { + s.Lock() + defer s.Unlock() + for k, v := range s.failures { + if v > s.errorThreshold { + log.Info().Any("network", k).Msg("Network has been removed from the database") + s.database.Delete(k) + delete(s.failures, k) } } } From f3357a17b8012049c8dd26ea6bec8096ef1cbe73 Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Sun, 11 Aug 2024 00:16:51 +0200 Subject: [PATCH 0213/1851] chore: :arrow_up: Update ggerganov/llama.cpp to `6e02327e8b7837358e0406bf90a4632e18e27846` (#3212) :arrow_up: Update ggerganov/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 9d09b917..ef38a460 100644 --- a/Makefile +++ b/Makefile @@ -8,7 +8,7 @@ DETECT_LIBS?=true # llama.cpp versions GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be -CPPLLAMA_VERSION?=b72942fac998672a79a1ae3c03b340f7e629980b +CPPLLAMA_VERSION?=6e02327e8b7837358e0406bf90a4632e18e27846 # go-rwkv version RWKV_REPO?=https://github.com/donomii/go-rwkv.cpp From 7ba4a78fcc87db6cd5a029ee3f8e11b516d4e144 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Sun, 11 Aug 2024 00:59:58 +0200 Subject: [PATCH 0214/1851] fix(explorer): reset counter when network is active (#3213) Signed-off-by: Ettore Di Giacinto --- core/explorer/discovery.go | 1 + 1 file changed, 1 insertion(+) diff --git a/core/explorer/discovery.go b/core/explorer/discovery.go index dc2b6e88..5de4162f 100644 --- a/core/explorer/discovery.go +++ b/core/explorer/discovery.go @@ -118,6 +118,7 @@ func (s *DiscoveryServer) runBackground() { s.networkState.Networks[token] = Network{ Clusters: ledgerK, } + delete(s.failures, token) s.Unlock() } else { s.failedToken(token) From 74eaf024847b99a3edbc1fb90edb9d9234c4b3b8 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Sun, 11 Aug 2024 01:31:53 +0200 Subject: [PATCH 0215/1851] feat(diffusers): support flux models (#3129) * feat(diffusers): support flux models This adds support for FLUX models. For instance: https://huggingface.co/black-forest-labs/FLUX.1-dev Signed-off-by: Ettore Di Giacinto * feat(diffusers): support FluxTransformer2DModel Signed-off-by: Ettore Di Giacinto * Small fixups Signed-off-by: Ettore Di Giacinto --------- Signed-off-by: Ettore Di Giacinto --- backend/python/diffusers/backend.py | 39 +++++++++++++++++-- backend/python/diffusers/requirements-cpu.txt | 3 +- .../diffusers/requirements-cublas11.txt | 3 +- .../diffusers/requirements-cublas12.txt | 3 +- .../python/diffusers/requirements-hipblas.txt | 1 + .../python/diffusers/requirements-intel.txt | 3 +- 6 files changed, 45 insertions(+), 7 deletions(-) diff --git a/backend/python/diffusers/backend.py b/backend/python/diffusers/backend.py index a348d290..8f420848 100755 --- a/backend/python/diffusers/backend.py +++ b/backend/python/diffusers/backend.py @@ -18,13 +18,13 @@ import backend_pb2_grpc import grpc from diffusers import StableDiffusion3Pipeline, StableDiffusionXLPipeline, StableDiffusionDepth2ImgPipeline, DPMSolverMultistepScheduler, StableDiffusionPipeline, DiffusionPipeline, \ - EulerAncestralDiscreteScheduler + EulerAncestralDiscreteScheduler, FluxPipeline, FluxTransformer2DModel from diffusers import StableDiffusionImg2ImgPipeline, AutoPipelineForText2Image, ControlNetModel, StableVideoDiffusionPipeline from diffusers.pipelines.stable_diffusion import safety_checker from diffusers.utils import load_image, export_to_video from compel import Compel, ReturnedEmbeddingsType - -from transformers import CLIPTextModel +from optimum.quanto import freeze, qfloat8, quantize +from transformers import CLIPTextModel, T5EncoderModel from safetensors.torch import load_file _ONE_DAY_IN_SECONDS = 60 * 60 * 24 @@ -163,6 +163,8 @@ class BackendServicer(backend_pb2_grpc.BackendServicer): modelFile = request.Model self.cfg_scale = 7 + self.PipelineType = request.PipelineType + if request.CFGScale != 0: self.cfg_scale = request.CFGScale @@ -244,6 +246,30 @@ class BackendServicer(backend_pb2_grpc.BackendServicer): torch_dtype=torchType, use_safetensors=True, variant=variant) + elif request.PipelineType == "FluxPipeline": + self.pipe = FluxPipeline.from_pretrained( + request.Model, + torch_dtype=torch.bfloat16) + if request.LowVRAM: + self.pipe.enable_model_cpu_offload() + elif request.PipelineType == "FluxTransformer2DModel": + dtype = torch.bfloat16 + # specify from environment or default to "ChuckMcSneed/FLUX.1-dev" + bfl_repo = os.environ.get("BFL_REPO", "ChuckMcSneed/FLUX.1-dev") + + transformer = FluxTransformer2DModel.from_single_file(modelFile, torch_dtype=dtype) + quantize(transformer, weights=qfloat8) + freeze(transformer) + text_encoder_2 = T5EncoderModel.from_pretrained(bfl_repo, subfolder="text_encoder_2", torch_dtype=dtype) + quantize(text_encoder_2, weights=qfloat8) + freeze(text_encoder_2) + + self.pipe = FluxPipeline.from_pretrained(bfl_repo, transformer=None, text_encoder_2=None, torch_dtype=dtype) + self.pipe.transformer = transformer + self.pipe.text_encoder_2 = text_encoder_2 + + if request.LowVRAM: + self.pipe.enable_model_cpu_offload() if CLIPSKIP and request.CLIPSkip != 0: self.clip_skip = request.CLIPSkip @@ -399,6 +425,13 @@ class BackendServicer(backend_pb2_grpc.BackendServicer): request.seed ) + if self.PipelineType == "FluxPipeline": + kwargs["max_sequence_length"] = 256 + + if self.PipelineType == "FluxTransformer2DModel": + kwargs["output_type"] = "pil" + kwargs["generator"] = torch.Generator("cpu").manual_seed(0) + if self.img2vid: # Load the conditioning image image = load_image(request.src) diff --git a/backend/python/diffusers/requirements-cpu.txt b/backend/python/diffusers/requirements-cpu.txt index e46a53e5..235bb57e 100644 --- a/backend/python/diffusers/requirements-cpu.txt +++ b/backend/python/diffusers/requirements-cpu.txt @@ -5,4 +5,5 @@ accelerate compel peft sentencepiece -torch \ No newline at end of file +torch +optimum-quanto \ No newline at end of file diff --git a/backend/python/diffusers/requirements-cublas11.txt b/backend/python/diffusers/requirements-cublas11.txt index df28b821..40e718cb 100644 --- a/backend/python/diffusers/requirements-cublas11.txt +++ b/backend/python/diffusers/requirements-cublas11.txt @@ -6,4 +6,5 @@ transformers accelerate compel peft -sentencepiece \ No newline at end of file +sentencepiece +optimum-quanto \ No newline at end of file diff --git a/backend/python/diffusers/requirements-cublas12.txt b/backend/python/diffusers/requirements-cublas12.txt index b0685a62..3bcc5397 100644 --- a/backend/python/diffusers/requirements-cublas12.txt +++ b/backend/python/diffusers/requirements-cublas12.txt @@ -5,4 +5,5 @@ transformers accelerate compel peft -sentencepiece \ No newline at end of file +sentencepiece +optimum-quanto \ No newline at end of file diff --git a/backend/python/diffusers/requirements-hipblas.txt b/backend/python/diffusers/requirements-hipblas.txt index fc9ea3b4..17cf7249 100644 --- a/backend/python/diffusers/requirements-hipblas.txt +++ b/backend/python/diffusers/requirements-hipblas.txt @@ -8,3 +8,4 @@ accelerate compel peft sentencepiece +optimum-quanto \ No newline at end of file diff --git a/backend/python/diffusers/requirements-intel.txt b/backend/python/diffusers/requirements-intel.txt index 77f9e674..1cc2e2a2 100644 --- a/backend/python/diffusers/requirements-intel.txt +++ b/backend/python/diffusers/requirements-intel.txt @@ -10,4 +10,5 @@ transformers accelerate compel peft -sentencepiece \ No newline at end of file +sentencepiece +optimum-quanto \ No newline at end of file From 9f61ac8accf41fde4c52c27e02398a6563e99bdb Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Sun, 11 Aug 2024 10:19:02 +0200 Subject: [PATCH 0216/1851] models(gallery): add flux.1-dev and flux.1-schnell (#3215) Signed-off-by: Ettore Di Giacinto --- gallery/flux.yaml | 14 ++++++++++++++ gallery/index.yaml | 38 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 52 insertions(+) create mode 100644 gallery/flux.yaml diff --git a/gallery/flux.yaml b/gallery/flux.yaml new file mode 100644 index 00000000..bb75b53b --- /dev/null +++ b/gallery/flux.yaml @@ -0,0 +1,14 @@ +--- +name: "flux" + +config_file: | + backend: diffusers + f16: true + low_vram: true + step: 25 + + diffusers: + cuda: true + enable_parameters: num_inference_steps + pipeline_type: FluxPipeline + cfg_scale: 0 diff --git a/gallery/index.yaml b/gallery/index.yaml index 8daa39c6..cca968bf 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -4942,6 +4942,44 @@ - sd-3 - gpu url: "github:mudler/LocalAI/gallery/stablediffusion3.yaml@master" +- &flux + name: flux.1-dev + license: flux-1-dev-non-commercial-license + description: | + FLUX.1 [dev] is a 12 billion parameter rectified flow transformer capable of generating images from text descriptions. For more information, please read our blog post. + Key Features + Cutting-edge output quality, second only to our state-of-the-art model FLUX.1 [pro]. + Competitive prompt following, matching the performance of closed source alternatives . + Trained using guidance distillation, making FLUX.1 [dev] more efficient. + Open weights to drive new scientific research, and empower artists to develop innovative workflows. + Generated outputs can be used for personal, scientific, and commercial purposes as described in the flux-1-dev-non-commercial-license. + urls: + - https://huggingface.co/black-forest-labs/FLUX.1-dev + tags: + - text-to-image + - flux + - python + - gpu + url: "github:mudler/LocalAI/gallery/flux.yaml@master" + overrides: + parameters: + model: ChuckMcSneed/FLUX.1-dev +- !!merge <<: *flux + name: flux.1-schnell + license: apache-2 + icon: https://huggingface.co/black-forest-labs/FLUX.1-schnell/resolve/main/schnell_grid.jpeg + description: | + FLUX.1 [schnell] is a 12 billion parameter rectified flow transformer capable of generating images from text descriptions. For more information, please read our blog post. + Key Features + + Cutting-edge output quality and competitive prompt following, matching the performance of closed source alternatives. + Trained using latent adversarial diffusion distillation, FLUX.1 [schnell] can generate high-quality images in only 1 to 4 steps. + Released under the apache-2.0 licence, the model can be used for personal, scientific, and commercial purposes. + urls: + - https://huggingface.co/black-forest-labs/FLUX.1-schnell + overrides: + parameters: + model: black-forest-labs/FLUX.1-schnell - &whisper ## Whisper url: "github:mudler/LocalAI/gallery/whisper-base.yaml@master" From c4534cd90800463b83d3231be184e0f06c3bdcb6 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Sun, 11 Aug 2024 10:46:17 +0200 Subject: [PATCH 0217/1851] chore(deps): update edgevpn (#3214) * chore(deps): update edgevpn Signed-off-by: Ettore Di Giacinto * fix: initialize failure map Signed-off-by: Ettore Di Giacinto --------- Signed-off-by: Ettore Di Giacinto --- core/explorer/discovery.go | 9 +++++---- go.mod | 35 +++++++++++++++++++------------- go.sum | 41 ++++++++++++++++++++++++++++++++++++++ 3 files changed, 67 insertions(+), 18 deletions(-) diff --git a/core/explorer/discovery.go b/core/explorer/discovery.go index 5de4162f..6a29442f 100644 --- a/core/explorer/discovery.go +++ b/core/explorer/discovery.go @@ -15,10 +15,10 @@ import ( type DiscoveryServer struct { sync.Mutex - database *Database - networkState *NetworkState - connectionTime time.Duration - failures map[string]int + database *Database + networkState *NetworkState + connectionTime time.Duration + failures map[string]int errorThreshold int } @@ -48,6 +48,7 @@ func NewDiscoveryServer(db *Database, dur time.Duration, failureThreshold int) * Networks: map[string]Network{}, }, errorThreshold: failureThreshold, + failures: make(map[string]int), } } diff --git a/go.mod b/go.mod index fad40e01..b35db1b1 100644 --- a/go.mod +++ b/go.mod @@ -29,15 +29,15 @@ require ( github.com/jaypipes/ghw v0.12.0 github.com/joho/godotenv v1.5.1 github.com/klauspost/cpuid/v2 v2.2.8 - github.com/libp2p/go-libp2p v0.35.2 + github.com/libp2p/go-libp2p v0.35.4 github.com/mholt/archiver/v3 v3.5.1 github.com/microcosm-cc/bluemonday v1.0.26 - github.com/mudler/edgevpn v0.26.2 + github.com/mudler/edgevpn v0.27.0 github.com/mudler/go-processmanager v0.0.0-20230818213616-f204007f963c github.com/mudler/go-stable-diffusion v0.0.0-20240429204715-4a3cd6aeae6f github.com/nomic-ai/gpt4all/gpt4all-bindings/golang v0.0.0-20240606155928-41c9013fa46a - github.com/onsi/ginkgo/v2 v2.19.0 - github.com/onsi/gomega v1.33.1 + github.com/onsi/ginkgo/v2 v2.20.0 + github.com/onsi/gomega v1.34.1 github.com/ory/dockertest/v3 v3.10.0 github.com/otiai10/openaigo v1.7.0 github.com/phayes/freeport v0.0.0-20220201140144-74d24b5ae9f5 @@ -64,8 +64,11 @@ require ( ) require ( + github.com/cpuguy83/go-md2man/v2 v2.0.4 // indirect github.com/go-task/slim-sprig/v3 v3.0.0 // indirect github.com/go-viper/mapstructure/v2 v2.0.0 // indirect + github.com/labstack/echo/v4 v4.12.0 // indirect + github.com/labstack/gommon v0.4.2 // indirect github.com/moby/docker-image-spec v1.3.1 // indirect github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 // indirect github.com/pion/datachannel v1.5.6 // indirect @@ -84,6 +87,10 @@ require ( github.com/pion/transport/v2 v2.2.5 // indirect github.com/pion/turn/v2 v2.1.6 // indirect github.com/pion/webrtc/v3 v3.2.40 // indirect + github.com/russross/blackfriday/v2 v2.1.0 // indirect + github.com/urfave/cli/v2 v2.27.3 // indirect + github.com/valyala/fasttemplate v1.2.2 // indirect + github.com/xrash/smetrics v0.0.0-20240521201337-686a1a2994c1 // indirect go.uber.org/mock v0.4.0 // indirect ) @@ -146,7 +153,7 @@ require ( github.com/google/btree v1.1.2 // indirect github.com/google/go-cmp v0.6.0 // indirect github.com/google/gopacket v1.1.19 // indirect - github.com/google/pprof v0.0.0-20240424215950-a892ee059fd6 // indirect + github.com/google/pprof v0.0.0-20240727154555-813a5fbdbec8 // indirect github.com/google/shlex v0.0.0-20191202100458-e7afc7fbc510 // indirect github.com/gorilla/css v1.0.1 // indirect github.com/gorilla/websocket v1.5.3 // indirect @@ -274,15 +281,15 @@ require ( go.uber.org/fx v1.22.1 // indirect go.uber.org/multierr v1.11.0 // indirect go.uber.org/zap v1.27.0 // indirect - golang.org/x/crypto v0.24.0 // indirect - golang.org/x/exp v0.0.0-20240506185415-9bf2ced13842 // indirect - golang.org/x/mod v0.18.0 // indirect - golang.org/x/net v0.26.0 // indirect - golang.org/x/sync v0.7.0 // indirect - golang.org/x/sys v0.22.0 // indirect - golang.org/x/term v0.21.0 // indirect - golang.org/x/text v0.16.0 // indirect - golang.org/x/tools v0.22.0 // indirect + golang.org/x/crypto v0.26.0 // indirect + golang.org/x/exp v0.0.0-20240719175910-8a7402abbf56 // indirect + golang.org/x/mod v0.20.0 // indirect + golang.org/x/net v0.28.0 // indirect + golang.org/x/sync v0.8.0 // indirect + golang.org/x/sys v0.24.0 // indirect + golang.org/x/term v0.23.0 // indirect + golang.org/x/text v0.17.0 // indirect + golang.org/x/tools v0.24.0 // indirect golang.zx2c4.com/wintun v0.0.0-20211104114900-415007cec224 // indirect golang.zx2c4.com/wireguard v0.0.0-20220703234212-c31a7b1ab478 // indirect golang.zx2c4.com/wireguard/windows v0.5.3 // indirect diff --git a/go.sum b/go.sum index 84dd09e6..5c035169 100644 --- a/go.sum +++ b/go.sum @@ -90,6 +90,8 @@ github.com/coreos/go-systemd/v22 v22.5.0/go.mod h1:Y58oyj3AT4RCenI/lSvhwexgC+NSV github.com/cpuguy83/go-md2man/v2 v2.0.0-20190314233015-f79a8a8ca69d/go.mod h1:maD7wRr/U5Z6m/iR4s+kqSMx2CaBsrgA7czyZG/E6dU= github.com/cpuguy83/go-md2man/v2 v2.0.0/go.mod h1:maD7wRr/U5Z6m/iR4s+kqSMx2CaBsrgA7czyZG/E6dU= github.com/cpuguy83/go-md2man/v2 v2.0.2/go.mod h1:tgQtvFlXSQOSOSIRvRPT7W67SCa46tRHOmNcaadrF8o= +github.com/cpuguy83/go-md2man/v2 v2.0.4 h1:wfIWP927BUkWJb2NmU/kNDYIBTh/ziUX91+lVfRxZq4= +github.com/cpuguy83/go-md2man/v2 v2.0.4/go.mod h1:tgQtvFlXSQOSOSIRvRPT7W67SCa46tRHOmNcaadrF8o= github.com/creachadair/mds v0.7.0 h1:7QoYqiPl18C0h7CLq9z9/qUH5Vr62V9677yJZHGLoQM= github.com/creachadair/mds v0.7.0/go.mod h1:4vrFYUzTXMJpMBU+OA292I6IUxKWCCfZkgXg+/kBZMo= github.com/creachadair/otp v0.4.2 h1:ngNMaD6Tzd7UUNRFyed7ykZFn/Wr5sSs5ffqZWm9pu8= @@ -254,6 +256,8 @@ github.com/google/martian v2.1.0+incompatible/go.mod h1:9I4somxYTbIHy5NJKHRl3wXi github.com/google/pprof v0.0.0-20181206194817-3ea8567a2e57/go.mod h1:zfwlbNMJ+OItoe0UupaVj+oy1omPYYDuagoSzA8v9mc= github.com/google/pprof v0.0.0-20240424215950-a892ee059fd6 h1:k7nVchz72niMH6YLQNvHSdIE7iqsQxK1P41mySCvssg= github.com/google/pprof v0.0.0-20240424215950-a892ee059fd6/go.mod h1:kf6iHlnVGwgKolg33glAes7Yg/8iWP8ukqeldJSO7jw= +github.com/google/pprof v0.0.0-20240727154555-813a5fbdbec8 h1:FKHo8hFI3A+7w0aUQuYXQ+6EN5stWmeY/AZqtM8xk9k= +github.com/google/pprof v0.0.0-20240727154555-813a5fbdbec8/go.mod h1:K1liHPHnj73Fdn/EKuT8nrFqBihUSKXoLYU0BuatOYo= github.com/google/renameio v0.1.0/go.mod h1:KWCgfxg9yswjAJkECMjeO8J8rahYeXnNhOm40UhjYkI= github.com/google/shlex v0.0.0-20191202100458-e7afc7fbc510 h1:El6M4kTTCOh6aBiKaUGG7oYTSPP8MxqL4YI3kZKwcP4= github.com/google/shlex v0.0.0-20191202100458-e7afc7fbc510/go.mod h1:pupxD2MaaD3pAXIBCelhxNneeOaAeabZDe5s4K6zSpQ= @@ -357,6 +361,10 @@ github.com/kr/pty v1.1.3/go.mod h1:pFQYn66WHrOpPYNljwOMqo10TkYh1fy3cYio2l3bCsQ= github.com/kr/text v0.1.0/go.mod h1:4Jbv+DJW3UT/LiOwJeYQe1efqtUx/iVham/4vfdArNI= github.com/kr/text v0.2.0 h1:5Nx0Ya0ZqY2ygV366QzturHI13Jq95ApcVaJBhpS+AY= github.com/kr/text v0.2.0/go.mod h1:eLer722TekiGuMkidMxC/pM04lWEeraHUUmBw8l2grE= +github.com/labstack/echo/v4 v4.12.0 h1:IKpw49IMryVB2p1a4dzwlhP1O2Tf2E0Ir/450lH+kI0= +github.com/labstack/echo/v4 v4.12.0/go.mod h1:UP9Cr2DJXbOK3Kr9ONYzNowSh7HP0aG0ShAyycHSJvM= +github.com/labstack/gommon v0.4.2 h1:F8qTUNXgG1+6WQmqoUWnz8WiEU60mXVVw0P4ht1WRA0= +github.com/labstack/gommon v0.4.2/go.mod h1:QlUFxVM+SNXhDL/Z7YhocGIBYOiwB0mXm1+1bAPHPyU= github.com/lib/pq v0.0.0-20180327071824-d34b9ff171c2 h1:hRGSmZu7j271trc9sneMrpOW7GN5ngLm8YUZIPzf394= github.com/lib/pq v0.0.0-20180327071824-d34b9ff171c2/go.mod h1:5WUZQaWbwv1U+lTReE5YruASi9Al49XbQIvNi/34Woo= github.com/libp2p/go-buffer-pool v0.1.0 h1:oK4mSFcQz7cTQIfqbe4MIj9gLW+mnanjyFtc6cdF0Y8= @@ -367,6 +375,8 @@ github.com/libp2p/go-flow-metrics v0.1.0 h1:0iPhMI8PskQwzh57jB9WxIuIOQ0r+15PChFG github.com/libp2p/go-flow-metrics v0.1.0/go.mod h1:4Xi8MX8wj5aWNDAZttg6UPmc0ZrnFNsMtpsYUClFtro= github.com/libp2p/go-libp2p v0.35.2 h1:287oHbuplkrLdAF+syB0n/qDgd50AUBtEODqS0e0HDs= github.com/libp2p/go-libp2p v0.35.2/go.mod h1:RKCDNt30IkFipGL0tl8wQW/3zVWEGFUZo8g2gAKxwjU= +github.com/libp2p/go-libp2p v0.35.4 h1:FDiBUYLkueFwsuNJUZaxKRdpKvBOWU64qQPL768bSeg= +github.com/libp2p/go-libp2p v0.35.4/go.mod h1:RKCDNt30IkFipGL0tl8wQW/3zVWEGFUZo8g2gAKxwjU= github.com/libp2p/go-libp2p-asn-util v0.4.1 h1:xqL7++IKD9TBFMgnLPZR6/6iYhawHKHl950SO9L6n94= github.com/libp2p/go-libp2p-asn-util v0.4.1/go.mod h1:d/NI6XZ9qxw67b4e+NgpQexCIiFYJjErASrYW4PFDN8= github.com/libp2p/go-libp2p-kad-dht v0.25.2 h1:FOIk9gHoe4YRWXTu8SY9Z1d0RILol0TrtApsMDPjAVQ= @@ -459,6 +469,8 @@ github.com/mr-tron/base58 v1.2.0 h1:T/HDJBh4ZCPbU39/+c3rRvE0uKBQlU27+QI8LJ4t64o= github.com/mr-tron/base58 v1.2.0/go.mod h1:BinMc/sQntlIE1frQmRFPUoPA1Zkr8VRgBdjWI2mNwc= github.com/mudler/edgevpn v0.26.2 h1:OK4jfk7sYjuU7vCh+geUJk38lsxRgMk+EdsS9s0hioE= github.com/mudler/edgevpn v0.26.2/go.mod h1:lplntB9N6LzGNqeSM3XHCq8kyDPsNhY3jqEbWGD2WaQ= +github.com/mudler/edgevpn v0.27.0 h1:FnBVzPs098DTgbUkiwm22n30hmEVBAq+PVpXanqx6qo= +github.com/mudler/edgevpn v0.27.0/go.mod h1:Hwvr+i+dePgn/Yh+EMMvqcw9ByUCLAWD9TgYtJYV95Y= github.com/mudler/go-piper v0.0.0-20240315144837-9d0100873a7d h1:8udOFrDf/I83JL0/u22j6U6Q9z9LoSdby2a/DWdd0/s= github.com/mudler/go-piper v0.0.0-20240315144837-9d0100873a7d/go.mod h1:O7SwdSWMilAWhBZMK9N9Y/oBDyMMzshE3ju8Xkexwig= github.com/mudler/go-processmanager v0.0.0-20230818213616-f204007f963c h1:CI5uGwqBpN8N7BrSKC+nmdfw+9nPQIDyjHHlaIiitZI= @@ -516,11 +528,14 @@ github.com/onsi/ginkgo v1.16.5 h1:8xi0RTUf59SOSfEtZMvwTvXYMzG4gV23XVHOZiXNtnE= github.com/onsi/ginkgo v1.16.5/go.mod h1:+E8gABHa3K6zRBolWtd+ROzc/U5bkGt0FwiG042wbpU= github.com/onsi/ginkgo/v2 v2.19.0 h1:9Cnnf7UHo57Hy3k6/m5k3dRfGTMXGvxhHFvkDTCTpvA= github.com/onsi/ginkgo/v2 v2.19.0/go.mod h1:rlwLi9PilAFJ8jCg9UE1QP6VBpd6/xj3SRC0d6TU0To= +github.com/onsi/ginkgo/v2 v2.20.0 h1:PE84V2mHqoT1sglvHc8ZdQtPcwmvvt29WLEEO3xmdZw= +github.com/onsi/ginkgo/v2 v2.20.0/go.mod h1:lG9ey2Z29hR41WMVthyJBGUBcBhGOtoPF2VFMvBXFCI= github.com/onsi/gomega v1.7.1/go.mod h1:XdKZgCCFLUoM/7CFJVPcG8C1xQ1AJ0vpAezJrB7JYyY= github.com/onsi/gomega v1.10.1/go.mod h1:iN09h71vgCQne3DLsj+A5owkum+a2tYe+TOCB1ybHNo= github.com/onsi/gomega v1.16.0/go.mod h1:HnhC7FXeEQY45zxNK3PPoIUhzk/80Xly9PcubAlGdZY= github.com/onsi/gomega v1.33.1 h1:dsYjIxxSR755MDmKVsaFQTE22ChNBcuuTWgkUDSubOk= github.com/onsi/gomega v1.33.1/go.mod h1:U4R44UsT+9eLIaYRB2a5qajjtQYn0hauxvRm16AVYg0= +github.com/onsi/gomega v1.34.1/go.mod h1:kU1QgUvBDLXBJq618Xvm2LUX6rSAfRaFRTcdOeDLwwY= github.com/opencontainers/go-digest v1.0.0 h1:apOUWs51W5PlhuyGyz9FCeeBIOUDA/6nW8Oi/yOhh5U= github.com/opencontainers/go-digest v1.0.0/go.mod h1:0JzlMkj0TRzQZfJkVvzbP0HBR3IKzErnv2BNG4W4MAM= github.com/opencontainers/image-spec v1.1.0 h1:8SG7/vwALn54lVB/0yZ/MMwhFrPYtpEHQb2IpWsCzug= @@ -639,6 +654,7 @@ github.com/russross/blackfriday v1.5.2/go.mod h1:JO/DiYxRf+HjHt06OyowR9PTA263kcR github.com/russross/blackfriday v1.6.0 h1:KqfZb0pUVN2lYqZUYRddxF4OR8ZMURnJIG5Y3VRLtww= github.com/russross/blackfriday v1.6.0/go.mod h1:ti0ldHuxg49ri4ksnFxlkCfN+hvslNlmVHqNRXXJNAY= github.com/russross/blackfriday/v2 v2.0.1/go.mod h1:+Rmxgy9KzJVeS9/2gXHxylqXiyQDYRxCVz55jmeOWTM= +github.com/russross/blackfriday/v2 v2.1.0 h1:JIOH55/0cWyOuilr9/qlrm0BSXldqnqwMsf35Ld67mk= github.com/russross/blackfriday/v2 v2.1.0/go.mod h1:+Rmxgy9KzJVeS9/2gXHxylqXiyQDYRxCVz55jmeOWTM= github.com/sashabaranov/go-openai v1.26.2 h1:cVlQa3gn3eYqNXRW03pPlpy6zLG52EU4g0FrWXc0EFI= github.com/sashabaranov/go-openai v1.26.2/go.mod h1:lj5b/K+zjTSFxVLijLSTDZuP7adOgerWeFyZLUhAKRg= @@ -736,11 +752,16 @@ github.com/ulikunitz/xz v0.5.9 h1:RsKRIA2MO8x56wkkcd3LbtcE/uMszhb6DpRf+3uwa3I= github.com/ulikunitz/xz v0.5.9/go.mod h1:nbz6k7qbPmH4IRqmfOplQw/tblSgqTqBwxkY0oWt/14= github.com/urfave/cli v1.22.2/go.mod h1:Gos4lmkARVdJ6EkW0WaNv/tZAAMe9V7XWyB60NtXRu0= github.com/urfave/cli v1.22.10/go.mod h1:Gos4lmkARVdJ6EkW0WaNv/tZAAMe9V7XWyB60NtXRu0= +github.com/urfave/cli v1.22.12 h1:igJgVw1JdKH+trcLWLeLwZjU9fEfPesQ+9/e4MQ44S8= github.com/urfave/cli v1.22.12/go.mod h1:sSBEIC79qR6OvcmsD4U3KABeOTxDqQtdDnaFuUN30b8= +github.com/urfave/cli/v2 v2.27.3 h1:/POWahRmdh7uztQ3CYnaDddk0Rm90PyOgIxgW2rr41M= +github.com/urfave/cli/v2 v2.27.3/go.mod h1:m4QzxcD2qpra4z7WhzEGn74WZLViBnMpb1ToCAKdGRQ= github.com/valyala/bytebufferpool v1.0.0 h1:GqA5TC/0021Y/b9FG4Oi9Mr3q7XYx6KllzawFIhcdPw= github.com/valyala/bytebufferpool v1.0.0/go.mod h1:6bBcMArwyJ5K/AmCkWv1jt77kVWyCJ6HpOuEn7z0Csc= github.com/valyala/fasthttp v1.55.0 h1:Zkefzgt6a7+bVKHnu/YaYSOPfNYNisSVBo/unVCf8k8= github.com/valyala/fasthttp v1.55.0/go.mod h1:NkY9JtkrpPKmgwV3HTaS2HWaJss9RSIsRVfcxxoHiOM= +github.com/valyala/fasttemplate v1.2.2 h1:lxLXG0uE3Qnshl9QyaK6XJxMXlQZELvChBOCmQD0Loo= +github.com/valyala/fasttemplate v1.2.2/go.mod h1:KHLXt3tVN2HBp8eijSv/kGJopbvo7S+qRAEEKiv+SiQ= github.com/valyala/tcplisten v1.0.0 h1:rBHj/Xf+E1tRGZyWIWwJDiRY0zc1Js+CV5DqwacVSA8= github.com/valyala/tcplisten v1.0.0/go.mod h1:T0xQ8SeCZGxckz9qRXTfG43PvQ/mcWh7FwZEA7Ioqkc= github.com/vbatts/tar-split v0.11.3 h1:hLFqsOLQ1SsppQNTMpkpPXClLDfC2A3Zgy9OUU+RVck= @@ -765,6 +786,8 @@ github.com/xeipuuv/gojsonschema v1.2.0 h1:LhYJRs+L4fBtjZUfuSZIKGeVu0QRy8e5Xi7D17 github.com/xeipuuv/gojsonschema v1.2.0/go.mod h1:anYRn/JVcOK2ZgGU+IjEV4nwlhoK5sQluxsYJ78Id3Y= github.com/xi2/xz v0.0.0-20171230120015-48954b6210f8 h1:nIPpBwaJSVYIxUFsDv3M8ofmx9yWTog9BfvIu0q41lo= github.com/xi2/xz v0.0.0-20171230120015-48954b6210f8/go.mod h1:HUYIGzjTL3rfEspMxjDjgmT5uz5wzYJKVo23qUhYTos= +github.com/xrash/smetrics v0.0.0-20240521201337-686a1a2994c1 h1:gEOO8jv9F4OT7lGCjxCBTO/36wtF6j2nSip77qHd4x4= +github.com/xrash/smetrics v0.0.0-20240521201337-686a1a2994c1/go.mod h1:Ohn+xnUBiLI6FVj/9LpzZWtj1/D6lUovWYBkxHVV3aM= github.com/yuin/goldmark v1.1.27/go.mod h1:3hX8gzYuyVAZsxl0MRgGTJEmQBFcNTphYh9decYSb74= github.com/yuin/goldmark v1.2.1/go.mod h1:3hX8gzYuyVAZsxl0MRgGTJEmQBFcNTphYh9decYSb74= github.com/yuin/goldmark v1.3.5/go.mod h1:mwnBkeHKe2W/ZEtQ+71ViKU8L12m81fl3OWwC1Zlc8k= @@ -833,9 +856,13 @@ golang.org/x/crypto v0.19.0/go.mod h1:Iy9bg/ha4yyC70EfRS8jz+B6ybOBKMaSxLj6P6oBDf golang.org/x/crypto v0.21.0/go.mod h1:0BP7YvVV9gBbVKyeTG0Gyn+gZm94bibOW5BjDEYAOMs= golang.org/x/crypto v0.24.0 h1:mnl8DM0o513X8fdIkmyFE/5hTYxbwYOjDS/+rK6qpRI= golang.org/x/crypto v0.24.0/go.mod h1:Z1PMYSOR5nyMcyAVAIQSKCDwalqy85Aqn1x3Ws4L5DM= +golang.org/x/crypto v0.26.0 h1:RrRspgV4mU+YwB4FYnuBoKsUapNIL5cohGAmSH3azsw= +golang.org/x/crypto v0.26.0/go.mod h1:GY7jblb9wI+FOo5y8/S2oY4zWP07AkOJ4+jxCqdqn54= golang.org/x/exp v0.0.0-20190121172915-509febef88a4/go.mod h1:CJ0aWSM057203Lf6IL+f9T1iT9GByDxfZKAQTCR3kQA= golang.org/x/exp v0.0.0-20240506185415-9bf2ced13842 h1:vr/HnozRka3pE4EsMEg1lgkXJkTFJCVUX+S/ZT6wYzM= golang.org/x/exp v0.0.0-20240506185415-9bf2ced13842/go.mod h1:XtvwrStGgqGPLc4cjQfWqZHG1YFdYs6swckp8vpsjnc= +golang.org/x/exp v0.0.0-20240719175910-8a7402abbf56 h1:2dVuKD2vS7b0QIHQbpyTISPd0LeHDbnYEryqj5Q1ug8= +golang.org/x/exp v0.0.0-20240719175910-8a7402abbf56/go.mod h1:M4RDyNAINzryxdtnbRXRL/OHtkFuWGRjvuhBJpk2IlY= golang.org/x/lint v0.0.0-20180702182130-06c8688daad7/go.mod h1:UVdnD1Gm6xHRNCYTkRU2/jEulfH38KcIWyp/GAMgvoE= golang.org/x/lint v0.0.0-20181026193005-c67002cb31c3/go.mod h1:UVdnD1Gm6xHRNCYTkRU2/jEulfH38KcIWyp/GAMgvoE= golang.org/x/lint v0.0.0-20190227174305-5b3e6a55c961/go.mod h1:wehouNa3lNwaWXcvxsM5YxQ5yQlVC4a0KAMCusXpPoU= @@ -852,6 +879,8 @@ golang.org/x/mod v0.7.0/go.mod h1:iBbtSCu2XBx23ZKBPSOrRkjjQPZFPuis4dIYUhu/chs= golang.org/x/mod v0.8.0/go.mod h1:iBbtSCu2XBx23ZKBPSOrRkjjQPZFPuis4dIYUhu/chs= golang.org/x/mod v0.18.0 h1:5+9lSbEzPSdWkH32vYPBwEpX8KwDbM52Ud9xBUvNlb0= golang.org/x/mod v0.18.0/go.mod h1:hTbmBsO62+eylJbnUtE2MGJUyE7QWk4xUqPFrRgJ+7c= +golang.org/x/mod v0.20.0 h1:utOm6MM3R3dnawAiJgn0y+xvuYRsm1RKM/4giyfDgV0= +golang.org/x/mod v0.20.0/go.mod h1:hTbmBsO62+eylJbnUtE2MGJUyE7QWk4xUqPFrRgJ+7c= golang.org/x/net v0.0.0-20180724234803-3673e40ba225/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4= golang.org/x/net v0.0.0-20180826012351-8a410e7b638d/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4= golang.org/x/net v0.0.0-20180906233101-161cd47e91fd/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4= @@ -885,6 +914,8 @@ golang.org/x/net v0.21.0/go.mod h1:bIjVDfnllIU7BJ2DNgfnXvpSvtn8VRwhlsaeUTyUS44= golang.org/x/net v0.22.0/go.mod h1:JKghWKKOSdJwpW2GEx0Ja7fmaKnMsbu+MWVZTokSYmg= golang.org/x/net v0.26.0 h1:soB7SVo0PWrY4vPW/+ay0jKDNScG2X9wFeYlXIvJsOQ= golang.org/x/net v0.26.0/go.mod h1:5YKkiSynbBIh3p6iOc/vibscux0x38BZDkn8sCUPxHE= +golang.org/x/net v0.28.0 h1:a9JDOJc5GMUJ0+UDqmLT86WiEy7iWyIhz8gz8E4e5hE= +golang.org/x/net v0.28.0/go.mod h1:yqtgsTWOOnlGLG9GFRrK3++bGOUEkNBoHZc8MEDWPNg= golang.org/x/oauth2 v0.0.0-20180821212333-d2e6202438be/go.mod h1:N/0e6XlmueqKjAGxoOufVs8QHGRruUQn6yWY3a++T0U= golang.org/x/oauth2 v0.0.0-20181017192945-9dcd33a902f4/go.mod h1:N/0e6XlmueqKjAGxoOufVs8QHGRruUQn6yWY3a++T0U= golang.org/x/oauth2 v0.0.0-20181203162652-d668ce993890/go.mod h1:N/0e6XlmueqKjAGxoOufVs8QHGRruUQn6yWY3a++T0U= @@ -902,6 +933,8 @@ golang.org/x/sync v0.0.0-20220722155255-886fb9371eb4/go.mod h1:RxMgew5VJxzue5/jJ golang.org/x/sync v0.1.0/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= golang.org/x/sync v0.7.0 h1:YsImfSBoP9QPYL0xyKJPq0gcaJdG3rInoqxTWbfQu9M= golang.org/x/sync v0.7.0/go.mod h1:Czt+wKu1gCyEFDUtn0jG5QVvpJ6rzVqr5aXyt9drQfk= +golang.org/x/sync v0.8.0 h1:3NFvSEYkUoMifnESzZl15y791HH1qU2xm6eCJU5ZPXQ= +golang.org/x/sync v0.8.0/go.mod h1:Czt+wKu1gCyEFDUtn0jG5QVvpJ6rzVqr5aXyt9drQfk= golang.org/x/sys v0.0.0-20180810173357-98c5dad5d1a0/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY= golang.org/x/sys v0.0.0-20180830151530-49385e6e1522/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY= golang.org/x/sys v0.0.0-20180909124046-d0be0721c37e/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY= @@ -952,6 +985,8 @@ golang.org/x/sys v0.18.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA= golang.org/x/sys v0.20.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA= golang.org/x/sys v0.22.0 h1:RI27ohtqKCnwULzJLqkv897zojh5/DwS/ENaMzUOaWI= golang.org/x/sys v0.22.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA= +golang.org/x/sys v0.24.0 h1:Twjiwq9dn6R1fQcyiK+wQyHWfaz/BJB+YIpzU/Cv3Xg= +golang.org/x/sys v0.24.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA= golang.org/x/term v0.0.0-20201126162022-7de9c90e9dd1/go.mod h1:bj7SfCRtBDWHUb9snDiAeCFNEtKQo2Wmx5Cou7ajbmo= golang.org/x/term v0.0.0-20210927222741-03fcf44c2211/go.mod h1:jbD1KX2456YbFQfuXm/mYQcufACuNUgVhRMnK/tPxf8= golang.org/x/term v0.2.0/go.mod h1:TVmDHMZPmdnySmBfhjOoOdhjzdE1h4u1VwSiw2l1Nuc= @@ -967,6 +1002,8 @@ golang.org/x/term v0.18.0/go.mod h1:ILwASektA3OnRv7amZ1xhE/KTR+u50pbXfZ03+6Nx58= golang.org/x/term v0.20.0/go.mod h1:8UkIAJTvZgivsXaD6/pH6U9ecQzZ45awqEOzuCvwpFY= golang.org/x/term v0.21.0 h1:WVXCp+/EBEHOj53Rvu+7KiT/iElMrO8ACK16SMZ3jaA= golang.org/x/term v0.21.0/go.mod h1:ooXLefLobQVslOqselCNF4SxFAaoS6KujMbsGzSDmX0= +golang.org/x/term v0.23.0 h1:F6D4vR+EHoL9/sWAWgAR1H2DcHr4PareCbAaCo1RpuU= +golang.org/x/term v0.23.0/go.mod h1:DgV24QBUrK6jhZXl+20l6UWznPlwAHm1Q1mGHtydmSk= golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ= golang.org/x/text v0.3.1-0.20180807135948-17ff2d5776d2/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ= golang.org/x/text v0.3.3/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ= @@ -981,6 +1018,8 @@ golang.org/x/text v0.12.0/go.mod h1:TvPlkZtksWOMsz7fbANvkp4WM8x/WCo/om8BMLbz+aE= golang.org/x/text v0.14.0/go.mod h1:18ZOQIKpY8NJVqYksKHtTdi31H5itFRjB5/qKTNYzSU= golang.org/x/text v0.16.0 h1:a94ExnEXNtEwYLGJSIUxnWoxoRz/ZcCsV63ROupILh4= golang.org/x/text v0.16.0/go.mod h1:GhwF1Be+LQoKShO3cGOHzqOgRrGaYc9AvblQOmPVHnI= +golang.org/x/text v0.17.0 h1:XtiM5bkSOt+ewxlOE/aE/AKEHibwj/6gvWMl9Rsh0Qc= +golang.org/x/text v0.17.0/go.mod h1:BuEKDfySbSR4drPmRPG/7iBdf8hvFMuRexcpahXilzY= golang.org/x/time v0.0.0-20180412165947-fbb02b2291d2/go.mod h1:tRJNPiyCQ0inRvYxbN9jk5I+vvW/OXSQhTDSoE431IQ= golang.org/x/time v0.0.0-20181108054448-85acf8d2951c/go.mod h1:tRJNPiyCQ0inRvYxbN9jk5I+vvW/OXSQhTDSoE431IQ= golang.org/x/time v0.5.0 h1:o7cqy6amK/52YcAKIPlM3a+Fpj35zvRj2TP+e1xFSfk= @@ -1008,6 +1047,8 @@ golang.org/x/tools v0.4.0/go.mod h1:UE5sM2OK9E/d67R0ANs2xJizIymRP5gJU295PvKXxjQ= golang.org/x/tools v0.6.0/go.mod h1:Xwgl3UAJ/d3gWutnCtw505GrjyAbvKui8lOU390QaIU= golang.org/x/tools v0.22.0 h1:gqSGLZqv+AI9lIQzniJ0nZDRG5GBPsSi+DRNHWNz6yA= golang.org/x/tools v0.22.0/go.mod h1:aCwcsjqvq7Yqt6TNyX7QMU2enbQ/Gt0bo6krSeEri+c= +golang.org/x/tools v0.24.0 h1:J1shsA93PJUEVaUSaay7UXAyE8aimq3GW0pjlolpa24= +golang.org/x/tools v0.24.0/go.mod h1:YhNqVBIfWHdzvTLs0d8LCuMhkKUgSUKldakyV7W/WDQ= golang.org/x/xerrors v0.0.0-20190717185122-a985d3407aa7/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0= golang.org/x/xerrors v0.0.0-20191011141410-1b5146add898/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0= golang.org/x/xerrors v0.0.0-20191204190536-9bdfabe68543/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0= From a92b3b13e9f68c41dc64cdaac858c49921cc3422 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Sun, 11 Aug 2024 11:22:00 +0200 Subject: [PATCH 0218/1851] chore: fix gosum missing entry --- go.sum | 1 + 1 file changed, 1 insertion(+) diff --git a/go.sum b/go.sum index 5c035169..47fd4c06 100644 --- a/go.sum +++ b/go.sum @@ -535,6 +535,7 @@ github.com/onsi/gomega v1.10.1/go.mod h1:iN09h71vgCQne3DLsj+A5owkum+a2tYe+TOCB1y github.com/onsi/gomega v1.16.0/go.mod h1:HnhC7FXeEQY45zxNK3PPoIUhzk/80Xly9PcubAlGdZY= github.com/onsi/gomega v1.33.1 h1:dsYjIxxSR755MDmKVsaFQTE22ChNBcuuTWgkUDSubOk= github.com/onsi/gomega v1.33.1/go.mod h1:U4R44UsT+9eLIaYRB2a5qajjtQYn0hauxvRm16AVYg0= +github.com/onsi/gomega v1.34.1 h1:EUMJIKUjM8sKjYbtxQI9A4z2o+rruxnzNvpknOXie6k= github.com/onsi/gomega v1.34.1/go.mod h1:kU1QgUvBDLXBJq618Xvm2LUX6rSAfRaFRTcdOeDLwwY= github.com/opencontainers/go-digest v1.0.0 h1:apOUWs51W5PlhuyGyz9FCeeBIOUDA/6nW8Oi/yOhh5U= github.com/opencontainers/go-digest v1.0.0/go.mod h1:0JzlMkj0TRzQZfJkVvzbP0HBR3IKzErnv2BNG4W4MAM= From e30114a4a42aeb55a1114707b313819b85d60a11 Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Sun, 11 Aug 2024 23:46:30 +0200 Subject: [PATCH 0219/1851] chore: :arrow_up: Update ggerganov/llama.cpp to `4134999e01f31256b15342b41c4de9e2477c4a6c` (#3218) :arrow_up: Update ggerganov/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index ef38a460..b5b2a435 100644 --- a/Makefile +++ b/Makefile @@ -8,7 +8,7 @@ DETECT_LIBS?=true # llama.cpp versions GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be -CPPLLAMA_VERSION?=6e02327e8b7837358e0406bf90a4632e18e27846 +CPPLLAMA_VERSION?=4134999e01f31256b15342b41c4de9e2477c4a6c # go-rwkv version RWKV_REPO?=https://github.com/donomii/go-rwkv.cpp From 7137c32f8f2eeba0eb101473f2700cbb76b37b46 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Mon, 12 Aug 2024 09:56:31 +0200 Subject: [PATCH 0220/1851] models(gallery): add infinity-instruct-7m-gen-llama3_1-70b (#3220) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index cca968bf..bc23d1b6 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -488,6 +488,20 @@ - filename: Kumiho-v1-rp-UwU-8B-gguf-q4_k_m.gguf sha256: a1deb46675418277cf785a406cd1508fec556ff6e4d45d2231eb2a82986d52d0 uri: huggingface://juvi21/Kumiho-v1-rp-UwU-8B-GGUF/Kumiho-v1-rp-UwU-8B-gguf-q4_k_m.gguf +- !!merge <<: *llama31 + name: "infinity-instruct-7m-gen-llama3_1-70b" + icon: https://huggingface.co/BAAI/Infinity-Instruct-7M-Gen-Llama3_1-70B/resolve/main/fig/Bk3NbjnJko51MTx1ZCScT2sqnGg.png + urls: + - https://huggingface.co/mradermacher/Infinity-Instruct-7M-Gen-Llama3_1-70B-GGUF + description: | + Infinity-Instruct-7M-Gen-Llama3.1-70B is an opensource supervised instruction tuning model without reinforcement learning from human feedback (RLHF). This model is just finetuned on Infinity-Instruct-7M and Infinity-Instruct-Gen and showing favorable results on AlpacaEval 2.0 and arena-hard compared to GPT4. + overrides: + parameters: + model: Infinity-Instruct-7M-Gen-Llama3_1-70B.Q4_K_M.gguf + files: + - filename: Infinity-Instruct-7M-Gen-Llama3_1-70B.Q4_K_M.gguf + sha256: f4379ab4d7140da0510886073375ca820ea9ac4ad9d3c20e17ed05156bd29697 + uri: huggingface://mradermacher/Infinity-Instruct-7M-Gen-Llama3_1-70B-GGUF/Infinity-Instruct-7M-Gen-Llama3_1-70B.Q4_K_M.gguf - &deepseek ## Deepseek url: "github:mudler/LocalAI/gallery/deepseek.yaml@master" From 4dfa0853392bb1bb2eda86743265d5d3754536a7 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Mon, 12 Aug 2024 09:59:17 +0200 Subject: [PATCH 0221/1851] models(gallery): add cathallama-70b (#3221) Signed-off-by: Ettore Di Giacinto --- gallery/index.yaml | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) diff --git a/gallery/index.yaml b/gallery/index.yaml index bc23d1b6..eb7515ba 100644 --- a/gallery/index.yaml +++ b/gallery/index.yaml @@ -502,6 +502,33 @@ - filename: Infinity-Instruct-7M-Gen-Llama3_1-70B.Q4_K_M.gguf sha256: f4379ab4d7140da0510886073375ca820ea9ac4ad9d3c20e17ed05156bd29697 uri: huggingface://mradermacher/Infinity-Instruct-7M-Gen-Llama3_1-70B-GGUF/Infinity-Instruct-7M-Gen-Llama3_1-70B.Q4_K_M.gguf +- !!merge <<: *llama31 + name: "cathallama-70b" + icon: https://cdn-uploads.huggingface.co/production/uploads/649dc85249ae3a68334adcc6/KxaiZ7rDKkYlix99O9j5H.png + urls: + - https://huggingface.co/gbueno86/Cathallama-70B + - https://huggingface.co/mradermacher/Cathallama-70B-GGUF + description: | + Notable Performance + + 9% overall success rate increase on MMLU-PRO over LLaMA 3.1 70b + Strong performance in MMLU-PRO categories overall + Great performance during manual testing + + Creation workflow + + Models merged + + meta-llama/Meta-Llama-3.1-70B-Instruct + turboderp/Cat-Llama-3-70B-instruct + Nexusflow/Athene-70B + overrides: + parameters: + model: Cathallama-70B.Q4_K_M.gguf + files: + - filename: Cathallama-70B.Q4_K_M.gguf + sha256: 7bbac0849a8da82e7912a493a15fa07d605f1ffbe7337a322f17e09195511022 + uri: huggingface://mradermacher/Cathallama-70B-GGUF/Cathallama-70B.Q4_K_M.gguf - &deepseek ## Deepseek url: "github:mudler/LocalAI/gallery/deepseek.yaml@master" From 9729d2ae37a4913e1d57d9006cd9e8359983c932 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Mon, 12 Aug 2024 19:25:44 +0200 Subject: [PATCH 0222/1851] feat(explorer): make possible to run sync in a separate process (#3224) Signed-off-by: Ettore Di Giacinto --- core/cli/explorer.go | 19 ++++++-- core/explorer/database.go | 59 +++++++++++++++-------- core/explorer/discovery.go | 49 ++++++------------- core/http/endpoints/explorer/dashboard.go | 11 ++--- core/http/explorer.go | 4 +- core/http/routes/explorer.go | 4 +- core/p2p/p2p.go | 1 + go.mod | 1 + go.sum | 2 + 9 files changed, 83 insertions(+), 67 deletions(-) diff --git a/core/cli/explorer.go b/core/cli/explorer.go index f3e3618d..67d25304 100644 --- a/core/cli/explorer.go +++ b/core/cli/explorer.go @@ -14,6 +14,9 @@ type ExplorerCMD struct { PoolDatabase string `env:"LOCALAI_POOL_DATABASE,POOL_DATABASE" default:"explorer.json" help:"Path to the pool database" group:"api"` ConnectionTimeout string `env:"LOCALAI_CONNECTION_TIMEOUT,CONNECTION_TIMEOUT" default:"2m" help:"Connection timeout for the explorer" group:"api"` ConnectionErrorThreshold int `env:"LOCALAI_CONNECTION_ERROR_THRESHOLD,CONNECTION_ERROR_THRESHOLD" default:"3" help:"Connection failure threshold for the explorer" group:"api"` + + WithSync bool `env:"LOCALAI_WITH_SYNC,WITH_SYNC" default:"false" help:"Enable sync with the network" group:"api"` + OnlySync bool `env:"LOCALAI_ONLY_SYNC,ONLY_SYNC" default:"false" help:"Only sync with the network" group:"api"` } func (e *ExplorerCMD) Run(ctx *cliContext.Context) error { @@ -27,10 +30,20 @@ func (e *ExplorerCMD) Run(ctx *cliContext.Context) error { if err != nil { return err } - ds := explorer.NewDiscoveryServer(db, dur, e.ConnectionErrorThreshold) - go ds.Start(context.Background()) - appHTTP := http.Explorer(db, ds) + if e.WithSync { + ds := explorer.NewDiscoveryServer(db, dur, e.ConnectionErrorThreshold) + go ds.Start(context.Background(), true) + } + + if e.OnlySync { + ds := explorer.NewDiscoveryServer(db, dur, e.ConnectionErrorThreshold) + ctx := context.Background() + + return ds.Start(ctx, false) + } + + appHTTP := http.Explorer(db) return appHTTP.Listen(e.Address) } diff --git a/core/explorer/database.go b/core/explorer/database.go index 8535140c..e24de0aa 100644 --- a/core/explorer/database.go +++ b/core/explorer/database.go @@ -7,58 +7,83 @@ import ( "os" "sort" "sync" + + "github.com/gofrs/flock" ) // Database is a simple JSON database for storing and retrieving p2p network tokens and a name and description. type Database struct { - sync.RWMutex - path string - data map[string]TokenData + path string + data map[string]TokenData + flock *flock.Flock + sync.Mutex } // TokenData is a p2p network token with a name and description. type TokenData struct { Name string `json:"name"` Description string `json:"description"` + Clusters []ClusterData + Failures int +} + +type ClusterData struct { + Workers []string + Type string + NetworkID string } // NewDatabase creates a new Database with the given path. func NewDatabase(path string) (*Database, error) { + fileLock := flock.New(path + ".lock") db := &Database{ - data: make(map[string]TokenData), - path: path, + data: make(map[string]TokenData), + path: path, + flock: fileLock, } return db, db.load() } // Get retrieves a Token from the Database by its token. func (db *Database) Get(token string) (TokenData, bool) { - db.RLock() - defer db.RUnlock() + db.flock.Lock() // we are making sure that the file is not being written to + defer db.flock.Unlock() + db.Lock() // we are making sure that is safe if called by another instance in the same process + defer db.Unlock() + db.load() t, ok := db.data[token] return t, ok } // Set stores a Token in the Database by its token. func (db *Database) Set(token string, t TokenData) error { + db.flock.Lock() + defer db.flock.Unlock() db.Lock() + defer db.Unlock() + db.load() db.data[token] = t - db.Unlock() - return db.Save() + return db.save() } // Delete removes a Token from the Database by its token. func (db *Database) Delete(token string) error { + db.flock.Lock() + defer db.flock.Unlock() db.Lock() + defer db.Unlock() + db.load() delete(db.data, token) - db.Unlock() - return db.Save() + return db.save() } func (db *Database) TokenList() []string { - db.RLock() - defer db.RUnlock() + db.flock.Lock() + defer db.flock.Unlock() + db.Lock() + defer db.Unlock() + db.load() tokens := []string{} for k := range db.data { tokens = append(tokens, k) @@ -74,9 +99,6 @@ func (db *Database) TokenList() []string { // load reads the Database from disk. func (db *Database) load() error { - db.Lock() - defer db.Unlock() - if _, err := os.Stat(db.path); os.IsNotExist(err) { return nil } @@ -91,10 +113,7 @@ func (db *Database) load() error { } // Save writes the Database to disk. -func (db *Database) Save() error { - db.RLock() - defer db.RUnlock() - +func (db *Database) save() error { // Marshal db.data into JSON // Write the JSON to the file f, err := os.Create(db.path) diff --git a/core/explorer/discovery.go b/core/explorer/discovery.go index 6a29442f..fe6470cb 100644 --- a/core/explorer/discovery.go +++ b/core/explorer/discovery.go @@ -16,22 +16,10 @@ import ( type DiscoveryServer struct { sync.Mutex database *Database - networkState *NetworkState connectionTime time.Duration - failures map[string]int errorThreshold int } -type NetworkState struct { - Networks map[string]Network -} - -func (s *DiscoveryServer) NetworkState() *NetworkState { - s.Lock() - defer s.Unlock() - return s.networkState -} - // NewDiscoveryServer creates a new DiscoveryServer with the given Database. // it keeps the db state in sync with the network state func NewDiscoveryServer(db *Database, dur time.Duration, failureThreshold int) *DiscoveryServer { @@ -44,11 +32,7 @@ func NewDiscoveryServer(db *Database, dur time.Duration, failureThreshold int) * return &DiscoveryServer{ database: db, connectionTime: dur, - networkState: &NetworkState{ - Networks: map[string]Network{}, - }, errorThreshold: failureThreshold, - failures: make(map[string]int), } } @@ -116,10 +100,10 @@ func (s *DiscoveryServer) runBackground() { if hasWorkers { s.Lock() - s.networkState.Networks[token] = Network{ - Clusters: ledgerK, - } - delete(s.failures, token) + data, _ := s.database.Get(token) + (&data).Clusters = ledgerK + (&data).Failures = 0 + s.database.Set(token, data) s.Unlock() } else { s.failedToken(token) @@ -132,27 +116,23 @@ func (s *DiscoveryServer) runBackground() { func (s *DiscoveryServer) failedToken(token string) { s.Lock() defer s.Unlock() - s.failures[token]++ + data, _ := s.database.Get(token) + (&data).Failures++ + s.database.Set(token, data) } func (s *DiscoveryServer) deleteFailedConnections() { s.Lock() defer s.Unlock() - for k, v := range s.failures { - if v > s.errorThreshold { - log.Info().Any("network", k).Msg("Network has been removed from the database") - s.database.Delete(k) - delete(s.failures, k) + for _, t := range s.database.TokenList() { + data, _ := s.database.Get(t) + if data.Failures > s.errorThreshold { + log.Info().Any("token", t).Msg("Token has been removed from the database") + s.database.Delete(t) } } } -type ClusterData struct { - Workers []string - Type string - NetworkID string -} - func (s *DiscoveryServer) retrieveNetworkData(c context.Context, ledger *blockchain.Ledger, networkData chan ClusterData) { clusters := map[string]ClusterData{} @@ -217,7 +197,7 @@ func (s *DiscoveryServer) retrieveNetworkData(c context.Context, ledger *blockch } // Start the discovery server. This is meant to be run in to a goroutine. -func (s *DiscoveryServer) Start(ctx context.Context) error { +func (s *DiscoveryServer) Start(ctx context.Context, keepRunning bool) error { for { select { case <-ctx.Done(): @@ -225,6 +205,9 @@ func (s *DiscoveryServer) Start(ctx context.Context) error { default: // Collect data s.runBackground() + if !keepRunning { + return nil + } } } } diff --git a/core/http/endpoints/explorer/dashboard.go b/core/http/endpoints/explorer/dashboard.go index 7cd9f3c9..9c731d9a 100644 --- a/core/http/endpoints/explorer/dashboard.go +++ b/core/http/endpoints/explorer/dashboard.go @@ -11,7 +11,6 @@ import ( func Dashboard() func(*fiber.Ctx) error { return func(c *fiber.Ctx) error { - summary := fiber.Map{ "Title": "LocalAI API - " + internal.PrintableVersion(), "Version": internal.PrintableVersion(), @@ -34,26 +33,24 @@ type AddNetworkRequest struct { } type Network struct { - explorer.Network explorer.TokenData Token string `json:"token"` } -func ShowNetworks(db *explorer.Database, ds *explorer.DiscoveryServer) func(*fiber.Ctx) error { +func ShowNetworks(db *explorer.Database) func(*fiber.Ctx) error { return func(c *fiber.Ctx) error { - networkState := ds.NetworkState() results := []Network{} - for token, network := range networkState.Networks { + for _, token := range db.TokenList() { networkData, exists := db.Get(token) // get the token data hasWorkers := false - for _, cluster := range network.Clusters { + for _, cluster := range networkData.Clusters { if len(cluster.Workers) > 0 { hasWorkers = true break } } if exists && hasWorkers { - results = append(results, Network{Network: network, TokenData: networkData, Token: token}) + results = append(results, Network{TokenData: networkData, Token: token}) } } diff --git a/core/http/explorer.go b/core/http/explorer.go index 608ecdb5..bdcb93b1 100644 --- a/core/http/explorer.go +++ b/core/http/explorer.go @@ -10,7 +10,7 @@ import ( "github.com/mudler/LocalAI/core/http/routes" ) -func Explorer(db *explorer.Database, discoveryServer *explorer.DiscoveryServer) *fiber.App { +func Explorer(db *explorer.Database) *fiber.App { fiberCfg := fiber.Config{ Views: renderEngine(), @@ -22,7 +22,7 @@ func Explorer(db *explorer.Database, discoveryServer *explorer.DiscoveryServer) app := fiber.New(fiberCfg) - routes.RegisterExplorerRoutes(app, db, discoveryServer) + routes.RegisterExplorerRoutes(app, db) httpFS := http.FS(embedDirStatic) diff --git a/core/http/routes/explorer.go b/core/http/routes/explorer.go index b3c0d40b..960b476b 100644 --- a/core/http/routes/explorer.go +++ b/core/http/routes/explorer.go @@ -6,8 +6,8 @@ import ( "github.com/mudler/LocalAI/core/http/endpoints/explorer" ) -func RegisterExplorerRoutes(app *fiber.App, db *coreExplorer.Database, ds *coreExplorer.DiscoveryServer) { +func RegisterExplorerRoutes(app *fiber.App, db *coreExplorer.Database) { app.Get("/", explorer.Dashboard()) app.Post("/network/add", explorer.AddNetwork(db)) - app.Get("/networks", explorer.ShowNetworks(db, ds)) + app.Get("/networks", explorer.ShowNetworks(db)) } diff --git a/core/p2p/p2p.go b/core/p2p/p2p.go index 37b892d9..bfa12287 100644 --- a/core/p2p/p2p.go +++ b/core/p2p/p2p.go @@ -236,6 +236,7 @@ func ensureService(ctx context.Context, n *node.Node, nd *NodeData, sserv string if ndService, found := service[nd.Name]; !found { if !nd.IsOnline() { // if node is offline and not present, do nothing + zlog.Debug().Msgf("Node %s is offline", nd.ID) return } newCtxm, cancel := context.WithCancel(ctx) diff --git a/go.mod b/go.mod index b35db1b1..dcece45c 100644 --- a/go.mod +++ b/go.mod @@ -67,6 +67,7 @@ require ( github.com/cpuguy83/go-md2man/v2 v2.0.4 // indirect github.com/go-task/slim-sprig/v3 v3.0.0 // indirect github.com/go-viper/mapstructure/v2 v2.0.0 // indirect + github.com/gofrs/flock v0.12.1 // indirect github.com/labstack/echo/v4 v4.12.0 // indirect github.com/labstack/gommon v0.4.2 // indirect github.com/moby/docker-image-spec v1.3.1 // indirect diff --git a/go.sum b/go.sum index 47fd4c06..db47c36b 100644 --- a/go.sum +++ b/go.sum @@ -204,6 +204,8 @@ github.com/gofiber/template/html/v2 v2.1.2 h1:wkK/mYJ3nIhongTkG3t0QgV4ADdgOYJYVS github.com/gofiber/template/html/v2 v2.1.2/go.mod h1:E98Z/FzvpaSib06aWEgYk6GXNf3ctoyaJH8yW5ay5ak= github.com/gofiber/utils v1.1.0 h1:vdEBpn7AzIUJRhe+CiTOJdUcTg4Q9RK+pEa0KPbLdrM= github.com/gofiber/utils v1.1.0/go.mod h1:poZpsnhBykfnY1Mc0KeEa6mSHrS3dV0+oBWyeQmb2e0= +github.com/gofrs/flock v0.12.1 h1:MTLVXXHf8ekldpJk3AKicLij9MdwOWkZ+a/jHHZby9E= +github.com/gofrs/flock v0.12.1/go.mod h1:9zxTsyu5xtJ9DK+1tFZyibEV7y3uwDxPPfbxeeHCoD0= github.com/gogo/protobuf v1.1.1/go.mod h1:r8qH/GZQm5c6nD/R0oafs1akxWv10x8SbQlK7atdtwQ= github.com/gogo/protobuf v1.3.1/go.mod h1:SlYgWuQ5SjCEi6WLHjHCa1yvBfUnHcTbrrZtXPKa29o= github.com/gogo/protobuf v1.3.2 h1:Ov1cvc58UF3b5XjBnZv7+opcTcQFZebYjWzi34vdm4Q= From ae4b67fb560e0d048d1c7d8884e958ddb0eefa33 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon, 12 Aug 2024 21:00:30 +0000 Subject: [PATCH 0223/1851] chore(deps): Bump llama-index from 0.10.61 to 0.10.65 in /examples/langchain-chroma (#3225) chore(deps): Bump llama-index in /examples/langchain-chroma Bumps [llama-index](https://github.com/run-llama/llama_index) from 0.10.61 to 0.10.65. - [Release notes](https://github.com/run-llama/llama_index/releases) - [Changelog](https://github.com/run-llama/llama_index/blob/main/CHANGELOG.md) - [Commits](https://github.com/run-llama/llama_index/compare/v0.10.61...v0.10.65) --- updated-dependencies: - dependency-name: llama-index dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain-chroma/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain-chroma/requirements.txt b/examples/langchain-chroma/requirements.txt index 535c6537..98f7855c 100644 --- a/examples/langchain-chroma/requirements.txt +++ b/examples/langchain-chroma/requirements.txt @@ -1,4 +1,4 @@ langchain==0.2.12 openai==1.39.0 chromadb==0.5.5 -llama-index==0.10.61 \ No newline at end of file +llama-index==0.10.65 \ No newline at end of file From bd57ebf042f197272ada0216081639b8b6770d2e Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon, 12 Aug 2024 21:13:01 +0000 Subject: [PATCH 0224/1851] chore(deps): Bump langchain-community from 0.2.9 to 0.2.11 in /examples/langchain/langchainpy-localai-example (#3230) chore(deps): Bump langchain-community Bumps [langchain-community](https://github.com/langchain-ai/langchain) from 0.2.9 to 0.2.11. - [Release notes](https://github.com/langchain-ai/langchain/releases) - [Commits](https://github.com/langchain-ai/langchain/compare/langchain-community==0.2.9...langchain-community==0.2.11) --- updated-dependencies: - dependency-name: langchain-community dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain/langchainpy-localai-example/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain/langchainpy-localai-example/requirements.txt b/examples/langchain/langchainpy-localai-example/requirements.txt index 414a1b27..c46a794a 100644 --- a/examples/langchain/langchainpy-localai-example/requirements.txt +++ b/examples/langchain/langchainpy-localai-example/requirements.txt @@ -11,7 +11,7 @@ frozenlist==1.4.1 greenlet==3.0.3 idna==3.7 langchain==0.2.12 -langchain-community==0.2.9 +langchain-community==0.2.11 marshmallow==3.21.3 marshmallow-enum==1.5.1 multidict==6.0.5 From 710f566553d1374087fff04990bf4ce5c153dc8d Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon, 12 Aug 2024 22:36:11 +0000 Subject: [PATCH 0225/1851] chore(deps): Bump attrs from 23.2.0 to 24.2.0 in /examples/langchain/langchainpy-localai-example (#3232) chore(deps): Bump attrs Bumps [attrs](https://github.com/sponsors/hynek) from 23.2.0 to 24.2.0. - [Commits](https://github.com/sponsors/hynek/commits) --- updated-dependencies: - dependency-name: attrs dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain/langchainpy-localai-example/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain/langchainpy-localai-example/requirements.txt b/examples/langchain/langchainpy-localai-example/requirements.txt index c46a794a..68031d75 100644 --- a/examples/langchain/langchainpy-localai-example/requirements.txt +++ b/examples/langchain/langchainpy-localai-example/requirements.txt @@ -1,7 +1,7 @@ aiohttp==3.10.2 aiosignal==1.3.1 async-timeout==4.0.3 -attrs==23.2.0 +attrs==24.2.0 certifi==2024.7.4 charset-normalizer==3.3.2 colorama==0.4.6 From 121ffe61c5dcccfe8237db338e04578ae7587386 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon, 12 Aug 2024 23:31:45 +0000 Subject: [PATCH 0226/1851] chore(deps): Bump pyyaml from 6.0.1 to 6.0.2 in /examples/langchain/langchainpy-localai-example (#3231) chore(deps): Bump pyyaml Bumps [pyyaml](https://github.com/yaml/pyyaml) from 6.0.1 to 6.0.2. - [Release notes](https://github.com/yaml/pyyaml/releases) - [Changelog](https://github.com/yaml/pyyaml/blob/main/CHANGES) - [Commits](https://github.com/yaml/pyyaml/compare/6.0.1...6.0.2) --- updated-dependencies: - dependency-name: pyyaml dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain/langchainpy-localai-example/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain/langchainpy-localai-example/requirements.txt b/examples/langchain/langchainpy-localai-example/requirements.txt index 68031d75..493f2687 100644 --- a/examples/langchain/langchainpy-localai-example/requirements.txt +++ b/examples/langchain/langchainpy-localai-example/requirements.txt @@ -22,7 +22,7 @@ openai==1.39.0 openapi-schema-pydantic==1.2.4 packaging>=23.2 pydantic==2.8.2 -PyYAML==6.0.1 +PyYAML==6.0.2 requests==2.32.3 SQLAlchemy==2.0.32 tenacity==8.5.0 From 83ffd626dc93be6edee3a01b95338657f6b49a24 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 13 Aug 2024 00:23:31 +0000 Subject: [PATCH 0227/1851] chore(deps): Bump llama-index from 0.10.59 to 0.10.65 in /examples/chainlit (#3238) chore(deps): Bump llama-index in /examples/chainlit Bumps [llama-index](https://github.com/run-llama/llama_index) from 0.10.59 to 0.10.65. - [Release notes](https://github.com/run-llama/llama_index/releases) - [Changelog](https://github.com/run-llama/llama_index/blob/main/CHANGELOG.md) - [Commits](https://github.com/run-llama/llama_index/compare/v0.10.59...v0.10.65) --- updated-dependencies: - dependency-name: llama-index dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/chainlit/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/chainlit/requirements.txt b/examples/chainlit/requirements.txt index 52e2b8a2..9e8b3b31 100644 --- a/examples/chainlit/requirements.txt +++ b/examples/chainlit/requirements.txt @@ -1,4 +1,4 @@ -llama_index==0.10.59 +llama_index==0.10.65 requests==2.32.3 weaviate_client==4.6.7 transformers From cd385c2720c41534c4900d470af09fe63d38e398 Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Tue, 13 Aug 2024 02:59:04 +0200 Subject: [PATCH 0228/1851] chore: :arrow_up: Update ggerganov/llama.cpp to `fc4ca27b25464a11b3b86c9dbb5b6ed6065965c2` (#3240) :arrow_up: Update ggerganov/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index b5b2a435..40ddcc6d 100644 --- a/Makefile +++ b/Makefile @@ -8,7 +8,7 @@ DETECT_LIBS?=true # llama.cpp versions GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be -CPPLLAMA_VERSION?=4134999e01f31256b15342b41c4de9e2477c4a6c +CPPLLAMA_VERSION?=fc4ca27b25464a11b3b86c9dbb5b6ed6065965c2 # go-rwkv version RWKV_REPO?=https://github.com/donomii/go-rwkv.cpp From 71f3fa653aa1a599af02505440f2cafcd1ca33c2 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 13 Aug 2024 01:12:11 +0000 Subject: [PATCH 0229/1851] chore(deps): Bump openai from 1.39.0 to 1.40.5 in /examples/langchain-chroma (#3241) chore(deps): Bump openai in /examples/langchain-chroma Bumps [openai](https://github.com/openai/openai-python) from 1.39.0 to 1.40.5. - [Release notes](https://github.com/openai/openai-python/releases) - [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md) - [Commits](https://github.com/openai/openai-python/compare/v1.39.0...v1.40.5) --- updated-dependencies: - dependency-name: openai dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain-chroma/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain-chroma/requirements.txt b/examples/langchain-chroma/requirements.txt index 98f7855c..16701ca3 100644 --- a/examples/langchain-chroma/requirements.txt +++ b/examples/langchain-chroma/requirements.txt @@ -1,4 +1,4 @@ langchain==0.2.12 -openai==1.39.0 +openai==1.40.5 chromadb==0.5.5 llama-index==0.10.65 \ No newline at end of file From 89979da33f0738990b791b7ca36fdc12552153ce Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Tue, 13 Aug 2024 04:01:26 +0200 Subject: [PATCH 0230/1851] chore: :arrow_up: Update ggerganov/whisper.cpp to `22fcd5fd110ba1ff592b4e23013d870831756259` (#3239) :arrow_up: Update ggerganov/whisper.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 40ddcc6d..c57a8cf2 100644 --- a/Makefile +++ b/Makefile @@ -16,7 +16,7 @@ RWKV_VERSION?=661e7ae26d442f5cfebd2a0881b44e8c55949ec6 # whisper.cpp version WHISPER_REPO?=https://github.com/ggerganov/whisper.cpp -WHISPER_CPP_VERSION?=81c999fe0a25c4ebbfef10ed8a1a96df9cfc10fd +WHISPER_CPP_VERSION?=22fcd5fd110ba1ff592b4e23013d870831756259 # bert.cpp version BERT_REPO?=https://github.com/go-skynet/go-bert.cpp From 447d9f844bf546dd0dd54eb32998417ba54dd999 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 13 Aug 2024 02:18:44 +0000 Subject: [PATCH 0231/1851] chore(deps): Bump aiohttp from 3.10.2 to 3.10.3 in /examples/langchain/langchainpy-localai-example (#3234) chore(deps): Bump aiohttp Bumps [aiohttp](https://github.com/aio-libs/aiohttp) from 3.10.2 to 3.10.3. - [Release notes](https://github.com/aio-libs/aiohttp/releases) - [Changelog](https://github.com/aio-libs/aiohttp/blob/master/CHANGES.rst) - [Commits](https://github.com/aio-libs/aiohttp/compare/v3.10.2...v3.10.3) --- updated-dependencies: - dependency-name: aiohttp dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain/langchainpy-localai-example/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain/langchainpy-localai-example/requirements.txt b/examples/langchain/langchainpy-localai-example/requirements.txt index 493f2687..bf46bef4 100644 --- a/examples/langchain/langchainpy-localai-example/requirements.txt +++ b/examples/langchain/langchainpy-localai-example/requirements.txt @@ -1,4 +1,4 @@ -aiohttp==3.10.2 +aiohttp==3.10.3 aiosignal==1.3.1 async-timeout==4.0.3 attrs==24.2.0 From 7d92936e1a181a92e5599a8b7dd21ecdf3a6f3b7 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 13 Aug 2024 03:59:16 +0000 Subject: [PATCH 0232/1851] chore(deps): Bump openai from 1.39.0 to 1.40.6 in /examples/langchain/langchainpy-localai-example (#3244) chore(deps): Bump openai Bumps [openai](https://github.com/openai/openai-python) from 1.39.0 to 1.40.6. - [Release notes](https://github.com/openai/openai-python/releases) - [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md) - [Commits](https://github.com/openai/openai-python/compare/v1.39.0...v1.40.6) --- updated-dependencies: - dependency-name: openai dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/langchain/langchainpy-localai-example/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/langchain/langchainpy-localai-example/requirements.txt b/examples/langchain/langchainpy-localai-example/requirements.txt index bf46bef4..b9d161c5 100644 --- a/examples/langchain/langchainpy-localai-example/requirements.txt +++ b/examples/langchain/langchainpy-localai-example/requirements.txt @@ -18,7 +18,7 @@ multidict==6.0.5 mypy-extensions==1.0.0 numexpr==2.10.1 numpy==2.0.1 -openai==1.39.0 +openai==1.40.6 openapi-schema-pydantic==1.2.4 packaging>=23.2 pydantic==2.8.2 From 02de274e00154269c4e9ccc653846f2cfdb77fcc Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Tue, 13 Aug 2024 16:17:18 +0200 Subject: [PATCH 0233/1851] feat(federated): allow to pickup a specific worker, improve loadbalancing (#3243) * feat(explorer): allow to specify a worker target Signed-off-by: Ettore Di Giacinto * feat(explorer): correctly load balance requests Signed-off-by: Ettore Di Giacinto * feat(explorer): mark load balanced by default Signed-off-by: Ettore Di Giacinto * fix: make sure to delete tunnels that might not exist anymore If a worker goes off and on might change tunnel address, and we want to load balance only on the active tunnels. Signed-off-by: Ettore Di Giacinto --------- Signed-off-by: Ettore Di Giacinto --- core/cli/federated.go | 5 +-- core/p2p/federated.go | 68 ++++++++++++++++++++++++++++++++++-- core/p2p/federated_server.go | 53 +++++++++++++--------------- core/p2p/p2p.go | 4 ++- 4 files changed, 96 insertions(+), 34 deletions(-) diff --git a/core/cli/federated.go b/core/cli/federated.go index 271babca..b917812c 100644 --- a/core/cli/federated.go +++ b/core/cli/federated.go @@ -10,13 +10,14 @@ import ( type FederatedCLI struct { Address string `env:"LOCALAI_ADDRESS,ADDRESS" default:":8080" help:"Bind address for the API server" group:"api"` Peer2PeerToken string `env:"LOCALAI_P2P_TOKEN,P2P_TOKEN,TOKEN" name:"p2ptoken" help:"Token for P2P mode (optional)" group:"p2p"` - LoadBalanced bool `env:"LOCALAI_LOAD_BALANCED,LOAD_BALANCED" default:"false" help:"Enable load balancing" group:"p2p"` + RandomWorker bool `env:"LOCALAI_RANDOM_WORKER,RANDOM_WORKER" default:"false" help:"Select a random worker from the pool" group:"p2p"` Peer2PeerNetworkID string `env:"LOCALAI_P2P_NETWORK_ID,P2P_NETWORK_ID" help:"Network ID for P2P mode, can be set arbitrarly by the user for grouping a set of instances." group:"p2p"` + TargetWorker string `env:"LOCALAI_TARGET_WORKER,TARGET_WORKER" help:"Target worker to run the federated server on" group:"p2p"` } func (f *FederatedCLI) Run(ctx *cliContext.Context) error { - fs := p2p.NewFederatedServer(f.Address, p2p.NetworkID(f.Peer2PeerNetworkID, p2p.FederatedID), f.Peer2PeerToken, f.LoadBalanced) + fs := p2p.NewFederatedServer(f.Address, p2p.NetworkID(f.Peer2PeerNetworkID, p2p.FederatedID), f.Peer2PeerToken, !f.RandomWorker, f.TargetWorker) return fs.Start(context.Background()) } diff --git a/core/p2p/federated.go b/core/p2p/federated.go index 3ac3ff91..8e468ef6 100644 --- a/core/p2p/federated.go +++ b/core/p2p/federated.go @@ -1,6 +1,12 @@ package p2p -import "fmt" +import ( + "fmt" + "math/rand/v2" + "sync" + + "github.com/rs/zerolog/log" +) const FederatedID = "federated" @@ -12,22 +18,70 @@ func NetworkID(networkID, serviceID string) string { } type FederatedServer struct { + sync.Mutex listenAddr, service, p2ptoken string requestTable map[string]int loadBalanced bool + workerTarget string } -func NewFederatedServer(listenAddr, service, p2pToken string, loadBalanced bool) *FederatedServer { +func NewFederatedServer(listenAddr, service, p2pToken string, loadBalanced bool, workerTarget string) *FederatedServer { return &FederatedServer{ listenAddr: listenAddr, service: service, p2ptoken: p2pToken, requestTable: map[string]int{}, loadBalanced: loadBalanced, + workerTarget: workerTarget, + } +} + +func (fs *FederatedServer) RandomServer() string { + var tunnelAddresses []string + for _, v := range GetAvailableNodes(fs.service) { + if v.IsOnline() { + tunnelAddresses = append(tunnelAddresses, v.TunnelAddress) + } else { + delete(fs.requestTable, v.TunnelAddress) // make sure it's not tracked + log.Info().Msgf("Node %s is offline", v.ID) + } + } + + if len(tunnelAddresses) == 0 { + return "" + } + + return tunnelAddresses[rand.IntN(len(tunnelAddresses))] +} + +func (fs *FederatedServer) syncTableStatus() { + fs.Lock() + defer fs.Unlock() + currentTunnels := make(map[string]struct{}) + + for _, v := range GetAvailableNodes(fs.service) { + if v.IsOnline() { + fs.ensureRecordExist(v.TunnelAddress) + currentTunnels[v.TunnelAddress] = struct{}{} + } + } + + // delete tunnels that don't exist anymore + for t := range fs.requestTable { + if _, ok := currentTunnels[t]; !ok { + delete(fs.requestTable, t) + } } } func (fs *FederatedServer) SelectLeastUsedServer() string { + fs.syncTableStatus() + + fs.Lock() + defer fs.Unlock() + + log.Debug().Any("request_table", fs.requestTable).Msgf("Current request table") + // cycle over requestTable and find the entry with the lower number // if there are multiple entries with the same number, select one randomly // if there are no entries, return an empty string @@ -39,18 +93,26 @@ func (fs *FederatedServer) SelectLeastUsedServer() string { minKey = k } } + log.Debug().Any("requests_served", min).Msgf("Selected tunnel %s", minKey) + return minKey } func (fs *FederatedServer) RecordRequest(nodeID string) { + fs.Lock() + defer fs.Unlock() // increment the counter for the nodeID in the requestTable fs.requestTable[nodeID]++ + + log.Debug().Any("request_table", fs.requestTable).Msgf("Current request table") } -func (fs *FederatedServer) EnsureRecordExist(nodeID string) { +func (fs *FederatedServer) ensureRecordExist(nodeID string) { // if the nodeID is not in the requestTable, add it with a counter of 0 _, ok := fs.requestTable[nodeID] if !ok { fs.requestTable[nodeID] = 0 } + + log.Debug().Any("request_table", fs.requestTable).Msgf("Current request table") } diff --git a/core/p2p/federated_server.go b/core/p2p/federated_server.go index 75da97ec..6d7ccd46 100644 --- a/core/p2p/federated_server.go +++ b/core/p2p/federated_server.go @@ -10,8 +10,6 @@ import ( "net" "time" - "math/rand/v2" - "github.com/mudler/edgevpn/pkg/node" "github.com/mudler/edgevpn/pkg/protocol" "github.com/mudler/edgevpn/pkg/types" @@ -76,7 +74,7 @@ func (fs *FederatedServer) proxy(ctx context.Context, node *node.Node) error { case <-ctx.Done(): return errors.New("context canceled") default: - log.Debug().Msg("New for connection") + log.Debug().Msgf("New connection from %s", l.Addr().String()) // Listen for an incoming connection. conn, err := l.Accept() if err != nil { @@ -86,37 +84,33 @@ func (fs *FederatedServer) proxy(ctx context.Context, node *node.Node) error { // Handle connections in a new goroutine, forwarding to the p2p service go func() { - var tunnelAddresses []string - for _, v := range GetAvailableNodes(fs.service) { - if v.IsOnline() { - tunnelAddresses = append(tunnelAddresses, v.TunnelAddress) - } else { - log.Info().Msgf("Node %s is offline", v.ID) + tunnelAddr := "" + + if fs.workerTarget != "" { + for _, v := range GetAvailableNodes(fs.service) { + if v.ID == fs.workerTarget { + tunnelAddr = v.TunnelAddress + break + } } + } else if fs.loadBalanced { + log.Debug().Msgf("Load balancing request") + + tunnelAddr = fs.SelectLeastUsedServer() + if tunnelAddr == "" { + tunnelAddr = fs.RandomServer() + } + + } else { + tunnelAddr = fs.RandomServer() } - if len(tunnelAddresses) == 0 { + if tunnelAddr == "" { log.Error().Msg("No available nodes yet") return } - tunnelAddr := "" - - if fs.loadBalanced { - for _, t := range tunnelAddresses { - fs.EnsureRecordExist(t) - } - - tunnelAddr = fs.SelectLeastUsedServer() - log.Debug().Msgf("Selected tunnel %s", tunnelAddr) - if tunnelAddr == "" { - tunnelAddr = tunnelAddresses[rand.IntN(len(tunnelAddresses))] - } - - fs.RecordRequest(tunnelAddr) - } else { - tunnelAddr = tunnelAddresses[rand.IntN(len(tunnelAddresses))] - } + log.Debug().Msgf("Selected tunnel %s", tunnelAddr) tunnelConn, err := net.Dial("tcp", tunnelAddr) if err != nil { @@ -132,7 +126,10 @@ func (fs *FederatedServer) proxy(ctx context.Context, node *node.Node) error { tunnelConn.Close() conn.Close() - // ll.Infof("(service %s) Done handling %s", serviceID, l.Addr().String()) + + if fs.loadBalanced { + fs.RecordRequest(tunnelAddr) + } }() } } diff --git a/core/p2p/p2p.go b/core/p2p/p2p.go index bfa12287..af2106be 100644 --- a/core/p2p/p2p.go +++ b/core/p2p/p2p.go @@ -181,7 +181,6 @@ func discoveryTunnels(ctx context.Context, n *node.Node, token, servicesID strin if err != nil { return nil, fmt.Errorf("creating a new node: %w", err) } - // get new services, allocate and return to the channel // TODO: @@ -201,6 +200,9 @@ func discoveryTunnels(ctx context.Context, n *node.Node, token, servicesID strin zlog.Debug().Msg("Searching for workers") data := ledger.LastBlock().Storage[servicesID] + + zlog.Debug().Any("data", ledger.LastBlock().Storage).Msg("Ledger data") + for k, v := range data { zlog.Info().Msgf("Found worker %s", k) nd := &NodeData{} From 10324d9ad209f321c6d263770139473e73fe1994 Mon Sep 17 00:00:00 2001 From: "LocalAI [bot]" <139863280+localai-bot@users.noreply.github.com> Date: Tue, 13 Aug 2024 23:45:01 +0200 Subject: [PATCH 0234/1851] chore: :arrow_up: Update ggerganov/llama.cpp to `06943a69f678fb32829ff06d9c18367b17d4b361` (#3245) :arrow_up: Update ggerganov/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index c57a8cf2..eb507acd 100644 --- a/Makefile +++ b/Makefile @@ -8,7 +8,7 @@ DETECT_LIBS?=true # llama.cpp versions GOLLAMA_REPO?=https://github.com/go-skynet/go-llama.cpp GOLLAMA_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be -CPPLLAMA_VERSION?=fc4ca27b25464a11b3b86c9dbb5b6ed6065965c2 +CPPLLAMA_VERSION?=06943a69f678fb32829ff06d9c18367b17d4b361 # go-rwkv version RWKV_REPO?=https://github.com/donomii/go-rwkv.cpp From 5bb2321fe0c1f99b44196aeb74473adf421b8a56 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 13 Aug 2024 23:47:52 +0000 Subject: [PATCH 0235/1851] chore(deps): Bump openai from 1.39.0 to 1.40.4 in /examples/functions (#3235) Bumps [openai](https://github.com/openai/openai-python) from 1.39.0 to 1.40.4. - [Release notes](https://github.com/openai/openai-python/releases) - [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md) - [Commits](https://github.com/openai/openai-python/compare/v1.39.0...v1.40.4) --- updated-dependencies: - dependency-name: openai dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- examples/functions/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/functions/requirements.txt b/examples/functions/requirements.txt index a8a8ca8c..d24cb5ec 100644 --- a/examples/functions/requirements.txt +++ b/examples/functions/requirements.txt @@ -1,2 +1,2 @@ langchain==0.2.12 -openai==1.39.0 +openai==1.40.4 From 57f79002107be719c22399048e5d9188e8f749d6 Mon Sep 17 00:00:00 2001 From: Dave Date: Wed, 14 Aug 2024 03:06:41 -0400 Subject: [PATCH 0236/1851] feat: Initial Version of vscode DevContainer (#3217) initial version of devcontainer --------- Signed-off-by: Dave Lee --- .devcontainer/devcontainer.json | 23 ++++++ .devcontainer/docker-compose-devcontainer.yml | 45 ++++++++++++ .devcontainer/grafana/datasource.yml | 10 +++ .devcontainer/prometheus/prometheus.yml | 21 ++++++ .dockerignore | 1 + .env | 3 + .vscode/launch.json | 21 +++--- Dockerfile | 70 +++++++++++++++---- docker-compose.yaml | 2 - 9 files changed, 169 insertions(+), 27 deletions(-) create mode 100644 .devcontainer/devcontainer.json create mode 100644 .devcontainer/docker-compose-devcontainer.yml create mode 100644 .devcontainer/grafana/datasource.yml create mode 100644 .devcontainer/prometheus/prometheus.yml diff --git a/.devcontainer/devcontainer.json b/.devcontainer/devcontainer.json new file mode 100644 index 00000000..a111dbfd --- /dev/null +++ b/.devcontainer/devcontainer.json @@ -0,0 +1,23 @@ +{ + "$schema": "https://raw.githubusercontent.com/devcontainers/spec/main/schemas/devContainer.schema.json", + "name": "LocalAI", + "workspaceFolder": "/workspace", + "dockerComposeFile": [ "./docker-compose-devcontainer.yml" ], + "service": "api", + "shutdownAction": "stopCompose", + "customizations": { + "vscode": { + "extensions": [ + "golang.go", + "ms-vscode.makefile-tools", + "ms-azuretools.vscode-docker", + "ms-python.python", + "ms-python.debugpy", + "wayou.vscode-todo-highlight", + "waderyan.gitblame" + ] + } + }, + "forwardPorts": [8080, 3000], + "postStartCommand": "make prepare && cp /build/backend-assets /workdir/backend-assets" +} \ No newline at end of file diff --git a/.devcontainer/docker-compose-devcontainer.yml b/.devcontainer/docker-compose-devcontainer.yml new file mode 100644 index 00000000..e36492e9 --- /dev/null +++ b/.devcontainer/docker-compose-devcontainer.yml @@ -0,0 +1,45 @@ +services: + api: + build: + context: .. + dockerfile: Dockerfile + target: devcontainer + args: + - FFMPEG=true + - IMAGE_TYPE=extras + - GO_TAGS=stablediffusion p2p tts + env_file: + - ../.env + ports: + - 8080:8080 + volumes: + - ..:/workspace:cached + command: /bin/sh -c "while sleep 1000; do :; done" + cap_add: + - SYS_PTRACE + security_opt: + - seccomp:unconfined + prometheus: + image: prom/prometheus + container_name: prometheus + command: + - '--config.file=/etc/prometheus/prometheus.yml' + ports: + - 9090:9090 + restart: unless-stopped + volumes: + - ./prometheus:/etc/prometheus + - prom_data:/prometheus + grafana: + image: grafana/grafana + container_name: grafana + ports: + - 3000:3000 + restart: unless-stopped + environment: + - GF_SECURITY_ADMIN_USER=admin + - GF_SECURITY_ADMIN_PASSWORD=grafana + volumes: + - ./grafana:/etc/grafana/provisioning/datasources +volumes: + prom_data: \ No newline at end of file diff --git a/.devcontainer/grafana/datasource.yml b/.devcontainer/grafana/datasource.yml new file mode 100644 index 00000000..1ed2fa3c --- /dev/null +++ b/.devcontainer/grafana/datasource.yml @@ -0,0 +1,10 @@ + +apiVersion: 1 + +datasources: +- name: Prometheus + type: prometheus + url: http://prometheus:9090 + isDefault: true + access: proxy + editable: true diff --git a/.devcontainer/prometheus/prometheus.yml b/.devcontainer/prometheus/prometheus.yml new file mode 100644 index 00000000..18c44da7 --- /dev/null +++ b/.devcontainer/prometheus/prometheus.yml @@ -0,0 +1,21 @@ +global: + scrape_interval: 15s + scrape_timeout: 10s + evaluation_interval: 15s +alerting: + alertmanagers: + - static_configs: + - targets: [] + scheme: http + timeout: 10s + api_version: v1 +scrape_configs: +- job_name: prometheus + honor_timestamps: true + scrape_interval: 15s + scrape_timeout: 10s + metrics_path: /metrics + scheme: http + static_configs: + - targets: + - localhost:9090 \ No newline at end of file diff --git a/.dockerignore b/.dockerignore index 3954769f..e91f0008 100644 --- a/.dockerignore +++ b/.dockerignore @@ -1,6 +1,7 @@ .idea .github .vscode +.devcontainer models examples/chatbot-ui/models examples/rwkv/models diff --git a/.env b/.env index 95a515bc..9e5dbd79 100644 --- a/.env +++ b/.env @@ -79,6 +79,9 @@ ### Enable to run parallel requests # LOCALAI_PARALLEL_REQUESTS=true +# Enable to allow p2p mode +# LOCALAI_P2P=true + ### Watchdog settings ### # Enables watchdog to kill backends that are inactive for too much time diff --git a/.vscode/launch.json b/.vscode/launch.json index 2727da92..50493421 100644 --- a/.vscode/launch.json +++ b/.vscode/launch.json @@ -3,12 +3,12 @@ "configurations": [ { "name": "Python: Current File", - "type": "python", + "type": "debugpy", "request": "launch", "program": "${file}", "console": "integratedTerminal", "justMyCode": false, - "cwd": "${workspaceFolder}/examples/langchain-chroma", + "cwd": "${fileDirname}", "env": { "OPENAI_API_BASE": "http://localhost:8080/v1", "OPENAI_API_KEY": "abc" @@ -19,15 +19,16 @@ "type": "go", "request": "launch", "mode": "debug", - "program": "${workspaceFolder}/main.go", - "args": [ - "api" - ], + "program": "${workspaceRoot}", + "args": [], "env": { - "C_INCLUDE_PATH": "${workspaceFolder}/go-llama:${workspaceFolder}/go-stable-diffusion/:${workspaceFolder}/gpt4all/gpt4all-bindings/golang/:${workspaceFolder}/go-gpt2:${workspaceFolder}/go-rwkv:${workspaceFolder}/whisper.cpp:${workspaceFolder}/go-bert:${workspaceFolder}/bloomz", - "LIBRARY_PATH": "${workspaceFolder}/go-llama:${workspaceFolder}/go-stable-diffusion/:${workspaceFolder}/gpt4all/gpt4all-bindings/golang/:${workspaceFolder}/go-gpt2:${workspaceFolder}/go-rwkv:${workspaceFolder}/whisper.cpp:${workspaceFolder}/go-bert:${workspaceFolder}/bloomz", - "DEBUG": "true" - } + "LOCALAI_LOG_LEVEL": "debug", + "LOCALAI_P2P": "true", + "LOCALAI_FEDERATED": "true" + }, + "buildFlags": ["-tags", "stablediffusion p2p tts", "-v"], + "envFile": "${workspaceFolder}/.env", + "cwd": "${workspaceRoot}" } ] } \ No newline at end of file diff --git a/Dockerfile b/Dockerfile index a0feadd9..0dfaaa19 100644 --- a/Dockerfile +++ b/Dockerfile @@ -8,7 +8,7 @@ FROM ${BASE_IMAGE} AS requirements-core USER root -ARG GO_VERSION=1.22.5 +ARG GO_VERSION=1.22.6 ARG TARGETARCH ARG TARGETVARIANT @@ -30,7 +30,7 @@ RUN apt-get update && \ # Install Go RUN curl -L -s https://go.dev/dl/go${GO_VERSION}.linux-${TARGETARCH}.tar.gz | tar -C /usr/local -xz -ENV PATH $PATH:/root/go/bin:/usr/local/go/bin +ENV PATH=$PATH:/root/go/bin:/usr/local/go/bin # Install grpc compilers RUN go install google.golang.org/protobuf/cmd/protoc-gen-go@v1.34.2 && \ @@ -39,15 +39,18 @@ RUN go install google.golang.org/protobuf/cmd/protoc-gen-go@v1.34.2 && \ COPY --chmod=644 custom-ca-certs/* /usr/local/share/ca-certificates/ RUN update-ca-certificates +RUN test -n "$TARGETARCH" \ + || (echo 'warn: missing $TARGETARCH, either set this `ARG` manually, or run using `docker buildkit`') + # Use the variables in subsequent instructions RUN echo "Target Architecture: $TARGETARCH" RUN echo "Target Variant: $TARGETVARIANT" # Cuda -ENV PATH /usr/local/cuda/bin:${PATH} +ENV PATH=/usr/local/cuda/bin:${PATH} # HipBLAS requirements -ENV PATH /opt/rocm/bin:${PATH} +ENV PATH=/opt/rocm/bin:${PATH} # OpenBLAS requirements and stable diffusion RUN apt-get update && \ @@ -62,9 +65,6 @@ RUN ln -s /usr/include/opencv4/opencv2 /usr/include/opencv2 WORKDIR /build -RUN test -n "$TARGETARCH" \ - || (echo 'warn: missing $TARGETARCH, either set this `ARG` manually, or run using `docker buildkit`') - ################################### ################################### @@ -217,13 +217,14 @@ RUN git clone --recurse-submodules --jobs 4 -b ${GRPC_VERSION} --depth 1 --shall ################################### ################################### -# The builder target compiles LocalAI. This target is not the target that will be uploaded to the registry. -# Adjustments to the build process should likely be made here. -FROM requirements-drivers AS builder +# The builder-base target has the arguments, variables, and copies shared between full builder images and the uncompiled devcontainer + +FROM requirements-drivers AS builder-base ARG GO_TAGS="stablediffusion tts p2p" ARG GRPC_BACKENDS ARG MAKEFLAGS +ARG LD_FLAGS="-s -w" ENV GRPC_BACKENDS=${GRPC_BACKENDS} ENV GO_TAGS=${GO_TAGS} @@ -231,14 +232,12 @@ ENV MAKEFLAGS=${MAKEFLAGS} ENV NVIDIA_DRIVER_CAPABILITIES=compute,utility ENV NVIDIA_REQUIRE_CUDA="cuda>=${CUDA_MAJOR_VERSION}.0" ENV NVIDIA_VISIBLE_DEVICES=all +ENV LD_FLAGS=${LD_FLAGS} + +RUN echo "GO_TAGS: $GO_TAGS" && echo "TARGETARCH: $TARGETARCH" WORKDIR /build -COPY . . -COPY .git . -RUN echo "GO_TAGS: $GO_TAGS" - -RUN make prepare # We need protoc installed, and the version in 22.04 is too old. We will create one as part installing the GRPC build below # but that will also being in a newer version of absl which stablediffusion cannot compile with. This version of protoc is only @@ -256,6 +255,20 @@ RUN < Date: Wed, 14 Aug 2024 10:08:32 +0200 Subject: [PATCH 0237/1851] Update binaries.md Signed-off-by: Ettore Di Giacinto --- docs/content/docs/reference/binaries.md | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/docs/content/docs/reference/binaries.md b/docs/content/docs/reference/binaries.md index edefca75..7780864c 100644 --- a/docs/content/docs/reference/binaries.md +++ b/docs/content/docs/reference/binaries.md @@ -19,4 +19,13 @@ Otherwise, here are the links to the binaries: | --- | --- | | Linux (amd64) | [Download](https://github.com/mudler/LocalAI/releases/download/{{< version >}}/local-ai-Linux-x86_64) | | Linux (arm64) | [Download](https://github.com/mudler/LocalAI/releases/download/{{< version >}}/local-ai-Linux-arm64) | -| MacOS (arm64) | [Download](https://github.com/mudler/LocalAI/releases/download/{{< version >}}/local-ai-Darwin-arm64) | \ No newline at end of file +| MacOS (arm64) | [Download](https://github.com/mudler/LocalAI/releases/download/{{< version >}}/local-ai-Darwin-arm64) | + + +{{% alert icon="⚡" context="warning" %}} +Binaries do have limited support compared to container images: + +- Python-based backends are not shipped with binaries (e.g. `bark`, `diffusers` or `transformers`) +- MacOS binaries and Linux-arm64 do not ship TTS nor `stablediffusion-cpp` backends +- Linux binaries do not ship `stablediffusion-cpp` backend +{{% /alert %}} From d6c4e751f23b4c6eb6d103490ee9fd4738e34667 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Wed, 14 Aug 2024 12:53:29 +0200 Subject: [PATCH 0238/1851] feat(explorer): visual improvements (#3247) Signed-off-by: Ettore Di Giacinto --- core/http/views/explorer.html | 67 +++++++++++++++++++++++++++++------ 1 file changed, 56 insertions(+), 11 deletions(-) diff --git a/core/http/views/explorer.html b/core/http/views/explorer.html index 91cb9720..033fa546 100644 --- a/core/http/views/explorer.html +++ b/core/http/views/explorer.html @@ -152,6 +152,35 @@ right: 10px; color: #e2e8f0; } + .fa-circle-nodes { + /* font-size: 100px; /* Adjust the size as needed */ + animation: rotateCircleNodes 8s linear infinite; /* Slow and fluid rotation */ + display: inline-block; + } + + @keyframes rotateCircleNodes { + 0% { transform: rotate(0deg); } + 100% { transform: rotate(360deg); } + } + /* Animation for the warning box */ + .fa-flask { + /* font-size: 100px; /* Adjust the size as needed */ + animation: shakeFlask 3s ease-in-out infinite; /* Smooth easing and longer duration for fluidity */ + transform-origin: bottom center; + } + + @keyframes shakeFlask { + 0%, 10% { transform: rotate(0deg); } /* Start and end still */ + 20% { transform: rotate(-10deg); } /* Smooth transition to left */ + 30% { transform: rotate(10deg); } /* Smooth transition to right */ + 40% { transform: rotate(-8deg); } /* Smooth transition to left */ + 50% { transform: rotate(8deg); } /* Smooth transition to right */ + 60% { transform: rotate(-5deg); } /* Smooth transition to left */ + 70% { transform: rotate(5deg); } /* Smooth transition to right */ + 80% { transform: rotate(-2deg); } /* Smooth transition to left */ + 90% { transform: rotate(2deg); } /* Smooth transition to right */ + 100% { transform: rotate(0deg); } /* Return to center */ + } @@ -159,14 +188,23 @@ {{template "views/partials/navbar_explorer" .}}
-

Network Clusters Explorer

-

View the clusters and workers available in each network.

+

+ Network Clusters Explorer + +

+

+ View the clusters and workers available in each network. + + + +

+
- + The explorer is a global, community-driven tool to share network tokens and view available clusters in the globe. Anyone can use the tokens to offload computation and use the clusters available or share resources. This is provided without any warranty. Use it at your own risk. We are not responsible for any potential harm or misuse. Sharing tokens globally allows anyone from the internet to use your instances. @@ -221,23 +259,30 @@