feat: more embedded models, coqui fixes, add model usage and description (#1556)

* feat: add model descriptions and usage * remove default model gallery * models: add embeddings and tts * docs: update table * docs: updates * images: cleanup pip cache after install * images: always run apt-get clean * ux: improve gRPC connection errors * ux: improve some messages * fix: fix coqui when no AudioPath is passed by * embedded: add more models * Add usage * Reorder table
2025-05-28 14:35:00 +00:00 · 2024-01-08 00:37:02 +01:00 · 2024-01-08 00:37:02 +01:00 · e19d7226f8
commit e19d7226f8
parent 0843fe6c65
21 changed files with 216 additions and 45 deletions
--- a/embedded/models/bert-cpp.yaml
+++ b/embedded/models/bert-cpp.yaml
@ -0,0 +1,23 @@
+backend: bert-embeddings
+embeddings: true
+f16: true
+
+gpu_layers: 90
+mmap: true
+name: bert-cpp-minilm-v6
+
+parameters:
+  model: bert-MiniLM-L6-v2q4_0.bin
+
+download_files:
+- filename: "bert-MiniLM-L6-v2q4_0.bin"
+  sha256: "a5a174d8772c8a569faf9f3136c441f2c3855b5bf35ed32274294219533feaad"
+  uri: "https://huggingface.co/mudler/all-MiniLM-L6-v2/resolve/main/ggml-model-q4_0.bin"
+
+usage: |
+    You can test this model with curl like this:
+
+    curl http://localhost:8080/embeddings -X POST -H "Content-Type: application/json" -d '{
+      "input": "Your text string goes here",
+      "model": "bert-cpp-minilm-v6"
+    }'