feat(aio): entrypoint, update workflows (#1872)

This commit is contained in:
Ettore Di Giacinto 2024-03-21 22:09:04 +01:00 committed by GitHub
parent 743095b7d8
commit abc9360dc6
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
9 changed files with 191 additions and 16 deletions

5
aio/cpu/README.md Normal file
View file

@ -0,0 +1,5 @@
## AIO CPU size
Use this image with CPU-only.
Please keep using only C++ backends so the base image is as small as possible (without CUDA, cuDNN, python, etc).

View file

@ -1,13 +1,18 @@
name: all-minilm-l6-v2
backend: sentencetransformers
backend: bert-embeddings
embeddings: true
f16: true
gpu_layers: 90
mmap: true
name: text-embedding-ada-002
parameters:
model: all-MiniLM-L6-v2
model: huggingface://mudler/all-MiniLM-L6-v2/ggml-model-q4_0.bin
usage: |
You can test this model with curl like this:
curl http://localhost:8080/embeddings -X POST -H "Content-Type: application/json" -d '{
"input": "Your text string goes here",
"model": "all-minilm-l6-v2"
"model": "text-embedding-ada-002"
}'