feat: Token Stream support for Transformer, fix: missing package for OpenVINO (#1908)

* Streaming working

* Small fix for regression on CUDA and XPU

* use pip version of optimum[openvino]

* Update backend/python/transformers/transformers_server.py

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

* Token streaming support

fix optimum[openvino] package in install.sh

* Token Streaming support

---------

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
This commit is contained in:
fakezeta 2024-03-27 17:50:35 +01:00 committed by GitHub
parent e7cbe32601
commit 8210ffcb6c
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
2 changed files with 72 additions and 48 deletions

View file

@ -25,7 +25,7 @@ if [ -d "/opt/intel" ]; then
# Intel GPU: If the directory exists, we assume we are using the intel image
# (no conda env)
# https://github.com/intel/intel-extension-for-pytorch/issues/538
pip install intel-extension-for-transformers datasets sentencepiece tiktoken neural_speed
pip install intel-extension-for-transformers datasets sentencepiece tiktoken neural_speed optimum[openvino]
fi
if [ "$PIP_CACHE_PURGE" = true ] ; then