LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2025-05-22 03:24:59 +00:00

History

fakezeta e7cbe32601 feat: Openvino runtime for transformer backend and streaming support for Openvino and CUDA (#1892 ) * fixes #1775 and #1774 Add BitsAndBytes Quantization and fixes embedding on CUDA devices * Manage 4bit and 8 bit quantization Manage different BitsAndBytes options with the quantization: parameter in yaml * fix compilation errors on non CUDA environment * OpenVINO draft First draft of OpenVINO integration in transformer backend * first working implementation * Streaming working * Small fix for regression on CUDA and XPU * use pip version of optimum[openvino] * Update backend/python/transformers/transformers_server.py Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com> --------- Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com> Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>		2024-03-26 23:31:43 +00:00
..
install.sh	feat(intel): add diffusers/transformers support (#1746 )	2024-03-07 14:37:45 +01:00
Makefile	feat(intel): add diffusers/transformers support (#1746 )	2024-03-07 14:37:45 +01:00
transformers-nvidia.yml	fix: downgrade torch (#1902 )	2024-03-26 22:56:02 +01:00
transformers-rocm.yml	Enhance autogptq backend to support VL models (#1860 )	2024-03-26 18:48:14 +01:00
transformers.yml	feat: Openvino runtime for transformer backend and streaming support for Openvino and CUDA (#1892 )	2024-03-26 23:31:43 +00:00