* feat(sycl): Enable SYCL for stable diffusion
This is a pain because we compile with CGO, but SD is compiled with
CMake. I don't think we can easily use CMake to set the linker flags
necessary. Also I could not find pkg-config calls that would fully set
the flags, so some of them are set manually.
See https://www.intel.com/content/www/us/en/developer/tools/oneapi/onemkl-link-line-advisor.html
for reference. I also resorted to searching the shared object files in
MKLROOT/lib for the symbols.
Signed-off-by: Richard Palethorpe <io@richiejp.com>
* fix(ci): Don't set nproc on cmake
Signed-off-by: Richard Palethorpe <io@richiejp.com>
---------
Signed-off-by: Richard Palethorpe <io@richiejp.com>
* chore(sycl): Update oneapi to 2025:1
Signed-off-by: Richard Palethorpe <io@richiejp.com>
* fix(sycl): Pass -fsycl flag as workaround
-fsycl should be set by llama.cpp's cmake file, but something goes wrong
and it doesn't appear to get added
Signed-off-by: Richard Palethorpe <io@richiejp.com>
* fix(build): Speed up llama build by using all CPUs
Signed-off-by: Richard Palethorpe <io@richiejp.com>
---------
Signed-off-by: Richard Palethorpe <io@richiejp.com>
* Add machine tag option, add extraUsage option, grpc-server -> proto -> endpoint extraUsage data is broken for now
Signed-off-by: mintyleaf <mintyleafdev@gmail.com>
* remove redurant timing fields, fix not working timings output
Signed-off-by: mintyleaf <mintyleafdev@gmail.com>
* use middleware for Machine-Tag only if tag is specified
Signed-off-by: mintyleaf <mintyleafdev@gmail.com>
---------
Signed-off-by: mintyleaf <mintyleafdev@gmail.com>
* chore(deps): bump llama-cpp to ae8de6d50a09d49545e0afab2e50cc4acfb280e2
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix(metal): metal file has moved
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Add Get Token Metrics to GRPC server
Signed-off-by: Siddharth More <siddimore@gmail.com>
* Expose LocalAI endpoint
Signed-off-by: Siddharth More <siddimore@gmail.com>
---------
Signed-off-by: Siddharth More <siddimore@gmail.com>
* Add CorrelationID to chat request
Signed-off-by: Siddharth More <siddimore@gmail.com>
* remove get_token_metrics
Signed-off-by: Siddharth More <siddimore@gmail.com>
* Add CorrelationID to proto
Signed-off-by: Siddharth More <siddimore@gmail.com>
* fix correlation method name
Signed-off-by: Siddharth More <siddimore@gmail.com>
* Update core/http/endpoints/openai/chat.go
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Signed-off-by: Siddharth More <siddimore@gmail.com>
* Update core/http/endpoints/openai/chat.go
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Signed-off-by: Siddharth More <siddimore@gmail.com>
---------
Signed-off-by: Siddharth More <siddimore@gmail.com>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
* feat(llama.cpp): add embeddings
Also enable embeddings by default for llama.cpp models
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix(Makefile): prepare llama.cpp sources only once
Otherwise we keep cloning llama.cpp for each of the variants
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* do not set embeddings to false
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* docs: add embeddings to the YAML config reference
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>