* fix(llama-cpp): consistently select fallback
We didn't took in consideration the case where the host has the CPU
flagset, but the binaries were not actually present in the asset dir.
This made possible for instance for models that specified the llama-cpp
backend directly in the config to not eventually pick-up the fallback
binary in case the optimized binaries were not present.
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chore: adjust and simplify selection
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix: move failure recovery to BackendLoader()
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* comments
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* minor fixups
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chore(refactor): track internally started models by ID
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Just extend options, no need to copy
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Improve debugging for rerankers failures
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Simplify model loading with rerankers
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Be more consistent when generating model options
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Uncommitted code
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Make deleteProcess more idiomatic
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Adapt CLI for sound generation
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Fixup threads definition
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Handle corner case where c.Seed is nil
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Consistently use ModelOptions
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Adapt new code to refactoring
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Dave <dave@gray101.com>
* feat: allow to run parallel requests
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fixup
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>