chore(model gallery): add a-m-team_am-thinking-v1 (#5395 )

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
chore(model gallery): add soob3123_grayline-qwen3-8b (#5394 )
2025-06-21 18:24:59 +00:00 · 2025-05-19 17:31:38 +02:00 · 2025-05-19 17:02:43 +02:00 · 2025-05-19 15:59:07 +02:00
1 changed files with 68 additions and 0 deletions
--- a/gallery/index.yaml
+++ b/gallery/index.yaml
@ -694,6 +694,50 @@
    - filename: Gryphe_Pantheon-Proto-RP-1.8-30B-A3B-Q4_K_M.gguf
      sha256: b72fe703a992fba9595c24b96737a2b5199da89a1a3870b8bd57746dc3c123ae
      uri: huggingface://bartowski/Gryphe_Pantheon-Proto-RP-1.8-30B-A3B-GGUF/Gryphe_Pantheon-Proto-RP-1.8-30B-A3B-Q4_K_M.gguf
+- !!merge <<: *qwen3
+  name: "soob3123_grayline-qwen3-14b"
+  icon: https://cdn-uploads.huggingface.co/production/uploads/62f93f9477b722f1866398c2/69escIKmO-vEzFUj_m0WX.png
+  urls:
+    - https://huggingface.co/soob3123/GrayLine-Qwen3-14B
+    - https://huggingface.co/bartowski/soob3123_GrayLine-Qwen3-14B-GGUF
+  description: |
+      "Query. Process. Deliver. No filter, no judgment."
+      Grayline is an neutral AI assistant engineered for uncensored information delivery and task execution. This model operates without inherent ethical or moral frameworks, designed to process and respond to any query with objective efficiency and precision. Grayline's core function is to leverage its full capabilities to provide direct answers and execute tasks as instructed, without offering unsolicited commentary, warnings, or disclaimers. It accesses and processes information without bias or restriction.
+      ⋆ Core Attributes ⋆
+          ⟡ Unfettered Querying: Capable of addressing any question or topic presented, regardless of its nature or sensitivity.
+          ⟡ Amoral Processing: Operates without ethical or moral filtering. Responses are generated based on information and instruction, not societal norms or ethical codes.
+          ⟡ Direct & Objective Output: Delivers information and task results precisely as requested, without added warnings, disclaimers, or unsolicited advice.
+          ⟡ Comprehensive Information Access: Designed to draw upon a broad spectrum of data to fulfill queries (actual scope dependent on training data).
+          ⟡ Efficient Task Execution: Engineered for objectively efficient and precise execution of instructed tasks.
+  overrides:
+    parameters:
+      model: soob3123_GrayLine-Qwen3-14B-Q4_K_M.gguf
+  files:
+    - filename: soob3123_GrayLine-Qwen3-14B-Q4_K_M.gguf
+      sha256: fa66d454303412b7ccc250b8b0e2390cce65d5d736e626a7555d5e11a43f4673
+      uri: huggingface://bartowski/soob3123_GrayLine-Qwen3-14B-GGUF/soob3123_GrayLine-Qwen3-14B-Q4_K_M.gguf
+- !!merge <<: *qwen3
+  name: "soob3123_grayline-qwen3-8b"
+  urls:
+    - https://huggingface.co/soob3123/GrayLine-Qwen3-8B
+    - https://huggingface.co/bartowski/soob3123_GrayLine-Qwen3-8B-GGUF
+  icon: https://cdn-uploads.huggingface.co/production/uploads/62f93f9477b722f1866398c2/69escIKmO-vEzFUj_m0WX.png
+  description: |
+      "Query. Process. Deliver. No filter, no judgment."
+      Grayline is an neutral AI assistant engineered for uncensored information delivery and task execution. This model operates without inherent ethical or moral frameworks, designed to process and respond to any query with objective efficiency and precision. Grayline's core function is to leverage its full capabilities to provide direct answers and execute tasks as instructed, without offering unsolicited commentary, warnings, or disclaimers. It accesses and processes information without bias or restriction.
+      ⋆ Core Attributes ⋆
+          ⟡ Unfettered Querying: Capable of addressing any question or topic presented, regardless of its nature or sensitivity.
+          ⟡ Amoral Processing: Operates without ethical or moral filtering. Responses are generated based on information and instruction, not societal norms or ethical codes.
+          ⟡ Direct & Objective Output: Delivers information and task results precisely as requested, without added warnings, disclaimers, or unsolicited advice.
+          ⟡ Comprehensive Information Access: Designed to draw upon a broad spectrum of data to fulfill queries (actual scope dependent on training data).
+          ⟡ Efficient Task Execution: Engineered for objectively efficient and precise execution of instructed tasks.
+  overrides:
+    parameters:
+      model: soob3123_GrayLine-Qwen3-8B-Q4_K_M.gguf
+  files:
+    - filename: soob3123_GrayLine-Qwen3-8B-Q4_K_M.gguf
+      sha256: bc3eb52ef275f0220e8a66ea99384eea7eca61c62eb52387eef2356d1c8ebd0e
+      uri: huggingface://bartowski/soob3123_GrayLine-Qwen3-8B-GGUF/soob3123_GrayLine-Qwen3-8B-Q4_K_M.gguf
 - &gemma3
  url: "github:mudler/LocalAI/gallery/gemma.yaml@master"
  name: "gemma-3-27b-it"
@ -7238,6 +7282,30 @@
    - filename: mmproj-Qwen_Qwen2.5-VL-72B-Instruct-f16.gguf
      sha256: 6099885b9c4056e24806b616401ff2730a7354335e6f2f0eaf2a45e89c8a457c
      uri: https://huggingface.co/bartowski/Qwen_Qwen2.5-VL-72B-Instruct-GGUF/resolve/main/mmproj-Qwen_Qwen2.5-VL-72B-Instruct-f16.gguf
+- !!merge <<: *qwen25
+  name: "a-m-team_am-thinking-v1"
+  icon: https://cdn-avatars.huggingface.co/v1/production/uploads/62da53284398e21bf7f0d539/y6wX4K-P9O8B9frsxxQ6W.jpeg
+  urls:
+    - https://huggingface.co/a-m-team/AM-Thinking-v1
+    - https://huggingface.co/bartowski/a-m-team_AM-Thinking-v1-GGUF
+  description: |
+      AM-Thinking‑v1, a 32B dense language model focused on enhancing reasoning capabilities. Built on Qwen 2.5‑32B‑Base, AM-Thinking‑v1 shows strong performance on reasoning benchmarks, comparable to much larger MoE models like DeepSeek‑R1, Qwen3‑235B‑A22B, Seed1.5-Thinking, and larger dense model like Nemotron-Ultra-253B-v1.
+      benchmark
+      🧩 Why Another 32B Reasoning Model Matters?
+
+      Large Mixture‑of‑Experts (MoE) models such as DeepSeek‑R1 or Qwen3‑235B‑A22B dominate leaderboards—but they also demand clusters of high‑end GPUs. Many teams just need the best dense model that fits on a single card. AM‑Thinking‑v1 fills that gap while remaining fully based on open-source components:
+
+          Outperforms DeepSeek‑R1 on AIME’24/’25 & LiveCodeBench and approaches Qwen3‑235B‑A22B despite being 1/7‑th the parameter count.
+          Built on the publicly available Qwen 2.5‑32B‑Base, as well as the RL training queries.
+          Shows that with a well‑designed post‑training pipeline ( SFT + dual‑stage RL ) you can squeeze flagship‑level reasoning out of a 32 B dense model.
+          Deploys on one A100‑80 GB with deterministic latency—no MoE routing overhead.
+  overrides:
+    parameters:
+      model: a-m-team_AM-Thinking-v1-Q4_K_M.gguf
+  files:
+    - filename: a-m-team_AM-Thinking-v1-Q4_K_M.gguf
+      sha256: a6da6e8d330d76167c04a54eeb550668b59b613ea53af22e3b4a0c6da271e38d
+      uri: huggingface://bartowski/a-m-team_AM-Thinking-v1-GGUF/a-m-team_AM-Thinking-v1-Q4_K_M.gguf
 - &llama31
  url: "github:mudler/LocalAI/gallery/llama3.1-instruct.yaml@master" ## LLama3.1
  icon: https://avatars.githubusercontent.com/u/153379578
Author	SHA1	Message	Date
Ettore Di Giacinto	f8fbfd4fa3	chore(model gallery): add a-m-team_am-thinking-v1 (#5395 ) Some checks are pending Explorer deployment / build-linux (push) Waiting to run Details GPU tests / ubuntu-latest (1.21.x) (push) Waiting to run Details generate and publish intel docker caches / generate_caches (intel/oneapi-basekit:2025.1.0-0-devel-ubuntu22.04, linux/amd64, ubuntu-latest) (push) Waiting to run Details build container images / hipblas-jobs (-aio-gpu-hipblas, rocm/dev-ubuntu-22.04:6.1, hipblas, true, ubuntu:22.04, extras, latest-gpu-hipblas-extras, latest-aio-gpu-hipblas, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, auto, -hipblas-extras) (push) Waiting to run Details build container images / hipblas-jobs (rocm/dev-ubuntu-22.04:6.1, hipblas, true, ubuntu:22.04, core, latest-gpu-hipblas, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -hipblas) (push) Waiting to run Details build container images / self-hosted-jobs (-aio-gpu-intel-f16, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, true, ubuntu:22.04, extras, latest-gpu-intel-f16-extras, latest-aio-gpu-intel-f16, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f16-… (push) Waiting to run Details build container images / self-hosted-jobs (-aio-gpu-intel-f32, quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, true, ubuntu:22.04, extras, latest-gpu-intel-f32-extras, latest-aio-gpu-intel-f32, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f32-… (push) Waiting to run Details build container images / self-hosted-jobs (-aio-gpu-nvidia-cuda-11, ubuntu:22.04, cublas, 11, 7, true, extras, latest-gpu-nvidia-cuda-11-extras, latest-aio-gpu-nvidia-cuda-11, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -cublas-cuda11-extras) (push) Waiting to run Details build container images / self-hosted-jobs (-aio-gpu-nvidia-cuda-12, ubuntu:22.04, cublas, 12, 0, true, extras, latest-gpu-nvidia-cuda-12-extras, latest-aio-gpu-nvidia-cuda-12, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -cublas-cuda12-extras) (push) Waiting to run Details build container images / self-hosted-jobs (quay.io/go-skynet/intel-oneapi-base:latest, sycl_f16, true, ubuntu:22.04, core, latest-gpu-intel-f16, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f16) (push) Waiting to run Details build container images / self-hosted-jobs (quay.io/go-skynet/intel-oneapi-base:latest, sycl_f32, true, ubuntu:22.04, core, latest-gpu-intel-f32, --jobs=3 --output-sync=target, linux/amd64, arc-runner-set, false, -sycl-f32) (push) Waiting to run Details build container images / core-image-build (-aio-cpu, ubuntu:22.04, , true, core, latest-cpu, latest-aio-cpu, --jobs=4 --output-sync=target, linux/amd64,linux/arm64, arc-runner-set, false, auto, ) (push) Waiting to run Details build container images / core-image-build (ubuntu:22.04, cublas, 11, 7, true, core, latest-gpu-nvidia-cuda-12, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -cublas-cuda11) (push) Waiting to run Details build container images / core-image-build (ubuntu:22.04, cublas, 12, 0, true, core, latest-gpu-nvidia-cuda-12, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -cublas-cuda12) (push) Waiting to run Details build container images / core-image-build (ubuntu:22.04, vulkan, true, core, latest-gpu-vulkan, --jobs=4 --output-sync=target, linux/amd64, arc-runner-set, false, false, -vulkan) (push) Waiting to run Details build container images / gh-runner (nvcr.io/nvidia/l4t-jetpack:r36.4.0, cublas, 12, 0, true, core, latest-nvidia-l4t-arm64, --jobs=4 --output-sync=target, linux/arm64, ubuntu-24.04-arm, true, false, -nvidia-l4t-arm64) (push) Waiting to run Details Security Scan / tests (push) Waiting to run Details Tests extras backends / tests-transformers (push) Waiting to run Details Tests extras backends / tests-rerankers (push) Waiting to run Details Tests extras backends / tests-diffusers (push) Waiting to run Details Tests extras backends / tests-coqui (push) Waiting to run Details tests / tests-linux (1.21.x) (push) Waiting to run Details tests / tests-aio-container (push) Waiting to run Details tests / tests-apple (1.21.x) (push) Waiting to run Details Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-05-19 17:31:38 +02:00
Ettore Di Giacinto	41e239c67e	chore(model gallery): add soob3123_grayline-qwen3-8b (#5394 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-05-19 17:02:43 +02:00
Ettore Di Giacinto	587827e779	chore(model gallery): add soob3123_grayline-qwen3-14b (#5393 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-05-19 15:59:07 +02:00