chore(model gallery): add pku-ds-lab_fairyr1-32b (#5517)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
This commit is contained in:
Ettore Di Giacinto 2025-05-29 09:43:45 +02:00 committed by GitHub
parent f257bf8d14
commit 39292407a1
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

View file

@ -10593,6 +10593,26 @@
- filename: PKU-DS-LAB_FairyR1-14B-Preview-Q4_K_M.gguf
sha256: c082eb3312cb5343979c95aad3cdf8e96abd91e3f0cb15e0083b5d7d94d7a9f8
uri: huggingface://bartowski/PKU-DS-LAB_FairyR1-14B-Preview-GGUF/PKU-DS-LAB_FairyR1-14B-Preview-Q4_K_M.gguf
- !!merge <<: *deepseek-r1
name: "pku-ds-lab_fairyr1-32b"
urls:
- https://huggingface.co/PKU-DS-LAB/FairyR1-32B
- https://huggingface.co/bartowski/PKU-DS-LAB_FairyR1-32B-GGUF
description: |
FairyR1-32B, a highly efficient large-language-model (LLM) that matches or exceeds larger models on select tasks despite using only ~5% of their parameters. Built atop the DeepSeek-R1-Distill-Qwen-32B base, FairyR1-32B leverages a novel “distill-and-merge” pipeline—combining task-focused fine-tuning with model-merging techniques to deliver competitive performance with drastically reduced size and inference cost. This project was funded by NSFC, Grant 624B2005.
The FairyR1 model represents a further exploration of our earlier work TinyR1, retaining the core “Branch-Merge Distillation” approach while introducing refinements in data processing and model architecture.
In this effort, we overhauled the distillation data pipeline: raw examples from datasets such as AIMO/NuminaMath-1.5 for mathematics and OpenThoughts-114k for code were first passed through multiple 'teacher' models to generate candidate answers. These candidates were then carefully selected, restructured, and refined, especially for the chain-of-thought(CoT). Subsequently, we applied multi-stage filtering—including automated correctness checks for math problems and length-based selection (2K8K tokens for math samples, 4K8K tokens for code samples). This yielded two focused training sets of roughly 6.6K math examples and 3.8K code examples.
On the modeling side, rather than training three separate specialists as before, we limited our scope to just two domain experts (math and code), each trained independently under identical hyperparameters (e.g., learning rate and batch size) for about five epochs. We then fused these experts into a single 32B-parameter model using the AcreeFusion tool. By streamlining both the data distillation workflow and the specialist-model merging process, FairyR1 achieves task-competitive results with only a fraction of the parameters and computational cost of much larger models.
overrides:
parameters:
model: PKU-DS-LAB_FairyR1-32B-Q4_K_M.gguf
files:
- filename: PKU-DS-LAB_FairyR1-32B-Q4_K_M.gguf
sha256: bbfe6602b9d4f22da36090a4c77da0138c44daa4ffb01150d0370f6965503e65
uri: huggingface://bartowski/PKU-DS-LAB_FairyR1-32B-GGUF/PKU-DS-LAB_FairyR1-32B-Q4_K_M.gguf
- &qwen2
url: "github:mudler/LocalAI/gallery/chatml.yaml@master" ## Start QWEN2
name: "qwen2-7b-instruct"