Ensemble-Hub — Tandem: Collaborative LLM–SLM Reasoning

Date: April 14, 2026

Ensemble-Hub is the reference implementation of Tandem (ACL 2026 Findings), a collaborative LLM–SLM reasoning framework. The LLM acts as a strategic mentor, generating compact reasoning insights with the GPRA schema (Goal, Planning, Retrieval, Action), while a smaller, more efficient SLM executes the full reasoning. A lightweight cost-aware classifier decides when sufficient guidance has accumulated, enabling early stopping, and three progressive effort levels (low / medium / high) allocate deeper support only to harder problems. Classifiers trained on mathematical reasoning transfer to code generation without retraining. On MATH, Tandem achieves +2.56% accuracy over a standalone 32B LLM while using only 59% of its compute, with roughly 40% overall cost reduction, and works with both open-source (DeepSeek) and API models (GPT-4o).

Direct Link

Share on

X (formerly Twitter) Facebook LinkedIn

Fu Zichuan

Share on