r/LocalLLaMA

Little Qwen 3.5 27B and Qwen 35B-A3B models did very well in my logical reasoning benchmark


title: Little Qwen 3.5 27B and Qwen 35B-A3B models did very well in my logical reasoning benchmark
author: u/fairydreaming
contenttype: redditpost
publication: r/LocalLLaMA
published: 2026-02-27T15:24:15+00:00
sourceurl: https://www.reddit.com/r/LocalLLaMA/comments/1rg9lli/littleqwen3527bandqwen35ba3bmodels_did/

word_count: 46

Tested in lineage-bench. Results are here. It's amazing that models this small can reliably reason from hundreds of premises.

Link: https://i.redd.it/s1gze7y5g1mg1.png

Score: 143 | Comments: 31 | Subreddit: r/LocalLLaMA


Top Comments

u/Healthy-Nebula-3603 (15 pts):
For qwen 27b is reasoning level of sonnet 4.5 ....that's insane.

I wouldn't believe such a small model could be so smart if I did not see it and test it.

u/Roubbes (10 pts):
I wouldn't have believed I could run a smarter model in my GPU than the sota at the time GPT-4 came out.

u/klop2031 (19 pts):
Seems like the 27b is better than the 122b interesting

u/fairydreaming (9 pts):
By the way I noticed that Artificial Analysis seems to corroborate this with Intelligence score 42 for Qwen3.5 27B (Reasoning) and score 37 for Qwen3.5 35B A3B (Reasoning). Next model of similar size is Seed-OSS-36B-Instruct (AFAIK it's a dense model as well) and it has Intelligence score of only 25, so there seems to be a huge progress in the intelligence of small models made by Qwen - at least measured by existing benchmarks.

u/cookieGaboo24 (9 pts):
Well, that settles it. If 35b-a3b is on similar levels as Gemini 3 flash, that's all I need. Considering other benchmarks point to the same conclusion. Qwen really did great this time. Great test, many thanks and best regards

u/dubesor86 (5 pts):
I think the differentiation between the top performers and models on the lower end of ranks 30ish is quite low. Maybe skip lineages <64 ?