Reader · no content
No body text on file.
Open the original to read the full piece.
SubQ is introduced as a breakthrough LLM built on a fully sub-quadratic sparse-attention architecture (SSA) that supports a 12 million-token context window. Alexander Whedon claims it achieves 52× speed vs. FlashAttention at 1M tokens, costs under 5% of Opus (caption also said 10× cheaper than Opus 4.7), and requires nearly 1,000× less compute.
SubQ is presented as the first model built on a fully sub-quadratic sparse-attention architecture (SSA) and the first frontier model with a 12,000,000-token context window (announcement by Alexander Whedon, posted 2026-05-05).
Open the original to read the full piece.