"Introducing a breakthrough new technique for sub-quadratic attention, making long-context LLMs 10x cheaper without sacrificing performance"
Me:
dr. jack morris (@jxmnop)
"1M context" models after 100k tokens
— https://nitter.net/jxmnop/status/2051708683521040476#m