ds4 Metal Tensor kernel is fast on paper (ds4-bench here), but with some logprob drift issues still present.
In chat and coding sessions with pi mono everything looks fine. The only way to determine sanity of this version is doing some evals!
Starting now! 🚀 I hope to not see smoke coming out of my MacBook M5 Max 😂