title: @alexocheema: Direct comparison of NVIDIA RTX 5090 to M3 Ultra 👀
Small models should always be faster on the 5090...
author: @alexocheema
contenttype: tweet
publication: Twitter/X
published: 2026-04-03T18:12:24+00:00
sourceurl: https://x.com/alexocheema/status/2040130311065886772
word_count: 81
Direct comparison of NVIDIA RTX 5090 to M3 Ultra 👀
Small models should always be faster on the 5090. The best perf for large models is to use both together (more on that soon).
Using llama.cpp isn’t super fair given the performance is not great on Apple Silicon. MLX is better
NVIDIA AI PC (@NVIDIAAIPC)
.@GoogleGemma 4 31B is up to 2.7X faster on RTX using llama.cpp.
Thanks to @ggerganov for working with us to make this model fast.
— https://nitter.net/NVIDIAAIPC/status/2039787452643131696#m