Shimmy v2.0

The first pure-Rust GGUF inference engine. No C. No Python.

4.5(10,000 avis)

Payant· à partir de US$20.00/mois

Lancé en 2026

Sobre

Two 5,200-token runs. Same model. SHA-identical byte output. That's a proof, not a benchmark. Shimmy v2.0 ships Airframe: pure-Rust GPU inference with hand-written WGSL compute shaders. No llama.cpp. No C. No Python. No CUDA. First production GGUF engine Rust all the way down — including the GPU shaders. Run TinyLlama, Llama 3.2, Phi, DeepSeek from GGUF. Drop-in for AnythingLLM, Open WebUI, Cursor, Zed via OpenAI or Ollama API. Windows, macOS, Linux. cargo install shimmy

Points positifs

+Suporte a dois runs de 5.200 tokens
+Modelo SHA-identico byte output
+GPU inference com hand-written WGSL compute shaders

Points négatifs

−Limitado a dois runs de 5.200 tokens
−Requer conhecimento de Rust para configuração

Shimmy v2.0

Sobre

Points positifs

Points négatifs

Você também pode gostar

BallonsTranslator Pro

Mobilewright

shout