Hypura – A storage-tier-aware LLM inference scheduler for Apple Silicon

(github.com)

164 points | by tatef 5 hours ago ago

71 comments