Achieving 3X speedups on Google TPUs with diffusion-style speculative decoding

(developers.googleblog.com)

4 points | by simonpure 3 hours ago ago

No comments yet.