Pipeline-parallel LLM inference across GPUs on separate machines

(github.com)

4 points | by ngaut 10 hours ago ago

No comments yet.