Towards Greater Leverage: Scaling Laws for Efficient MoE Language Models

(arxiv.org)

3 points | by Anon84 8 hours ago ago

No comments yet.