Out-of-Distribution Generalization in Transformers via Latent Space Reasoning

(arxiv.org)

9 points | by marojejian a day ago ago

1 comments