FlashAttention-T: Towards Tensorized Attention

(dl.acm.org)

72 points | by matt_d 5 hours ago ago

33 comments