An implementation of Performer, a linear attention-based transformer, in Pytorch
Add relative positional encoding (previously impossible) for Performer. Dramatically better performance