Transformer Positional Encoding & Tokenization Study

Investigation into the effect of positional encoding, tokenization, and model size on perplexity.

Transformer Positional Encoding & Tokenization Study