Transformers were first introduced in a 2017 paper by a team at Google.
It was titled [“Attention is All You Need.”](https://arxiv.org/abs/1706.03762)
The paper proposed a new architecture for language models called a transformer.
[[Transformers]] memory overcame the limitations of [[Recurrent Neural Networks - RNNs]] and enabled massive improvements. Its attention mechanism allowed for the processing of longer text.
![[Pasted image 20230219144811.png]]
Enter [[Generative Pretrained Transformers - GPT]]