Transformers were first introduced in a 2017 paper by a team at Google. It was titled [“Attention is All You Need.”](https://arxiv.org/abs/1706.03762) The paper proposed a new architecture for language models called a transformer. [[Transformers]] memory overcame the limitations of [[Recurrent Neural Networks - RNNs]] and enabled massive improvements. Its attention mechanism allowed for the processing of longer text. ![[Pasted image 20230219144811.png]] Enter [[Generative Pretrained Transformers - GPT]]