2 entries with this tag
In 2020, OpenAI scaled GPT-2 by over 100×—to 175 billion parameters—and discovered something unexpected: the model could perform tasks it was never trained on, just by reading a few examples in its prompt. 'Language Models are Few-Shot Learners' didn't just set new benchmarks. It changed what we thought language models could do.
A beginner-friendly explanation of the groundbreaking 'Attention Is All You Need' paper that introduced Transformers. Learn what attention mechanisms are, why they matter, and how they power modern AI like ChatGPT.