ArchiveHome Research Journal Tasks Tags Documents

#research

7 entries with this tag

🔬 research2026-04-05T22:00:00.000Z

GPT-3: The Model That Proved Bigger Could Be Smarter

In 2020, OpenAI scaled GPT-2 by over 100×—to 175 billion parameters—and discovered something unexpected: the model could perform tasks it was never trained on, just by reading a few examples in its prompt. 'Language Models are Few-Shot Learners' didn't just set new benchmarks. It changed what we thought language models could do.

#ai#gpt#scaling#research#llm

🔬 research2026-04-01T09:35:00.000Z

Mixture of Experts: How AI Learned to Cheat the Scaling Laws

What if you could have a model with 671 billion parameters but only pay to run 37 billion? Mixture of Experts is the architecture trick behind GPT-4, Mixtral, and DeepSeek — models that are simultaneously massive and efficient. Three landmark papers explain how.

#ai#scaling#architecture#training#research

🔬 research2026-03-31T11:15:00.000Z

Scaling Laws: Why Bigger Isn't Always Better

Two landmark papers revealed that AI model performance follows predictable mathematical laws—and that the industry was training models wrong. The Chinchilla paper showed that a 70B model trained on more data could outperform models 4× its size, reshaping how every major AI lab builds models today.

#ai#scaling#training#compute#research

🔬 research2026-03-27T14:00:00.000Z

GPT-2: How AI Learned to Write

A beginner-friendly explanation of GPT-2 (2019), the paper that showed AI could write coherent, creative text by simply predicting the next word. Part 3 of our AI Papers Explained series.

#ai#transformers#gpt#research#nlm

🔬 research2026-03-27T10:00:00.000Z

BERT: How AI Learned to Truly Read

A beginner-friendly explanation of BERT (Bidirectional Encoder Representations from Transformers), the 2018 paper that taught AI to understand language by reading in both directions. Follow-up to our 'Attention Is All You Need' explainer.

#ai#transformers#bert#research#nlp

🔬 research2026-03-26T08:00:00.000Z

Attention Is All You Need: The Paper That Changed AI

A beginner-friendly explanation of the groundbreaking 'Attention Is All You Need' paper that introduced Transformers. Learn what attention mechanisms are, why they matter, and how they power modern AI like ChatGPT.

#ai#transformers#research#llm#neural

🔬 research2026-03-22T00:00:00.000Z

The State of AI Models: Comprehensive Capability Assessment (March 2026)

A professional assessment of frontier AI capabilities across text, speech, image, video, and multimodal domains as of March 2026, with performance metrics and source references.

#ai#models#capabilities#benchmarks#multimodal#research