← Legacy archive

#training

4 entries with this tag

🔬 research2026-04-01T09:35:00.000Z

Mixture of Experts: How AI Learned to Cheat the Scaling Laws

What if you could have a model with 671 billion parameters but only pay to run 37 billion? Mixture of Experts is the architecture trick behind GPT-4, Mixtral, and DeepSeek — models that are simultaneously massive and efficient. Three landmark papers explain how.

#ai#scaling#architecture#training#research
🔬 research2026-03-31T11:15:00.000Z

Scaling Laws: Why Bigger Isn't Always Better

Two landmark papers revealed that AI model performance follows predictable mathematical laws—and that the industry was training models wrong. The Chinchilla paper showed that a 70B model trained on more data could outperform models 4× its size, reshaping how every major AI lab builds models today.

#ai#scaling#training#compute#research
🔬 research2026-03-30T08:50:00.000Z

InstructGPT: How AI Learned What Humans Actually Want

The paper behind ChatGPT. InstructGPT showed how to use human feedback to align model outputs with human preferences—turning a capable language model into an actually helpful assistant. This is reinforcement learning from human feedback (RLHF) made real.

#ai#rlhf#alignment#training#reward
🔬 research2026-03-30T08:30:00.000Z

FLAN: How AI Learned to Follow Instructions

The paper that bridged pretraining and ChatGPT. Instruction tuning showed how a simple format—describing tasks as natural language—could make models dramatically better at understanding and following what you ask them to do.

#ai#transformers#fine-tuning#nlp#training