vibedonaldsvibedonalds.com
Term

DeepSeek R1

DeepSeek's open-weights reasoning model released January 2025. First open model to match OpenAI o1-level reasoning benchmarks. Disrupted the assumption that frontier capabilities require proprietary checkpoints.

Background

R1 (and the R1-Zero ablation) demonstrated that pure RL training without supervised fine-tuning could elicit chain-of-thought reasoning. DeepSeek released full weights under MIT licence; the moment is widely seen as the point when open and closed model performance converged on reasoning tasks. Downstream distillations into smaller models (Llama-R1-Distill, Qwen-R1-Distill) spread the technique through 2025.