Term

DeepSeek R1

DeepSeek's open-weights reasoning model released January 2025. First open model to match OpenAI o1-level reasoning benchmarks. Disrupted the assumption that frontier capabilities require proprietary checkpoints.

Background

DeepSeek R1 is an open-weights large language model, released by the Chinese lab DeepSeek in January 2025, aimed at reasoning-heavy tasks like maths, code, and multi-step logic. Its defining feature is that it produces an explicit chain of thought before its final answer: the model reasons in a visible scratchpad, sometimes at considerable length, then emits the response. Unlike models where step-by-step reasoning is only coaxed out by prompting, this behaviour is trained in. DeepSeek reported using large-scale reinforcement learning, rewarding correct final answers and well-formed reasoning, rather than relying solely on supervised imitation of human-written solutions. A companion variant explored pure-RL training with minimal supervised warm-up. Because the weights are openly published under a permissive licence, R1 can be downloaded, self-hosted, fine-tuned, and run offline, and DeepSeek also released smaller distilled versions that transfer R1's reasoning traces into more deployable model sizes. For people building software with AI, R1 matters on two fronts. First, it made strong reasoning available without a proprietary API, lowering cost and removing vendor lock-in for agents, code assistants, and evaluation pipelines. Second, its visible reasoning tokens are useful for debugging and for building systems that inspect intermediate steps, though they also increase latency and token spend. Treat the long reasoning trace as a cost to budget for, and separate it from the user-facing answer in your application.

Background

Tools that use it