DeepSeek / deepseek-v4-flash

deepseekDeepSeek-V4DeepSeek-V4-Flash
0 0 1 更新于 2026-05-06 22:05

DeepSeek-V4:迈向高效百万Token上下文智能

DeepSeek-V4-Flash 是 DeepSeek-V4 系列的预览版,一款总参数量为 284B、激活参数为 13B 的混合专家模型,专为在 1M Token 上下文窗口内实现高效推理而构建。

3 种思考模式:

  • 无思考:用于快速、直觉性回答
  • 思考:用于谨慎的逻辑分析
  • 最大思考:在最具挑战性的问题上投入最大推理努力 benchmark
基准测试(指标)V4-Flash 非思考V4-Flash 高V4-Flash 最大V4-Pro 非思考V4-Pro 高V4-Pro 最大
知识与推理
MMLU-Pro (EM)83.086.486.282.987.187.5
SimpleQA-Verified (Pass@1)23.128.934.145.046.257.9
Chinese-SimpleQA (Pass@1)71.573.278.975.877.784.4
GPQA Diamond (Pass@1)71.287.488.172.989.190.1
HLE (Pass@1)8.129.434.87.734.537.7
LiveCodeBench (Pass@1)55.288.491.656.889.893.5
Codeforces (Rating)-28163052-29193206
HMMT 2026 Feb (Pass@1)40.891.994.831.794.095.2
IMOAnswerBench (Pass@1)41.985.188.435.388.089.8
Apex (Pass@1)1.019.133.00.427.438.3
Apex Shortlist (Pass@1)9.372.185.79.285.590.2
长上下文
MRCR 1M (MMR)37.576.978.744.783.383.5
CorpusQA 1M (ACC)15.559.360.535.656.562.0
智能体
Terminal Bench 2.0 (Acc)49.156.656.959.163.367.9
SWE Verified (Resolved)73.778.679.073.679.480.6
SWE Pro (Resolved)49.152.352.652.154.455.4
SWE Multilingual (Resolved)69.770.273.369.874.176.2
BrowseComp (Pass@1)-53.573.2-80.483.4
HLE w/ tools (Pass@1)-40.345.1-44.748.2
MCPAtlas (Pass@1)64.067.469.069.474.273.6
GDPval-AA (Elo)--1395--1554
Toolathlon (Pass@1)40.743.547.846.349.051.8

参考

DeepSeek-V4 技术报告