DeepSeek V4‑Pro: 1.6T model, 1M‑token context
DeepSeek released V4‑Pro, a 1.6 trillion‑parameter open‑weight model with a 1 million‑token context window, priced at $1.74/$3.48 per million input/output tokens.
Hangzhou‑based DeepSeek announced V4‑Pro and a lower‑cost variant, V4‑Flash, on April 24, 2026. Both models are available as open weights under an MIT license on Hugging Face. DeepSeek also said it will retire older chat endpoints on July 24, 2026.
V4‑Pro contains 1.6 trillion total parameters and supports a 1,000,000‑token context window. The model uses a mixture‑of‑experts architecture that keeps all parameters in the model but activates about 49 billion parameters per inference. V4‑Flash contains 284 billion total parameters with roughly 13 billion active per inference. DeepSeek priced V4‑Pro at $1.74 per million input tokens and $3.48 per million output tokens. V4‑Flash is priced at $0.14 per million input tokens and $0.28 per million output tokens.
DeepSeek published technical details that describe two new attention techniques, Compressed Sparse Attention and Heavily Compressed Attention. The company said those attention types run in alternating layers so the model retains detailed local context while maintaining a lower‑cost global view. DeepSeek reported that at a one‑million‑token context V4‑Pro uses about 27% of the compute and 10% of the KV cache memory compared with its V3.2 predecessor. V4‑Flash uses about 10% of the compute and 7% of the memory of V3.2 at the same context length.
The company said parts of V4 were trained on Huawei Ascend chips and that it has expanded domestic hardware capacity. DeepSeek said prices will fall further after 950 additional supernodes come online later in 2026.
DeepSeek and independent testers published benchmark results that vary by task. In a competitive programming benchmark, V4‑Pro‑Max, the model’s highest‑effort reasoning mode, scored 3,206 on Codeforces. On a curated math and STEM set labeled Apex Shortlist, V4‑Pro‑Max achieved a 90.2% pass rate. On a software‑engineering test using real GitHub issues, the model recorded an 80.6% pass rate. DeepSeek also published performance gaps where V4‑Pro trails some closed‑source models on multitask and advanced specialist benchmarks.
DeepSeek described agent features for V4. The model supports what the company calls “interleaved thinking,” which preserves chain‑of‑thought across multiple tool calls so agents do not lose context between web searches, code runs, and other tool interactions. An internal survey of 85 developers who used V4‑Pro as a coding agent found 52% would make it their default and 39% leaned toward adoption.
The model card on Hugging Face includes the statement: “DeepSeek‑V4‑Pro‑Max significantly advances the knowledge capabilities of open‑source models, firmly establishing itself as the best open‑source model available today.” DeepSeek said multimodal capabilities are under development and that the current releases are text‑only.
The material on GNcrypto is intended solely for informational use and must not be regarded as financial advice. We make every effort to keep the content accurate and current, but we cannot warrant its precision, completeness, or reliability. GNcrypto does not take responsibility for any mistakes, omissions, or financial losses resulting from reliance on this information. Any actions you take based on this content are done at your own risk. Always conduct independent research and seek guidance from a qualified specialist. For further details, please review our Terms, Privacy Policy and Disclaimers.






