The Anticipation for DeepSeek V4 in China's AI Landscape

When the narratives of “China Group,” “China Chain,” and “China Ring” intertwine, and as programming, multimodal, agents, and OpenClaw waves pass without the presence of DeepSeek, the expectation for DeepSeek V4’s third boost grows. People miss not only cheaper tokens but also a disruptor capable of leveraging a trillion-parameter foundation, native multimodal capabilities, and powerful agent abilities to define the next steps for AI in China.

Recently, the article “People Miss DeepSeek” went viral, mentioning that DeepSeek has driven down costs for global large models, allowing users and industries to enjoy cheaper tokens. The key issue is that applications like “Little Lobster” are burning tokens at a crazy rate, raising user costs again. In this context, the responsibility for driving down costs across the industry falls back on DeepSeek.

It has been over a year since the release of DeepSeek V3 and R1. Initially, there were expectations for DeepSeek V4 to make a splash during the Spring Festival this year, but those hopes were dashed. However, recent events like system outages and the launch of expert modes suggest that DeepSeek V4 may be closer than we think.

Thus, this may be the last call for DeepSeek’s update.

In this letter urging for updates, I want to discuss with friends who miss DeepSeek the narratives of AI in China, the waves of technological evolution, ecological competition, and token economics.

The Narrative of AI in China Has Changed

In early 2025, DeepSeek R1 debuted with low costs, high performance, and open-source capabilities, reaching its peak upon release. It not only dominated the domestic large model field but also gained global popularity, with internet platforms, IT giants, and various industries integrating and embracing open-source. Various DeepSeek integrated machines attempted to steal the spotlight.

During that time, whenever AI in China was mentioned, DeepSeek was always at the forefront. It is not an exaggeration to say that even grandparents on the street might have been discussing or using this domestic AI assistant.

However, over the past year, the AI industry and the narrative surrounding AI in China have evolved significantly. The intertwined narratives of “China Group,” “China Chain,” and “China Ring” have taken shape. The narrative of AI in China, once solely represented by DeepSeek, has lost its vibrancy.

From this perspective, the lack in large models and AI is not just computational power or electricity; it is also the time window.

Regarding “China Group,” I summarize it as “(3+1)+6+N,” where “3+1” refers to the four major companies: ByteDance, Alibaba, Tencent, and Baidu, with the latter three being the well-known giants of the internet era, known as BAT. The number “6” corresponds to the “Six Little Tigers” of the large model era—Kimi, Zhipu, MiniMax, Jiyue Xingchen, Baichuan, and Mianbi Intelligence—who have completed their listings or are racing towards it while DeepSeek focused on self-research.

Originally, Li Kaifu’s Zero One Everything was included among the Six Little Tigers, but it fell behind during the first hundred model battle, so we replaced it with Mianbi Intelligence, though Baichuan’s voice has also gradually weakened over the past year.

“N” refers not just to a single entity but to other vertical models and specialized AI companies in the market.

In total, ten companies or types of enterprises constitute the leading position of China’s large model industry. They are no longer scattered soldiers but a competitive industrial legion that DeepSeek must surpass on its path to reclaiming its glory.

Simultaneously growing with “China Group” is “China Chain”—from chip computing power, clusters/cloud, data corpus, algorithms/models, agents, to AI application development ecology, a complete chain has been established, making China one of only two countries with a full industrial chain in intelligent technology. This offers a potential alternative for global intelligent infrastructure and aims to provide new public goods for global intelligent inclusivity through capability economy.

There is no doubt that DeepSeek R1 indeed established the brand of Chinese models overseas, but now companies like MiniMax are also making significant strides in international markets.

As for “China Ring,” it encompasses industry, application, and investment—creating a closed loop from AI to AI4S and modern industrial clusters, from AI technology to market applications across thousands of industries and millions of households, and from early investments to public exits. The preliminary formation of these closed loops not only indicates that AI has been successfully implemented in China but also signifies the interconnected cycles of the intelligent economy at different levels.

From group, chain to ring, the narrative of AI in China has undoubtedly changed.

Since early 2026, the models of the Six Little Tigers have consistently led in token consumption on international platforms like OpenRouter, with their overall share surpassing half, primarily driven by overseas users.

In summary, the open-source strength of China in 2025 has altered the global AI development landscape. By 2026, China’s AI development will enter a phase of capability output.

From the perspective of global large models and the AI industry, the diversification of technological paths enhances the vitality of talent mobility and benefits supply chain resilience. For downstream application developers, the existence of multiple suppliers means stronger bargaining power and lower lock-in risks.

A positive phenomenon in China’s AI narrative is that the market has not been monopolized by a few oligarchs, which is beneficial for competitive innovation and talent ecosystem construction, and also helps form a cluster advantage in the Sino-US AI competition.

Four Waves Have Passed

Chinese classical mythology often states, “One day in heaven equals one year on earth.” During DeepSeek’s year of silence, AI has gone through four waves—programming, multimodal, agents, and OpenClaw (Little Lobster).

When AI programming tools like GitHub Copilot, Cursor, and Claude Code swept the developer community, it became hard to remember DeepSeek’s existence, even though it was also used in programming scenarios.

Programming, the fundamental driver of AI sweeping through all industries and the most essential scene for developers, has now been firmly occupied by companies like Anthropic abroad and has become a battleground for Kimi and others domestically.

In the wave of multimodal, products like Gemini 3 Pro have shown impressive performance in visual understanding and image generation, while Nano Banana has made a name for itself, and in video generation, it is ByteDance’s Seedance 2.0.

DeepSeek has appeared as a slow starter, only beginning to test million-token contexts in version V3.2, and its multimodal capabilities have yet to arrive.

Some say that in the large model field, if a generation of product technology routes goes wrong, it can miss an entire era. Is DeepSeek caught in this situation? It’s hard to say.

The third wave is Agent—multi-agent—swarm intelligence. Compared to the understanding and dialogue capabilities of AI assistants, agents have evolved to the execution level, shifting from “answering questions” to “solving problems”—from “passive response” to “active execution.” The emergence of products like Manus signifies that AI agents are transitioning from concept to reality, with Kimi Agent Swarm pushing this wave to a climax.

In this wave, DeepSeek has mostly been called upon as a model rather than being a builder of the agent ecosystem, and the model’s support for agents, tools, and code is relatively limited.

As we move into 2026, the wave of action intelligence represented by OpenClaw and various Claw products, Claude Code, and Claude Cowork has begun to emerge. Their capabilities have surpassed the agent level, becoming operational systems for applications—AI OS.

However, products like OpenClaw have been dubbed “token black holes,” with their single-task token consumption being dozens or even hundreds of times that of traditional conversational AI. This high input, low output model faces sustainability challenges in large-scale industry applications, with the products themselves being rough, unstable, and undergoing multiple destructive iterations, resembling a bare shell.

Thus, it’s not surprising that people are saying, “People miss DeepSeek,” as it has been absent through several waves, and the public needs it to drive down costs and improve efficiency in China’s large models.

However, it must be said that the logic behind confirming the application of AI OS and general action intelligence is valid, and the timing is right. It tells everyone that AI is no longer just a tool but can be an all-encompassing operational agent.

So during the March “全民养虾” (National Lobster Farming) wave, it was evident how quickly everyone was copying assignments. To promote local products, people began giving away “cyber eggs” because OpenClaw made it clear to major companies, including Anthropic, that an all-encompassing application OS and action intelligence is just around the corner. With the right mindset, tasks can be executed, and becoming a general intelligence agent is not difficult!

This is why Anthropic reacts and counters the fastest, and why it is most impacted by Claw. Claude Code flanks OpenClaw, and other major companies quickly follow suit, replicating Claude Code and OpenClaw’s strategies. This is what is currently happening.

The reason this is a battleground is due to the entry position, immense value, and future ecological dominance, which is comparable to models and the first three waves.

If large models are accumulating strength, multimodal is broadening scenes, and agents are sowing seeds, then the large-scale harvesting of ecology relies on application AI OS and general action intelligence. This now seems to have a sense of finality and the shadow of an ultimate form. When it comes to EI endogenous intelligence and II autonomous intelligence stages, it may be a different story.

However, based on today’s input-output ratio for OpenClaw, it may not be the one to occupy the ecological niche of AI OS and general action intelligence.

Thus, in this final letter urging DeepSeek for updates, we also want to pose a question: Has DeepSeek, which did not jump into these four rivers at the first opportunity, chosen to gather strength, hoping to “make a big move” through V4 and subsequent foundational models?

However, the market never waits. Users’ attention, developers’ enthusiasm, and capital flows are being diverted in wave after wave. The competitive thresholds in these four wave areas have risen sharply, and the ecological costs have significantly increased.

Will DeepSeek’s story remain stuck in the Spring Festival of 2025?

Full Ecological Competition Has Arrived

Previously, I believed that leading companies have reached a stage of full ecological competition. In this stage, full-stack AI capabilities will form the foundation for the upcoming battles among giants, with Google being a prime example.

Google’s heightened attention during the Gemini 3 Pro wave stems from the gradual emergence of their accumulated advantages in four areas: model principle evolution degree (Evolutionary Index), data depth (Data Index), full-chain ecological breadth (Ecological Index), and intelligent connectivity (Connectivity Index).

CEO Sundar Pichai has been in office for nearly ten years, and in a recent interview, he recalled the regret of losing the race to ChatGPT with the Transformer. However, he does not believe that losing the first-mover advantage means defeat; he summarizes Google’s advantage as full-stack vertical integration.

Thus, with Gemini 3 Pro, Google executed a brilliant comeback based on this full-stack integration.

One can boldly predict that in 2026, the competition among leading American AI companies may see Anthropic take the lead, followed closely by Google, while the early frontrunner OpenAI faces a pincer attack, ultimately reducing the four strong competitors to three, with the lagging one falling further behind Grok.

At the 2026 GTC, Huang Jen-Hsun, in a rare move, wrote an article proposing the “Five-Layer Cake Theory”: Energy → Chips → AI Infrastructure → Models → Applications.

However, if we delve deeper, AI competition also manifests in chips, data corpus, foundational models, development tools and developers, agents, and tool skills, as well as application services. A misstep in any of these areas can lead to a decline in overall competitiveness, and the barriers to competition and investment have become a heavy asset game worth billions or trillions.

Innovation is no longer limited to “overtaking on the curve” but involves systemic competition and framework confrontations. Especially for large models, the capital, computing power, algorithms, and data they rely on have become decisive factors; simply taking a big boost or eating a sea cucumber won’t solve many issues.

In the landscape of full ecological competition, DeepSeek has advantages in principle generation and foundational breakthroughs, but it also has obvious shortcomings: a lack of support from the industrial ecological chain of IT giants, relatively thin product application functions, and a need to strengthen multimodal and agent ecosystem construction.

The Rise of Token Economy

The new year has seen the rise of the token economy, which serves as the value closed loop of the intelligent economy as a capability economy. This is a viewpoint I shared during an interview with China National Radio.

In the past, during the industrial era, the unit of energy was kilowatt-hours; in the digital age, the unit of data flow was GB; in the intelligent era, the unit of supply for capability products is tokens. Tokens allow the “capability” of AI to become a measurable, priceable, and tradable commodity.

You can understand it this way: Tokens have become the “settlement unit” connecting technology and business, thus forming a commercial closed loop for the capability economy.

The consumption of tokens is expanding at a geometric growth rate—China’s daily token call volume surged from 100 billion in early 2024 to 140 trillion in March 2026, a growth of over a thousand times in two years. The more tokens consumed, the more it represents the vigorous development of the capability economy.

For enterprises, achieving gross margin improvement through price leverage means that their profit model has partially been validated.

However, tokens are a measurement unit, not a quality unit. The industry cannot only focus on the quantity of tokens but must pay attention to the “quality of capability” behind them. Therefore, I believe the future differentiation in the token economy will be very clear—high-quality tokens will generate profits, while low-quality tokens will incur losses, with the latter potentially being eliminated.

Thus, when Xiaomi’s Luo Fuli promotes the MiMo large model package, he states: “Currently, the global supply of computing power can no longer keep up with the token demand created by agents. The real solution is not cheaper tokens but co-evolution—a more token-efficient agent framework and a stronger, more efficient model in synergy.”

This year has seen a typical trend where users complain about expensive tokens while simultaneously paying for them. Essentially, part of the consumed tokens has been transformed into productivity. When paying for tokens becomes a trend, enterprises can generate revenue to invest in developing higher-level models, thus nurturing the intelligent economy.

The most direct paths for model and agent companies to commercialize are either through paid subscriptions or by generating revenue through API token fee packages. OpenAI’s practice of linking advertisements to AI assistant conversations carries too many uncertainties, and no other company in the industry has followed suit.

I believe that in the reasoning-driven token economy era, the scenarios that will first succeed are three types: high-value, high-density scenarios (like financial risk control and medical diagnosis, where customers are willing to pay a premium for “error-free” services); high-frequency, high-necessity scenarios (like intelligent customer service and code generation, where costs are diluted through scale); and scenarios with widespread applications of agent intelligence.

In the future, tokens will become basic services like water and electricity, thin profit, inclusive, and ubiquitous. The unit cost of tokens will continue to decrease, but the token economy will stratify: tokens with regular capability levels will trend towards thin profits, while high-capability, high-value tokens may maintain a premium.

More concretely, companies that can build closed loops of scenarios + data + platforms + models and provide high-value intelligent agent services will gain a premium.

DeepSeek, with its background in quantitative investment, is not short on funds, but from a sustainable development perspective, it also needs to embrace the token economy.

Open-Source Ecology Awaits Third Turning Point

Over the past year, the landscape of open-source ecology has changed.

In early 2025, DeepSeek ignited the first explosion in the open-source ecology. Earlier this year, OpenClaw completed the second boost to the open-source ecology. The first explosion prompted some closed-source models to lean towards open-source, with domestic giants like Baidu joining the open-source camp and overseas companies like OpenAI and Google increasing their open-source efforts.

According to analysis from the OpenRouter platform on 100 trillion token call data, the market share of open-source models has risen to 33%. The remarkable rise of Chinese open-source models is particularly noteworthy, with five of the top six on the OpenRouter platform being Chinese open-source models at one point.

The rise of open-source models is driven by a combination of technological iteration, user demand, and economic factors. The core motivation for enterprises to choose open-source models has become very pragmatic: the costs of closed-source APIs are strongly correlated with call scale, and marginal costs are uncontrollable; self-hosted open-source models significantly reduce unit costs in high concurrency, long context, and agent scenarios.

In simple terms, as long as capability is online, open-source models become cheaper the more they are used in private deployment scenarios. As a disruptor in the open-source model ecology, DeepSeek is likely to give another boost to the open-source landscape in 2026.

This anticipated push encompasses the industrial impact of computing cost, the explosive effect on user markets, the activation effect on open-source ecology, and the confidence-boosting effect on the market, which may re-emerge.

This is the underlying logic for why people miss DeepSeek; price is merely a superficial issue.

While open-source is great, building it remains a heavy task ahead.

For DeepSeek, it needs to quickly form a developer ecology, support agent development ecology, establish apps and skill packaging and distribution channels similar to Skills, to enhance openness and flexibility while attracting more developers to participate.

We look forward to DeepSeek once again becoming a key push in the open-source ecology.

Expectations for V4 Go Beyond Past Standards

Across the ocean, the suspense lies in how far the next generation of models from OpenAI and Anthropic can reach, whether a Super App can become an application OS and general action intelligence like the evolving Claude Code, and which entity can wield the ecological foundational knife of coding the fastest. These three factors will influence the major trends this year.

From the current situation, Anthropic’s fire is rapidly approaching OpenAI’s stronghold. This can be seen in the financial data disclosed by the Wall Street Journal, indicating that Anthropic may turn a profit before OpenAI.

In this context, what do we expect from DeepSeek?

Summarizing the earlier points, it should include V4 and R2 achieving generational leaps, a million-token context window (which has just begun gray testing), native multimodal capabilities, and a foundational model at the trillion-parameter level as the bare minimum starting point.

However, these are past standards and should not be the upper limit of V4 and R2’s capabilities. At this point in time, DeepSeek needs breakthroughs in multi-agent capabilities, tool usage, computer operations, and strong coding abilities behind the scenes.

There is no need for excessive anxiety; although AI agents are hot, they are still in the stage of integrating existing capabilities, and true autonomous intelligence is still a distance away.

In the future, AI agents may follow four paths: integration of cloud virtual machines, a hybrid model of local and cloud collaboration, achieving intelligent interconnection through protocols, or restructuring all high-frequency application entrances in the form of a “super OS.” Regardless of the path, it will ultimately become the hub for personal intelligent services and a strategic high ground for future competition.

The old standards no longer match DeepSeek V4, so in this letter urging for updates, my expectation is not just for a more powerful language model but for an intelligent base capable of autonomously executing complex tasks, integrating various tools, and efficiently interacting with the external environment.

As mentioned earlier, I hope it can “make a big move,” and the actual exploration of model principles and product technological progress by DeepSeek seems to confirm this “big” rhythm.

Since October last year, DeepSeek has accelerated its publication of papers and partial product updates in the large model field, forming a dense rhythm of innovation.

From the release of DeepSeek-V3.2 in December 2025 to the concentrated release of three core architecture papers—mHC, Engram, and DualPath—in January 2026, along with significant updates and expansions of previously published R1 technical reports, the overall R&D has shown a multidimensional advancement covering architectural innovation, reasoning efficiency, multimodal, and agent capabilities. This series of efforts is widely viewed as a technical prelude to the next-generation flagship model DeepSeek-V4.

DeepSeek has not officially confirmed how these innovations will be integrated into the final architecture of V4, but the authorship of the papers (including founder Liang Wenfeng), code leaks, and visible changes on the platform all point in this direction.

The DeepSeek-OCR series launched in October 2025 explored the possibility of compressing text information through visual representation, overturning the traditional assumption that “text tokens are more efficient than visual tokens.” The visual causal flow mechanism of OCR 2 further enables the model to “understand” documents based on layout logic like a human, rather than mechanically scanning them. This provides a new approach for multimodal models to understand and process extremely lengthy documents (like entire books or financial reports), potentially expanding the context window of large models to tens of millions of tokens without incurring a square-level increase in computational complexity.

The mHC technology addresses fundamental challenges in training trillion-parameter models: signal explosion, breaking through the bottleneck of “deep network stability” for large-scale development, and paving the way for training open-source models at trillion-parameter levels. It also helps achieve deep expansion of models through architectural innovation without relying on advanced process chips.

Engram offers an engineering solution for long contexts and continuous learning, theoretically supporting persistent memory across sessions, breaking the current limitations of large models’ “stateless” reasoning. It challenges the traditional Transformer design paradigm of “exchanging computation for memory.” This method stores static knowledge in external sparse tables, allowing the model’s feedforward network to focus on dynamic reasoning. This “neural-symbolic” hybrid architecture enables the model to maintain million-token-level contexts while significantly reducing reasoning costs.

The V3.2 version released in December 2025 has already demonstrated initial capabilities for “cross-tool memory retention,” solving the problem of traditional AI agents losing reasoning chains when calling multiple tools, and reducing the reasoning cost of 128K long contexts by several times through a sparse attention mechanism, with memory usage decreased by 70%.

Additionally, DeepSeek, in collaboration with Peking University and Tsinghua University, released a new paper introducing the agent reasoning framework DualPath, which innovates a dual-path KV-cache loading mechanism to parallelize data reading and GPU computation, completely resolving the traditional architecture’s computational idling issues. Offline reasoning throughput has been tested to improve by 1.87 times, and online agent operation efficiency has improved by 1.96 times, achieving performance doubling through pure software optimization, marking a disruptive breakthrough in AI infrastructure and significantly enhancing cost efficiency, a style very characteristic of DeepSeek.

All signs indicate that the upcoming new generation flagship model DeepSeek-V4 will likely integrate text, image, and video generation capabilities, adopting native multimodal pre-training rather than post-hoc stitching, with model parameters exceeding a trillion and possessing strong memory, tool, coding, learning capabilities, and good support for agents.

The Dual Sword of Domestic Models and Domestic Computing Power

Beyond the model, another expectation for DeepSeek V4 is to explore a synergy with domestic computing power after adaptation.

There have been numerous reports discussing that before releasing V4, DeepSeek did not provide previews to American chip manufacturers like Nvidia and AMD but instead chose to open access to Chinese chip suppliers, including Huawei, weeks in advance to ensure deep adaptation and optimization of the model on domestic computing platforms.

This is also a key reason why DeepSeek V4 is perceived to be delayed.

Adapting to domestic computing power is a challenging path for domestic models, but in the long run, it is a necessity. A necessary task must have a starting point, and perhaps DeepSeek V4 is that starting point.

When the model extends an olive branch, the pressure falls on domestic computing power, requiring efficiency, production capacity, and effective supply to keep up and form ecological synergy with model and agent development.

If DeepSeek V4 and R2 can empirically demonstrate world-class performance from training to reasoning on domestic chips at lower costs, there is hope to significantly break free from dependence on overseas computing power, shattering the label of “Token King” that Huang Jen-Hsun has placed on himself through SemiAnalysis.

If you recall, the night DeepSeek R1 was launched, Nvidia’s stock plummeted nearly 17%, with a record single-day market value evaporating by $589 billion.

While Nvidia’s drop is not good news for tech stock investors, if it is driven by DeepSeek, we would welcome such a situation to happen again.

Layering of Sugar Water Intelligence and Original Force Intelligence

In closing this letter, if I were to mention another expectation, it would be for DeepSeek to make breakthroughs in another Scaling Law.

This breakthrough is not in the traditional sense of “the larger the model, the stronger the capability,” but rather that small models continuously scale to achieve the capabilities of large models.

Based on the two technical routes of “principle-algorithm-training-thinking and reasoning capability evolution” and “intelligent compression-distillation-internalization,” small models at each stage continuously reach the capability level of the previous stage’s large models, even approaching and achieving daily high-availability levels, and then gradually layering capability-application-scenario-value.

Small models, conventional intelligence serve simple basic daily tasks, excelling in quantity, with better openness, edge deployment, and cost efficiency—this is “sugar water intelligence,” the broth part of the token economy.

Large models, super intelligence serve enterprise industry business-productivity-professional technology-heavy tasks, generating high premiums—this is “original force intelligence,” the meat part of the token economy.

Regarding the capability evolution of small models, Google Gemma 4 serves as a good reference, encompassing four versions of 2B, 4B, 26B, and 31B, covering all scenarios from mobile phones to workstations. The 31B Dense model ranks third in the Arena AI open-source leaderboard, while the 26B A4B MoE model ranks sixth. All four models support image and video input, cover over 140 languages, and include switchable thinking modes. This is not merely parameter compression but the distillation and internalization of intelligence—achieving greater efficiency in knowledge transfer, more precise quantization pruning, and advanced distillation techniques, allowing small models to possess great wisdom.

I hope DeepSeek can surpass Gemma-4 with high-quality models in the 30B-70B-120B range, enabling enterprise-level deployment to exceed the levels previously reached by the “Six Little Tigers,” creating a new landscape.

Additionally, I look forward to DeepSeek achieving similar breakthroughs in lightweight models in the 1B-8B range. When edge models can run smoothly on consumer-grade graphics cards or even mobile phones, and when billions of edge models exist on personal phones and computers, granting every ordinary user strong AI capabilities, it will represent the equitable and inclusive form of the intelligent economy.

Final Thoughts

2026 is poised to be a year of “jumping development” for the next generation of frontier models and operational intelligent agents, with each AI company playing its trump card, triggering a new round of industry reshuffling.

“China Group” needs DeepSeek’s return, the open-source ecology requires DeepSeek’s push, the token economy demands DeepSeek’s deep original force intelligence, and domestic computing power needs DeepSeek’s validation.

Currently, the capabilities of models in China and the US in routine Q&A have nearly no gap, but there remains a disparity in deep intelligence for long and complex tasks. This gap fuels the anticipation for DeepSeek.

This is the last call for updates and the final summons. V4 and R2 carry expectations not just for model iteration but for the advancement of an era. From the battle of models to the battle of full ecology, from single-point breakthroughs to full-stack AI competition, from following and imitating to autonomous innovation—can DeepSeek’s next steps define the future of AI in China?

I hope the year-long “silence” of DeepSeek is a precursor to a more significant explosion.