In the spring of 2026, Yao Jinbo, the chairman and CEO of 58.com, initiated a radical experiment to transform the company into an “AI-native enterprise” within the organization. He required everyone, even those in the finance and human resources departments, to master AI proficiently. “In an AI-native enterprise, the vast majority of work should be completed by AI itself,” he said.
Currently, 58.com consumes nearly 200 billion tokens per day, and this number is expected to exceed 300 billion soon. “I often get angry and scold the team due to the insufficient application of AI,” Yao Jinbo said. As a result, he canceled the plan to set up a strategic department. “Now, whenever I come across information about any enterprise or industry, I directly send it to AI and interact with it continuously. The efficiency and results are far better than those of the strategic department.”
Yao Jinbo, the chairman and CEO of 58.com. Photography: Deng Pan
On March 17th, Jensen Huang, the CEO of NVIDIA, crowned himself the “King of Tokens.” He painted a business picture that is still hard for people to imagine: in the future, a high-level model will provide services for you, and $150 per million tokens will be nothing.
The concept of “tokens” is sweeping through the tech world. In the second half of 2025, led by ByteDance and Alibaba, the industry was still debating whether tokens should be the standard for determining the “number one” in the AI cloud market. At the Yunqi Conference, an Alibaba Cloud executive told China Entrepreneur that Alibaba is more concerned about the “effective growth of the AI cloud” than token consumption.
Half a year later, on March 16th, 2026, Alibaba announced the establishment of the Alibaba Token Hub (ATH) business group, led by CEO Wu Yongming. This is the largest AI business integration in Alibaba in recent years and the first time a Chinese internet giant has included “tokens” in its organizational structure.
At Tencent, a silent but significant transformation is also taking place. Insiders revealed that Tencent has signaled to its employees that they should consume tokens worth 1,000 yuan per month. This is more like a hidden assessment than a welfare benefit. Those who do not know how to collaborate with AI will be re-evaluated.
This token fever started after the Spring Festival with an open-source intelligent agent framework called OpenClaw (commonly known as “lobster”). The nationwide “lobster-raising” activity helped AI complete a market enlightenment, verified the practical value of agents, and brought about an exponential increase in computing power demand. On March 23rd, the National Data Bureau disclosed that the daily average token call volume in China has exceeded 140 trillion, a more than 1,000-fold increase compared to two years ago.
As tokens evolve from a computing power unit to a value currency, a new infrastructure around “production – pricing – circulation – distribution” is also being comprehensively rolled out. The chain reaction is evident at all levels:
Start-up model companies represented by MiniMax, Zhipu, and Moon’s Dark Side have emerged from the shadows. Their market values/valuations have multiplied several times in a short period, directly triggering a token price increase. The extreme shortage of computing power has also led cloud providers to end the price war, and the concept of “tokens as a service” has become a consensus. In the first quarter of 2026, Zhipu’s API pricing increased by 83%, but the call volume increased by 400%. Zhang Peng, the CEO of Zhipu, said, “When the model is strong enough, the API itself is the best business model.”
Tech giants have all readjusted their organizational structures for tokens, which has led to fierce competition for talent. Once-low-key technical leaders have unprecedentedly stepped into the spotlight and even become a key variable in the market’s confidence in the company’s market value.
In December 2025, Tencent appointed Yao Shunyu, a former senior researcher at OpenAI, as the group’s chief AI scientist. In March 2026, Lin Junyang, the technical leader of Alibaba’s Qwen large model, left the company, causing internal and external shocks. In April, Guo Daya, a former researcher at DeepSeek, was reported to have joined ByteDance with an annual salary of 100 million yuan.
As digital employees enter the production structure, organizational replacement is inevitable.
Kunlun Wanwei requires technical personnel to increase their R & D efficiency by 50% through AI tools. Employees who use fewer tokens will be eliminated. A senior industry insider told China Entrepreneur that due to the efficiency improvement brought by agents, about 30% of programmers in large companies will soon be released. “They are not being eliminated due to low performance but because the demand for their positions is decreasing.”
Data centers are being upgraded to token factories. Jensen Huang predicted at the 2026 GTC Conference that in the future, every cloud service and AI company will use “token factory efficiency” as the core operating indicator. By 2027, the demand scale for AI infrastructure will be at least $1 trillion.
SaaS and software companies have received a death notice. Chen Hang, the founder of DingTalk, told China Entrepreneur bluntly that software has become a “daily throwaway item” and is developing in the direction of on-demand production and daily evolution. DingTalk has canceled the internal software development work of its middle platform because all software will be generated instantaneously in the future.
Chen Hang, the founder of DingTalk. Source: The interviewee
From “traffic is king” to “tokens are king,” from the free logic to the paid logic, and from labor-intensive to intelligence-intensive, all the old business common sense, organizational experience, and competition rules are being rewritten.
Where will tokens lead us? Zhang Yingjie, a partner at Fortune Capital, summarized the essence of the change as follows: “In the mobile internet era, the cost of each bit is decreasing, but your actual usage is increasing. In the context of AI, the cost of each token is decreasing, but the increase in your usage far exceeds the rate of the decrease in unit cost. In the future, tokens will be like mobile phone data packages. The price per GB will be lower and lower, but the usage will be higher and higher, which will open up an exponential market space.”
This is also destined to be a cruel elimination race. “The industry will be divided into two extremes: one type of enterprise is determined in its strategy and fully committed to AI, while the other type is unclear about the direction and hesitant. The gap will widen rapidly within a year,” said Lin Fan, the founder of Maimai.
The End of the Old Order
For 30 years in the internet industry, the rule has been “traffic = users = revenue.” From social software to short video applications, from portals to e-commerce platforms, all business logic revolves around one core: to attract as much user attention as possible and then monetize it.
Tokens have shattered all of this.
In the context of AI, tokens play three roles: first, they are the smallest unit of semantic information. Second, they are the measure of computing power consumption. Every token burned corresponds to the real costs of chips, electricity, cooling, and scheduling. Third, they are the settlement currency for intelligent value, turning the invisible and intangible “intellectual activities” into standardized commodities that can be measured, traded, and compared.
Traffic and tokens represent two completely opposite, even conflicting, business philosophies.
In the traffic world, the more users there are, the lower the marginal cost. This makes internet companies naturally pursue infinite expansion, using freebies, subsidies, and fission to attract users and then monetizing them through advertising, value-added services, and e-commerce commissions. First, occupy the users’ minds with scale, and then recover the costs through business.
In the token world, the more users there are, the stronger the intelligence, and the more complex the tasks, the higher the cost.
“In the past, when the interaction was mainly chat-based, the daily token consumption of a single user or program was usually in the range of hundreds of thousands to millions. With the rise of Coding Agents, the daily token consumption of a single program has increased significantly, often reaching billions, leading to a sharp increase in the overall token consumption,” said Lin Fan.
The consumption speed of tokens is still increasing rapidly. Bai Ya, the CEO of Youzan, told China Entrepreneur that the company’s token consumption in February was around 30 to 40 billion, approaching 100 billion in March, and is expected to reach at least 200 billion in April. “By the end of the year, the monthly consumption may reach about 5 trillion tokens.”
The consumption of tokens is irresistible, but the execution efficiency can vary greatly under different models and usage methods. This also determines that behind the token competition is, in essence, a comprehensive competition of the capabilities of intelligent agents (Agents).
Xia Lixue, the CEO of Wuwen Xinqiong, told China Entrepreneur that in the future, the “system CEO” may be an Agent. It can lead the design of the next-generation underlying architecture according to customer needs, and humans will only play the role of the “chairman” to ensure the top-level design. The key lies in exploring cutting-edge mechanisms to enable efficient collaboration and “shared brain” among the Agents within the system.
Xia Lixue, the CEO of Wuwen Xinqiong. Source: China Pictorial Library
As Agents become the new production unit and consumption subject in the intelligent era, the investment subject and value logic have also undergone fundamental changes.
On March 18th, Zhong Tianjie, the investment director of ZhenFund, pointed out the core pain point of the internet era in his viral article Maybe We Shouldn’t Invest in Software Companies with GUI Thinking Anymore: Humans are Agents limited by the attention system and must rely on continuous visual anchors to maintain the task state. In the past few decades, the huge software industry centered around the graphical user interface (GUI), all interaction designs, user experiences, and product paths are essentially patching up human cognitive shortcomings.
When interviewed by China Entrepreneur, Zhong Tianjie said that the most valuable companies in the future are those that can be embedded in the Agent workflow and become the default protocol and necessary nodes in the infrastructure. Tokens will also become the “hard currency” and measurement scale in the new network.
For this reason, ZhenFund specially launched the “Token Grant” program, directly providing 50,000 yuan in cash to early-stage entrepreneurs to purchase tokens. “In the AI era, providing tokens and supporting the burning of tokens means supporting innovation itself.”
The Economics of Token Factories
Enterprises are enthusiastically embracing tokens because their logic naturally leads to charging. This has also made cloud providers quickly change their course, shifting the value scale from the IaaS resource era of renting machines and selling computing power to the MaaS intelligent era centered around intelligence and using tokens as the settlement unit.
At the 2026 GTC Conference, Jensen Huang refined this global industrial transformation into the “economics of token factories.” He designed a five-tier pricing system for tokens, from the free basic layer to the top ultra-high-speed layer (about $150 per million tokens). The value gradient is no longer determined by model parameters but by inference performance, response speed, and context length.
However, when the entire industry flocks to Agents, it quickly hits a rigid constraint: the dual shortage of computing power and electricity has become the biggest bottleneck restricting the token manufacturing industry. The power consumption of a large-scale intelligent computing cluster is comparable to that of a small to medium-sized city.
Under the triple constraints of power supply, chip production capacity, and unit computing power efficiency, the explosive growth of users and scenarios, which used to be a growth dividend, has become a “sweet trouble” for model manufacturers and cloud providers. The larger the scale, the more intense the consumption, and the more acute the pressure on cost and supply.
This has also caused tokens to switch from being “cheap as cabbage” to a price increase in a short period. In March 2026, leading cloud providers at home and abroad took collective action. Google Cloud, Amazon Web Services, Tencent Cloud, Alibaba Cloud, Baidu Smart Cloud, etc. successively issued price adjustment announcements, and the prices of AI computing power and storage services generally increased by 30% – 50%. Among them, the price of some core products of Tencent Cloud increased by up to 400%, marking the end of the nearly 20-year price cut cycle in cloud computing.
In April 2026, Alibaba Cloud’s Bailian announced the discontinuation of the Coding Plan Lite basic package, which was once known as the “10 billion AI subsidy,” with a first-month price of 7.9 yuan and a monthly subscription price of 40 yuan, and only retained the Coding Plan Pro advanced package at 200 yuan per month. However, this product was still sold out in a short period.
ByteDance has also completed a strategic shift. ByteDance put an end to the token price war it started two years ago. Tan Dai, the president of Volcengine, said bluntly, “We should look at costs from the essence. We will return to the first principle and look at the value created by the product itself.”
This comprehensive manufacturing competition has also directly detonated the supply – demand gap for computing power hardware, making AI chips in short supply across the board and bringing a spring to domestic chips. “Domestic chips have shown a certain degree of substitutability in inference. Large and small companies are now trying their best to use domestic chips to upgrade their computing power,” said Lin Fan.
On April 24th, the highly anticipated DeepSeek V4 was released and open – sourced simultaneously. Through in – depth adaptation to domestic chip architectures such as Ascend and Cambricon, it achieved a significant performance leap and created a landmark example of token production with high efficiency, low cost, and large – scale production.
All these changes point to a clear conclusion: in the future token factories, the competition is no longer about chip parameters and algorithm effects but a comprehensive competition of computing – power and electricity coordination efficiency.
Xia Lixue explained the way to break the deadlock from the perspective of infrastructure to China Entrepreneur: First, we must promote the full – scale upgrade of the underlying system and build a “high – efficiency token factory” with high stability, high energy efficiency, and high compatibility through full – stack software – hardware collaboration to form a positive flywheel of cost – performance improvement and increased usage.
Second, we must completely change the traditional idea of resource mismatch in cloud computing. We need to “design infrastructure for Agents rather than for humans” and make token factories move towards “intelligent manufacturing.”
Xia Lixue believes that the calling formats of many current tools are designed for human understanding and contain a large number of redundant symbols that are meaningless to AI. The infrastructure must optimize the format from the underlying data link and refine large – scale structured documents into high – density effective information through an efficient information compression mechanism to significantly improve the execution efficiency of intelligent agents in a long – context environment.
Looking towards the farther future, the infrastructure must also evolve on its own. “The factory needs to form an organization of Agents and become a ‘token factory capable of intelligent evolution’.”
Where will the reasonable profit margin of tokens be anchored in the balance between cost and value?
Zhou Yahui, the founder of Kunlun Wanwei, provided a reference for China Entrepreneur: From 2015 to 2025, the CPM (cost per thousand impressions) of mobile internet advertising increased by 10 – 20 times. He judged that “in the next 10 years, the price of tokens will also increase by 10 to 50 times. However, due to the continuous reduction of inference costs, the absolute price may








































































































































































































































































































































































































































































































































































































































































































