tech

As everyone can see, in recent years, developing their own chips has become a necessary path for many system vendors. Faced with this trend, some practitioners have expressed that developing a chip only proves that they have formed a reliable team in chip design. If they can apply it to the ground, this is the ultimate goal of self-developed chips. This is also one of the reasons why many AI chip entrepreneurs have been criticized in the past few years.

But for Alibaba, which is used to enrich application scenarios, this seems not to be a problem.

At the Cloud Festival held last year, this local Internet giant launched its first Arm server chip, Yitian 710. With its leading specifications and excellent strength, this chip has attracted the attention of domestic and foreign practitioners since its birth. Everyone is also highly concerned about its progress. A year later, at the 2022 Cloud Festival held recently, Zhang Jianfeng, President of Alibaba Cloud Intelligence, announced that the company's self-developed CPU, Yitian 710, has been deployed on a large scale, which is China's first large-scale self-developed CPU applied on the cloud.

Advertisement

Since the establishment of Pingtou Brother Semiconductor in 2018, this global leading cloud giant and e-commerce giant have achieved good results in the chip field, and the deployment of Yitian 710 has become another milestone in their short development history of the chip field.

Self-developed chips, the inevitable choice for development

From the development of the entire Internet industry, especially for e-commerce businesses like Alibaba that have huge traffic needs, and suppliers with a variety of public cloud services, self-developed chips are the inevitable choice for industrial development. This can first help reduce cost expenditure and achieve the "dual carbon" goal.

Take the Arm server chip represented by Yitian 710 as an example. As everyone knows, since the X86 architecture has unified the server chip market, Intel has almost become the only supplier in this market for many years. Under this situation, the performance of buyers in bargaining power can be imagined. Although AMD has also put some pressure on Intel in the server chip market in recent years, the rapidly rising Arm and TSMC with better and better services have given server manufacturers new choices. As a result, cloud giants including Alibaba, Amazon, and Microsoft have all turned their attention to the development of Arm CPU.

Because according to relevant analysis, using Arm CPU in some cloud applications can not only save costs but also save power consumption. This is undoubtedly a huge benefit for manufacturers that need a large number of server deployments.

Back to the first CPU chip of Alibaba Pingtou Brother, Yitian 710.

According to the introduction, the chip is developed for cloud scenarios, while taking into account both performance and ease of use. After a year of business verification, Yitian 710 has been deployed on a large scale and provides cloud services. The Yitian 710 cloud instance is integrated with the Feitian operating system and CIPU, and the cost-effectiveness in core scenarios such as databases, big data, video encoding and decoding, and AI inference has increased by more than 30%; Alibaba Cloud provides a rich set of ecological tools, supporting full application ecological adaptation, and mainstream business migration can be completed without code modification.Alibaba has also pointed out that the Yitian 710 cloud instance has been applied to the core business of Alibaba Group, and serves scientific research, the smartphone industry, and many well-known Internet companies. According to data provided by Alibaba, during the Double 11 period in 2021, the core transaction system of Tmall Double 11 smoothly migrated to the Yitian 710 cloud instance, and the cost-effectiveness of computing power increased by 30%; the advertising inference business of Huiliang Technology used the Yitian 710 cloud instance, and both performance and network bandwidth were improved, with a cost-effectiveness increase of more than 40%.

With a self-developed chip, the company's system's problem-solving ability has been enhanced, which is not a pioneering move by Alibaba. However, for this domestic cloud computing giant, it is an inevitable result of their foresight over the past decade.

As early as 2007, Alibaba was in the rapid development stage of its business. At that time, the number of users on Taobao was growing rapidly, especially the pulse traffic brought great uncertainty, and the underlying IOE (IBM small machine, Oracle database, EMC storage) technical architecture of the company's business was inadequate. Moreover, because there was still no domestic computing power system at that time, the equipment and software under the IOE architecture were the standard solutions, and the problem of insufficient business computing power could only be solved by expanding the procurement scale.

Faced with this situation, Alibaba made a decision that was "incomprehensible" to people at the time - to use high-level cloud computing to build a brand-new technical architecture for the huge business. Alibaba also took the opportunity to start the development of the Feitian operating system in 2009 and established Alibaba Cloud, which opened the prelude to the self-developed cloud computing of Chinese enterprises. It is reported that the Feitian operating system replaced the traditional centralized architecture with a distributed architecture, and its goal is to connect servers all over the world. In order to achieve this goal, Alibaba Cloud has developed its own large-scale deployment system and automatic fault handling system, greatly improving the global control of the cluster and achieving the scale of a single cluster of 5,000 servers for the first time in the world.

With cloud computing, Alibaba has greatly improved the efficiency of obtaining computing power. However, the virtualization loss problem that followed brought new challenges to Alibaba. They also took a relatively radical approach at the time to solve this historical problem - to solve the virtualization loss with the idea of combining software and hardware. This has promoted the emergence of Alibaba Cloud's Shenlong architecture. Alibaba said that this architecture has the advantages of both the elasticity of virtual machines and the high performance of physical machines, and has achieved zero loss of performance with the design of combining software and hardware, and for the first time, the potential of cloud computing power has been fully released.

After the reform of the system and application, the cloud service provider continued to go deep into the technology, so self-developed chips, custom hardware, and providing more competitive computing power services for cloud enterprises with a brand-new chip, server, switch and other hardware systems has become the trend of the industry. This has also led to the establishment of the Damo Academy in 2017 and the emergence of the flat-headed brother semiconductor. Yitian 710 is the research result of the flat-headed brother semiconductor.

In addition to this chip, Alibaba has also released the Han Guang 800 chip, which is deeply customized for AI scenarios. At the same time, Alibaba's Shenlong computing platform has gone through multiple iterations and has grown into a new control and acceleration center called CIPU. It breaks the traditional cloud computing architecture centered on the CPU, quickly clouding and accelerating the computing, storage, and network resources of the data center, and accessing the operating system.

At the 2022 Cloud Festival, Alibaba Cloud first demonstrated the perfect collaboration between Yitian 710 and CIPU, Feitian operating system, and Yitian 710 became the first self-developed CPU to be used on a large scale in China. Cloud services based on Yitian 710 have the highest cost-effectiveness in core scenarios such as databases, big data, video encoding and decoding, and web servers, with an increase of up to 80%, and the unit computing power consumption (power consumption) is reduced by more than 60%. Alibaba also said that in the next two years, 20% of the new computing power of Alibaba Cloud will use self-developed CPUs, which is obviously cost-effective for the company.

Over the past 13 years, Alibaba Cloud has continued to delve into the technical fields of operating systems, databases, storage, networks, and chips, and has achieved a series of important results. It is the only cloud service provider in China with a complete set of self-developed software and hardware technology systems. However, in Alibaba's view, this is not the final form of the industry.

Continuous innovation, seize the right to defineLooking back at the development of the cloud computing industry over the past many years, it has been a testament to the dedication and hard work of industry professionals. Applications that originally grew on cloud platforms have been continuously emerging, and going all-in on the cloud is gradually becoming the core strategy for businesses. At the same time, cloud computing has also shifted from the scale-driven software technology development of the first decade to entering a new phase.

"In the next ten years, the integration of hardware and software in a self-developed computing system is the foundation for cloud service providers to stand on. Only by continuously innovating in the research and development of core technologies and products can we seize the right to define," emphasized Zhang Jianfeng, President of Alibaba Cloud Intelligence. From Alibaba's perspective, the computing power system that has been followed after abandoning IOE is about to usher in a new round of changes. Mainstream cloud manufacturers such as AWS and Alibaba Cloud have taken the lead in developing new types of hardware and chips.

In terms of the underlying computing system architecture, Alibaba has explored new computing paradigms. In December 2021, Alibaba's Damo Academy successfully developed the world's first 3D bonded stacking computing and storage chip based on DRAM. By using the computing and storage chip to overcome the performance bottleneck of the traditional von Neumann architecture, a new type of computing system architecture has been created. This achievement is expected to provide higher efficiency computing power for future AI scenarios.

In the overall strategy of the end-to-end cloud chip system, Alibaba has also focused its front line, determining to take the RISC-V processor IP as the core direction of attack. Thanks to the Pingtou Brother team's deep accumulation in processor IP, as early as July 2019, Pingtou Brother released the industry's strongest performance RISC-V processor Xuantie 910. This product has become the industry's benchmark for building high-performance chips based on the RISC-V architecture, making RISC-V a new choice for emerging fields such as 5G, artificial intelligence, network communication, and autonomous driving.

In the attack on the end-side computing power, Alibaba is also continuously promoting the development of the ecosystem. As a member of the RISC-V Foundation's board of directors, it has led 11 important technical directions and become a leader in the development of global RISC-V technology and ecology. Since 2021, Pingtou Brother has continued to promote the deep integration of RISC-V and Android, greatly expanding the ecological imagination of the RISC-V architecture. Pingtou Brother has successfully achieved compatibility with the Android 10.0 system on Xuantie C910 and runs the Chrome browser.

Looking towards longer-term computing needs, Alibaba is also continuously laying out quantum computing, striving to subvert the potential of traditional computing. In May 2018, Alibaba released the industry's strongest quantum circuit simulator "Tai Zhang", using the computing power of Alibaba Group's computing platform to simulate Google's "quantum supremacy" plan at the time, redefining the boundaries of "quantum supremacy"; In March 2022, based on the new type of superconducting quantum bit fluxonium, the quantum laboratory successfully designed and manufactured a two-bit quantum chip, achieving a single-bit control accuracy of 99.97%, and the two-bit iSWAP gate control accuracy reached up to 99.72%, achieving the world's best level for this type of bit.

In addition to vigorously promoting in the hardware field, Alibaba's investment in the software field is also unremitting. Their investment in this area is based on the disruptive changes in software development paradigms that the company has seen.

Zhang Jianfeng believes that there are three levels of change in the software development paradigm. The first is the rise of emerging software development methods, and the software architecture is fully serverless; the second is that software development is no longer the patent of programmers, and low-code allows 80% of future applications to be directly developed by business personnel; the third is that all future software will be AI-based, and the open source of large models will accelerate the true popularization of AI.

Among them, serverless will make cloud computing truly become a capability from a resource. Zhang Jianfeng said that in the past, cloud computing replaced physical servers with cloud servers, but customers still purchased cloud resources in the mode of "several cores and several G servers". In the future, cloud computing will be fully serverless, closer to the "power grid" model, and pay according to the number of calls for computing.

This will bring profound changes to the way software is developed. The software architecture has shifted from the original mainframe architecture to the serverless architecture, where customers only need to develop business logic and no longer need to worry about operational and maintenance issues. In addition, the serverless architecture can reduce the threshold for software development, provide more pre-made modules, and greatly improve the efficiency of software production.Secondly, low-code will further reduce the threshold for application development. Zhang Jianfeng believes that in the future, 80% of applications will be developed by business personnel, and not understanding low-code will be the same as not being able to use Word 20 years ago. Data shows that on DingTalk, more than 5 million low-code applications have been added in two years, gathering more than 3.8 million low-code developers.

Finally, more and more software will be AI-enabled, and the open source of large models will promote the true popularization of AI. Zhang Jianfeng said that open source is the core driving force for software progress. In the past, open source has promoted the progress of software architecture, and in the future, open source will also promote the progress and popularization of AI applications. To this end, Alibaba's DAMO Academy has joined hands with the CCF Open Source Development Committee to launch the AI model community "ModelScope", aiming to reduce the threshold of AI application. DAMO Academy has taken the lead in contributing more than 300 verified high-quality AI models to the ModelScope community, more than one-third of which are Chinese models, fully open source and open, and turning models into directly usable services.

"In the past ten years, the research and development of AI have developed rapidly, but the application of AI has always been a major problem. The high threshold of use has limited the potential of AI." said Zhou Jingren, Senior Vice President of Alibaba Group and Deputy Dean of DAMO Academy. In his view, AI models are relatively complex, especially when applied to industry scenarios, they often need to be retrained, which makes AI only in the hands of a few algorithm personnel, difficult to popularize. The newly launched ModelScope community, practicing the new concept of Model as a Service, provides a large number of pre-trained basic models, which can be quickly put into use after a little adjustment for specific scenarios.

In summary, Alibaba is striving to become the leader in technology of each era.

In conclusion, the China Academy of Information and Communications Technology pointed out in the "Cloud Computing White Paper" released in Changsha a few days ago that at present, cloud services, as a general computing power, have become the key to empowering the transformation of enterprise business units. However, with the deepening of enterprise digitalization and the increasing diversity of digital applications, users have put forward higher requirements for the types and quantities of computing power, effective perception, and efficient utilization, and cloud services have gradually evolved towards computing power services. The white paper also emphasizes that cloud computing can shield the differences of different hardware architectures (CPU, GPU, FPGA), output different types of services (conventional computing, intelligent computing, edge computing), achieve the unified output of large-scale heterogeneous computing resources, and more universally meet the computing power needs under different quantities or different hardware architectures, achieving the popularization of computing power.

These unobtrusive experiences are exactly the reasons why Alibaba can empower in various aspects of cloud computing underlying technology and can firmly hold the leading position of domestic public cloud service providers.

Comments