How smart can artificial intelligence be now?

  Full-size humanoid bionic robot, with a height of 1.77 meters and a weight of 52 kilograms. Photo/Reporter Li Na

  At the 2023 Zhongguancun Forum being held, "artificial intelligence" is undoubtedly the hottest keyword. Whether it is autonomous driving or smart wear, quantum computing or 5G communication, or even carbon neutrality, many cutting-edge technologies are inseparable from the support of artificial intelligence technology. It can be said that artificial intelligence will continue to change the lives of all walks of life and ordinary people in the next decade. In this forum’s international technology trade conference section, the exhibition section of the science and technology fair, and the parallel forum related to artificial intelligence, Beijing Youth Daily reporter noticed that all major companies have brought the latest artificial intelligence scientific and technological achievements, including the visual general segmentation model SegGPT, 5G audio and video interactive application, bilingual digital intelligence and so on.

  New application of 5G communication

  Visual self-service brings new interactive experience

  According to the latest data, China’s 561 million 5G subscribers have been built and 2.312 million 5G base stations have been opened in China, accounting for more than 60% of the global total. In the first quarter, the national average 5G download speed was 334.98Mbps, and the peak download rate was 472.92Mbps. Such a fast internet speed, coupled with the support of artificial intelligence, can be used for anything other than social networking and daily office study?

  The "5G New Communication Intelligent Interactive Platform" exhibited by China Unicom has applied the characteristics of 5G "large bandwidth, low delay and ubiquitous connection", applied the capabilities of 5G audio-video interaction and AI atom, combined with advanced technologies such as AR&VR, 3D modeling and intelligent interaction, and achieved the application of audio-video interaction under 5G endogenous service. The platform uses multimedia, three-dimensional modeling, real-time tracking, sensing, intelligent interaction and other technologies to realize end-to-end visualization and intelligent new communication services, and provide enterprises and governments with functions such as 5G audio and video interaction and intelligent virtual image.

  For example, on the financial industry application platform, users can remotely access the bank counter service and enjoy the same experience and personal privacy protection as on-site handling; The application platform of energy industry reconstructs the application system of intelligent linkage system of well site, and realizes the digital management of well site resources and the digital compilation of well site patrol. The transportation industry version platform provides barrier-free intelligent communication services based on 5G new communication, and provides visual interactive intelligent service for the elderly passengers.

  It is worth mentioning that the platform realizes the adaptation between the platform and the domestic mobile phone chip in terms of localization adaptation and autonomous control; The platform side supports domestic system, and the mobile phone side is adapted to Huawei Kirin chip and MediaTek Tianji chip, and supports domestic mobile phones of Huawei, Xiaomi, OPPO, VIVO and Meizu series.

  "Digital Homo sapiens" is smarter

  Integration with the big model can "understand you"

  Digital Homo sapiens is simply a virtual person. With the help of anthropomorphic appearance and the core of artificial intelligence, Digital Homo sapiens has begun to be commercialized in many industries, assisting manual services and improving the operational efficiency of enterprises. For example, in financial, cultural tourism, media, public services, medical care, retail and other industry scenarios, Digital Wisdom can play the role of customer service, financial consultant, broadcast host and tour guide. In cultural and entertainment scenes, IP assets can be formed as virtual idols, virtual singers, etc. In intelligent vehicle, intelligent transportation, smart home and other scenarios, it can provide users with intelligent services by combining with smart devices.

  Tencent Cloud Intelligent Small Sample Digital Homo sapiens Production Platform was released for the first time recently. It only needs 3 minutes of Zhong Zhen population broadcast video and 100 sentences of voice material, and the platform can model and generate high-definition portraits in real time through multi-modal data input of audio and text, and produce a "Digital Homo sapiens" similar to real people within 24 hours. Compared with digital people who can only present facial shapes by photo generation, a small sample of Homo sapiens can design gestures according to the text, and the lip movements, mouth shapes and expressions can reproduce the real-life style.

  This year’s Zhongguancun Forum, Beiqing Daily reporter also tried to use the virtual anchor of Digital Homo sapiens instead of the real anchor, and conducted a 7×24-hour live broadcast service, which attracted many viewers’ attention.

  However, in the past, the number of intelligent people, compared with real people, their thinking ability was obviously weaker. At the Zhongguancun Forum, the "Zhipu AI Brain-number Homo sapiens" launched by Zhipu AI is even smarter. It is no longer confined to a fixed way of interaction, but has the ability to understand the intention of human instructions. Zhipu AI was transformed from the technological achievements of Tsinghua University Computer Department. In 2022, the company jointly developed a bilingual 100 billion-level super-large-scale pre-training model GLM-130B, and led the construction of a high-precision universal knowledge map, organically integrating the two into a cognitive engine driven by data and knowledge, and based on this 100 billion-base model, ChatGLM was built. Linking billions of users in the physical world through the cognitive big model, empowering digital people in the meta-universe, becoming the base of a humanoid robot, and giving the machine the ability to "think" like a human. In addition, he is also a bilingual intellectual who can speak both Chinese and English.

  "unmanned driving" to the streets

  The latest pedestrian prediction model is coming out.

  Nowadays, in Yizhuang and other places, you can already hit Baidu’s self-driving vehicles. In the future, with the development of technology and the approval of policies, the safety officers on board will be withdrawn, and the self-driving vehicles will be truly unmanned.

  According to Baidu, the core of driverless technology is the "Apollo platform of Baidu car brain", which includes four modules: high-precision map, positioning, perception, intelligent decision-making and control. The latest Apollo has evolved to introduce several models based on deep learning, publish a low-speed pedestrian prediction model based on semantic map, and introduce imitation learning based on semantic map.

  In this Zhongguancun forum, Defiance Technology released a self-developed intelligent tray four-way shuttle system. As a discrete equipment in the flexible logistics system, the four-way car with intelligent pallet can realize "one car running the whole warehouse". Why is it "flexible logistics"? Despise, mainly because it has two characteristics: discrete equipment and distributed control. User enterprises can flexibly combine and deploy as needed, just like building blocks. Secondly, flexibility is reflected in the "dynamic scalability" of the whole system. User enterprises can increase or decrease the number of four-way vehicles at any time according to changes such as off-peak season and business growth, so as to enhance the carrying capacity of the system.

  Smart cities are lower carbon.

  AI "housekeeper" manages all water, electricity and air conditioning.

  In the construction of smart cities, AI plays an increasingly important role. For example, AI can be used for urban infrastructure management, such as automatically monitoring the structural health of roads, bridges and buildings, and detecting and repairing cracks and potholes on roads; AI can help cities manage energy, for example, by analyzing energy usage data to achieve more efficient energy use and optimize the city’s energy system; AI can also help cities protect the environment, for example, through air quality monitoring, garbage disposal and water resources management, etc., to improve the environmental quality of cities.

  So, how to use AI to reduce carbon in buildings to achieve the goal of carbon neutrality in peak carbon dioxide emissions? From the perspective of making full use of clean energy, the carbon management platform based on building brain neural network system displayed by Henghua Digital focuses on the application of technical products with high cost performance, covering the sensing nodes at the end of the building and the sensing nodes of major energy-using equipment. Through the unified and coordinated management of the building brain edge computing server, the building energy-using equipment can operate efficiently and eliminate unnecessary energy waste as much as possible. According to the analysis of the edge computing model, the energy consumption curve of each energy-using subsystem of the building is in a stable state, and the overall energy consumption is the lowest.

  Among them, the building’s electricity consumption should take the first place in the building’s energy consumption. According to the characteristics of the building’s weak current system, a set of weak current monitoring and AI control system with smaller volume, accurate measurement and convenient installation is developed on the basis of not increasing the decoration construction, which can dynamically monitor the building’s power system, ensure the power failure in no man’s land in time and avoid unnecessary power waste. However, the energy consumption of building air conditioning system accounts for 40% of the total energy consumption of the building. Henghua Digital Co., Ltd. developed a strategic algorithm for the optimization of building cold and heat source systems through in-depth cooperation with the establishment of Industry-University-Research base in colleges and universities, and formed a mature data algorithm model, which made the energy saving rate of air conditioning system reach more than 10%. At present, this project has landed in Guangdong, Tianjin, Jiangxi, Sichuan, Hubei and Anhui provinces. In the future, residential quarters, office buildings and shopping malls will all "evolve" in the direction of green and low carbon.

  AR glasses "simultaneous interpretation"

  Smart wearable devices help accessibility.

  With the integration of artificial intelligence into all aspects of life, devices equipped with artificial intelligence tend to be miniaturized. For example, smart watches can answer calls, reply to WeChat, and monitor sports. Smart glasses are shaped like ordinary glasses, and they can make phone calls and listen to music after being put on.

  However, the smart glasses displayed at the Zhongguancun Forum are more practical. This pair of smart glasses named "Bright Listener’s Smart Glasses" is a dual-eye waveguide AR smart glasses.

  VR glasses will be immersed in the virtual world, while AR glasses will not block the line of sight. They will integrate the real world with the virtual world, thus realizing some functions that cannot be achieved in the real world. For example, hearing-impaired people often encounter difficulties in the process of work, social interaction and study because of "inaudible" and "inaudible" sounds. This pair of glasses can convert sound information into words and display them in front of their eyes. It also has the function of simultaneous interpretation, which can identify the languages of different countries, and at the same time help users to understand easily in the environment of international language exchange before converting them into Chinese characters or characters of other countries. This pair of glasses is light and portable, and its body weighs only 79g. Compared with the 200-300g AR glasses currently on the market, its own weight is very suitable for long-term wear. Can also adapt to myopia, hyperopia, astigmatism, presbyopia and other conditions of the lens; The outside of the glasses does not leak light, protecting privacy, and the content is only visible to itself; These glasses are also equipped with millisecond real-time subtitles, noise reduction algorithm, accurate radio reception within 5 meters, and the highest translation accuracy can reach over 95%. It is reported that this product has mass production capacity at present.

  Privacy preserving computing technology is open source.

  Used in the fields of finance, medical insurance and so on.

  Privacy computing, also known as privacy protection computing, refers to a series of information technologies that analyze and calculate data under the premise of ensuring that the data provider does not disclose the original data, so as to realize the "availability and invisibility" of data in the process of circulation and integration, thus realizing the transformation and release of data value. Privacy protection computing provides the protection ability for private data that is urgently needed in the future industry. At the Zhongguancun Forum Exhibition (Science and Technology Expo), Ant Group announced for the first time an open source complete version with key basic software as the core. All nine core technologies are open source, including the "argot" of privacy computing technology. In other words, this technology platform is open to users all over the world, and can directly use product functions without calling and developing code, helping users explore privacy computing application scenarios at low cost.

  According to reports, argot has been applied in financial, medical, insurance and other scenarios. For example, Shanghai Pudong Development Bank and Ant Group’s argot platform have identified more than 145,000 high-risk users and prevented the issuance of billions of high-risk loans. In terms of medical care, the Ant Privacy Computing Platform and Alibaba Cloud Digital Medical Team cooperated to build a data fusion platform for hospital operation management, providing digital performance management analysis for managers, helping hospitals to establish a refined operation management system, and reducing economic risks or clinical risks of hospitals. In addition, in the past, in the process of claim settlement, insurance institutions will obtain unnecessary original data by querying the medical institutions in clear text (that is, the data is not encrypted). Ant’s solution, by setting logical data query and using privacy computing technologies such as multi-party security computing, enables insurance companies to obtain only the query results of whether to make claims, but not all kinds of original data, thus protecting the privacy of claims users.

  sound

  Large-scale model will change the development of the world, and control technology should be studied at the same time.

  ChatGPT is one of the most interesting new things in science and technology in 2023, and its release has also triggered a language big model frenzy. Many companies, such as Baidu, Ali, Zhihu, Shang Tang and JD.COM, have launched their big models one after another. Another major area of AI — — Visual GPT also appeared in this Zhongguancun Forum: the visual team of Zhiyuan Research Institute officially launched the general segmentation model SegGPT, which is the first general visual model to complete any segmentation task with visual cues.

  According to reports, when SegGPT is used, it abandons the traditional thinking of language model and uses images instead of words when interacting with machines. For example, when a user gives a picture of SegGPT and circles the "rainbow" on it, when the user gives many more pictures containing rainbows, SegGPT can automatically identify the rainbow on it and circle these parts. It can be said that SegGPT is "all-in-one": given one or several sample images and intention masks, the model can get the user’s intention and complete similar segmentation tasks in a "suit-making" manner. In addition, SegGPT is also "touch-to-touch": through a point or bounding box, interactive prompts are given on the picture to be predicted to identify the specified object on the segmented picture. Using this feature, many functions can be realized. For example, when the robot manipulator goes to get tomatoes and other objects, the robot can quickly know where the edge of the tomato is, and can pick up the tomato without crushing it, which is very accurate.

  At present, the domestic big model is in a state of letting a hundred flowers blossom and a hundred schools of thought contend. Li Yanhong, founder, chairman and CEO of Baidu, said at the Zhongguancun Forum that artificial intelligence has once again become the focus of human innovation, and more and more people recognize that the fourth industrial revolution is coming. He stressed: "The big model has changed artificial intelligence, and the big model is about to change the world." Dai Qionghai, an academician of the China Academy of Engineering and chairman of the China Artificial Intelligence Society, also said that artificial intelligence will bring about application changes in many aspects: a new paradigm for scientific research (the origin of the universe, the laws of nature, and the mysteries of life); Facing people’s life and health (AI drug research and development, remote virtual surgery); Facing the main battlefield of economy (virtual creation, industrial manufacturing, spiritual interaction); Facing the major needs of national defense (multi-source situation analysis, AI ground-air front deployment) and so on.

  It is noteworthy that in the face of new changes, some people have also raised warnings. Kai-Fu Lee, Chairman and CEO of innovation works, said, "AI will still make mistakes and talk nonsense seriously. It can only be used to generate the first draft of content and develop ideas, but not as the final version. AI needs continuous manual intervention to avoid fallacies or disasters. In addition, AI may still have some legal and ethical problems. Therefore, AI is not suitable for all fields and can only be applied to applications with high fault tolerance. " Kai-fu Lee emphasized that "AI may create false information and may be used by criminals to deceive users. Therefore, when developing, it is necessary to study the laws and regulations that control the technology and management of AI at the same time." This edition/reporter Wen Wei