On October 17, 2023, dressed in a white shirt and white shoes, Baidu’s Founder, Chairman, and CEO Robin Li took the stage to announce the arrival of a new era. The conference, named “Generate the Future,” saw Robin Li formally release the latest 4.0 version of the company’s big language model, Wenxin Big Model, while handily teaching people how to use cue words to make the upgraded Wenxin Big Model apps that can help people use Beijing Provident Fund to buy a house in Hebei, make ads and videos, and create Netflix novels.
At the meeting, Robin Li confidently stated that the capability of Wenxin Big Model version 4.0 is “no less than GPT-4”. Meanwhile, the announcement six months ago that “all applications are worth reconstructing again with the big model” also came to fruition at the meeting that day. From the main business search to the Baidu family bucket applications such as the Baidu library, online disk, maps, etc., have been connected to the big model of the heart of the text, showing better interaction and logic capabilities. On the B-side, Robin Li also exposed GBI (generative business analysis) tools, as well as the office assistant “RuLiu”, which is supported by the big model capabilities. If generative AI has activated the entire technology circle, then Baidu may be the giant that benefits the most.
1. Wenshin 4.0, a direct counterpart to GPT-4
At the beginning of the conference, Robin Li first announced the release of the Wenshin big model 4.0 version. Baidu has divided the ability of the big model into four defining criteria: understanding, generation, logic, and memory. The 4.0 update of the Wenshin Big Model still has the same basic architecture as the 3.0 and 3.5 versions, but claims to have greater improvements in logic and memory.
According to Baidu’s CTO Wang Haifeng, LCSM 4.0 has similar improvements in comprehension and generation capabilities, while logic is three times more powerful than comprehension and memory is two times more powerful than comprehension. Four different capabilities can improve the efficiency of different application scenarios. For each of these capabilities, Robin Li gave a live demonstration. Understanding is the basis for conversational AI to help users and is very important in the fields of government, marketing, customer service, and so on.
In terms of comprehension, a prompt (prompt word) with a reversed order of speech and fuzzy expression was used to test the model’s ability: “I want to go back to Chengde to buy a house, can I use the provident fund to get a loan? What are the procedures? I work in Beijing.” To understand this sentence, the AI must realize that “working in Beijing” and “returning to Chengde to buy a house” has the same meaning as “contributing to the CPF in Beijing and having a hukou in Chengde”. This kind of very Chinese subtext, in order to make it accurate, the user needs to answer. And sure enough, Wenxin Yiyin quickly understood the key to the question and made the right answer.
The generation ability can mainly improve the efficiency of brand marketing, copywriting, and creative work. In the scene, Robin Lee showed that according to a picture, with natural language prompts, it can be background conversion, subject fuzzy processing, according to the official website information to generate posters and copywriting and other kinds of processing. In addition to traditional image processing, this time Baidu also demonstrated the ability to generate video. Through a natural language prompt, in the live demo, Wenxin Yiyin generated a digital human oral video advertisement with almost no delay. The video incorporated product images and added a number of transitional backgrounds, and a digital person in a suit appeared from time to time to introduce product features gleaned from the official website.
What was originally a collaborative effort between multiple AIGC products w as seamlessly blended into this presentation. The entire process of generating a commercial, 5 ad copy, and a poster took less than 3 minutes. Logical reasoning is usually expressed as a test of mathematical logic. For this presentation, Baidu focused on its potential in education. Robin Li gave an application problem that involved converting the volume of a cone into the volume of a rectangle, and Wenxin Yiyin not only gave the answer but also was able to solve the problem step by step and analyze the knowledge points involved in each step.
For the demonstration of memorization skills, Baidu’s choice was rather unusual. Baidu chose to have Wenxin Yiyan write an outline setting for a martial arts novel. After the writing was completed, on top of the original outline, the Big Model was asked to add character relationships and dramatic conflicts to show that the Big Model was able to remember the original outline setting and character abilities after adding complex information, without roaming around and using its imagination.
Baidu also shared the technical support for the advancement of Wenxin’s big model capabilities. Baidu has previously announced that the Wenshin Big Model is the first big model in China to be trained using Wanka clusters, and many people have speculated that the parameter size of this Wenshin Big Model 4.0 is expected to exceed the trillion level. However, at the conference, Baidu did not emphasize the parameter level of the big model. In addition to Wanka training, Baidu’s CTO also mentioned that the weekly average of Baidu’s algorithm training stability has exceeded 98% and that the technology of knowledge point enhancement has been carried out in both input and output.
2. Reconstructing Baidu Family Bucket
Although shown separately, in fact, more often than not, the four basic capabilities of the big model are composite applications. in May, Baidu announced the use of the big model to reconstruct Baidu’s application. At the conference, Baidu also showed the latest results of the big model reconstruction of Baidu’s apps. Among them, the most amazing is the reconstruction of the search.
In February, Microsoft launched New Bing based on GPT technology to refactor its search. In his latest testimony, Microsoft’s Nadella said that Microsoft’s share of the search market has remained virtually unchanged since adding artificial intelligence capabilities to Bing. Microsoft’s New Bing is essentially a system of conversational bots that can chat with New Bing and ask questions to get integrated information with links. Google’s Bard is similar.
But Baidu’s search reorganization goes deeper into the entire search system. Baidu describes it as “Ultimate Satisfaction, Recommendation Inspiration, Multi-Wheel Interaction”. Extreme satisfaction is reflected in the search box, enter a question, search can no longer give a link, but directly generates the best answer.
In the demonstration, Robin Li raised a question of what the ranking of industrial value added of each country was in the past 20 years. Unlike New Bing and Bard, which may give a data answer with a link, the new Baidu is able to directly give a dynamic table graph in the form of a bar chart showing the value of industrial growth in different countries. The graph is dynamic, with growth and ranking changes over time. The recommendation stimulation function is somewhat equivalent to the current search engine’s related questions, which can prompt the user to continue to follow the prompts to understand some related issues, such as “What is the relationship between industrial value added and GDP?” “What is the impact of industry on national economic development?”.
The multi-round interaction is also very interesting. In the current wave of big language modeling startups, one of the directions that many entrepreneurs are working on is to use big language modeling with recommendation engines to conduct multiple rounds of conversations to provide users with an optimal choice, and the first prize winner of Baidu’s Wenxin Cup Startup Competition in September, Buysmart.AI, is one of the leaders in this direction. AI is a leader in this direction. Users use natural language and point-and-click to clarify their needs, and Buysmart.AI utilizes a recommendation engine to ultimately recommend the products they need most. The refactored Baidu Search, on the other hand, adds functionality in a similar direction directly to the search. In the demo, Baidu’s search prompt asked “Where is the best place to go hiking around Beijing? The search engine gives a hundred flowers mountain, hiatus mountain, and so on more than one answer, allowing the user to further point and click to supplement the choice of their own situation. For example, if the choice of hiking novice plus parent-child, the search engine will be changed to recommend Xishan Mountain and Baiwang Mountain such locations, that are relatively good to climb, and the parent-child activities are also more friendly.
In addition to the reconstruction of the search, Baidu also showed for Baidu net disk, Baidu map, Baidu library, and other applications reconstruction.
Baidu NetDisk’s Cloud One Personal Cloud Assistant was launched earlier. As the world’s first personal cloud assistant, it now has 20 million users. You can use natural language to communicate with the assistant, a sentence operation to find a video in the personal cloud, for the video content to understand, to find a certain item in the video, to summarize the video of the golden sentence, and so on.
Baidu Map, according to Baidu’s publicity, is the world’s first AI native map product. Through a dialog with the map’s assistant, you can go straight to thousands of services with multi-level menus in one step. You can also recommend a restaurant with a suitable geographic location, compare the environment of the restaurant to choose from, and finally make a direct appointment.
Baidu Wiku, relying on one billion past manuscript resources, can directly after the user searches for information on a specific topic, check the type of article needed, serious academic literature or general public material, for one-click article generation. After the reorganization of the Baidu library also added the function of PPT generation, can understand the relationship between points of view is a juxtaposition or progressive relationship, PPT style switch, Baidu claimed that far from the supermarket on the field of other PPT generation tools.
3. Powering the B-end
In this presentation, Baidu also showed some new B-end applications. Among them, Baidu focused on launching a business intelligence product. Baidu GBI, Generative Business intelligence, is a brand new product launched by Baidu, which is the first generative business intelligence product in China, with the ability to support natural language interaction, cross-database analysis, and professional knowledge learning, which can shorten data analysis work that can only be accomplished by business analysts in a dozen days to the minute level.
In the commercials, the face of “How much is the cost estimate? What is the bottom line of the price that will not lose money? The client asked us to deliver within 3 months, can we do it? How soon can we do it? What can we do if the competition, for example, our offer is low?” Baidu GBI can provide answers to this series of questions related to financial analysis, project interaction, and user analysis directly through natural language dialog, and generate graphical answers. There is no need for professionals to operate it, and no additional operations are required to access data across databases and tables. On top of that, companies can train them to learn specialized knowledge and become industry experts.
Another B-side product is RuLu. Reconstructed with generative AI, RuLiu can generate meeting minutes with a single click, summarizing the contents of thousands of workgroups. Combined with the CRM system of the enterprise, it can provide managers with project backgrounds and project talks. According to the personal schedule, planning work plans, sending meeting invitations, and so on. In addition to empowering office aspects, Baidu also demonstrated the empowerment of large models for autonomous driving, intelligent cockpit, and government intelligence monitoring projects. Released more than half a year ago, Wenxin rapid iteration, and reconstruction of Baidu applications, while gradually building the Wenxin ecosystem. Baidu also introduced the recently launched Spirit Realm platform at the conference. Whether it is personal or enterprise data or applications, can quickly become a plug-in on the spirit realm platform, using API to access the capabilities of the Wenxin big model.
Baidu introduction, the current spirit of the platform on the line for a month, there have been 27,000 developers applied for stationing, covering more than 20 areas, including legal advice, resume generation, brain map production, oral practice and so on a variety of scenarios of the native application. Private enterprise data can be easily and quickly accessed without risk of leakage to the state-of-the-art capabilities of this big model. “China has rich application scenarios, Chinese users are naturally willing to embrace new technologies, and with the advanced basic big model, we can build a prosperous AI ecosystem and create a new round of economic growth together,” said Robin Li. Robin Li said.
On October 17, 2023, dressed in a white shirt and white shoes, Baidu’s Founder, Chairman, and CEO Robin Li took the stage to announce the arrival of a new era. The conference, named “Generate the Future,” saw Robin Li formally release the latest 4.0 version of the company’s big language model, Wenxin Big Model, while handily teaching people how to use cue words to make the upgraded Wenxin Big Model apps that can help people use Beijing Provident Fund to buy a house in Hebei, make ads and videos, and create Netflix novels.
At the meeting, Robin Li confidently stated that the capability of Wenxin Big Model version 4.0 is “no less than GPT-4”. Meanwhile, the announcement six months ago that “all applications are worth reconstructing again with the big model” also came to fruition at the meeting that day. From the main business search to the Baidu family bucket applications such as the Baidu library, online disk, maps, etc., have been connected to the big model of the heart of the text, showing better interaction and logic capabilities. On the B-side, Robin Li also exposed GBI (generative business analysis) tools, as well as the office assistant “RuLiu”, which is supported by the big model capabilities. If generative AI has activated the entire technology circle, then Baidu may be the giant that benefits the most.
1. Wenshin 4.0, a direct counterpart to GPT-4
At the beginning of the conference, Robin Li first announced the release of the Wenshin big model 4.0 version. Baidu has divided the ability of the big model into four defining criteria: understanding, generation, logic, and memory. The 4.0 update of the Wenshin Big Model still has the same basic architecture as the 3.0 and 3.5 versions, but claims to have greater improvements in logic and memory.
According to Baidu’s CTO Wang Haifeng, LCSM 4.0 has similar improvements in comprehension and generation capabilities, while logic is three times more powerful than comprehension and memory is two times more powerful than comprehension. Four different capabilities can improve the efficiency of different application scenarios. For each of these capabilities, Robin Li gave a live demonstration. Understanding is the basis for conversational AI to help users and is very important in the fields of government, marketing, customer service, and so on.
In terms of comprehension, a prompt (prompt word) with a reversed order of speech and fuzzy expression was used to test the model’s ability: “I want to go back to Chengde to buy a house, can I use the provident fund to get a loan? What are the procedures? I work in Beijing.” To understand this sentence, the AI must realize that “working in Beijing” and “returning to Chengde to buy a house” has the same meaning as “contributing to the CPF in Beijing and having a hukou in Chengde”. This kind of very Chinese subtext, in order to make it accurate, the user needs to answer. And sure enough, Wenxin Yiyin quickly understood the key to the question and made the right answer.
The generation ability can mainly improve the efficiency of brand marketing, copywriting, and creative work. In the scene, Robin Lee showed that according to a picture, with natural language prompts, it can be background conversion, subject fuzzy processing, according to the official website information to generate posters and copywriting and other kinds of processing. In addition to traditional image processing, this time Baidu also demonstrated the ability to generate video. Through a natural language prompt, in the live demo, Wenxin Yiyin generated a digital human oral video advertisement with almost no delay. The video incorporated product images and added a number of transitional backgrounds, and a digital person in a suit appeared from time to time to introduce product features gleaned from the official website.
What was originally a collaborative effort between multiple AIGC products was seamlessly blended into this presentation. The entire process of generating a commercial, 5 ad copy, and a poster took less than 3 minutes. Logical reasoning is usually expressed as a test of mathematical logic. For this presentation, Baidu focused on its potential in education. Robin Li gave an application problem that involved converting the volume of a cone into the volume of a rectangle, and Wenxin Yiyin not only gave the answer but also was able to solve the problem step by step and analyze the knowledge points involved in each step.
For the demonstration of memorization skills, Baidu’s choice was rather unusual. Baidu chose to have Wenxin Yiyan write an outline setting for a martial arts novel. After the writing was completed, on top of the original outline, the Big Model was asked to add character relationships and dramatic conflicts to show that the Big Model was able to remember the original outline setting and character abilities after adding complex information, without roaming around and using its imagination.
Baidu also shared the technical support for the advancement of Wenxin’s big model capabilities. Baidu has previously announced that the Wenshin Big Model is the first big model in China to be trained using Wanka clusters, and many people have speculated that the parameter size of this Wenshin Big Model 4.0 is expected to exceed the trillion level. However, at the conference, Baidu did not emphasize the parameter level of the big model. In addition to Wanka training, Baidu’s CTO also mentioned that the weekly average of Baidu’s algorithm training stability has exceeded 98% and that the technology of knowledge point enhancement has been carried out in both input and output.
2. Reconstructing Baidu Family Bucket
Although shown separately, in fact, more often than not, the four basic capabilities of the big model are composite applications. in May, Baidu announced the use of the big model to reconstruct Baidu’s application. At the conference, Baidu also showed the latest results of the big model reconstruction of Baidu’s apps. Among them, the most amazing is the reconstruction of the search.
In February, Microsoft launched New Bing based on GPT technology to refactor its search. In his latest testimony, Microsoft’s Nadella said that Microsoft’s share of the search market has remained virtually unchanged since adding artificial intelligence capabilities to Bing. Microsoft’s New Bing is essentially a system of conversational bots that can chat with New Bing and ask questions to get integrated information with links. Google’s Bard is similar.
But Baidu’s search reorganization goes deeper into the entire search system. Baidu describes it as “Ultimate Satisfaction, Recommendation Inspiration, Multi-Wheel Interaction”. Extreme satisfaction is reflected in the search box, enter a question, search can no longer give a link, but directly generates the best answer.
In the demonstration, Robin Li raised a question of what the ranking of industrial value added of each country was in the past 20 years. Unlike New Bing and Bard, which may give a data answer with a link, the new Baidu is able to directly give a dynamic table graph in the form of a bar chart showing the value of industrial growth in different countries. The graph is dynamic, with growth and ranking changes over time. The recommendation stimulation function is somewhat equivalent to the current search engine’s related questions, which can prompt the user to continue to follow the prompts to understand some related issues, such as “What is the relationship between industrial value added and GDP?” “What is the impact of industry on national economic development?”.
The multi-round interaction is also very interesting. In the current wave of big language modeling startups, one of the directions that many entrepreneurs are working on is to use big language modeling with recommendation engines to conduct multiple rounds of conversations to provide users with an optimal choice, and the first prize winner of Baidu’s Wenxin Cup Startup Competition in September, Buysmart.AI, is one of the leaders in this direction. AI is a leader in this direction. Users use natural language and point-and-click to clarify their needs, and Buysmart.AI utilizes a recommendation engine to ultimately recommend the products they need most. The refactored Baidu Search, on the other hand, adds functionality in a similar direction directly to the search. In the demo, Baidu’s search prompt asked “Where is the best place to go hiking around Beijing? The search engine gives a hundred flowers mountain, hiatus mountain, and so on more than one answer, allowing the user to further point and click to supplement the choice of their own situation. For example, if the choice of hiking novice plus parent-child, the search engine will be changed to recommend Xishan Mountain and Baiwang Mountain such locations, that are relatively good to climb, and the parent-child activities are also more friendly.
In addition to the reconstruction of the search, Baidu also showed for Baidu net disk, Baidu map, Baidu library, and other applications reconstruction.
Baidu NetDisk’s Cloud One Personal Cloud Assistant was launched earlier. As the world’s first personal cloud assistant, it now has 20 million users. You can use natural language to communicate with the assistant, a sentence operation to find a video in the personal cloud, for the video content to understand, to find a certain item in the video, to summarize the video of the golden sentence, and so on.
Baidu Map, according to Baidu’s publicity, is the world’s first AI native map product. Through a dialog with the map’s assistant, you can go straight to thousands of services with multi-level menus in one step. You can also recommend a restaurant with a suitable geographic location, compare the environment of the restaurant to choose from, and finally make a direct appointment.
Baidu Wiku, relying on one billion past manuscript resources, can directly after the user searches for information on a specific topic, check the type of article needed, serious academic literature or general public material, for one-click article generation. After the reorganization of the Baidu library also added the function of PPT generation, can understand the relationship between points of view is a juxtaposition or progressive relationship, PPT style switch, Baidu claimed that far from the supermarket on the field of other PPT generation tools.
3. Powering the B-end
In this presentation, Baidu also showed some new B-end applications. Among them, Baidu focused on launching a business intelligence product. Baidu GBI, Generative Business intelligence, is a brand new product launched by Baidu, which is the first generative business intelligence product in China, with the ability to support natural language interaction, cross-database analysis, and professional knowledge learning, which can shorten data analysis work that can only be accomplished by business analysts in a dozen days to the minute level.
In the commercials, the face of “How much is the cost estimate? What is the bottom line of the price that will not lose money? The client asked us to deliver within 3 months, can we do it? How soon can we do it? What can we do if the competition, for example, our offer is low?” Baidu GBI can provide answers to this series of questions related to financial analysis, project interaction, and user analysis directly through natural language dialog, and generate graphical answers. There is no need for professionals to operate it, and no additional operations are required to access data across databases and tables. On top of that, companies can train them to learn specialized knowledge and become industry experts.
Another B-side product is RuLu. Reconstructed with generative AI, RuLiu can generate meeting minutes with a single click, summarizing the contents of thousands of workgroups. Combined with the CRM system of the enterprise, it can provide managers with project backgrounds and project talks. According to the personal schedule, planning work plans, sending meeting invitations, and so on. In addition to empowering office aspects, Baidu also demonstrated the empowerment of large models for autonomous driving, intelligent cockpit, and government intelligence monitoring projects. Released more than half a year ago, Wenxin rapid iteration, and reconstruction of Baidu applications, while gradually building the Wenxin ecosystem. Baidu also introduced the recently launched Spirit Realm platform at the conference. Whether it is personal or enterprise data or applications, can quickly become a plug-in on the spirit realm platform, using API to access the capabilities of the Wenxin big model.
Baidu introduction, the current spirit of the platform on the line for a month, there have been 27,000 developers applied for stationing, covering more than 20 areas, including legal advice, resume generation, brain map production, oral practice and so on a variety of scenarios of the native application. Private enterprise data can be easily and quickly accessed without risk of leakage to the state-of-the-art capabilities of this big model. “China has rich application scenarios, Chinese users are naturally willing to embrace new technologies, and with the advanced basic big model, we can build a prosperous AI ecosystem and create a new round of economic growth together,” said Robin Li. Robin Li said.