To build artificial intelligence (AI) systems that can interact with people in smarter, safer, and more useful ways, we must teach them to adapt to our needs. Today, we are launching blenderbot 3, our state-of-the-art conversational agent that can naturally converse with people, who can then provide feedback to the model on how to improve its responses. We will share data from these interactions, and have shared the BlenderBot 3 model and model cards with the scientific community to help advance conversational AI research.
The BlenderBot series has made progress in combining conversational skills, such as personality, empathy and knowledge, incorporating long-term memory and Internet searching to carry out meaningful conversations. BlenderBot 3 inherits these abilities and offers superior performance because it is built from publicly available Meta AI. Language model OPT-175B — about 58 times the size of BlenderBot 2.
Since all AI conversational chatbots are known to sometimes mimic and generate unsafe, biased, or offensive feedback, we have conducted large-scale studies, co-hosted workshops, and developed new techniques to create protections for BlenderBot 3. Despite this work, BlenderBot can still make rude or offensive comments, so we’re collecting feedback that will help improve future chatbots.
The promise and challenge of chatting with humans
Allowing an AI system to interact with people in the real world leads to longer and more diverse conversations, as well as more varied comments. For example, you can react to each chat message in our BlenderBot 3 demo by clicking the thumbs up or thumbs down icons. Choosing a thumbs down allows you to explain why you didn’t like the message, whether it was off topic, pointless, rude, spam-like, or something else. You can also post comments in the chat itself.
Developing a secure chatbot that improves itself
To improve BlenderBot 3’s ability to interact with people, we trained it on a large amount of publicly available language data. Many of the data sets used were collected by our own team, including a new data set consisting of more than 20,000 conversations with people based on more than 1,000 conversation topics. We’ve trained BlenderBot 3 to learn from conversations and improve the skills people think are most important, from talking about healthy recipes to finding kids’ services around town.
When the chatbot’s response is not satisfactory, we collect feedback on it. With this data, we can improve the model so that it doesn’t repeat its mistakes.
We understand that not everyone who uses chatbots has good intentions, so we also developed a new learning algorithm.s distinguish between helpful responses and harmful examples. Over time, we will use this technique to make our models more responsible and safe for all users.
Putting BlenderBot 3 to the test
Compared to its predecessors, we found that BlenderBot 3 improved by 31% on conversational tasks. He is also twice as knowledgeable, while the facts are wrong 47% less often. We also found that only 0.16% of BlenderBot responses to people were flagged as rude or inappropriate.
The goal of our research is to collect and publish feedback data that we and the broader AI research community can build upon over time. In that way, we can find new ways to make AI systems safer and more attractive to the people who use them.
Driving conversational AI forward
Progress in the field of AI is highly dependent on the opportunity for the broader AI research community to build on the best available technology. Therefore, releasing chatbot models and datasets is key to gaining comprehensive and reliable insights into how and why they work, their potential, and their limitations.
While BlenderBot 3 makes significant progress on publicly available chatbots, it’s certainly not on a human level. It is occasionally incorrect, inconsistent, and off topic. As more people interact with our demo, we will improve our models using your feedback and publish data to benefit the AI community at large.
Learn more about BlenderBot 3