Microsoft details AI model designed to improve Bing search
All the sessions from Transform 2021 are available on-demand now. Watch now.
Microsoft today detailed a large neural network model it’s been using in production to improve the relevance of Bing searches. The company says that the model, called a “sparse” neural network, is complementary to existing large Transformer-based networks like OpenAI’s GPT-3.
Transformer-based models have been getting a lot of attention in the machine learning world. These models excel at understanding semantic relationships and they’ve been used to enhance Bing search, as Microsoft has previously revealed. But they can also fail to capture more nuanced relationships between search and webpage terms beyond pure semantics.
That’s where Microsoft’s new Make Every feature Binary (MEB) model comes in. The large-scale, sparse model has 135 billion parameters — the parts of the machine learning model learned from historical training data — and space for over 200 billion binary features that reflect the subtle relationships between searches and documents. Microsoft claims that MEB can map single facts to features, allowing the model to gain a more nuanced understanding of individual facts.
Microsoft says that MEB, which was trained on more than 500 billion query and document pairs from three years of Bing searches, is running in production for 100% of Bing searches in all regions and languages. It’s the largest universal language model that the company is serving to date, occupying 720 GB when loaded into memory and sustaining 35 million feature lookups during peak traffic time.
Supercharging Bing searches
Many models overgeneralize when filling in the blank in a sentence like “[blank] can fly.” For example, the models might only fill the blank with the word “birds.” MEB avoids this by assigning each fact to a feature, so it can assign weights that distinguish between the ability to fly in, say, a penguin versus a puffin. Instead of simply saying “birds can fly,” MEB paired with Transformer models can take this to another level of classification, saying “birds can fly, except ostriches, penguins, and these other birds.”
MEB can continue to learn with more data added, according to Microsoft, indicating that model capacity increases with newly added data. It’s refreshed daily by continuously training with the latest daily click data, with an auto-expiration strategy to checks each feature’s timestamp and filters out features that haven’t shown up in the last 500 days.
For example, MEB learned that “Hotmail” is strongly correlated to “Microsoft Outlook” — even though they’re not close to each other in terms of semantic meaning. Similarly, it learned a strong connection between “Fox31” and “KDVR,” where KDVR is the call sign of the TV channel in Denver, CO that’s operating under the brand Fox31.
MEB can also identify negative relationships between words or phrases, revealing what users don’t want to see for a query. For example, users searching for “baseball” usually don’t click on pages talking about “hockey,” even though they are both popular sports. Understanding these negative relationships can help to omit irrelevant search results.
After deploying MEB into production, Microsoft says that it led to an almost 2% increase on clickthrough rate on the top search results, as well as a 1% reduction in manual search query reformulation by more than 1%. Moreover, MED reduced clicks on pagination by over 1.5%. Users needing to click on the “next page” button means they didn’t find what they were looking for on the first page.
“We’ve found very large sparse neural networks like MEB can learn nuanced relationships complementary to the capabilities of Transformer-based neural networks,” Microsoft wrote in a blog post. “This improved understanding of search language results in significant benefits to the entire search ecosystem.”
- up-to-date information on the subjects of interest to you
- our newsletters
- gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
- networking features, and more
Source: Read Full Article