More relevant searches for the less relevant search engine

Sebastian Moss

August 5, 2021

5 Min Read

More relevant searches for the less relevant search engine

Microsoft has introduced a new large-scale sparse model it claims improves results for its search engine, Bing.

The 'Make Every feature Binary' (MEB) model is designed to complement Transformer-based deep learning models such as OpenAI's GPT-3, Google's Switch Transformer, or Microsoft's own Turing NLG.

MEB is now fully deployed across Bing, in every language and location. It is the company's largest universal language model in production, taking up 720GB when loaded into memory and sustaining 35 million feature lookups during peak traffic time.

Binary results

Deep neural network (DNN) language models can overgeneralize, Microsoft said, failing to capture the more nuanced relationships between language, searches, and web pages.

For example, when filling in "___ can fly," they may only suggest "birds" because that is what the majority of the training data will suggest.

MEB could be used in addition to the DNN, Microsoft said, assigning a fact to a feature. So it could say “birds can fly, except ostriches, penguins, and these other birds.”

The artificial intelligence system can also learn hidden intents behind searches, Microsoft said. For example, it was able to learn that searches for "Hotmail" were actually for "Microsoft Outlook," the company's rebranded inbox.

The new model can also understand negative correlations, such as the fact that those searching for "baseball" don't want to be given "hockey" results.

The company said it trained the system on three years of Bing searches, containing more than 500 billion query/document pairs (for comparison, SEOTribunal estimates Google gets about 2 trillion searches a year, but data is limited).

The company expects MEB to keep learning as more data is added, and has an auto-expiration feature that filters out irrelevant data that hasn't proved useful in the last 500 days.

"For each Bing search impression, we use heuristics to determine if the user was satisfied with the document(s) they clicked," Microsoft explained in a blog post.

"We label these “satisfactory” documents as positive samples. Other documents in the same impression are labeled as negative samples. For each query and document pair, binary features are extracted from the query text, the document URL, title, and body text. These features are fed into a sparse neural network model to minimize the cross-entropy loss between the model’s predicted click probability and the actual click label."

With the system rolled out, Microsoft said that they have seen an "almost two percent" increase on clickthrough rate (CTR) on the top search results. Manual query reformulation (that is, people rewriting searches) dropped by more than one percent. Clicks to the next page of search results fell 1.5 percent.

A bit about Bing

Microsoft launched Bing back in 2009, following on from half-hearted efforts like MSN Search and Windows Live Search.

At the time, the company threw around $100 million into advertising Bing across the US, and aggressively pushed the search engine as a superior competitor to Google.

Over the following few years, Microsoft threw more and more money behind it – including $550m to make it the default search provider on Verizon's BlackBerry, and an even bigger deal to have Bing power Yahoo! searches.

Such efforts brought Bing's US market share up from 8.4 percent at launch to 15.4 percent by 2012, according to comScore. But they also proved incredibly expensive.

That year, the company took a $6.2 billion writedown on its online services division (which had already lost $10.4bn over the previous five years).

Most of that was due to its $6.3bn purchase of online display advertising company aQuantive in 2007, which was meant to integrate tightly with Bing.

Losses continued to mount until 2015, when the division finally turned a profit on quarterly revenue of more than $1 billion. This was in part helped by Bing being aggressively integrated into Windows 10.

In its latest earnings call, Microsoft said that search revenue was still growing, but did not provide details about users.

Reliable market share stats are lacking, with Netmarketshare estimating Bing to have a 6.81 percent market share globally as of 2020, behind Google's 82 percent and Baidu's 8.28 percent.

Microsoft claims that its search engine is used far more than people think – at least on desktop – suggesting back in 2017 that a third of PC searches in the US were on Bing. This has not been corroborated by other sources.

With Google integrating its search services into Android, and paying Apple billions to do the same on iOS, the company's stranglehold over the growing mobile market has proved impossible to break.

In the strange world of trillion-dollar tech businesses, Bing is an unusual operation. It is widely regarded as a failure, due to its comparatively tiny market share, but it is still a business that contributes billions to Microsoft's bottom line, and is used by tens to hundreds of millions of consumers.

But in its original mission, it has failed spectacularly. The business was created to take on Google, a desperate effort to snuff out a company that would later take over the browser market, eat into Office sales with Google Docs, and launch its own operating system as a broadside against Windows.

Instead, Bing seems to have done the opposite. Repeatedly, when antitrust officials around the world trained their crosshairs on Google, the company pointed to Bing as competition. Look, Google executives said, we can't be a monopoly because there's this well-funded, aggressive competitor biting at our heels.

In many ways, that may prove Bing's most powerful legacy.

Keep up with the ever-evolving AI landscape
Unlock exclusive AI content by subscribing to our newsletter!!

You May Also Like