Does your search engine find “all” or “any” query words?

The search matching rule really matters

To run a search engine, you have to understand the relationship of the input (search terms) and the output (search results).   There may be a lot of query processing going on, but the most basic is how the search engine handles multi-word queries.  The main choices are to find documents with all the words in the query, or any of the words in the query.

Match all words in the query

Imagine searching for product information mypartnumber. This will only match documents with the terms product and information and mypartnumber.  

Advantages and disadvantages

  • A small number of matches, likely to answer the question.
  • Easy to understand why the documents got matched.
  • Can miss useful documents which have slightly different vocabulary, like info-sheet or product page.

Match any words in the query

Again, using the example of searching product information mypartnumber. This will match all documents with the terms product or information or mypartnumber.  

Advantages and disadvantages

  • Complete result set, no chance of missing anything
  • Relevance ranking can show the ones with all the words at the top of result.
  • Likely to find other useful pages for mypartnumber

A little history: the early web search engines, like Lycos and AltaVista, matched any word on any page they found.  This quickly became unwieldy, so HotBot and Google chose to match only pages which had all the words in the query.  As of August, 2011, Bing (and therefore Yahoo) has different behavior for long queries, and will find pages containing most of the words in the query.  This can be annoying.

How to find out whether your search engine matches on all terms or any term:

  1. Do a search on your search engine for a word that you know is on many documents in the site, like the company name.
  2. Do a search for a word that you know is not in the search index, maybe a made-up one like ztyclrqqp, so you get the no-matches result.  (If your search engine tries to be clever and automatically changes it to something else, you may need to put a + before the word.)
  3. Now do a search with both words: name +ztyclrqqp 

If the search engine finds no results, you know that it is matching on all words in the query, because ztyclrqqp doesn’t exist on your site or intranet (though it now does on mine).

If it finds results, you know it’s probably matching on any word in the query.  That means the number of results will be high (which may distress some users), so the relevance ranking has to be very good, putting the best matches first and being transparent about what matches mean.


If you have questions about this, please leave a comment here.

I have lots more information at searchtools.com, and provide search analysis, configuration, and training — contact me for rates.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s