Friday, October 17, 2008

Keyword Power

Word matching is the key to effective searching. For several years now, we've been proponents of thinking about keywords in four categories:
  • words that are effective "as is"
  • words that are important but for which there is probably a better word
  • words that have little effect
  • stop words
Words in the top two groups are Power Words. The remaining words weaken or do nothing to help the search.

When you think about it, most of the words you may use in a query fall into the second group. Unless the word is a proper noun, a number or an object for which there isn't a proper noun or any other meanings, there is almost always one or more alternate words that could be used. Figuring out what those other words are is one of the two big challenges of speculative searching (the other one is figuring out where to look if Google doesn't have the information).

On the topic of word choices, there are other factors that render keywords more powerful. The first of these is 'Multiplicity of Meanings.' If there are few meanings--or just one--for the same spelling, the word gains power. This is why BISON is more powerful than BUFFALO. The latter has numerous meanings, including being a verb.

'Specificity' is another word quality that radiates power. If a word is very specific it probably has only one meaning. Combining weak words usually results in greater specificity. In order to strengthen BUFFALO, another word that is specific to the animal should be used, for example, BUFFALO HIDE. Separately, both words are weak, but together they mean only one thing.

'Frequency of Usage,' is another important factor. Unique combinations of words are very powerful because they tend only to be used together in specific contexts. This is the case with DEAF HORSE SOCCER, described in the previous post. If you include too many words in a unique string, however, the odds multiply against you. Keep in mind, you are looking for a page that contains all these words.

Finally, and there may be debate about this, verbs tend to make weak search terms. They don't describe objects well and objects are usually what is being sought. If you can turn a verb into a noun, you almost always improve the query. This is the case with the Earthquake Challenge: "What toy models a construction principle that reduces damage from earthquakes?" There are two verbs here and neither one helps. If you change reduces to reduction, the query is more powerful. However, the combination of toy construction earthquakes is unique enough by itself to retrieve relevant information. Very few pages include all three of those keywords, making it an infrequently used string and quite powerful.

No comments: