- Accessing the Catalog
- Noise Words
- Efficiency in Searching
- Summary
Noise Words
For optimal search performance, certain sets of words are considered "noise" words by the Microsoft Search service. These words are ignored and cannot be searched using CONTAINS(), CONTAINSTABLE(), FREETEXT(), and FREETEXTTABLE(). The noise words are maintained in several files, one per language, in the directory MSSQL/FTDATA/SQLServer/Config. The noise file for English is noise.eng. This list of words can be altered by the database administrator, by editing the file when the Microsoft Search service is inactive. You can add or delete words in the list.
Developers need to be aware of some of the behaviors that noise words can have in the search string argument of the previously mentioned search utilities. Depending on the type of query, the search service may or may not perform the match. Within a search string, you can use AND or OR to join multiple words. For example, suppose you have this search string:
'java OR object OR both'
The word both is considered a noise word and so is ignored when the search is run. However, suppose you change the search string to this:
'java AND object AND both'
The search will fail and you'll get the following error:
Server: Msg 7619, Level 16, State 1, Line 1 A clause of the query contained only ignored words
If you're writing an interface for users to search content stored in a database, it's usually best to write in a word parser to strip out noise words from a query, so that users don't get null result sets (if the exception was trapped) because they didn't know what noise words to avoid.
CAUTION
Exact-phrase queries require the noise words to be part of the search string. If the noise words are stripped out of the exact phrase, the phrase is no longer valid, since the noise words are part of the full-text catalog and the search results will not be as expected.