Number 248 - January 2004

Search Engines Public and Private
by Burt Leavenworth, Member Boca Raton Computer Society
   Information is power. This aphorism, in the information age, is more true than ever on the Internet. If you are using your computer mostly to send and receive e-mails, you are wasting a powerful resource. Most Boca Raton Computer Society (BRCS) members with computers should be aware of, if not regularly using, search engines on the web such as Google (www.google.com), Hotbot (www.hotbot.com) and Fast (www.alltheweb.com). Your web browser is very useful when you can direct it to a specific web site, easily obtainable these days from addresses given in newspaper articles and magazines. But the latent power of the browser can be more fully exploited when you are looking for information on general and specific topics but do not have a specific address to use. In this case, search engines can be used to "surf' the web. They are really easy to use once you have mastered some basic principles. This article will discuss these principles as they apply to search engines, both public and private (terms employed by the author to differentiate two different types of search engines which are defined below).

   What is the difference between public and private search engines? A public search engine uses keywords to index pages on the web which contain articles or documents on a particular topic useful to the public at large. The sum total of these pages is in effect a huge public database of all the information that can be accessed by the search engine. These indexes are created and maintained by personnel supporting the search engines. On the other hand, a private search engine uses keywords to index documents or notes contained in an individual's private database. These indexes are correspondingly created and maintained by the individual for his or her private use.

   Suppose we want to find out about Boca Raton. So we fire up one of our favorite search engines, Google, and type in the two keywords: Boca Raton, and then click on Google Search. By default, Google displays the first ten documents containing all the keywords of the query .In this case, some of the results obtained are the city Boca Raton, the Chamber of Commerce, the Boca Raton Resort and Club, the Boca Pointe Community Hospital, and the Boca Raton Museum of Art. The criterion 'all the keywords' is an example of a search mode called 'All of the Words'. A second search mode called' Any of the Words' would allow you to search for documents that contain as few as one of the provided keywords. But this mode will generally return many more results than the previous mode. If we use this mode in Google (actually we had to click 'Advanced Search' to get this option whereas Hotbot provides both these modes plus several more on its home web page), one of the results returned is Raton, New Mexico (you can find it in your atlas).

   Let us now turn our discussion to private search engines which will elaborate the above ideas and introduce some new ones. Consider a small private database containing the names of restaurants in the local area:

      Carafiello's Restaurant
      "Deerfield Beach" Italian
      The Lobster House Restaurant
      "Boca Raton" seafood
      La Trattoria Restaurant
      "Boca Raton" Italian
      Busch's Seafood Restaurant
      "Delray Beach"
      La Luna Restaurant
      "Boca Raton" Italian
   This is not a real example but is designed to motivate and illustrate the search principles discussed in the article. The types of search queries will be the same as those used by most public search engines.

   Note that the type of restaurant (Italian, seafood) and the location (Boca Raton, Deerfield Beach) have been entered as attributes, but this is not necessary in the case of Busch's because Seafood is part of the name of the restaurant. Instead of entering, for example, Boca and Raton as separate words in the database, we combine them in the one word "Boca Raton". This is an example of an exact phrase. Of course, we could have used the abbreviation: Boca.

   Suppose we want to find all the seafood restaurants Using 'All of the Words' as the search mode and the keywords: seafood restaurant, the search engine would return:

      The Lobster House
      Busch's

   To be more specific, suppose we want only the seafood restaurants in Boca. Still using 'All of the Words', and the three keywords: seafood restaurant "Boca Raton", the result returned is only:

      The Lobster House

   There is just one more idea that needs to be introduced, that of a Boolean phrase, named for the English logician George Boole (you can look him up by using one of the public search engines mentioned above). The Boolean phrase equivalent of the first example would be: seafood AND restaurant.

   Now suppose we want to find a restaurant in Boca that is not Italian. Using Boolean phrase as the search mode, we enter the search query: restaurant AND "Boca Raton" NOT Italian, and the result is: The Lobster House. The advantage of using Boolean phrases is that they enable us to put together more complicated queries.

   Each of the constructions we have mentioned: all of the words, any of the words, exact phrase, and Boolean phrase are provided in one form or another by most public search engines and, of course, by our private search engine. But what are private search engines good for? They enable the individual user to store away any kind of information that is useful to him or her in unstructured form. This might consist of names, addresses, phone numbers, driving directions, records of phone conversations, medical records, ideas, etc. My wife has names of handymen under "H" in the rolodex but it is much easier to use associations in a private database because you can use multiple associations for an object; you can forget the name of the object but are more likely to find it using one of its associations or attributes. Much of this information otherwise is normally entered on scraps of paper which are easily lost.

   There is a commercial product called Info Select (www.miclog.com) which is a private search engine with lots of bells and whistles, a steep learning curve and a steep price, but there is also a homegrown product by the author called InfoSearch which is free to members of BRCS. If you want to start keeping track of things and would like to play around with the software, you can drop a note to the author at: edlsoft@adelphia.net for more information.

   So start using the public search engines to look up all kinds of information and a private search engine in order to store and retrieve your personal information.
  Number 248 - January 2004