next up previous
Next: Slides 10-13 Up: Notes on Information Retrieval Previous: Slides 1-6

Slides 7-8

Controlled vocabulary systems - The Library of Congress Subject Headings are considered by many to be the definitive controlled indexing vocabulary. You will find a good description of the subject hierarchy in http://www.unc.edu/courses/jomc050/loc/lcsh.html.

NASA maintains its own controlled vocabulary system [Sil94]. Boeing has trained its human technical writers and translators to use only the officially specified Boeing controlled vocabulary when writing manuals for their products. They are also developing checking tools [nas93] to ensure compliance. This policy was adopted after a 747 crashed killing all on board because a Brazilian technician didn't correctly understand one ambiguous noun phrase and did something the wrong way during maintenance. Bear in mind that the manuals for a 747 weigh more than the plane itself.

Try Yahoo - entertainment - movies and film - actors and actresses - P - Pacino, Al - filmography. You will see that Yahoo maintains a carefully planned hierarchy in its subject tree, with hand-selected pages at the bottom.

As regards free text systems, these are the most important systems in IR today. Nearly all IR research uses free-text indexing. In particular, WWW search engines all employ free-text indexing to maintain their catalogues. The advent of the WWW has given IR a massive boost as a science. Search engine sites, in particular, are a highly lucrative commercial area.


next up previous
Next: Slides 10-13 Up: Notes on Information Retrieval Previous: Slides 1-6
Nerbonne J.
1999-09-20