Synonym Resolution on the Web

Synonym Resolution on the Web



The Web is a vast resource of information on practically anything one can think of. Unfortunately, the information is mostly in unstructured text, making it difficult for machines to process. This talk presents new methods for identifying synonymous objects and relations on the Web, on top of an information extraction system. New techniques developed for this problem include a novel probabilistic model for synonym extraction, and a highly scalable clustering algorithm. The results have been integrated into an application that allows searching over a large set of relations extracted from the Web, and they hold promise for improved search technology.

Leave a Reply