homeresources & publications › modeling information scent: a comparison of lsa, pmi-ir and glsa similarity measures on common tests and corpora

TECHNICAL PUBLICATIONS:

Modeling information scent: a comparison of LSA, PMI-IR and GLSA similarity measures on common tests and corpora

 

In this paper we describe a comparison among three systems that estimate semantic similarity between words: Latent Semantic Analysis [6], Pointwise Mutual Information [17], and Generalized Latent Semantic Analysis [8]. We compare all these techniques on a unique corpus (TASA) and, for PMI and GLSA, we also report performance on a different web-based corpus. The evaluation is carried out through two kinds of tests: (1) synonymy tests, and (2) comparison with human word similarity judgments.

 
citation

Budiu, R. ; Royer, C. ; Pirolli, P. L. Modeling information scent: a comparison of LSA, PMI-IR and GLSA similarity measures on common tests and corpora. PARC TR-2006-1. 2006 September.