Modeling information scent: a comparison of LSA, PMI-IR and GLSA similarity measures on common tests and corpora
In this paper we describe a comparison among three systems that estimate semantic similarity between words: Latent Semantic Analysis , Pointwise Mutual Information , and Generalized Latent Semantic Analysis . We compare all these techniques on a unique corpus (TASA) and, for PMI and GLSA, we also report performance on a different web-based corpus. The evaluation is carried out through two kinds of tests: (1) synonymy tests, and (2) comparison with human word similarity judgments.
Budiu, R. ; Royer, C. ; Pirolli, P. L. Modeling information scent: a comparison of LSA, PMI-IR and GLSA similarity measures on common tests and corpora. PARC TR-2006-1. 2006 September.