In today's digital society, electronic information is increasingly shared among different entities, and decisions are made based on common attributes. To address associated privacy concerns, the research community has begun to develop cryptographic techniques for controlled (privacy-preserving) information sharing. One interesting open problem involves two mutually distrustful parties that need to assess the similarity of their information sets, but cannot disclose their actual content. This paper presents the first efficient and provably-secure construction for privacy-preserving evaluation of sample set similarity, measured as the Jaccard similarity index. We present two protocols: the first securely computes the Jaccard index of two sets, the second approximates it, using MinHash techniques, with lower costs. We show that our novel protocols are attractive in many compelling applications, including document similarity, biometric authentication, genetic tests, multimedia file similarity. Finally, we demonstrate that our constructions are appreciably more efficient than prior work.
De Cristofaro, E.; Blundo, C.; Gasti, P. EsPRESSo: efficient privacy-preserving evaluation of sample set similarity. 7th International Workshop on Data Privacy Management (DPM 2012); 2012 September 13-14; Pisa, Italy.