Using the web to discover Inference, bias, and “private” data
Details
Speakers
Using the web to discover Inference, bias, and “private” data
Text content can allow unintended inferences. Consider, for example, the numerous people who have published anonymous blogs for venting about their employer only to be identified through seemingly non-identifying posts. Similarly, the US government's "Operation Iraqi Freedom Portal" was assembled as evidence of nuclear weapons presence in Iraq, but removed because it could be used to infer much of the weapon making process. We propose a simple, semi-automated approach to detecting text-based inferences prior to the release of content. Our approach uses association rule mining of the Web to identify keywords that may allow a sensitive topic to be inferred. While the main motivation of this work is data loss prevention we will also discuss how these techniques can be adapted to detect bias in product reviews and to assess the privacy of proprietary data.
Note: Most of this talk is joint work with Richard Chow and Philippe Golle.
Additional information
Our work is centered around a series of Focus Areas that we believe are the future of science and technology.
We’re continually developing new technologies, many of which are available for Commercialization.
PARC scientists and staffers are active members and contributors to the science and technology communities.