On the reading of tables of contents

Details

Event Eighth IAPR International Workshop on Document Analysis Systems

Authors

Sarkar, Prateek
Eric Saund
Technical Publications
September 16th 2008
This paper presents a framework for understanding tables of contents of books, journals, and magazines. We propose a universal logical structure representation in terms of a hierarchy of entries, each of which may contain a descriptor and a locator. We enumerate graphical and perceptual cues that provide cues to parsing of tables of contents in terms of this formalism. We make initial suggestions about the form of evaluation metrics for comparing groundtruthed tables of contents with the output of recognition algorithms. Typical and atypical tables of contents are used throughout to illustrate signifcant phenomena that must be dealt with in principled ways in any general TOC interpretation scheme. Finally we discuss implications of our observations on the design of recognition algorithms.

Citation

Sarkar, P.; Saund, E. On the reading of tables of contents. Eighth IAPR International Workshop on Document Analysis Systems; 2008 September 16-19; Nara, Japan.

Additional information

Focus Areas

Our work is centered around a series of Focus Areas that we believe are the future of science and technology.

FIND OUT MORE
Licensing & Commercialization Opportunities

We’re continually developing new technologies, many of which are available for Commercialization.

FIND OUT MORE
News

PARC scientists and staffers are active members and contributors to the science and technology communities.

FIND OUT MORE