Decomposing document images by heuristic search


Document decomposition is a basic but crucial step for many document related applications. This paper proposes a novel approach to decompose document images into zones. It first generates overlapping zone hypotheses based on generic visual features. Then, each candidate zone is evaluated quantitatively by a learned generative zone model. We infer the optimal set of non-overlapping zones that cover a given document image by a heuristic search algorithm. The experimental results demonstrate that the proposed method is very robust to document structure variation and noise.


Wang, Y. ; Gao, D. Decomposing document images by heuristic search. The 6th International Conference on Energy Minimization Methods in Computer Vision and Pattern Recognition; 2007 August 27-29; EZhou; Hubei; China. Berlin: Springer; 2007; LNCS 4679: 97-111.