Pruning the search space of a hand-crafted parsing system with a probabilistic parser
The demand for deep linguistic analysis for huge volumes of data means that it is increasingly important that the time taken to parse such data is minimized. In the XLE parsing model which is a hand-crafted, unification-based parsing system, most of the time is spent on unification, searching for valid f-structures (dependency attribute-value matrices) within the space of the many valid c-structures (phrase structure trees). We carried out an experiment to determine whether pruning the search space at an earlier stage of the parsing process results in an improvement in the overall time taken to parse, while maintaining the quality of the f-structures produced. We retrained a state-of-the-art probabilistic parser and used it to pre-bracket input to the XLE, constraining the valid c-structure space for each sentence. We evaluated against the PARC 700 Dependency Bank and show that it is possible to decrease the time taken to parse by ~18% while maintaining accuracy.
Cahill, A.; King, T. H. ; Maxwell, J. T. Pruning the search space of a hand-crafted parsing system with a probabilistic parser. ACL2007 Workshop on Deep Linguistic Processing; 2007 June 28; Prague; Czech Republic.