Share Email Print
cover

Proceedings Paper

Layout-based substitution tree indexing and retrieval for mathematical expressions
Author(s): Thomas Schellenberg; Bo Yuan; Richard Zanibbi
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

We introduce a new system for layout-based (LATEX) indexing and retrieval of mathematical expressions using substitution trees. Substitution trees can efficiently store and find expressions based on the similarity of their symbols, symbol layout, sub-expressions and size. We describe our novel implementation and some of our modifications to the substitution tree indexing and retrieval algorithms. We provide an experiment testing our system against the TF-IDF keyword-based system of Zanibbi and Yuan and demonstrate that, in many cases, the quality of search results returned by both systems is comparable (overall means, substitution tree vs. keywordbased: 100% vs. 89% for top 1; 48% vs. 51% for top 5; 22% vs. 28% for top 20). Overall, we present a promising first attempt at layout-based substitution tree indexing and retrieval for mathematical expressions and believe that this method will prove beneficial to the field of mathematical information retrieval.

Paper Details

Date Published: 23 January 2012
PDF: 8 pages
Proc. SPIE 8297, Document Recognition and Retrieval XIX, 82970I (23 January 2012); doi: 10.1117/12.912502
Show Author Affiliations
Thomas Schellenberg, Rochester Institute of Technology (United States)
Bo Yuan, Rochester Institute of Technology (United States)
Richard Zanibbi, Rochester Institute of Technology (United States)


Published in SPIE Proceedings Vol. 8297:
Document Recognition and Retrieval XIX
Christian Viard-Gaudin; Richard Zanibbi, Editor(s)

© SPIE. Terms of Use
Back to Top