Frankenplace: interactive thematic mapping for ad hoc exploratory search

Publication Type:

Conference Paper

Source:

The 24th International Conference on World Wide Web, ACM, Florence, Italy (2015)

URL:

http://www.www2015.it/documents/proceedings/proceedings/p12.pdf

Abstract:

Ad hoc keyword search engines built using modern information retrieval methods do a good job of handling fine-grained queries. However, they perform poorly at facilitating spatial and spatially-embedded thematic exploration of the results, despite the fact that many queries, e.g. civil war, refer to different documents and topics in different places. This is not for lack of data: geographic information, such as place names, events, and coordinates are common in unstructured document collections on the web. The associations between geographic and thematic contents in these documents can provide a rich groundwork to organize information for exploratory research. In this paper we describe the architecture of an interactive thematic map search engine, Frankenplace, designed to facilitate document exploration at the intersection of theme and place. The map interface enables a user to zoom the geographic context of their query in and out, and quickly explore through thousands of search results in a meaningful way. And by combining topic models with geographically contextualized search results, users can discover related topics based on geographic context. Frankenplace utilizes a novel indexing method called geoboost for boosting terms associated with cells on a discrete global grid. The resulting index factors in the geographic scale of the place or feature mentioned in related text, the relative textual scope of the place reference, and the overall importance of the containing document in the document network. The system is currently indexed with over 5 million documents from the web, including the English Wikipedia and online travel blog entries. We demonstrate that Frankenplace can support four distinct types of exploratory search tasks while being adaptive to scale and location of interest.