Deep Product Comparison on the Semantic Web

Abstract

Search plays a major role in information systems of today. It facilitates the finding of information on our desktop computers and mobile devices, in enterprise intranets, or on the Web. Yet, as the volume of data grows, it becomes increasingly difficult to get the required information. Problems in particular arise with regard to search efficiency (“Can the information be procured at low cost?”) and search effectiveness (“Are the returned results satisfying?”). An important use case in this context is the discovery of products on the Web. Product search is challenging for several reasons: (1) The amount of product-related documents has increased over time; (2) the data contained in those documents is mostly unstructured and heterogeneous; (3) products are multi-dimensional objects; and, (4) users have often complex information needs. On that account, the quality and granularity of data are critical requirements for product search algorithms on the Web. This thesis contributes a search framework for product offers on the Semantic Web, also known as the Web of Data. Structured data on the Web has grown rapidly over the last five years. The key drivers have been Linked Data sources and Web pages with embedded Microdata or RDFa markup. Structured data can mitigate many of the limitations of traditional Web searches for products. For instance, global resource identifiers on the Web ease product data integration. This way, it is possible to augment product data by fine-grained and high-quality product descriptions. In our work, this authoritative data is supplied via manufacturer datasheets and product classification systems. These granular product descriptions enable deep product comparison over several product dimensions. A crucial component of our solution is the implementation of a faceted search interface. Faceted search is a proper way to deal with the iterative and incremental nature of search. It engages the user in the search process, letting him continually learn about the option space in an exploratory fashion. As an important innovation, our approach is data- or instance-driven, i.e. the availability of data determines the options presented to the user. This is in stark contrast to traditional search interfaces that typically rely on a system-wide, domain-specific, rigid conceptual structure. Our design choice eases to search within the often sparse graph of product information on the Web of Linked Data. Furthermore, it extends the feasibility of our approach to other application areas outside the narrow scope of e-commerce.

Publication
PhD thesis, Universität der Bundeswehr Munich, Germany
Next
Previous

Related