Generating Recommendation Dialogs by Extracting Information from User Reviews

Overview

Recommendation dialog systems help users navigate e-commerce listings by asking questions about users' preferences toward relevant domain attributes. We present a framework for generating and ranking fine-grained, highly relevant questions from user-generated reviews. We demonstrate our approach on a new dataset just released by Yelp.

Data

We use data from the Yelp Dataset Challenge, a collection of 11,537 businesses, 8,282 checkin sets, 43,873 users, and 229,907 reviews.

  • Yelp-specific sentiment words: 1435 positive and 570 negative.
  • Business aspects extracted from reviews: aspects.txt. The file format is tab-separated, business ID followed by the aspects for that business.
  • More businesses aspects extracted using lexical patterns like "good for": goodFor.txt
  • Business subcategories: subcategories/. Each file contains a list of business IDs and their corresponding subcategory. For example, Chinese restaurants are subcategorized as buffet, dim sum, noodles, pan Asian, Panda Express, sitdown, or vegetarian.
  • Topic models: topic models. Each file lists top words for each raw, unmerged subcategory topic.

People

Papers

  • Kevin Reschke, Adam Vogel, and Dan Jurafsky, "Generating Recommendation Dialogs by Extracting Information from User Reviews". ACL 2013. [pdf] [poster]

Contact Information

For any comments or questions, please e-mail av@cs.stanford.edu.