Who To Trust?

One of the beauties, and challenges, of the online world is access to an immense variety of people and their opinions. If you live in a major metro area and are interested in that new Bistro that just opened around the corner, chances are that you’ll be able to find opinions and reviews at places like Chowhound, Zagat and Fodor’s. But, what to do with these reviews? There could easily be hundreds of them. How do you know which ones to trust?

My wife was a restaurant nut for quite some time, spending hours every day on Chowhound. But, even though she spent sooooo much time reading post-after-post-after-post, there was still one thing she struggled with: evaluating restaurant prospects based on reviews. One restaurant might have a stellar record—not a single negative review—but no reviews from people whose taste she knew well. Another restaurant might have a few complaints, but the people she knew best seemed to like it. And, the reviews themselves rarely helped matters. They tended to be vague: “The tuna was excellent!” What does that mean? Was it seasoned properly? Was it done correctly—rare center with a flavorful crust? If you gave two people the exact same tuna dish, would they even be able to agree on whether it was seasoned properly and done correctly? Probably not. Everyone has their own slightly different definitions of what constitutes a great dining experience.

A VC thinks that we need an algorithm to find our taste neighbors. I couldn’t agree more. Even my wife, who was willing to put in a lot of work in order to find her taste neighbors, couldn’t keep track of so much information! It’s easy to remember whether there’s someone who shares your taste exactly. But what about the ones that don’t? How do you keep track of which restaurants you disagreed on and in what aspects and to what degree? Even asking a human to determine “taste similarities” for a handful of people is a daunting task. But, it’s an easy task for a computer. The hard part is telling the computer to do with all this information in order to produce a “taste similarity” measure which matches our intuition. This problem largely motivated my PhD thesis work. See the introduction for a short summary.

There are two distinct tasks in determining “taste neighbors,” or StyleTwins (as we call them here at StyleFeeder). The first is sentiment extraction, or converting a free-form, written review into a structured opinion. The second is taking the set of structured opinions across a variety of users and restaurants (or products, or movies, etc.) and producing “interesting” information. “Interesting” information might include predicted opinions, recommendations, taste neighbors, similar restaurants, etc. Sentiment extraction is certainly an interesting problem, but users tend to be quite willing to provide structured feedback if you ask them for it. Much more challenging than acquiring structured opinion information is analyzing it and producing interesting information such as a user’s StyleTwins.

There are many different approaches to modeling rating data. We interpret a user’s ratings as a partial ordering of the items (e.g. restaurants) that are being rated. We model each user and item as a vector of features and parameters. The preference score of an item for a particular user is simply the dot-product between the user’s and item’s vectors. A set of ordered thresholds per user are used to map these real-valued “scores” to the discrete rating values. We learn parameters for users and item simultaneously by minimizing a loss function that penalizes mistakes in the model (e.g. the model predicts a rating of “2” on an article of clothing even though the user actually gave it “4” stars). An advantage of this approach is that it provides us with a representation of users and items which can be used for comparisons. We represent each user or item as a vector in 189-dimensional space. 89 of those dimensions correspond to features we have identified for the users and/or items—information such as gender and age (for users) and tags and categories (for items). The remaining 100 dimensions are reserved for the learning process to identify and represent information about users and items which aren’t so obvious. They might correspond to (for the restaurant example) decor, service, type of cuisine or level of seasoning. Or, they might incorporate other subtle aspects of the restaurant experience which may be difficult to get from a review, but can be determined from a user’s pattern of ratings. These 189-dimension vectors can be compared to each other easily. Vectors which point in the same direction are highly compatible; vectors that point in opposite directions are incompatible. A real-valued compatibility score can be computed by taking the dot-product of two vectors. So, finding the best StyleTwins for a user is as simple as computing a number of dot-products and sorting the resulting scores.

Here at StyleFeeder, our focus is on helping you discover products serendipitously, without asking you to describe exactly what you want. The core problem that we’re working on, taking your opinions (in the form of ratings) and providing product recommendations based on them, is common to many domains: products, restaurants, movies, etc. But, people are part of the equation too. You’ll be much more inclined to trust a review or suggestion from someone if they share your taste/style. And, it can be as exciting to find a person who shares your taste as it is to find a new restaurant that fits your taste to a “T”.

One Comment

  1. D Joots says:

    Thanks for the interesting post on taste matching/how to trust sites on the web. The semantic web is still a ‘WIP’ but i’m sure we’ll get there. We’re only a decade or so into the web experiment, and some of the exciting work on ‘learning systems’ is truely mind boggling.

    David J