At OpenTable, recommendations play a key role in connecting diners with restaurants. The act of recommending a restaurant to a diner relies heavily on aligning everything we know about the restaurant with everything we can infer about the diner. Our methods go beyond using the diner-restaurant interaction history as the sole input — we use click and search data, the metadata of restaurants, as well as insights gleaned from reviews, together with any contextual information to make meaningful recommendations. In this talk, I will highlight the main aspects of our recommendation stack built with Scala using Apache Spark.
3. • Over 32,000 restaurants worldwide
• more than 885 million diners seated since 1998,
representing more than $30 billion spent at partner
restaurants
• Over 17 million diners seated every month
• OpenTable has seated over 254 million diners via a
mobile device. Almost 50% of our reservations are
made via a mobile device
• OpenTable currently has presence in US, Canada,
Mexico, UK, Germany and Japan
• OpenTable has nearly 600 partners including Bing,
Facebook, Google, TripAdvisor, Urbanspoon, Yahoo
and Zagat.
3
OpenTable
the world’s leading provider of online restaurant
reservations
5. Ingredients of a
magical experience
Understanding the diner Understanding the restaurant
Building up a profile of you as a
diner from explicit and implicit
signals - information you have
provided, reviews you have written,
places you have dined at etc.
What type of restaurant is it?
What dishes are they known for?
Is it good for a date night/ family
friendly/ has amazing views etc.
What’s trending?
Connecting the dots
6. we have a wealth
of data
32 million reviews
diner
requests and
notes
menus
external
ratings,
searches and
transactions
images
9. There are various approaches to
making meaningful recommendations
Nearest neighbor approaches in user-user or item-item space
Collaborative Filtering based on explicit/implicit interactions
Content-based approach leveraging restaurant metadata
Factorization machines that include interactions, metadata, as well as context.
17. Content Based Approach
• Comes in very handy for cold start where users have very few interactions
Very useful for cold
start where users
have very few
interactions.
Given a few
interactions we can
find similar
restaurants.
Bayesian
information retrieval
approach.
Content
based
approach
18. 18
Our reviews are rich and verified,
and come in all shapes and sizes
Superb!
This really is a hidden gem and I'm not sure I want to
share but I will. :) The owner, Claude, has been here
for 47 years and is all about quality, taste, and not
overcharging for what he loves. My husband and I
don't often get into the city at night, but when we do
this is THE place. The Grand Marnier Souffle' is the
best I've had in my life - and I have a few years on the
life meter. The custard is not over the top and the
texture of the entire dessert is superb. This is the only
family style French restaurant I'm aware of in SF. It
also doesn't charge you an arm and a leg for their
excellent quality and that also goes for the wine list.
Soup, salad, choice of main (try the lamb shank) and
choice of dessert - for around $42 w/o drinks.
Many restaurants have thousands of reviews.
19. Word2Vec: Word Embeddings
[1] Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient Estimation of Word Representations in Vector
Space. In Proceedings of Workshop at ICLR, 2013.
[2] Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean. Distributed Representations of Words
and Phrases and their Compositionality. In Proceedings of NIPS, 2013.
[3] Tomas Mikolov, Wen-tau Yih, and Geoffrey Zweig. Linguistic Regularities in Continuous Space Word
Representations. In Proceedings of NAACL HLT, 2013.
“We've [been here for afternoon tea multiple times, and each time] we
find it very pleasant”
[ 0.00513298, 0.10313627, 0.0773475 , ..., -0.07634512, 0.00877244, 0.04441034]Vec[tea]=
‘teas', ‘empress', ‘scones', ‘iced’, 'fortnum', ‘salon', ‘teapot', ‘teapots', ‘savories', ‘afternoon',
‘earlgrey' ….
model.most_similar(‘tea’ ):
24. 24
Sushi of Gari,
Gari Columbus, NYC
Masaki Sushi
Chicago
Sansei Seafood Restaurant & Sushi
Bar, Maui
A restaurant like your favorite one but in a
different city.
Find the “synonyms” of the restaurant in question, then filter by location!
Akiko’s, SF
San Francisco Maui Chicago New York
'
Downtown upscale sushi experience with sushi bar
25. 25
Harris’
Steakhouse in
Downtown area
~v(Harris’) + ~v(jazz)
Broadway
Jazz Club
Steakhouse
with live jazz
~v(Harris’) + ~v(patio)
~v(Harris’) + ~v(scenic) Celestial
Steakhouse
Steakhouse
with a view
Patio at Las
Sendas
Steakhouse
with amazing
patio
Translating restaurants
via concepts
27. 27
We expect diner reviews to be broadly
composed of a handful of broad themes
Food &
Drinks
Ambiance Service
Value for
Money
Special
occasions
This motivated diving into the reviews with topic modeling
31. Our topics reveal the unique aspects of each
restaurant without having to read the reviews …
Each review
for a given
restaurant
has certain
topic
distribution
Combining
them, we
identify the
top topics
for that
restaurant.
0
0.5
1
Topic 01 Topic 02 Topic 03 Topic 04 Topic 05
0
0.5
1
Topic 01 Topic 02 Topic 03 Topic 04 Topic 05
0
0.5
1
Topic 01 Topic 02 Topic 03 Topic 04 Topic 05
review 1
review 2
review N
.
.
.
0
0.5
1
Topic 01 Topic 02 Topic 03 Topic 04 Topic 05
Restaurant
32. Looking at the
topics and the top
reviews associated
with it , we know
Espetus
Churrascaria is
not just about
meat and steak,
but has good salad
as well! The service
is top notch, its kid
friendly, and
people go for
special occasions,
…
33. Content Based Approach
• Comes in very handy for cold start where users have very few interactions
Very useful for cold
start where users
have very few
interactions.
Given a few
interactions we can
find similar
restaurants.
Bayesian
information retrieval
approach.
Content
based
approach
+ Topic Weights
35. 35
We leveraged food and drink related topics to
expand our corpus of dishes and drinks
Most dishes are usually 1-grams
(“tiramisu”) 2-grams (“pork cutlets”) or
3-grams (“lemon ricotta pancake”)
For each restaurant, we perform an N-gram
analysis of the reviews within the scope of food
topics and surface candidate dish tags
We were able to generate several
thousands of dish tags using this
methodology!
37. 37
Sentiments - we use ratings as labels
for positive and negative sentiments
Ingredients of a stellar experience
38. 38
Sentiments - we use ratings as labels
for positive and negative sentiments
Ingredients of a terrible experience
39. 39
The model knows that “to die for”, “crispy”, “moist”
are actually indicative of positive sentiment when it
comes to food!
•The lobster and avocado eggs Benedict are to die for.
• We finished out meal with the their blackberry bread pudding which was so moist and
tasty.
•The pork and chive dumplings were perfectly crispy and full of flavor.
•I had the Leg of Lamb Tagine and it was "melt in-your-mouth" wonderful.
•… we did our best with the scrumptious apple tart and creme brulee.
•My husband's lamb porterhouse was a novelty and extremely tender.
•We resisted ordering the bacon beignets but gave in and tried them and were glad we
did---Yumm! …
41. 41
We also
learn
restaurant
specific
attributes
from
review text
We learn features
using one vs. all
Logistic
Regression with
L1 regularization
via a mech turk
curated labeled
set.
For outdoor seating features include obvious ones such as ‘outdoor’, ‘patio’, as
well as ‘raining’, ‘sunny’, ‘smoke’, etc. …