The problem
- A new user of a Recommender System has empty profile
- In order to generate recommendations, need to ask some information to the new user
- Goal of a RS: don't annoy the user and become able to generate good recommendations fast
The problem: bootstrapping of RS for new users
Outline
- Problem
- generating recommendations for new user
- Solution for bootstrapping:
- explicit trust statements and trust metric
- Analysis on Epinions.com
- Collaborative Filtering vs Trust-aware, about bootstrapping for new users
Collaborative Filtering Recommender Systems
Users explicitly assign ratings to items (ex: "I like Titanic as 4/5")
Collaborative Filtering (CF) RSs
Task: generating recommendations for the active user ME
- Step 1: Find users similar to ME (called neighbours) [Similarity assessment]
- Step 2: Predict rating of user ME to item I as weighted sum of ratings given by neighbours to item I [Rating prediction]
Collaborative Filtering RSs: example
| Itm1 | Itm2 | Itm3 | Itm4 |
ME | 2 | 5 | ? | 5 |
User1 | 5 | | 2 | 3 |
User2 | | 4 | 1 | 3 |
User3 | 2 | 5 | 5 | 4 |
Step 1: Find users similar to ME (neighbours)
- Sim(ME,User1) = -0.2
- Sim(ME,User2) = +0.1
[neigh]
- Sim(ME,User3) = +0.9
[neigh]
Step 2: Predict rating of user ME to Item3 as
weighted sum of ratings given by neighbours to Item3
|
(0.9 * 5) + (0.1 * 1) |
0.9 + 0.1 |
|
|
Note: overlapping of rated items required!
Collaborative Filtering Weakness
CF Weakness: Cold start
New users have 0 ratings -->
Step 1 fails (User Similarity not computable) -->
No neighbours -->
Ratings prediction not possible (coverage=0)
Question: how to boostrap a Recommender System for a new user?
Usual solution for bootstrapping
- "Please rate 10 movies"
- "Until then, no recommendations are available for you"
Disadvantages: annoying (users is asked to "work" without an immediate benefit), slow
Our proposed solution for bootstrapping
- "Please indicate few users you trust"
- "One friend is enough for getting recommendations"
Advantages: possibly fast (one friend is enough, she can be the user who invited you in the community/system)
Our proposed solution for bootstrapping
We propose to change Step 1 of CF RS
from "Find users similar to user ME"
to "Find users trustable by user ME"
Our proposed solution for bootstrapping
- Trust-awareness: considering explicit trust between users.
- Trust statement = explicit judgement of a user on another user
- Ex: "I (Mary) trust Kate as 0.8 (in [0,1])"
- about the perceived quality of the user's characteristics;
in RSs, a user should trust someone if she appreciates her tastes and ratings
Trust-aware RSs
Trust-aware RSs
- Step 1: Find users trustable by ME
(neighbours) <--
- Step 2: Predict rating of user ME to item
I as weighted sum of ratings given by neighbours to item
I [unchanged step]
But what does "trustable" mean?
Users trusted explicitly are neighbours (Kate
is neighbour of Mary), and unknown users?
Trust Networks
Trust network: aggregate of all the trust statements
- Properties of trust:
- weighted (0=distrust, 1=max trust)
- subjective
- asymmetric
Trust Metrics
How much should ME trust "unknown users"?
Trust Metrics use existing edges for predicting values of trust for
non-existing edges,
- exploiting trust propagation
6 degrees of separation (Stanley Milgram,
1967) -> no more "unknown" users
MoleTrust: local trust metric
MoleTrust (MT)
-
Time-efficient
-
Local: propagates trust from active user
-
Step 1: remove cycles
-
Step 2: propagate trust, up to trust propagation horizon
(MT1, MT2, ...)
Experimental analysis
Comparison of:
-
Standard Collaborative Filtering algorithm (using only ratings information)
-
Trust-aware algorithm (using only trust statements information): MoleTrust propagating trust up to distance 2.
about their performances on users who expressed few information bits (ratings or trust statements). These users are in need of bootstrapping.
On which data?
Epinions.com description
Epinions.com users can:
- assign ratings to items
- express users they trust (1) and distrust (0)
- Epinions FAQ suggests to trust "users whose reviews and ratings you have consistently found to be valuable"
Most meaningful example of real community
expressing trust statements and ratings
Epinions.com dataset
- ~50,000 users,
- ~140,000 items,
- ~660,000 ratings,
- ~500,000 trust statements (only positive).
Large and real world dataset!
How many cold start users?
#users who provided at most x information bits
|
|
0
|
1
|
2
|
3
|
4
|
5
|
6
|
7
|
#ratings
|
18.52%
|
34.22%
|
42.21%
|
48.13%
|
52.83%
|
56.72%
|
60.05%
|
62.98%
|
#trusts
|
31.10%
|
50.24%
|
59.7%
|
65.8%
|
70.18%
|
73.61%
|
76.25%
|
78.23%
|
A large portion of the users provided few ratings (cold start users)
Boostrapping for cold start users is a real and important issue.
Analysis: Comparable users
|
CF (using ratings)
|
|
|
MT2 (using trust statements)
|
|
# Neighbours found in Step 1 by 2 techniques, compared when using same quantity of information
-- Note the difference in Y axis!
Analysis: Comparable users
|
CF (using ratings)
|
|
|
MT2 (using trust statements)
|
|
Why this huge difference? Trust can be propagated (six degrees).
Evaluation of performances of the algorithms
- Evaluation of performances of the algorithms about their ability to produce predictions
- Leave one out technique
- Measures:
- Accuracy (MAE: Mean Absolute Error)
- Coverage (percentage of predictable ratings)
- Compared when using same amount of information (either ratings or trust statements)
Evaluation: Coverage
Coverage refers to the percentage of hidden ratings that are predictable
For bootstrapping, trust-awareness is much more effective than standard CF (similarity)
Evaluation: Accuracy
Accuracy refers to the error made when creating a prediction (MAE)
Largest coverage of trust-awareness does not cause smaller accuracy.
Conclusions
- Bootstrapping for new users is a real problem for RSs: real world data
- Proposed trust-awareness as a boostrapping tool
- Comparison of CF and trust-aware on cold start users:
- --- trust-aware achieves much higher coverage
- Trust can be propagated (more effective in finding neighbours [step 1])
- Few analysis of coverage of CF
- --- the accuracy of trust-aware does not decrease
- Recommendation for RSs creators: to bootstrap a new user, better asking few trust statements on other users than few ratings on items
THE END
Thanks for your attention!