x*****0 发帖数: 452 | 1 I've encountered this question online. How to design an algorithm to predict
how many upvotes a review would get in a certain time period after
publishing. Let's say the review comes from Yelp. So the problem becomes the
following:
> Give millions of reviews (text) and their associated upvotes in a
> certain time period, how do you design a ML predictive model?
Definitely, a lot of prior information may affect the upvotes of a certain
review would got. For example:
(1) Who wrote this review, elite or a regular user?
(2) The business of the review. A review for a hot restaurant may get
more upvotes than for a car mechanic. Generally speaking, when you have a
Yelp's review record, it contains a business id.
(3) ...
Let's pass these prior information and focus on the text features. Can
anyone give me some suggestions. I'm thinking using LDA (topic model) to
generate a topics dictionary but still did not come out a complete solution.
关于如何从text中generate features,大家有什么好的想法吗? |
|