algorithm - How to continually filter interesting data to the user? -


Take a example of a 'browse' slide show of a question / answer site, which will show a question / answer page at a time . The user clicks the 'Next' button and a new question / answer is presented to him.

I need to decide that when the user clicks on 'Next', which pages should be returned? Some things that I do not want and why are the reasons:

  • Showing the 'latest' question in descending order:

    Enter 100 questions, then No user is going to click through 100th item and it will never get any response. This also means that if any new questions were not given recently, every time the user goes to the site , Then give it to the same repetitive stale data Ega.

  • A lot of decision has been taken from the 'most active' suggestions / comments:

    This will not return those questions which have less activity , Which actually requires more visibility

  • Showing 'low activity' question, too many replies / comments are not given:

    Once an activity activity begins, it will stop showing. This prevents activity on a question when I really want to encourage discussion.

  • I think one of these mixtures will work well, but I have uncertainty how to decide which pages should be returned. I would like to emphasize that I do not want the user to see which category of items to see (such as how many answers have not been given / active / latest filters).

    Thanks

    Edit:

    >

    Now with many thanks for Tim's comments So far I am leaning on this: So far, I am thinking of ranking pages according to the activity number / view number, where the user increases the activity every time a task on one page, such as votes, comments, replies, etc. See the increment for each page on each page increases, when a person views the page.

    Then I will rank all the pages according to their activity / view, show the ratio of high proportion and repeatedly, thus less activity and higher viewing pages will be shown at least, whereas high activity and less People with views will be shown more often. I will be imagined somewhere between low activity / low thoughts and high activity / high opinion, but I have to keep a close eye on this in beta release. I am also planning to store the pages users viewed in the last 24 hours so that there will be no duplication of slides in the given day.

    Some ideas to stop 'stale' data (if all above do not seem to stop it): Probably to run a cron job which periodically check the pages that have been recently And they are encouraged to raise their proportion.

As I see, you are touching two interesting questions:

  1. How to define how posts are interesting to the user: Here you can get a weighted combination of different factors that can contribute to the interest of a post. How much is the amount of activity, how to enter fresh, if you have any way of knowing that the item matches the users' interests etc. You can choose weight based on intuition and see how much better results are than you expected. If you have time and inclination, you can collect data so well how your users respond to entries and using machine learning techniques, try to know the optimum weight for each factor.

  2. To give chance to new posts, otherwise known as exploration-exploitation business BAsically, if you only continue to go in interesting entries, then you should immediately Maximize happiness, but you will never learn about new interesting things, so your total users are unhappy.

> This is a very good education problem, and depending on how much you want it, you can read literature such as the problems of Kashmir armed bandits.

But do not have a simple solution, choose the entry with the highest score, but choose the entry on the basis of probability distribution, such as high score entries are more likely to be displayed. This way, most of the time you show interesting things, but every post gets a chance to show it occasionally.


Comments