Metadata-Version: 1.0
Name: cold-start-recommender
Version: 0.3.5
Summary: In-memory recommender for recommendations produced on-the-fly
Home-page: https://github.com/malemi/cold-start-recommender
Author: Mario Alemi
Author-email: mario.alemi@gmail.com
License: LICENSE.txt
Description: *********************************************************************
        Cold Start Recommender: easy, fast, greed recommender to avoid the cold start
        *********************************************************************
        
        Cold Start Recommender was developed because we needed a recommender
        with the following characteristics:
        
        * **Greedy.** No previous data on Items or Users available, therefore
            *any* information should be used --not just which Item a User
            likes, but also --in case of a book-- the correspondend category,
            author etc.
        
        * **Fast.** Any information on Users and Item should be stored and
            used immediately. A rating by any User should improve
            recommendations for such User, but also for other Users. This
            means no batch computations, and in-memory database.
        
        * **Ready to use.** Have a look at bin/recommender_api.py for starting
            a webapp to POST information and GET recommendations.
        
        
        CSRec should not (yet) be used for production system, but only for
        pilots, where statistics is so low that any filter (e.g. loglikelihood
        filter on the co-occurence matrix) do not make yet sense. It aims at
        *gather data* providing a recommendation experience.
        
        TODO Future releases will include state of the art algorithms (about
        October 2014, a few months before our product is supposed to go
        public).
        
        CSRec is written in Python, and under the hood it uses the `Pandas`_
        library. 
        
        **Table of Contents**
        
        .. contents::
            :local:
            :depth: 1
            :backlinks: none
        
        
        A simple script
        ---------------
        
            from csrec.Recommender import Recommender
            
            engine = Recommender()
        
            # Insert Item with it properties (e.g. author, category...)
        
            engine.insert_item({'_id': 'an_item', 'author': 'The Author'})
        
            # Insert rating, indicating wich property of the Item should be used for producing recs
        
            engine.insert_rating(user_id='a_user;, item_id='an_item', rating=4, item_info=['author'])
        
            # Insert rating, indicating that only the property should be used for recs (e.g. initial users' profiling)
        
            engine.insert_rating(user_id='another_user', item_id='an_item', rating=3, item_info=['author'], only_info=True)
        
        
        Features
        ========
        
        Persistence
        -----------
        
        You can use CSRec purely in-memory for testing or with MongoDB, which
        you can install on a tmpfs filesystem created in your RAM (on Linux,
        see
        http://edgystuff.tumblr.com/post/49304254688/how-to-use-mongodb-as-a-pure-in-memory-db-redis-style). If using a RAM partition, please make a replica set!
        
        (Why using a replica set? Because you can have the primary DB in
        memory, and two other secondaries on disk. If the primary goes down,
        you still can use CSRec at lower performances, but without any data
        loss.)
        
        Examples:
        
        	engine = Recommender()  # Start in-memory recommender for testing
        	
        	engine = Recommender(mongo_host='localhost', mongo_db_name='my_cold_rec')  # ...with MongoDB, collections are created automatically
        	
        	engine = Recommender(mongo_host='localhost', mongo_db_name='my_cold_rec', mongo_replica_set='recommender_replica')  # as above, with replica
        	
        
        The Cold Start Problem
        ----------------------
        
        The Cold Start Problem originates from the fact that collaborative
        filtering recommenders needs data to build recommendations. Typically,
        if Users who liked item 'A' also liked item 'B', the recommender would
        recommend 'B' to a user who just liked 'A'. But if you have no
        previous rating by any User, you cannot make any recommendation.
        
        CSRec tackles the issue in various ways:
        
        1. It allows profiling with well-known Items without biasing the
        results. For instance, if a call to insert_rating is done in this way:
        
           engine.insert_rating(user_id='another_user', item_id='an_item', rating=3, item_info=['author'], only_info=True)
        
        CSRec will only register that 'another_user' likes a certain author,
        but not that s/he might like 'an_item'. This is of fundamental
        importance when profiling Users with a "profiling page" provided soon
        after log-in. If you ask Users if they like "Harry Potter" instead of
        "The Better Angel of Our Nature", and many chose Harry Potter, you
        don't want to make the Item "Harry Potter" even more popular than what
        alread is. You might only want to record that the user likes books for
        children sold as literature for adults.
        
        CSRec does that because, unless you are Amazon or a similar brand, the
        co-occurence matrix is often too sparse to build any decent
        recommendation. In this way you start building multiple, denser,
        co-occurence matrices and use them from the beginning.
        
        2. **Any information is used.** You decide which information you should
        record about a User rating an Item. This is similar to the previous
        point, but you also register the item_id.
        
        3. **Any information is used *immediately*.** The co-occurence matrix is
        updated as soon as a rating is inserted.
        
        4. **It tracks anonimous users,** e.g. random visitors of a website
        before the sign in/ sign up process. After sign up/ sign in the
        information can be reconciled --information relative to the session ID
        is moved into the correspondent user ID entry.
        
        Mix Recommended with Popular Items
        ----------------------------------
        
        What about users who would only receive a couple of recommended items?
        No problem! We'll fill the list with most popular items who were not
        recommended (nor rated by such user).
        
        Algorithms
        ----------
        
        At the moment CSRec only provides purely item-based recommendations
        (co-occurence matrix dot the User's ratings array). In this way we can
        provide recommendations in less than 200msec for a matrix of about
        10,000 items.
        
        
        Versions
        --------
        
        ***v 3.5***
        
        * Added logging
        * Added creation of collections for super-cold start (not even one rating, and still user asking for recommendations...)
        * Additional info used for recommendations (eg Authors etc) are now stored in the DB
        * _sync_user_item_ratings now syncs addition info's collections too
        * popular_items now are always returned, even in case of no rating done, and get_recommendations eventually adjusts the order if some profiling has been done 
        
        
        .. _Pandas: http://pandas.pydata.org
Platform: UNKNOWN
