Metadata-Version: 1.1
Name: infertweet
Version: 0.2
Summary: Infer information from Tweets. Useful for human-centered computing tasks, such as sentiment analysis, location prediction, authorship profiling and more!
Home-page: http://www.github.com/bwbaugh/infertweet
Author: Wesley Baugh
Author-email: wesley@bwbaugh.com
License: Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License
Description: InferTweet
        ==========
        
        Infer information from Tweets. Useful for human-centered computing
        tasks, such as sentiment analysis, location prediction, authorship
        profiling and more!
        
        [![Build Status][Build Status]][Travis CI]
        
        Sentiment Analysis
        ------------------
        
        We provide three-class (positive, negative, objective-OR-neutral)
        sentiment analysis on tweets.
        
        Experiments are ongoing, but currently the system uses a hierarchical
        classifier that first determines if a tweet is objective or subjective
        (subjectivity classifier), and then if subjective determine if the tweet
        is positive or negative (polarity classifier).
        
        We use approximately 8,750 labeled training instances provided by the
        [Sentiment Analysis in Twitter](http://www.cs.york.ac.uk/semeval-2013/task2/)
        task for SemEval-2013. We then "freeze" the subjectivity classifier, as
        we currently haven't been able to incorporate additional high quality
        labeled or unlabeled objective-OR-neutral tweets or text. However, we
        continue to train the polarity classifier through self-training on
        approximately 1 million unlabeled tweets that are likely to contain
        sentiment. The additional tweets were captured from Twitter if they had
        a matching emoticon present in the text of the tweet.
        
        ### SemEval-2013
        
        An early version of our system was entered in the SemEval-2013
        competition. Our simple system (Naive Bayes with unigrams + bigrams)
        scored 25th out of 48 submissions, which while not state-of-the-art is
        still not too bad.
        
        The evaluation metric was the average F-measure of the positive and
        negative classes. Our system achieved an F-measure of `0.5437`, while
        the top system achieved `0.6902`.
        
        #### Results of system for SemEval-2013
        
            Confusion table:
            gs \ pred| positive| negative|  neutral
            ---------------------------------------
             positive|      841|      233|      498
             negative|       74|      324|      203
              neutral|      276|      196|     1168
        
        
            Scores:
            class                    prec                 recall     fscore
            positive      (841/1191) 0.7061    (841/1572) 0.5350     0.6088
            negative       (324/753) 0.4303     (324/601) 0.5391     0.4786
            neutral      (1168/1869) 0.6249   (1168/1640) 0.7122     0.6657
            --------------------------------------------------------------------
            average(pos and neg)                                          0.5437
        
        In the mean time, we have a lot more experimental ideas that may improve
        the performance of our classifier, so it's time to get experimenting!
        
        ### RPC server
        
        The sentiment analysis classifier can be loaded from file and served
        using a RPC server. This allows the classifier to potentially be used by
        many applications, as well as being able to stay loaded even if another
        application that depends on the classifier needs to restart or update.
        
        ### Web user interface
        
        We have added a very simple web interface that allows users to query the
        system. Lots of upcoming features are planned for the web interface.
        
        **Known Bug:** If installing the package through `pip` or `setup.py`
        then the web interface files under `web/static` and `web/templates` are
        not copied along with the installation. Therefore, either copy these
        files manually or run from the source directory.
        
        ### RESTful JSON API
        
        #### GET sentiment/classify
        
        ##### Resource URL
        
        http://.../api/sentiment/classify.json
        
        ##### Parameters
        
        - text: String representing the document to be classified.
        
        ##### Response object fields
        
        - text: String of the original input text.
        - label: String of the sentiment classification label.
        - confidence: Float of the confidence in the label.
        
        ##### Example request
        
        GET `http://.../api/sentiment/classify.json?text=Today+is+March+30%2C+2013.`
        
            {
                "text": "Today is March 30, 2013.",
                "confidence": 0.9876479882432573,
                "label": "neutral"
            }
        
          [Build Status]: https://travis-ci.org/bwbaugh/infertweet.png?branch=master
          [Travis CI]: https://travis-ci.org/bwbaugh/infertweet
        
Platform: UNKNOWN
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Natural Language :: English
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 2.7
Classifier: Programming Language :: Python :: Implementation :: PyPy
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Text Processing :: Linguistic
