Metadata-Version: 1.0
Name: udon
Version: 0.1
Summary: Normalizing English lengthened expression having repeating letters. (e.g., "cooooooooooooooollllllllllllll" to "cool")
Home-page: https://github.com/ikegami-yukino/udon
Author: Yukino Ikegami
Author-email: yukino_0131@me.com
License: MIT License
Description: udon
        ===========
        
        Udon is a text normalizer for lengthened English expression having repeating letters.
        
        (e.g., Udon converts "cooooooooooooooollllllllllllll" to "cool")
        
        This module is based on the following paper:
        Samuel Brody and Nicholas Diakopoulos.
        Cooooooooooooooollllllllllllll!!!!!!!!!!!!!! using word lengthening to detect sentiment in microblogs.
        In EMNLP2011, pp. 562-570, 2011.
        http://aclweb.org/anthology//D/D11/D11-1052.pdf
        
        
        Installation
        ============
        
        ::
        
         $ pip install udon
        
        
        Usage
        =====
        
        Import Udon
        --------------------------------------------
        
        ::
        
         >>> import udon
        
        
        Normalize sentence
        --------------------------------------------
        
        ::
        
         >>> udon.normalize_sentence('you are coooolll!!!')
         you are cool!
        
        
        - normalize_sentence(str)
        
        
        Normalize sentence
        --------------------------------------------
        
        ::
        
         >>> udon.normalize_word('okayyyyy')
         okay
        
        
        - normalize_word(str)
        
        
        Shorten repeated substring until threshould without dictionary
        -------------------------------------------------------------------
        
        ::
        
         >>> udon.cut_repeat('mamisaaaaaan', 1)
         mamisan
         >>> udon.cut_repeat('okayyyyy', 2)
         okayy
        
        
        - cut_repeat(str, threshould)
        
          * Note that this method don't use a lengthened expression normalize table (e.g., cooll -> cool).
            If you want to normalize such expression, use `normalize_word()` or `normalize_sentence()` method.
        
          
        TODO
        ======================
        * Support Japanese lengthened expressions
        
        Contributions are welcome!
        
        
        License
        =========
        
        - This module is licensed under MIT License.
        
        
        
        
        CHANGES
        =======
        
        0.1 (2014-03-14)
        ----------------
        
        First release.
        
        
Keywords: normalize,lengthened expression,English
Platform: POSIX
Platform: Windows
Platform: Unix
Platform: MacOS
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: MacOS
Classifier: Operating System :: Microsoft
Classifier: Operating System :: POSIX
Classifier: Programming Language :: Python
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Topic :: Utilities
Classifier: Topic :: Software Development
Classifier: Topic :: Text Processing
Classifier: Natural Language :: English
