Metadata-Version: 1.0
Name: tldextract
Version: 0.1
Summary: Accurately separate the gTLD/ccTLD component from the registered domain and subdomains of a URL.
Home-page: https://github.com/john-kurkowski/tldextract
Author: John Kurkowski
Author-email: john.kurkowski@gmail.com
License: BSD License
Description: 
        The `tldextract` module accurately separates the gTLD and ccTLDs from the
        registered domain and subdomains of a URL. For example, you may want the
        'www.google' part of http://www.google.com. This is
        simple to do by splitting on the '.' and using all but the last split element,
        however that will not work for URLs with arbitrary numbers of subdomains and
        country codes, unless you know what all country codes look like. Think
        http://forums.bbc.co.uk for example.
        
        `tldextract` can give you the subdomains, domain, and gTLD/ccTLD component of
        a URL, because it looks up--and caches locally--the currently living TLDs
        according to iana.org.
        
            >>> import tldextract
            >>> ext = tldextract.extract('http://forums.news.cnn.com/')
            >>> ext['subdomain'], ext['domain'], ext['tld']
            ('forums.news', 'cnn', 'com')
            >>> ext = tldextract.extract('http://forums.bbc.co.uk/')
            >>> ext['subdomain'], ext['domain'], ext['tld']
            ('forums', 'bbc', 'co.uk')
        
Keywords: tld domain subdomain url parse extract urlparse
Platform: UNKNOWN
Classifier: Development Status :: 5 - Production/Stable
Classifier: Topic :: Utilities
Classifier: License :: OSI Approved :: BSD License
Classifier: Programming Language :: Python :: 2.5
Classifier: Programming Language :: Python :: 2.6
Classifier: Programming Language :: Python :: 2.7
