{ "info": { "author": "Gustavo Sverzut Barbieri", "author_email": "gsbarbieri@users.sourceforge.net", "bugtrack_url": null, "classifiers": [ "Development Status :: 4 - Beta", "Environment :: Console", "Intended Audience :: Developers", "Intended Audience :: End Users/Desktop", "License :: OSI Approved :: GNU General Public License (GPL)", "Natural Language :: English", "Operating System :: OS Independent", "Programming Language :: Python" ], "description": "TV Grab Brazil (source - http://tudonoar.uol.com.br/)\n\n It requires the python tv grabber library: pytvgrab-lib\n (see http://pytvgrab.sourceforge.net)\n It is used to extract information from the source webpage and\n outputs in the xmltv format. (version 0.5.15 see\n http://xmltv.sourceforge.net)\n\n\n\n The guide provider: http://tudonoar.uol.com.br provides the guide\n in 3 levels:\n\n Channel Listing: here we get the channel name and url for its\n programs list\n +-> Channel Programs: here we get the program name and time\n for a channel\n +-> Program Information: here we get the program info\n\n\n My approach to grab the guide\n =============================\n (To be used by new grabbers developers)\n\n To parse the web guide I use a Customized HTMLParser\n (customizedparser) that is a parser that ignores some tags,\n parsing just the wanted ones, like , ,
, ; and\n just some attributes, like href, ... This class map the HTML to a\n python equivalent structure based on the Tag class, which has a\n name, attributes, contents data and children.\n\n Then I get the structure and call one of the 3 functions that\n 'knows' how to get the data we want from each document. They are:\n\n - get_channels(): this function knows how to get the channel\n name and url and returns a list of tuples ( name, url );\n\n - get_programs(): this one knows how to get the programs from\n a given channel and returns a list of tuples;\n\n - get_program_info(): this knows how to get the program\n information and returns a dict with the parsed xmltv data.\n\n So, to get the whole guide is easy: First, I grab the first\n page and use get_channels() to parse it and get the channels name\n and url. Then for each channel, go for that URL and use\n get_programs() to get the programs names, start time and url. If\n the program has a url, grab it and use get_program_info() to get\n its xmltv information.\n\n Each get_{channels,programs,program_info}() has its get_url_\n and process_ functions. This low coupling helps to print out\n information and debug. For instance the main functions get the\n url, download data, process it and return the results. All these\n parts are separated.\n When an error happens, the grabber dumps the file and exits\n with -1.\n\n\n NOTICE: You may define your own functions!!!\n\n NOTICE: The re_clean and clear_html() are fully optional! It's\n just something I use to get ride of bullshit and maybe fix\n some html errors.", "description_content_type": null, "docs_url": null, "download_url": "http://sourceforge.net/project/showfiles.php?group_id=87433", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "http://pytvgrab.sourceforge.net/", "keywords": "python xmltv grabber br Brazil", "license": "GNU Gerneral Public License", "maintainer": null, "maintainer_email": null, "name": "pytvgrab-br_uol", "package_url": "https://pypi.org/project/pytvgrab-br_uol/", "platform": "posix", "project_url": "https://pypi.org/project/pytvgrab-br_uol/", "project_urls": { "Download": "http://sourceforge.net/project/showfiles.php?group_id=87433", "Homepage": "http://pytvgrab.sourceforge.net/" }, "release_url": "https://pypi.org/project/pytvgrab-br_uol/0.6.0/", "requires_dist": null, "requires_python": null, "summary": "python xmltv grabber for Brazil (source - http://tudonoar.uol.com.br)", "version": "0.6.0" }, "last_serial": 3885, "releases": { "0.5.0": [], "0.6.0": [] }, "urls": [] }