{ "info": { "author": "Dylan Jay", "author_email": "software@pretaweb.com", "bugtrack_url": null, "classifiers": [ "Programming Language :: Python", "Topic :: Software Development :: Libraries :: Python Modules" ], "description": ".. contents :: :local:\n\n\nIntroduction\n============\n\nTransmogrifier blueprints that look at how html items are linked to gather metadata\nabout items. They can help you restructure your content.\n\n\n\n\n\ntransmogrify.siteanalyser.sitemapper\n====================================\nTake navigation html such as a whole sitemap, breadcrumbs or navigation menus using nested links\nand buildup a site structure and titles\nfor pages. This can then be used to cleanup urls and titles for items as well as hide content that shouldn't be\ndisplayed in menus. This is useful for cleaning up sites with a flat url scheme such 'display?id=xxx' type sites.\n\n\nOptions:\n\n:field:\n Name of a field from item which contains a sitemap or nested links\n\n:field_expr:\n Expression to determine the field which contains a sitemap\n\n:breadcrumb_field:\n Key of the field which contains breadcrumb style html. e.g. Folder > Item.\n\n:folder-type:\n If set will ensure all parents in sitemap are of this Type, moving item to defaultpage if needed\n\n:exclude-from-navigation-key:\n Will set this key to 'True' if this item is not found in the sitemap. Defaults to 'exclude-from-navigation'\n\n:title-key:\n Update this field with the title taken from the sitemap if no title already exists\n\n:path_sub:\n Newline seperated regular expressions and substitions to adjust paths so change where content is moved to\n\n:condition:\n TAL expression which if false, don't move this item\n\n\n\ntransmogrify.siteanalyser.urltidy\n=================================\nWill normalize ids in urls to be suitable for adding to plone.\n\nThe following will tidy up the URLs based on a TALES expression ::\n\n $> bin/funnelweb --urltidy:link_expr=\"python:item['_path'].endswith('.html') and item['_path'][:-5] or item['_path']\"\n\nIf you'd like to move content around before it's uploaded you can use the urltidy step as well e.g. ::\n\n $> bin/funnelweb --urltidy:link_expr=python:item['_path'].startswith('/news') and '/otn/news'+item['path'][5:] or item['_path']\n\n\nOptions:\n\n:condition:\n TAL Expression to apply transform\n\n:locale:\n TAL Expression to return the locale used for id normalisation. e.g. 'string:en'\n\n:link_expr:\n TAL Expression to alter the items '_path'\n\n:use_title:\n Condition TAL Expression to change the end path element to a normalised version of item['_title']\n\n\n\n\n\n\n\ntransmogrify.siteanalyser.attach\n================================\n\nFind items and move them if they are tightly linked to a single page. For example if an image\nis located in an images folder, but is only referenced from a single img element on a page in\n/page then the image will be 'merged' with the page.\nHow the merge occurs depends on the 'fields' setting. Merging can either be moving the content\nof the subitem into a field of the parent item, or it can be via containment.\n\n\n\nor the following will only move attachments that are images and use ``index-html`` as the new\nname for the default page of the newly created folder ::\n\n [funnelweb]\n recipe = funnelweb\n attachmentguess-condition = python: subitem.get('_type') in ['Image']\n attachmentguess-defaultpage = index-html\n\nOptions\n\n:fields:\n TAL Expression to return the a dictionary of changes to ``item``. It will use ``item``, ``subitem`` and ``i`` variables.\n e.g. python:{'attachment':subitem['text']}. This will be called for all subitems. The subitems will be deleted.\n\n:condition:\n TAL Expression to apply transform\n (default='python:True')\n\n:defaultpage:\n (default='index-html')\n\n\n\ntransmogrify.siteanalyser.title\n===============================\n\nThis blueprint will take the _backlinks from the item generated by webcrawler\nand if no Title field has been given to the item it will attempt to guess\nit from the link names that linked to this document.\nYou can specify an option 'ignore' option to specify titles never to use\n\nIf it can't guess it from the backlinks it will default to using the file name after\ncleaning it up somewhat\n\nOptions:\n\n:condition:\n TAL Expression to apply transform\n\n:ignore:\n New line seperated list of strings which won't be use as titles. Defaults to 'next','previous'\n\n\ntransmogrify.siteanalyser.hidefromnav\n=====================================\n\nThis blueprint will guess which folders should be hidden from the navigation tree.\nIt does this by one of three rules\n\n1. Gather all links in the _template html left over after content extraction\nand assume anything linked from outside the content should have their folders shown and\nanything else should be hidden. #TODO\n2. Any folders with content found only via img links will also be hidden. #TODO\n3. The condition to set to tree for the item to hide\n\nOptions\n\n:key:\n Default is '_exclude-from-navigation'.\n\n:condition:\n Default is 'python:False'\n\n:template_key:\n #TODO\n Default is '_template'\n\n:hide_img_folders:\n #TODO\n Default is 'True'\n\n\ntransmogrify.siteanalyser.defaultpage\n=====================================\nTo determine if an item is a default page for a container (it has many links\nto items in that container, even if not contained in that folder), and then move\nit to that folder.\n\nOptions:\n\n:mode:\n 'links' or 'path' (default=links).\n 'links' mode uses links\n to determine if a item is a defaultpage of a subtree by looking at it's links.\n 'path' mode uses parent_path expression to\n determine if an item is a defaultpage of that parent.\n\n:min_links:\n If a page has as at least this number of links that point to content in a folder\n then move it there and make it the defaultpage. (default=2)\n\n:max_uplinks:\n If a page has more than max_uplinks it won't be moved. (default=2)\n\n:parent_path:\n Rule is defined by entered\n parent_path option which is expression with access to item,\n transmogrifier, name, options and modules variables.\n Returned value is used to find possible parent item by path. If found,\n item is moved to that parent item, parent item _defaultpage key is set\n appropriately, and we turn to processing another item in a pipeline. So\n the first item in pipeline will take precedence in case parent_path rule\n returns more than one item for the same parent.\n\n:condition:\n default=python:True\n\n\ntransmogrify.siteanalyser.relinker\n==================================\nHelp restructure your content.\nIf you'd like to move content from one path to another then in a\nprevious blueprints adjust the '_path' to the new path. Create a new field\ncalled '_origin' and put the old path into that. Once you pass it through\nthe relinker all href, img tags etc will be changed in any html content where they\npointed to content that has since moved. All '_origin' fields will be removed\nafter relinking.\n\nOptions:\n\n:ignore_duplicates:\n If 'True' there won't be an error raised when two items were redirected from the same place. This can occur with\n some CMS's where content can be in different urls in the site\n\n:broken_link_normalise:\n TAL expressions, each on a new line, which take 'url' from inside the html and returns a link that will match one of\n the existing links in the site. Must return the full url, not the path. This is useful when many different links\n could go to the same content.\n\n\ntransmogrify.pathsorter\n==================================\n\nIf items are at the same level in a folder then they will be sorted based on a\n'_sortorder' key as given by transmogrify.webcrawler.\n\nIn addition\n\n\n- if a container has a 'text' key then a default page will be created.\n\n- if item's name is in 'default_pages' and it's parent doesn't already have a defaultpage\n then the item will be set as the parents default page.\n\nOptions:\n\n:default_pages:\n Set item as to be set as the default page of it's parent if it matches one of these names.\n Default is 'index.html'\n\n:default_containers:\n if an item doesn't exist for a given items parent it will be created. The _type key will\n be set to the first item in 'default_containers'. Default is 'Folder'.\n\nChangelog\n=========\n\n1.3 (2012-12-28)\n----------------\n\n- added sitemapper blueprint\n- converted text to unicode baseNormalize because decomposition needs unicode valute to works properly [gborelli]\n- add ignore_re attribute in Backlinks title for ignore regular expression condition [ivanteoh]\n- add 'ignore_duplicates' and 'broken_link_normalise' to relinker [djay]\n- added invalid_ids option [ivanteoh]\n- relinker relinks any field which contains '<' and additional specififed fields [djay]\n- cleaned up logging [djay]\n- fixed relinking of defaultpages [djay]\n- titles from backlinks must be unique to be used [djay]\n\n1.2 (2012-04-28)\n----------------\n\n- moved transmogrify.pathsorter into transmogrify.siteanalyser.pathsorter [djay]\n\n1.1 (2012-04-18)\n----------------\n\n- added transmogrify.siteanalyser.sitemapper [djay]\n- split transmogrify.siteanalyser.urltidy out of relinker [djay]\n- ensure urltidy always create unique urls [djay]\n- Added ability to take id from title to urltidy [djay]\n- improved logging [djay]\n- fixed bug in attach where two items can end up with same path [djay]\n\n\n1.0 (2011-06-29)\n----------------\n\n- 1.0 release\n\n1.0b8 (2011-02-12)\n------------------\n- more robust parsing of html\n\n1.0b7 (2011-02-06)\n------------------\n\n- show error if text is None\n- fix bug with bad chars in rewritten links\n- fix bug in losing items\n- add hidefromnav blueprint. does manual hiding\n\n\n1.0b6 (2010-12-15)\n------------------\n\n- remove nulls from links which cause lxml errors\n- summarise info in log to single entry\n\n1.0b5 (2010-12-13)\n------------------\n\n- condition was in the wrong place. resulted in dropping items\n- improve logging\n- handle default pages that don't exist\n\n1.0b4 (2010-11-11)\n------------------\n\n- fix bug where _defaultpage wasn't being relinked\n\n1.0b3 (2010-11-09)\n------------------\n\n- fix bug in quoting links in relinker\n\n\n1.0b2 (2010-11-08)\n------------------\n\n- Add conditions to site analyser blueprints", "description_content_type": null, "docs_url": null, "download_url": "UNKNOWN", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "http://github.com/collective/transmogrify.siteanalyser", "keywords": "transmogrifier blueprint funnelweb source plone import conversion microsoft office", "license": "GPL", "maintainer": null, "maintainer_email": null, "name": "transmogrify.siteanalyser", "package_url": "https://pypi.org/project/transmogrify.siteanalyser/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/transmogrify.siteanalyser/", "project_urls": { "Download": "UNKNOWN", "Homepage": "http://github.com/collective/transmogrify.siteanalyser" }, "release_url": "https://pypi.org/project/transmogrify.siteanalyser/1.3/", "requires_dist": null, "requires_python": null, "summary": "transmogrifier source blueprints for crawling html", "version": "1.3" }, "last_serial": 800896, "releases": { "1.0": [ { "comment_text": "", "digests": { "md5": "d554d63f00424a776bf375384d2ea655", "sha256": "d203855f55fe654fb892f7b78bc45c0f1a2af9cffee4826db6aeb761bca39128" }, "downloads": -1, "filename": "transmogrify.siteanalyser-1.0.zip", "has_sig": false, "md5_digest": "d554d63f00424a776bf375384d2ea655", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 41355, "upload_time": "2011-06-29T16:22:31", "url": "https://files.pythonhosted.org/packages/e3/9d/c1195d2c9d8196778c6e2184baca7d362893460f3fb5d690310636f4d77f/transmogrify.siteanalyser-1.0.zip" } ], "1.0a1": [], "1.0a1dev": [ { "comment_text": "", "digests": { "md5": "a194fb437bb05542ccf06e5938cbc745", "sha256": "b4bb8941b312c3fdba647a8768d81ba66931a6c1a95bee2f158ff2f0290b473b" }, "downloads": -1, "filename": "transmogrify.siteanalyser-1.0a1dev.tar.gz", "has_sig": false, "md5_digest": "a194fb437bb05542ccf06e5938cbc745", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 27386, "upload_time": "2010-03-26T11:51:34", "url": "https://files.pythonhosted.org/packages/04/a5/3b7e0f7c3d173a355bcc9b60d11dd6ee70a8eee31251a4e655427eccc5c2/transmogrify.siteanalyser-1.0a1dev.tar.gz" } ], "1.0b1": [ { "comment_text": "", "digests": { "md5": "fabdc0ee80a1c8b322847dcfd25c55d1", "sha256": "3ef7c379831caf3eadfef74d903ff37304c5c7189a3da7b78381ba3664f98dc8" }, "downloads": -1, "filename": "transmogrify.siteanalyser-1.0b1.zip", "has_sig": false, "md5_digest": "fabdc0ee80a1c8b322847dcfd25c55d1", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 38747, "upload_time": "2010-11-07T16:06:23", "url": "https://files.pythonhosted.org/packages/6e/4f/cb7aebaa78047f20c5e13d3206824d3c9d57f992036650b26f7282e843c3/transmogrify.siteanalyser-1.0b1.zip" } ], "1.0b2": [ { "comment_text": "", "digests": { "md5": "99dc38283cdb9d37ee1c1aaca1203599", "sha256": "0b2e447ad6924e421df8855ee37cd2c1917494ea7977ceba12f9cdace150bea6" }, "downloads": -1, "filename": "transmogrify.siteanalyser-1.0b2.zip", "has_sig": false, "md5_digest": "99dc38283cdb9d37ee1c1aaca1203599", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 38917, "upload_time": "2010-11-08T15:53:01", "url": "https://files.pythonhosted.org/packages/86/e6/ca8fde3bd38a489dfa651b23222b6b96b5c192664763d016dd610d6838d0/transmogrify.siteanalyser-1.0b2.zip" } ], "1.0b3": [ { "comment_text": "", "digests": { "md5": "48212c0c46e07e6333043dc35980bbe4", "sha256": "61bd5e1cd870cc3430fc9b7e05f255be3e93d752844b36d43b1e618fa8777d53" }, "downloads": -1, "filename": "transmogrify.siteanalyser-1.0b3.zip", "has_sig": false, "md5_digest": "48212c0c46e07e6333043dc35980bbe4", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 39008, "upload_time": "2010-11-08T18:43:37", "url": "https://files.pythonhosted.org/packages/a5/92/1f3b540f35169e7a87cc84d2f575d492ff0688b265069ab0498d9dcbd725/transmogrify.siteanalyser-1.0b3.zip" } ], "1.0b5": [ { "comment_text": "", "digests": { "md5": "897cf19342aeefe2f0e0e6ac24a9b95b", "sha256": "46919d60266e5ec0c99236a4b407415c2acc6b80e5b280ebff2c89107fc59952" }, "downloads": -1, "filename": "transmogrify.siteanalyser-1.0b5.zip", "has_sig": false, "md5_digest": "897cf19342aeefe2f0e0e6ac24a9b95b", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 39612, "upload_time": "2010-12-13T16:43:36", "url": "https://files.pythonhosted.org/packages/e9/58/2b8a12b11929ea2504f99dfe0343c6d716f3b7aa099cb133e7f172c632ed/transmogrify.siteanalyser-1.0b5.zip" } ], "1.0b6": [ { "comment_text": "", "digests": { "md5": "385ac19575ccd1dedfa47f3a7dd7d980", "sha256": "6a93b65a50874a2100ceec923fca8d6ed8d6bf992f7b44586207420ef7337134" }, "downloads": -1, "filename": "transmogrify.siteanalyser-1.0b6.zip", "has_sig": false, "md5_digest": "385ac19575ccd1dedfa47f3a7dd7d980", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 39958, "upload_time": "2010-12-14T16:42:00", "url": "https://files.pythonhosted.org/packages/d4/fb/7c6bd125913dc918265ce1601a6d580260c53a44a60a8a61be4004d79b0b/transmogrify.siteanalyser-1.0b6.zip" } ], "1.0b7": [ { "comment_text": "", "digests": { "md5": "00aec7423529613fe401d8bfc5d42db4", "sha256": "cef89607b3f66d591f1ed6d4c3e9fa39f671c0daf3ea164d715e3c6252fdf335" }, "downloads": -1, "filename": "transmogrify.siteanalyser-1.0b7.zip", "has_sig": false, "md5_digest": "00aec7423529613fe401d8bfc5d42db4", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 41334, "upload_time": "2011-02-06T17:09:17", "url": "https://files.pythonhosted.org/packages/a3/18/7f7518cd15ec60a4c3a99739ffc766a19ad1e0750c416eba530b984b2a38/transmogrify.siteanalyser-1.0b7.zip" } ], "1.0b8": [ { "comment_text": "", "digests": { "md5": "6d00724da647fd11e9d1c5279f54b97f", "sha256": "a820b44fc5a0ae49707fdf47031cf2616760c773050c418bc3bbf18534ac1b96" }, "downloads": -1, "filename": "transmogrify.siteanalyser-1.0b8.zip", "has_sig": false, "md5_digest": "6d00724da647fd11e9d1c5279f54b97f", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 41428, "upload_time": "2011-02-12T02:15:36", "url": "https://files.pythonhosted.org/packages/2e/88/3ef07f1336f9f3b06d39159008fc0b220bc75890b809bf34fc97d080de9b/transmogrify.siteanalyser-1.0b8.zip" } ], "1.1": [ { "comment_text": "", "digests": { "md5": "76034fbc4d1dcdc236370cd3b69269a4", "sha256": "588acac879fabb38b210e89a0dbcc499d2f52932d9aa6254d7691faccffbc1ad" }, "downloads": -1, "filename": "transmogrify.siteanalyser-1.1.zip", "has_sig": false, "md5_digest": "76034fbc4d1dcdc236370cd3b69269a4", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 53847, "upload_time": "2012-04-18T13:58:50", "url": "https://files.pythonhosted.org/packages/9d/e9/e5080093c988865b5aab89871e975fa0cb07815db1783ca9ceeb2a76dcea/transmogrify.siteanalyser-1.1.zip" } ], "1.2": [ { "comment_text": "", "digests": { "md5": "ea1b8d5af2f1b1bbbd1c6e811e509b88", "sha256": "65774f27ac59df808659eb920f5c2366c50a1d19702a03a1e43ae6c9a63bf193" }, "downloads": -1, "filename": "transmogrify.siteanalyser-1.2.zip", "has_sig": false, "md5_digest": "ea1b8d5af2f1b1bbbd1c6e811e509b88", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 57242, "upload_time": "2012-04-25T09:51:45", "url": "https://files.pythonhosted.org/packages/e9/48/822e4e9444a03c1a62371fdad962d19848b305feb22f3903205e51da7b65/transmogrify.siteanalyser-1.2.zip" } ], "1.3": [ { "comment_text": "", "digests": { "md5": "e77acc891be00dda0b03702b96820390", "sha256": "2bbd1a7adaf5a69f6ae3e5a7d7b326243a4081b68fda7afac7af959f1ee735d5" }, "downloads": -1, "filename": "transmogrify.siteanalyser-1.3.tar.gz", "has_sig": false, "md5_digest": "e77acc891be00dda0b03702b96820390", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 45080, "upload_time": "2012-12-28T04:02:07", "url": "https://files.pythonhosted.org/packages/b5/98/5f48dd64a5fc3e5f04c1c019399700de9b57e912de36aef0b330bc4cfc22/transmogrify.siteanalyser-1.3.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "e77acc891be00dda0b03702b96820390", "sha256": "2bbd1a7adaf5a69f6ae3e5a7d7b326243a4081b68fda7afac7af959f1ee735d5" }, "downloads": -1, "filename": "transmogrify.siteanalyser-1.3.tar.gz", "has_sig": false, "md5_digest": "e77acc891be00dda0b03702b96820390", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 45080, "upload_time": "2012-12-28T04:02:07", "url": "https://files.pythonhosted.org/packages/b5/98/5f48dd64a5fc3e5f04c1c019399700de9b57e912de36aef0b330bc4cfc22/transmogrify.siteanalyser-1.3.tar.gz" } ] }