{ "info": { "author": "Michal Dul, Piotr Przetacznik, Krzysztof Strojny", "author_email": "piotr.przetacznik@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 5 - Production/Stable", "Intended Audience :: Developers", "License :: OSI Approved :: MIT License", "Programming Language :: Python :: 2.7", "Topic :: Utilities" ], "description": "patent-parsing-tools\n====================\n\n## System requirements:\n\n```Bash\nsudo yum install python-devel libxslt-devel libxml2-devel\n```\n\n## Python requirements:\n\n```Bash\npip install -r requirements.txt\n```\n\n## Running:\n\nCollecting and serializing data:\n```Bash\npython -m patent_parsing_tools.supervisor [working_directory] [train_destination] [test_destination] [year_from] [year_to]\n```\n\nEg.\n```Bash\npython -m patent_parsing_tools.supervisor patents/working_directory patents/train_destination patents/test_destination 2014 2015\n```\n\nGenerating dictionary with train set:\n```Bash\npython -m patent_parsing_tools.bow.dictionary_maker [train_directory] [max_parsed_patents] [dict_max_size] [dictionary_name]\n```\n\nEg.\n```Bash\npython -m patent_parsing_tools.bow.dictionary_maker patents/train_destination 1000000000 4096 dictionary.txt\n```\n\nGenerate bag of words with train set and test set:\n```Bash\npython -m patent_parsing_tools.bow.bag_of_words [directory_with_serialized_patents] [destination_directory] [dictionary.txt] [package_size > 1024]\n```\n\nEg.\n```Bash\npython -m patent_parsing_tools.bow.bag_of_words patents/train_destination patents/final_dataset_train dictionary.txt 1048576\npython -m patent_parsing_tools.bow.bag_of_words patents/test_destination patents/final_dataset_test dictionary.txt 1048576\n```\n\n## Running tests\n\n```Bash\npython -m unittest discover .\n```", "description_content_type": null, "docs_url": null, "download_url": "UNKNOWN", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://bitbucket.org/ml-patents/patent-parsing-tools", "keywords": "deeplearning dbn rbm rsm backpropagation precission recall", "license": "MIT", "maintainer": null, "maintainer_email": null, "name": "patent-parsing-tools", "package_url": "https://pypi.org/project/patent-parsing-tools/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/patent-parsing-tools/", "project_urls": { "Download": "UNKNOWN", "Homepage": "https://bitbucket.org/ml-patents/patent-parsing-tools" }, "release_url": "https://pypi.org/project/patent-parsing-tools/0.9.1/", "requires_dist": null, "requires_python": null, "summary": "patent-parsing-tools is a library providing tools for generating training and test set from Google's USPTO data helpful with for testing machine learning algorithms", "version": "0.9.1" }, "last_serial": 1811936, "releases": { "0.9": [ { "comment_text": "", "digests": { "md5": "d039be522f2247a30cbf34c26df70a79", "sha256": "ad72ac60f7d67a212f0533bfb9d347f22500a3bb038e91a04ae9bb97737e1e0e" }, "downloads": -1, "filename": "patent-parsing-tools-0.9.tar.gz", "has_sig": false, "md5_digest": "d039be522f2247a30cbf34c26df70a79", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 82389, "upload_time": "2015-10-22T16:07:39", "url": "https://files.pythonhosted.org/packages/95/28/4bf865a9c1ea9ad9f887b2c3a7775b919097b07777a38eeca850c1909ebd/patent-parsing-tools-0.9.tar.gz" } ], "0.9.1": [] }, "urls": [] }