{ "info": { "author": "qiulimao", "author_email": "qiu_limao@163.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 4 - Beta", "Intended Audience :: Developers", "License :: OSI Approved :: Apache Software License", "Operating System :: OS Independent", "Programming Language :: Python :: 2", "Programming Language :: Python :: 2.6", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.3", "Programming Language :: Python :: 3.4", "Programming Language :: Python :: 3.5", "Topic :: Software Development :: Libraries :: Application Frameworks", "Topic :: Software Development :: Libraries :: Python Modules" ], "description": "# webfocus HTML\u7f51\u9875\u6b63\u6587\u63d0\u53d6\n\n## \u5b89\u88c5\n```bash\n\n$ pip install webfocus\n\n```\n\n## \u4f7f\u7528\u65b9\u5f0f\n### \u547d\u4ee4\u884c\n```bash\nUsage: webfocus [OPTIONS] COMMAND [ARGS]...\n\n webfocus system. ---- Powered by qiulimao@2017.03\n\nOptions:\n --help Show this message and exit.\n\nCommands:\n extract \u7ed9\u5b9aurl\u63d0\u53d6\u76f8\u5e94\u7684\u6b63\u6587\n```\n\u76ee\u524d\u4ec5\u4ec5`extract` \u547d\u4ee4\u53ef\u7528\n\n```bash\nUsage: webfocus extract [OPTIONS]\n\n \u7ed9\u5b9aurl\u63d0\u53d6\u76f8\u5e94\u7684\u6b63\u6587\n\nOptions:\n -u, --url TEXT the target url\n -n, --shownoise \u4ec5\u8f93\u51fa\u566a\u58f0\uff0c\u9ed8\u8ba4\u4e3aFalse\n -t, --textonly \u8f93\u51fa\u4e0d\u5e26\u6807\u7b7e\u7684\u6b63\u6587\uff0c\u9ed8\u8ba4\u4e3aFalse\n --help Show this message and exit.\n```\n### \u4f7f\u7528example\n```bash\n$ webfocus extract\nINPUT TARGET URL: \u8f93\u5165\u4f60\u7684url\n\n\u300b\u300b\u300b\u300b\u5e26\u6807\u7b7e\u7684\u7ed3\u679c\u663e\u793a\u8f93\u51fa\n```\n\n```bash\n$ webfocus extract -t\nINPUT TARGET URL: \u8f93\u5165\u4f60\u7684url\n\n\u300b\u300b\u300b\u300b\u5e26\u6807\u7b7e\u7684\u7ed3\u679c\u663e\u793a\u8f93\u51fa\n```\n\n### \u7a0b\u5e8f\u5f53\u4e2d\u4f7f\u7528\n```\nfrom webfocus.extractor import extract_from_url,extract_from_htmlstring\n>>> info,noise = extract_from_url(YOUR_URL,text_only=False) # \u7ed9\u5b9aurl\u8f93\u51fa \u5e26\u6807\u7b7e\u7684\u6b63\u6587\n\n>>> info,noise = extract_from_htmlstring(YOUR_HTML_STRING,text_only=True) # \u7ed9\u5b9ahtml\u6b63\u6587\u8f93\u51fa\u7eaf\u6587\u672c\u6b63\u6587\n```\n### \u5f00\u53d1\u65e5\u5fd7\n* 2017.03.02 \u6b63\u5bf9\u65b0\u95fb\u7f51\u9875\u7b49\u9898\u6750\u7684\u7f51\u7ad9\u5c61\u8bd5\u4e0d\u723d\uff0c\u4f46\u662f\u5bf9\u4e8e\u535a\u5ba2\u7c7b\u7f51\u7ad9\u8fd8\u6709\u5f85\u6539\u8fdb\n\n### \u5e38\u89c1\u95ee\u9898\n\n#### `Unicode strings with encoding declaration are not supported.`\n\u68c0\u67e5\u4f60\u6240\u8bbf\u95ee\u7684url\u662f\u4e0d\u662fban\u722c\u866b\u7684\uff0c\u53ef\u80fd\u8fd4\u56de\u4e86\u4e00\u4e2axml\u7684\u6587\u4ef6\u7ed9\u4f60\n\n#### \u63d0\u53d6\u51fa\u6765\u7684\u6587\u5b57\u597d\u50cf\u90fd\u662f\u566a\u58f0\uff0c\u4e0d\u662f\u6b63\u6587\n\u68c0\u67e5\u4f60\u6240\u8981\u63d0\u53d6\u7684\u7f51\u9875\u7684\u6b63\u6587\u90e8\u5206\u662f\u4e0d\u662f\u4f9d\u9760js\u52a0\u8f7d\u4ea7\u751f\u7684\uff1f\u5982\u679c\u662f\uff0c\u90a3\u4e48\u80af\u5b9a\u63d0\u53d6\u4e0d\u51fa\u6765\uff0c\u6bd4\u5982\u5f00\u6e90\u4e2d\u56fd\u535a\u5ba2\u5c31\u662f\u8fd9\u79cd\u65b9\u5f0f\n\n### bug report\nemail:qiu_limao@163.com", "description_content_type": null, "docs_url": null, "download_url": "UNKNOWN", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/qiulimao", "keywords": "html content extract", "license": "Apache License, Version 2.0", "maintainer": null, "maintainer_email": null, "name": "webfocus", "package_url": "https://pypi.org/project/webfocus/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/webfocus/", "project_urls": { "Download": "UNKNOWN", "Homepage": "https://github.com/qiulimao" }, "release_url": "https://pypi.org/project/webfocus/0.13/", "requires_dist": null, "requires_python": null, "summary": "extract the content from html docs", "version": "0.13" }, "last_serial": 2863563, "releases": { "0.1": [ { "comment_text": "", "digests": { "md5": "b5a2d5f94768b3029070d0af133a58ad", "sha256": "62bc6b7fb588c8c6f1d52ef548fa5e5c678e5067a529ea4121953d274fa0f95d" }, "downloads": -1, "filename": "webfocus-0.1-py2-none-any.whl", "has_sig": false, "md5_digest": "b5a2d5f94768b3029070d0af133a58ad", "packagetype": "bdist_wheel", "python_version": "2.7", "requires_python": null, "size": 4500, "upload_time": "2017-03-02T02:49:03", "url": "https://files.pythonhosted.org/packages/05/11/f0bf4c69718371390e1b9de7ef2a03e138e28d16135c25053326fb67ac78/webfocus-0.1-py2-none-any.whl" }, { "comment_text": "", "digests": { "md5": "0d619ce033d2943d4a7e1cc58e0f03d2", "sha256": "1844abed20f001ab145f70b85b45a81fc00e91c9fa2b618168f23dd4a2eeb523" }, "downloads": -1, "filename": "webfocus-0.1.tar.gz", "has_sig": false, "md5_digest": "0d619ce033d2943d4a7e1cc58e0f03d2", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 2517, "upload_time": "2017-03-02T02:48:59", "url": "https://files.pythonhosted.org/packages/23/40/2f351b7ab4ce62d83ba9d2e046fc0c60c89e36e346e3b9a77ccdc6a13a87/webfocus-0.1.tar.gz" } ], "0.11": [ { "comment_text": "", "digests": { "md5": "a92511b68f85b597839ecf5ea2d0f273", "sha256": "0f22614f7a316e6e5225600bf9862c151e4211792fcf188752bf215c951f1c18" }, "downloads": -1, "filename": "webfocus-0.11-py2-none-any.whl", "has_sig": false, "md5_digest": "a92511b68f85b597839ecf5ea2d0f273", "packagetype": "bdist_wheel", "python_version": "2.7", "requires_python": null, "size": 8683, "upload_time": "2017-03-02T03:05:05", "url": "https://files.pythonhosted.org/packages/a3/43/c5f05264bc457fd1f0af69c7a9ac7a79e7fb76e61fb4efb9736bf34b3995/webfocus-0.11-py2-none-any.whl" }, { "comment_text": "", "digests": { "md5": "9d846f132c9ba65f9735c7123cb45abb", "sha256": "626552299c3377b45ba24740fb9dffc2564b17232bc0b577c2818a5206ad6907" }, "downloads": -1, "filename": "webfocus-0.11.tar.gz", "has_sig": false, "md5_digest": "9d846f132c9ba65f9735c7123cb45abb", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5926, "upload_time": "2017-03-02T03:04:59", "url": "https://files.pythonhosted.org/packages/ca/ae/f0beff93b7f63ceeb4b0b30681c6b649d955b670e3cc628745a83a9a1e06/webfocus-0.11.tar.gz" } ], "0.12": [ { "comment_text": "", "digests": { "md5": "1021791b8023ec78b06c758f3275b7f3", "sha256": "da5e047becc68beeccd434fee09eda9e960337af9699c5961bc66de8645cd724" }, "downloads": -1, "filename": "webfocus-0.12-py2-none-any.whl", "has_sig": false, "md5_digest": "1021791b8023ec78b06c758f3275b7f3", "packagetype": "bdist_wheel", "python_version": "2.7", "requires_python": null, "size": 8732, "upload_time": "2017-03-06T01:11:00", "url": "https://files.pythonhosted.org/packages/1e/2a/e9916b11a68319fb5efc817796fa9f29dee56f8164d09d2986ac9d34070b/webfocus-0.12-py2-none-any.whl" }, { "comment_text": "", "digests": { "md5": "cbf89ce0b206caee26aced8529d9331e", "sha256": "df0d1aa4b4e2531f2934c53fb0f08c7a05e3eb6324652de66f399d910b4c8a8c" }, "downloads": -1, "filename": "webfocus-0.12.tar.gz", "has_sig": false, "md5_digest": "cbf89ce0b206caee26aced8529d9331e", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 10271, "upload_time": "2017-03-06T01:10:56", "url": "https://files.pythonhosted.org/packages/fc/10/329b9ef998bb16e288efbab75fb16a786e0ef4161dd2a224234012f5032b/webfocus-0.12.tar.gz" } ], "0.13": [ { "comment_text": "", "digests": { "md5": "9dc4ebb0525b493286db15ef6973f661", "sha256": "38a64151281a9c7685fe2d9fbbfbb9e5a3d9f031d3ddb79560a376f00fb54092" }, "downloads": -1, "filename": "webfocus-0.13-py2-none-any.whl", "has_sig": false, "md5_digest": "9dc4ebb0525b493286db15ef6973f661", "packagetype": "bdist_wheel", "python_version": "2.7", "requires_python": null, "size": 22254, "upload_time": "2017-05-10T06:39:32", "url": "https://files.pythonhosted.org/packages/bc/0e/f64b5a97221bc9b1a73d5c459b973eeb3fcb42e26bee4f8727876059f448/webfocus-0.13-py2-none-any.whl" }, { "comment_text": "", "digests": { "md5": "61653baf640b390043ca49b045e3dd66", "sha256": "85ad9983028a94e00d7968bf50a211a4166d85c73fafb4a8ecf5dac87a2fe43d" }, "downloads": -1, "filename": "webfocus-0.13.tar.gz", "has_sig": false, "md5_digest": "61653baf640b390043ca49b045e3dd66", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 19955, "upload_time": "2017-05-10T06:39:29", "url": "https://files.pythonhosted.org/packages/fb/48/4a57a8f19c0d48385e2ccdaafde99d99cd0c14d50517449323b6e4e73fa2/webfocus-0.13.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "9dc4ebb0525b493286db15ef6973f661", "sha256": "38a64151281a9c7685fe2d9fbbfbb9e5a3d9f031d3ddb79560a376f00fb54092" }, "downloads": -1, "filename": "webfocus-0.13-py2-none-any.whl", "has_sig": false, "md5_digest": "9dc4ebb0525b493286db15ef6973f661", "packagetype": "bdist_wheel", "python_version": "2.7", "requires_python": null, "size": 22254, "upload_time": "2017-05-10T06:39:32", "url": "https://files.pythonhosted.org/packages/bc/0e/f64b5a97221bc9b1a73d5c459b973eeb3fcb42e26bee4f8727876059f448/webfocus-0.13-py2-none-any.whl" }, { "comment_text": "", "digests": { "md5": "61653baf640b390043ca49b045e3dd66", "sha256": "85ad9983028a94e00d7968bf50a211a4166d85c73fafb4a8ecf5dac87a2fe43d" }, "downloads": -1, "filename": "webfocus-0.13.tar.gz", "has_sig": false, "md5_digest": "61653baf640b390043ca49b045e3dd66", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 19955, "upload_time": "2017-05-10T06:39:29", "url": "https://files.pythonhosted.org/packages/fb/48/4a57a8f19c0d48385e2ccdaafde99d99cd0c14d50517449323b6e4e73fa2/webfocus-0.13.tar.gz" } ] }