{
    "info": {
        "author": "breezedeus",
        "author_email": "breezedeus@163.com",
        "bugtrack_url": null,
        "classifiers": [
            "Development Status :: 4 - Beta",
            "Intended Audience :: Developers",
            "License :: OSI Approved :: Apache Software License",
            "Operating System :: OS Independent",
            "Programming Language :: Python",
            "Programming Language :: Python :: 3",
            "Programming Language :: Python :: 3.4",
            "Programming Language :: Python :: 3.5",
            "Programming Language :: Python :: 3.6",
            "Programming Language :: Python :: Implementation",
            "Topic :: Software Development :: Libraries"
        ],
        "description": "\u4e2d\u6587\u7248\u8bf4\u660e\u8bf7\u89c1[\u4e2d\u6587README](./README_cn.md)\u3002\n\n\n\n# Update 2019.07.25: release cnocr V1.0.0\n\n`cnocr` `v1.0.0` is released, which is more efficient for prediction. **The new version of the model is not compatible with the previous version.** So if upgrading, please download the latest model file again. See below for the details (same as before).\n\n\n\nMain changes are\uff1a\n\n-  **The new crnn model supports prediction for variable-width image files, so is more efficient for prediction.**\n-  Support fine-tuning the existing model with specific data.\n-  Fix bugs\uff0csuch as `train accuracy` always `0`.\n-  Depended package `mxnet` is upgraded from `1.3.1`  to `1.4.1`.\n\n\n\n# cnocr\n\nA python package for Chinese OCR with available trained models.\nSo it can be used directly after installed.\n\nThe accuracy of the current crnn model is about `98.8%`.\n\nThe project originates from our own ([\u7231\u56e0\u4e92\u52a8 Ein+](https://einplus.cn)) internal needs.\nThanks for the internal supports.\n\n## Changes\n\nMost of the codes are adapted from [crnn-mxnet-chinese-text-recognition](https://github.com/diaomin/crnn-mxnet-chinese-text-recognition).\nMuch thanks to the author.\n\nSome changes are:\n\n* use raw MXNet CTC Loss instead of WarpCTC Loss. No more complicated installation.\n* public pre-trained model for anyone. No more a-few-days training.\n* add online `predict` function and script. Easy to use.\n\n## Installation\n\n```bash\npip install cnocr\n```\n\n> Please use Python3 (3.4, 3.5, 3.6 should work). Python2 is not tested.\n\n## Usage\n\nThe first time cnocr is used, the model files will be downloaded automatically from \n[Dropbox](https://www.dropbox.com/s/7w8l3mk4pvkt34w/cnocr-models-v1.0.0.zip?dl=0) to `~/.cnocr`. \n\nThe zip file will be extracted and you can find the resulting model files in `~/.cnocr/models` by default.\nIn case the automatic download can't perform well, you can download the zip file manually \nfrom [Baidu NetDisk](https://pan.baidu.com/s/1DWV3H2UWmzOU6d48UbTYVw) with extraction code `ss81`, and put the zip file to `~/.cnocr`. The code will do else.\n\n\n\n### Predict\n\nThree functions are provided for prediction.\n\n\n\n#### 1. `CnOcr.ocr(img_fp)`\n\nThe function `cnOcr.ocr (img_fp)` can recognize texts in an image containing multiple lines of text (or single lines).\n\n\n\n**Function Description**\n\n- input parameter `img_fp`: image file path; or color image `mx.nd.NDArray` or `np.ndarray`, with shape `(height, width, 3)`, and the channels should be RGB formatted.\n- return: `List(List(Char))`,  such as:  `[['\u7b2c', '\u4e00', '\u884c'], ['\u7b2c', '\u4e8c', '\u884c'], ['\u7b2c', '\u4e09', '\u884c']]`.\n\n\n\n\n**Usage Case**\n\n\n```python\nfrom cnocr import CnOcr\nocr = CnOcr()\nres = ocr.ocr('examples/multi-line_cn1.png')\nprint(\"Predicted Chars:\", res)\n```\n\nor:\n\n```python\nimport mxnet as mx\nfrom cnocr import CnOcr\nocr = CnOcr()\nimg_fp = 'examples/multi-line_cn1.png'\nimg = mx.image.imread(img_fp, 1)\nres = ocr.ocr(img)\nprint(\"Predicted Chars:\", res)\n```\n\nThe previous codes can recognize texts in the image file [examples/multi-line_cn1.png](./examples/multi-line_cn1.png):\n\n![examples/multi-line_cn1.png](./examples/multi-line_cn1.png)\n\nThe OCR results shoule be:\n\n```bash\nPredicted Chars: [['\u7f51', '\u7edc', '\u652f', '\u4ed8', '\u5e76', '\u65e0', '\u672c', '\u8d28', '\u7684', '\u533a', '\u522b', '\uff0c', '\u56e0', '\u4e3a'],\n                  ['\u6bcf', '\u4e00', '\u4e2a', '\u624b', '\u673a', '\u53f7', '\u7801', '\u548c', '\u90ae', '\u4ef6', '\u5730', '\u5740', '\u80cc', '\u540e'],\n                  ['\u90fd', '\u4f1a', '\u5bf9', '\u5e94', '\u7740', '\u4e00', '\u4e2a', '\u8d26', '\u6237', '\u4e00', '\u2015', '\u8fd9', '\u4e2a', '\u8d26'],\n                  ['\u6237', '\u53ef', '\u4ee5', '\u662f', '\u4fe1', '\u7528', '\u5361', '\u8d26', '\u6237', '\u3001', '\u501f', '\u8bb0', '\u5361', '\u8d26'],\n                  ['\u6237', '\uff0c', '\u4e5f', '\u5305', '\u62ec', '\u90ae', '\u5c40', '\u6c47', '\u6b3e', '\u3001', '\u624b', '\u673a', '\u4ee3'],\n                  ['\u6536', '\u3001', '\u7535', '\u8bdd', '\u4ee3', '\u6536', '\u3001', '\u9884', '\u4ed8', '\u8d39', '\u5361', '\u548c', '\u70b9', '\u5361'],\n                  ['\u7b49', '\u591a', '\u79cd', '\u5f62', '\u5f0f', '\u3002']]\n```\n\n#### 2. `CnOcr.ocr_for_single_line(img_fp)`\n\nIf you know that the image you're predicting contains only one line of text, function `CnOcr.ocr_for_single_line(img_fp)` can be used instead\u3002Compared with `CnOcr.ocr()`, the result of `CnOcr.ocr_for_single_line()` is more reliable because the process of splitting lines is not required. \n\n\n\n**Function Description**\n\n- input parameter `img_fp`: image file path; or color image `mx.nd.NDArray` or `np.ndarray`, with shape `[height, width]` or `[height, width, channel]`.  The optional channel should be `1` (gray image) or `3` (color image).\n- return: `List(Char)`,  such as:  `['\u4f60', '\u597d']`.\n\n\n\n**Usage Case**\uff1a\n\n```python\nfrom cnocr import CnOcr\nocr = CnOcr()\nres = ocr.ocr_for_single_line('examples/rand_cn1.png')\nprint(\"Predicted Chars:\", res)\n```\n\nor:\n\n```python\nimport mxnet as mx\nfrom cnocr import CnOcr\nocr = CnOcr()\nimg_fp = 'examples/rand_cn1.png'\nimg = mx.image.imread(img_fp, 1)\nres = ocr.ocr_for_single_line(img)\nprint(\"Predicted Chars:\", res)\n```\n\n\nThe previous codes can recognize texts in the image file  [examples/rand_cn1.png](./examples/rand_cn1.png)\uff1a\n\n![examples/rand_cn1.png](./examples/rand_cn1.png)\n\nThe OCR results shoule be:\n\n```bash\nPredicted Chars: ['\u7b20', '\u6de1', '\u563f', '\u9a85', '\u8c27', '\u9f0e', '\u81ed', '\u59da', '\u6b7c', '\u8822', '\u9a7c', '\u8033', '\u88d4', '\u631d', '\u6daf', '\u72d7', '\u84bd', '\u5b50', '\u72b7'] \n```\n\n#### 3. `CnOcr.ocr_for_single_lines(img_list)`\n\nFunction `CnOcr.ocr_for_single_lines(img_list)` can predict a number of single-line-text image arrays batchly. Actually `CnOcr.ocr(img_fp)` and `CnOcr.ocr_for_single_line(img_fp)` both invoke `CnOcr.ocr_for_single_lines(img_list)` internally.\n\n\n\n**Function Description**\n\n- input parameter `img_list`: list of images, in which each element should be a line image array,  with type `mx.nd.NDArray` or `np.ndarray`.  Each element should be a tensor with values ranging from `0` to` 255`, and with shape `[height, width]` or `[height, width, channel]`.  The optional channel should be `1` (gray image) or `3` (color image).\n- return: `List(List(Char))`,  such as:  `[['\u7b2c', '\u4e00', '\u884c'], ['\u7b2c', '\u4e8c', '\u884c'], ['\u7b2c', '\u4e09', '\u884c']]`.\n\n\n\nUsage Case**\uff1a\n\n```python\nimport mxnet as mx\nfrom cnocr import CnOcr\nocr = CnOcr()\nimg_fp = 'examples/multi-line_cn1.png'\nimg = mx.image.imread(img_fp, 1).asnumpy()\nline_imgs = line_split(img, blank=True)\nline_img_list = [line_img for line_img, _ in line_imgs]\nres = ocr.ocr_for_single_lines(line_img_list)\nprint(\"Predicted Chars:\", res)\n```\n\nMore usage cases can be found at [tests/test_cnocr.py](./tests/test_cnocr.py).\n\n\n### Using  the Script\n\n```bash\npython scripts/cnocr_predict.py --file examples/multi-line_cn1.png\n```\n\n\n\n### (No NECESSARY) Train\n\nYou can use the package without any train. But if you really really want to train your own models, follow this:\n\n```bash\npython scripts/cnocr_train.py --cpu 2 --num_proc 4 --loss ctc --dataset cn_ocr\n```\n\n\n\nFine-tuning the model with specific data from existing models is also supported. Please refer to the following command:\n\n```bash\npython scripts/cnocr_train.py --cpu 2 --num_proc 4 --loss ctc --dataset cn_ocr --load_epoch 20\n```\n\n\n\nMore references can be found at  [scripts/run_cnocr_train.sh](./scripts/run_cnocr_train.sh).\n\n\n\n## Future Work\n\n* [x] support multi-line-characters recognition (`Done`)\n* [x] crnn model supports prediction for variable-width image files (`Done`)\n* [x] Add Unit Tests  (`Doing`)\n* [x]  Bugfixes  (`Doing`)\n* [ ] Support space recognition (Tried, but not successful for now )\n* [ ] Try other models such as DenseNet, ResNet\n\n\n",
        "description_content_type": "text/markdown",
        "docs_url": null,
        "download_url": "",
        "downloads": {
            "last_day": -1,
            "last_month": -1,
            "last_week": -1
        },
        "home_page": "https://github.com/breezedeus/cnocr",
        "keywords": "",
        "license": "Apache 2.0",
        "maintainer": "",
        "maintainer_email": "",
        "name": "cnocr",
        "package_url": "https://pypi.org/project/cnocr/",
        "platform": "Mac",
        "project_url": "https://pypi.org/project/cnocr/",
        "project_urls": {
            "Homepage": "https://github.com/breezedeus/cnocr"
        },
        "release_url": "https://pypi.org/project/cnocr/1.0.0/",
        "requires_dist": [
            "numpy (<1.15.0,>=1.14.0)",
            "pillow (>=5.3.0)",
            "mxnet (<1.5.0,>=1.4.1)",
            "gluoncv (<0.4.0,>=0.3.0)"
        ],
        "requires_python": "",
        "summary": "Package for Chinese OCR, which can be used after installed without training yourself OCR model",
        "version": "1.0.0"
    },
    "last_serial": 5580940,
    "releases": {
        "0.1.1": [
            {
                "comment_text": "",
                "digests": {
                    "md5": "17778ce84a31339b349a8bf41195efad",
                    "sha256": "7d5754f9bdbd93e283e6893b9153f5b224fb07787f28ec2b887b51809fc57e41"
                },
                "downloads": -1,
                "filename": "cnocr-0.1.1-py3-none-any.whl",
                "has_sig": false,
                "md5_digest": "17778ce84a31339b349a8bf41195efad",
                "packagetype": "bdist_wheel",
                "python_version": "py3",
                "requires_python": null,
                "size": 32078,
                "upload_time": "2019-03-27T15:22:15",
                "url": "https://files.pythonhosted.org/packages/f5/f8/4da355ec579d61b756ab1bd355b78cbc7697e1c4f5fc1b9dec8057737325/cnocr-0.1.1-py3-none-any.whl"
            },
            {
                "comment_text": "",
                "digests": {
                    "md5": "48fce165f81dda0461a015c17d69c2ac",
                    "sha256": "d58adb8d340c55a9bce7e54ed07985f3f3449b7880ad3bae4baf0a6d21ced58d"
                },
                "downloads": -1,
                "filename": "cnocr-0.1.1.tar.gz",
                "has_sig": false,
                "md5_digest": "48fce165f81dda0461a015c17d69c2ac",
                "packagetype": "sdist",
                "python_version": "source",
                "requires_python": null,
                "size": 16715,
                "upload_time": "2019-03-27T15:22:18",
                "url": "https://files.pythonhosted.org/packages/38/e9/84fc884b33b87ea8d376395db804c33149d9edc0887d78d623b50f6b796e/cnocr-0.1.1.tar.gz"
            }
        ],
        "0.2.0": [
            {
                "comment_text": "",
                "digests": {
                    "md5": "5b90bc2afaf1f061e29fc300c466ec3e",
                    "sha256": "dbc50f9c3bf5c594a666bd09ebd99f01d3d2d2dbc48011bc611c53fa13d205e4"
                },
                "downloads": -1,
                "filename": "cnocr-0.2.0-py3-none-any.whl",
                "has_sig": false,
                "md5_digest": "5b90bc2afaf1f061e29fc300c466ec3e",
                "packagetype": "bdist_wheel",
                "python_version": "py3",
                "requires_python": null,
                "size": 35634,
                "upload_time": "2019-04-07T07:02:56",
                "url": "https://files.pythonhosted.org/packages/49/b7/a9d383d87e892721683042bcb18128eae7435bc5e379f7f13c3d1591c64e/cnocr-0.2.0-py3-none-any.whl"
            },
            {
                "comment_text": "",
                "digests": {
                    "md5": "293806c10cb27dce17054b96d80da435",
                    "sha256": "7f736b33a29cf7ccfd72e1c27e49fc68bfd9e2129480c08218f4f815a0b20f14"
                },
                "downloads": -1,
                "filename": "cnocr-0.2.0.tar.gz",
                "has_sig": false,
                "md5_digest": "293806c10cb27dce17054b96d80da435",
                "packagetype": "sdist",
                "python_version": "source",
                "requires_python": null,
                "size": 19386,
                "upload_time": "2019-04-07T07:03:47",
                "url": "https://files.pythonhosted.org/packages/1a/d7/2156d29de187f00ec27551c8d6bd798f2b1c4e001f82a21e0c8fc2b1c489/cnocr-0.2.0.tar.gz"
            }
        ],
        "1.0.0": [
            {
                "comment_text": "",
                "digests": {
                    "md5": "457c351be21949edf115cf60726fd87c",
                    "sha256": "a6ba6fb8a94e851f847463e93107ef8ccffb431a26e1c734e71616fd7db5893f"
                },
                "downloads": -1,
                "filename": "cnocr-1.0.0-py3-none-any.whl",
                "has_sig": false,
                "md5_digest": "457c351be21949edf115cf60726fd87c",
                "packagetype": "bdist_wheel",
                "python_version": "py3",
                "requires_python": null,
                "size": 39255,
                "upload_time": "2019-07-25T03:09:31",
                "url": "https://files.pythonhosted.org/packages/8b/2a/86464f97dee48b691abc0c3e3f2c85602462645d4f5f062b0789087b4ea4/cnocr-1.0.0-py3-none-any.whl"
            },
            {
                "comment_text": "",
                "digests": {
                    "md5": "192c2706f1a3808148ff14c3adfad6a0",
                    "sha256": "95eaef5e83b4f49beea7a072c155ce34eddad3bd163372dc29e958bdb6e4b66a"
                },
                "downloads": -1,
                "filename": "cnocr-1.0.0.tar.gz",
                "has_sig": false,
                "md5_digest": "192c2706f1a3808148ff14c3adfad6a0",
                "packagetype": "sdist",
                "python_version": "source",
                "requires_python": null,
                "size": 23678,
                "upload_time": "2019-07-25T03:09:37",
                "url": "https://files.pythonhosted.org/packages/90/22/5b396ba294d947e3652ff7140def3660a9e60782c368541d063b3e9a944a/cnocr-1.0.0.tar.gz"
            }
        ]
    },
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "md5": "457c351be21949edf115cf60726fd87c",
                "sha256": "a6ba6fb8a94e851f847463e93107ef8ccffb431a26e1c734e71616fd7db5893f"
            },
            "downloads": -1,
            "filename": "cnocr-1.0.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "457c351be21949edf115cf60726fd87c",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 39255,
            "upload_time": "2019-07-25T03:09:31",
            "url": "https://files.pythonhosted.org/packages/8b/2a/86464f97dee48b691abc0c3e3f2c85602462645d4f5f062b0789087b4ea4/cnocr-1.0.0-py3-none-any.whl"
        },
        {
            "comment_text": "",
            "digests": {
                "md5": "192c2706f1a3808148ff14c3adfad6a0",
                "sha256": "95eaef5e83b4f49beea7a072c155ce34eddad3bd163372dc29e958bdb6e4b66a"
            },
            "downloads": -1,
            "filename": "cnocr-1.0.0.tar.gz",
            "has_sig": false,
            "md5_digest": "192c2706f1a3808148ff14c3adfad6a0",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 23678,
            "upload_time": "2019-07-25T03:09:37",
            "url": "https://files.pythonhosted.org/packages/90/22/5b396ba294d947e3652ff7140def3660a9e60782c368541d063b3e9a944a/cnocr-1.0.0.tar.gz"
        }
    ]
}