{ "info": { "author": "KzXuan", "author_email": "kaizhouxuan@gmail.com", "bugtrack_url": null, "classifiers": [ "License :: Free for non-commercial use", "Programming Language :: Python :: 3.5", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: 3.7", "Topic :: Scientific/Engineering :: Artificial Intelligence" ], "description": "## PyTorch - Deep Neural Network - Natural Language Processing\n\nVersion 1.1 by KzXuan\n\n**Contains CNN, RNN and Transformer layers and models implemented by pytorch for classification and sequence labeling tasks in NLP.**\n\n* Newly designed modules.\n* Reduce usage complexity.\n* Use `mask` as the sequence length identifier.\n* Multi-GPU parallel for grid search.\n\n
\n\n### Dependecies\n\npython 3.5+ & pytorch 1.2.0+\n\n
\n\n### [Install](https://pypi.org/project/dnnnlp/)\n\n```bash\n> pip install dnnnlp\n```\n\n
\n\n### [API Document](./docs.md) (In Chinese)\n\n * [Layer](./docs.md#Layer) - [layer.py](./dnnnlp/layer.py)\n * [Model](./docs.md#Model) - [model.py](./dnnnlp/model.py)\n * [Execution](./docs.md#Execution) - [exec.py](./dnnnlp/exec.py)\n * [Utility](./docs.md#Utility) - [utils.py](./dnnnlp/utils.py)\n\n
\n\n### Hyperparameters\n\n| Name | Type | Default | Description |\n| ------------- | ----- | ----------- | -------------------------------------------------------------- |\n| n_gpu | int | 1 | The number of GPUs (0 means no GPU acceleration). |\n| space_turbo | bool | True | Accelerate with more GPU memories. |\n| rand_seed | int | 100 | Random seed setting. |\n| data_shuffle | bool | Ture | Disrupt data for training. |\n| emb_type | str | None | Embedding modes contain None, 'const' or 'variable'. |\n| emb_dim | int | 300 | Embedding dimension (or feature dimension). |\n| n_class | int | 2 | Number of target classes. |\n| n_hidden | int | 50 | Number of hidden nodes, or output channels of CNN. |\n| learning_rate | float | 0.01 | Learning rate. |\n| l2_reg | float | 1e-6 | L2 regular. |\n| batch_size | int | 32 | Number of samples for one batch. |\n| iter_times | int | 30 | Number of iterations. |\n| display_step | int | 2 | The number of iterations between each output of the result. |\n| drop_prob | float | 0.1 | Dropout ratio. |\n| eval_metric | str | 'accuracy' | Evaluation metrics contain 'accuracy', 'macro', 'class1', etc. |\n\n
\n\n### Usage\n\n```python\n# import our modules\nfrom dnnnlp.model import RNNModel\nfrom dnnnlp.exec import default_args, Classify\n\n# load the embedding matrix\nemb_mat = np.array((-1, 300))\n# load the train data\ntrain_x = np.array((800, 50))\ntrain_y = np.array((800,))\ntrain_mask = np.array((800, 50))\n# load the test data\ntest_x = np.array((200, 50))\ntest_y = np.array((200,))\ntest_mask = np.array((200, 50))\n\n# get the default arguments\nargs = default_args()\n# modify part of the arguments\nargs.space_turbo = False\nargs.n_hidden = 100\nargs.batch_size = 32\n```\n\n* Classification\n\n```python\n# initilize a model\nmodel = RNNModel(args, emb_mat, bi_direction=False, rnn_type='GRU', use_attention=True)\n# initilize a classifier\nnn = Classify(model, args, train_x, train_y, train_mask, test_x, test_y, test_mask)\n# do training and testing\nevals = nn.train_test(device_id=0)\n```\n\n* Run several times and get the average score.\n\n````python\n# initilize a model\nmodel = CNNModel(args, emb_mat, kernel_widths=[2, 3, 4])\n# initilize a classifier\nnn = Classify(model, args, train_x, train_y, train_mask)\n# run the model several times\navg_evals = average_several_run(nn.train_test, args, n_times=8, n_paral=4, fold=5)\n````\n\n* Parameters' grid search.\n\n````python\n# initilize a model\nmodel = TransformerModel(args, n_layer=12, n_head=8)\n# initilize a classifier\nnn = Classify(model, args, train_x, train_y, train_mask, test_x, test_y, test_mask)\n# set searching params\nparams_search = {'learning_rate': [0.1, 0.01], 'n_hidden': [50, 100]}\n# run grid search\nmax_evals = grid_search(nn, nn.train_test, args, params_search)\n````\n\n* Sequence labeling\n\n```python\nfrom dnnnlp.model import RNNCRFModel\nfrom dnnnlp.exec import default_args, SequenceLabeling\n\n# load the train data\ntrain_x = np.array((1000, 50))\ntrain_y = np.array((1000, 50))\ntrain_mask = np.array((1000, 50))\n\n# initilize a model\nmodel = RNNCRFModel(args)\n# initilize a labeler\nnn = SequenceLabeling(model, args, train_x, train_y, train_mask)\n# do cross validation\nnn.cross_validation(fold=5)\n```\n\n
\n\n### History\n\n**version 1.1**\n * Add `CRFLayer`: packaging CRF for both training and testing.\n * Add `RNNCRFModel`: a integrated RNN-CRF sequence labeling model.\n * Add `SequenceLabeling`: a sequence labeling execution module that inherits from Classify.\n * Fix errors in judging whether a tensor is None.\n\n**version 1.0**\n * Rename project `dnn` to `dnnnlp`.\n * Remove file `base`, add file `utils`.\n * Optimize and rename `SoftmaxLayer` and `SoftAttentionLayer`.\n * Rewrite and rename `EmbeddingLayer`, `CNNLayer` and `RNNLayer`.\n * Rewrite `MultiheadAttentionLayer`: a packaging attention layer based on `nn.MultiheadAttention`.\n * Rewrite `TransformerLayer`: support new `MultiheadAttentionLayer`.\n * Optimize and rename `CNNModel`, `RNNModel` and `TransformerModel`.\n * Optimize and rename `Classify`: a highly applicable classification execution module.\n * Rewrite `average_several_run` and `grid_search`: support multi-GPU parallel.\n * Support pytorch 1.2.0.\n\n**version 0.12**\n * Update `RNN_layer`: fully support for tanh, LSTM and GRU.\n * Fix errors in some mask operations.\n * Support pytorch 1.1.0.\n\nOld version [0.12.3](https://github.com/NUSTM/pytorch-dnnnlp/tree/8d2d6c4e432076e13020ae54954aa419f3bb9bce).\n\n**version 0.11**\n * Provides an acceleration method by using more GPU memories.\n * Fix the problem of memory consumption caused by abnormal data reading.\n * Add `multi_head_attention_layer`: packaging multi-head attention for Transformer.\n * Add `Transformer_layer` and `Transformer_model`: packaging Transformer layer and model written by ourself.\n * Support data disruption for training.\n\n**version 0.10**\n * Split the code into four files: `base`, `layer`, `model`, `exec`.\n * Add `CNN_layer` and `CNN_model`: packaging CNN layer and model.\n * Support multi-GPU parallel for each model.\n\n**version 0.9**\n * Fix the problem of output format.\n * Fix the statistical errors in cross-validation part of `LSTM_classify`.\n * Rename: `LSTM_model` to `RNN_layer`, `self_attention` to `self_attention_layer`.\n * Add `softmax_layer`: a packaging fully-connected layer.\n\n**version 0.8**\n * Adjust the applicability of functions in `LSTM_classify` to avoid rewriting in `LSTM_sequence`.\n * Optimize the way of parameter transfer.\n * A more complete evaluation mechanism.\n\n**version 0.7**\n * Add `LSTM_sequence`: a sequence labeling module for `LSTM_model`.\n * Fix the nan-value problem in hierarchical classification.\n * Support pytorch 1.0.0.\n\n**version 0.6**\n * Update `LSTM_classify`: support hierarchical classification.\n * The `GRU_model` is merged into the `LSTM_model`.\n * Adapt to CPU operation.\n\n**version 0.5**\n * Split the running part of `LSTM_classify` to reduce the rewrite of custom models.\n * Add control for visual output.\n * Create function `average_several_run`: support to get the average score after several training and testing.\n * Create function `grid_search`: support parameters' grid search.\n\n**version 0.4**\n * Add `GRU_model`: a packaging GRU model based on `nn.GRU`.\n * Support L2 regular.\n\n**version 0.3**\n * Add `self_attention`: provides attention mechanism support.\n * Update `LSTM_classify`: adapts to complex custom models.\n\n**version 0.2**\n * Support mode selection of embedding.\n * Default usage of `nn.Dropout`.\n * Create function `default_args` to provide default hyperparameters.\n\n**version 0.1**\n * Initilization of project `dnn`: based on pytorch 0.4.1.\n * Add `LSTM_model`: a packaging LSTM model based on `nn.LSTM`.\n * Add `LSTM_classify`: a classification module for LSTM model, which supports train-test and corss-validation.\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/NUSTM/pytorch-dnnnlp", "keywords": "", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "dnnnlp", "package_url": "https://pypi.org/project/dnnnlp/", "platform": "", "project_url": "https://pypi.org/project/dnnnlp/", "project_urls": { "Homepage": "https://github.com/NUSTM/pytorch-dnnnlp" }, "release_url": "https://pypi.org/project/dnnnlp/1.1.4/", "requires_dist": [ "numpy", "scikit-learn (>=0.21.0)", "torch (>=1.2.0)", "torchcrf (>=0.7.2)" ], "requires_python": "", "summary": "Deep Neural Networks for Natural Language Processing classification or sequential task written by PyTorch.", "version": "1.1.4", "yanked": false, "yanked_reason": null }, "last_serial": 6045757, "releases": { "1.0.3": [ { "comment_text": "", "digests": { "md5": "5cab194cc4aa07c63d40c45021b5b16c", "sha256": "6bb6f712fe83b196bc70964f131d0ae7b1db692779e209cfdd66c1fd72b820b2" }, "downloads": -1, "filename": "dnnnlp-1.0.3-py3-none-any.whl", "has_sig": false, "md5_digest": "5cab194cc4aa07c63d40c45021b5b16c", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 17395, "upload_time": "2019-08-23T02:39:22", "upload_time_iso_8601": "2019-08-23T02:39:22.447196Z", "url": "https://files.pythonhosted.org/packages/5a/03/e99f234932a81a4dc1d9bf0ac34a6fbff71fc0b2eb60151b65257eeef0fb/dnnnlp-1.0.3-py3-none-any.whl", "yanked": false, "yanked_reason": null }, { "comment_text": "", "digests": { "md5": "d65b29161ccb1183b61b418ee9b3f6e8", "sha256": "636c48d1520c45738207d5211e9e6d9b24faa6bdde63871e1160aa0f40714577" }, "downloads": -1, "filename": "dnnnlp-1.0.3.tar.gz", "has_sig": false, "md5_digest": "d65b29161ccb1183b61b418ee9b3f6e8", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 14178, "upload_time": "2019-08-23T02:39:25", "upload_time_iso_8601": "2019-08-23T02:39:25.769694Z", "url": "https://files.pythonhosted.org/packages/6c/36/4750b0163e72deed3680d30adbe50077a6414ea3161a2477851422f500dc/dnnnlp-1.0.3.tar.gz", "yanked": false, "yanked_reason": null } ], "1.1.4": [ { "comment_text": "", "digests": { "md5": "0552b2fb489ea97a98d280f0a80ddd93", "sha256": "eb03c41ff8ccdb592fc2de29a3b885a5e56e8e214e69a6256348dab29401d408" }, "downloads": -1, "filename": "dnnnlp-1.1.4-py3-none-any.whl", "has_sig": false, "md5_digest": "0552b2fb489ea97a98d280f0a80ddd93", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 18344, "upload_time": "2019-10-29T07:50:06", "upload_time_iso_8601": "2019-10-29T07:50:06.755277Z", "url": "https://files.pythonhosted.org/packages/51/2e/545ea7d0cbc648d12e6777ed13c8e012b4c54d028085f1b659f6c98d6b26/dnnnlp-1.1.4-py3-none-any.whl", "yanked": false, "yanked_reason": null }, { "comment_text": "", "digests": { "md5": "5ad7873a3512660103af73e64a30e288", "sha256": "aa37a56ccf089b3450c68d546ca4e9e2bd887c5020fef8076460445077ca7fe3" }, "downloads": -1, "filename": "dnnnlp-1.1.4.tar.gz", "has_sig": false, "md5_digest": "5ad7873a3512660103af73e64a30e288", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 15056, "upload_time": "2019-10-29T07:50:08", "upload_time_iso_8601": "2019-10-29T07:50:08.924396Z", "url": "https://files.pythonhosted.org/packages/c1/9c/a5e635017e2ef33f0795c584044842c5a3ca22a403994f469b7b0b3823b7/dnnnlp-1.1.4.tar.gz", "yanked": false, "yanked_reason": null } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "0552b2fb489ea97a98d280f0a80ddd93", "sha256": "eb03c41ff8ccdb592fc2de29a3b885a5e56e8e214e69a6256348dab29401d408" }, "downloads": -1, "filename": "dnnnlp-1.1.4-py3-none-any.whl", "has_sig": false, "md5_digest": "0552b2fb489ea97a98d280f0a80ddd93", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 18344, "upload_time": "2019-10-29T07:50:06", "upload_time_iso_8601": "2019-10-29T07:50:06.755277Z", "url": "https://files.pythonhosted.org/packages/51/2e/545ea7d0cbc648d12e6777ed13c8e012b4c54d028085f1b659f6c98d6b26/dnnnlp-1.1.4-py3-none-any.whl", "yanked": false, "yanked_reason": null }, { "comment_text": "", "digests": { "md5": "5ad7873a3512660103af73e64a30e288", "sha256": "aa37a56ccf089b3450c68d546ca4e9e2bd887c5020fef8076460445077ca7fe3" }, "downloads": -1, "filename": "dnnnlp-1.1.4.tar.gz", "has_sig": false, "md5_digest": "5ad7873a3512660103af73e64a30e288", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 15056, "upload_time": "2019-10-29T07:50:08", "upload_time_iso_8601": "2019-10-29T07:50:08.924396Z", "url": "https://files.pythonhosted.org/packages/c1/9c/a5e635017e2ef33f0795c584044842c5a3ca22a403994f469b7b0b3823b7/dnnnlp-1.1.4.tar.gz", "yanked": false, "yanked_reason": null } ], "vulnerabilities": [] }