{
"info": {
"author": "Artem Ryzhikov",
"author_email": "",
"bugtrack_url": null,
"classifiers": [
"Intended Audience :: Science/Research",
"Programming Language :: Python :: 3"
],
"description": "# Variational Dropout Sparsifies NN (Pytorch)\n[](LICENSE)\n[](https://badge.fury.io/py/pytorch-ard)\n\n\nMake your neural network 300 times faster!\n\nPytorch implementation of Variational Dropout Sparsifies Deep Neural Networks ([arxiv:1701.05369](https://arxiv.org/abs/1701.05369)).\n\n## Description\nThe discovered approach helps to train both convolutional and dense deep sparsified models without significant loss of quality. Additive Noise Reparameterization\nand the Local Reparameterization Trick discovered in the paper helps to eliminate weights prior's restrictions (
) and achieve Automatic Relevance Determination (ARD) effect on (typically most) network's parameters. According to the original paper, authors reduced the number of parameters up to 280 times on LeNet architectures and up to 68 times on VGG-like networks with a negligible decrease of accuracy. Experiments with Boston dataset in this repository proves that: 99% of simple dense model were dropped using paper's ARD-prior without any significant loss of MSE. Moreover, this technique helps to significantly reduce overfitting and helps to not worry about model's complexity - all redundant parameters will be dropped automatically. Moreover, you can achieve any degree of regularization variating regularization factor tradeoff (see ***reg_factor*** variable in [boston_ard.py](examples/boston/boston_ard.py) and [cifar_ard.py](examples/cifar/cifar_ard.py) scripts)\n\n## Usage\n\n```python\nimport torch_ard as nn_ard\nfrom torch import nn\nimport torch.nn.functional as F\n\ninput_size, hidden_size, output_size = 60, 150, 1\n\nmodel = nn.Sequential(\n nn_ard.LinearARD(input_size, hidden_size),\n nn.ReLU(),\n nn_ard.LinearARD(hidden_size, output_size)\n)\n\nreg_factor = 1.0\ncriterion = lambda input, target: F.binary_cross_entropy(input, target) + reg_factor*nn_ard.get_ard_reg(model)\nprint('Sparsification ratio: %.3f%%' % (100.*nn_ard.get_dropped_params_ratio(model)))\n```\n\n## Installation\n\n```\npip install pytorch-ard\n```\n\n## Experiments\n\nAll experiments are placed at [examples](examples/) folder and contains baseline and implemented models comparison.\n\n### Boston dataset\n\nTwo scripts were used in the experiment: [boston_baseline.py](examples/boston/boston_baseline.py) and [boston_ard.py](examples/boston/boston_ard.py). Training procedure for each experiment was **100000 epoches, Adam(lr=1e-3)**. Baseline model was dense neural network with single hidden layer with hidden size 150.\n\n| | Baseline (nn.Linear) | LinearARD, no reg | LinearARD, reg=0.0001 | LinearARD, reg=0.001 | LinearARD, reg=0.1 | LinearARD, reg=1 |\n|----------------|----------|-------------|-----------------|----------------|--------------|------------|\n| MSE (train) | 1.751 | 1.626 | 1.587 | 1.962 | 17.167 | 33.682 |\n| MSE (test) | 22.580 | 16.229 | 15.957 | 8.416 | 25.695 | 30.231 |\n| Compression, % | 0 | 0.38 | 52.95 | 64.19 | 97.29 | 99.29 |\n\nYou can see on the table above that variating regularization factor any degree of compression can be achieved (for example, ~99.29% of connections can be dropped if reg_factor=1 will be used). Moreover, you can see that training with LinearARD layers with some regularization parameters (like reg=0.001 in the table above) not only significantly reduces number of model parameters (>64% of parameters can be dropped after training), but also significantly increases quality on test, reducing overfitting.\n\n## Tips\n\n1. Despite the high performance of implemented layers in \"end-to-end\" mode, authors recommends to use in fine-tuning pretrained models without ARD prior. In this case the best performance could be achieved. Moreover, it will be faster - despite of comparable convergence speed of this layers optimization, each training epoch takes more time (approx. twice longer - ~2 times more parameters in \\*ARD implementations). This fact well describable - using ARD prior in earlier stages can drop useful connections with unobvious dependencies.\n2. Model's sparsification takes almost no any speed-up effects until You convert it to the sparse one! (*TODO*)\n\n\n## Requirements\n* **PyTorch** >= 0.4.0\n* **SkLearn** >= 0.19.1\n* **Pandas** >= 0.23.3\n* **Numpy** >= 1.14.5\n\n## TODO\n- [X] LinearARD layer implementation\n- [X] Conv2dARD layer implementation\n- [ ] Learnable bias for Conv2dARD\n- [ ] Implement *to_sparse(model)* utility\n\n## Authors\n\n```\n@article{molchanov2017variational,\n title={Variational Dropout Sparsifies Deep Neural Networks},\n author={Molchanov, Dmitry and Ashukha, Arsenii and Vetrov, Dmitry},\n journal={arXiv preprint arXiv:1701.05369},\n year={2017}\n}\n```\n[Original implementation](https://github.com/ars-ashuha/variational-dropout-sparsifies-dnn) (Theano/Lasagne)\n\n## Citation\n\n```\n@misc{pytorch_ard,\n author = {Artem Ryzhikov},\n title = {HolyBayes/pytorch_ard},\n url = {https://github.com/HolyBayes/pytorch_ard},\n year = {2018}\n}\n```\n\n## Contacts\n\nArtem Ryzhikov, LAMBDA laboratory, Higher School of Economics, Yandex School of Data Analysis\n\n**E-mail:** artemryzhikoff@yandex.ru\n\n**Linkedin:** https://www.linkedin.com/in/artem-ryzhikov-2b6308103/\n\n**Link:** https://www.hse.ru/org/persons/190912317",
"description_content_type": "text/markdown",
"docs_url": null,
"download_url": "",
"downloads": {
"last_day": -1,
"last_month": -1,
"last_week": -1
},
"home_page": "https://github.com/HolyBayes/pytorch_ard",
"keywords": "pytorch,bayesian neural networks,ard,deep learning,neural networks,machine learning",
"license": "",
"maintainer": "",
"maintainer_email": "",
"name": "pytorch-ard",
"package_url": "https://pypi.org/project/pytorch-ard/",
"platform": "",
"project_url": "https://pypi.org/project/pytorch-ard/",
"project_urls": {
"Homepage": "https://github.com/HolyBayes/pytorch_ard"
},
"release_url": "https://pypi.org/project/pytorch-ard/0.2.0/",
"requires_dist": null,
"requires_python": "",
"summary": "Make your PyTorch faster",
"version": "0.2.0"
},
"last_serial": 5155735,
"releases": {
"0.1.0": [
{
"comment_text": "",
"digests": {
"md5": "588f045a3b9af8f105a2b16baa710cc4",
"sha256": "b2020bf62c3c901dfecfa0150a09a59472ffbfd70a8a6b33baf94a4b7b08bd31"
},
"downloads": -1,
"filename": "pytorch_ard-0.1.0.tar.gz",
"has_sig": false,
"md5_digest": "588f045a3b9af8f105a2b16baa710cc4",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 5840,
"upload_time": "2019-02-19T08:39:01",
"url": "https://files.pythonhosted.org/packages/3d/f7/1408ee4bd5ada3e53a042797bc0314912d9d1b063ec9486db8a33b0f687d/pytorch_ard-0.1.0.tar.gz"
}
],
"0.1.1": [
{
"comment_text": "",
"digests": {
"md5": "64538bf633e21d37dd55ffec26367bea",
"sha256": "8ddf9ac488fe7d7ceb65aad3e3e0abb32f5a0eedcbbe31ac6ba103b40a5a70e0"
},
"downloads": -1,
"filename": "pytorch_ard-0.1.1.tar.gz",
"has_sig": false,
"md5_digest": "64538bf633e21d37dd55ffec26367bea",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 5903,
"upload_time": "2019-02-19T08:47:56",
"url": "https://files.pythonhosted.org/packages/67/ed/0378d584e024cf570c0a58de6c9cda055a66dd4c6c0910651cff20180a42/pytorch_ard-0.1.1.tar.gz"
}
],
"0.2.0": [
{
"comment_text": "",
"digests": {
"md5": "43fe14fc3116d486c909a2385b258f2c",
"sha256": "534484c71a89c7df6658363ebf3569b3fe6c93f0cfe2002542031a8b8bc0afbc"
},
"downloads": -1,
"filename": "pytorch_ard-0.2.0.tar.gz",
"has_sig": false,
"md5_digest": "43fe14fc3116d486c909a2385b258f2c",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 5942,
"upload_time": "2019-04-17T15:24:13",
"url": "https://files.pythonhosted.org/packages/e8/17/0d8ff81a28beae2a573d0ea4dc92fb73125adc70fd864c86ef73567f44e3/pytorch_ard-0.2.0.tar.gz"
}
]
},
"urls": [
{
"comment_text": "",
"digests": {
"md5": "43fe14fc3116d486c909a2385b258f2c",
"sha256": "534484c71a89c7df6658363ebf3569b3fe6c93f0cfe2002542031a8b8bc0afbc"
},
"downloads": -1,
"filename": "pytorch_ard-0.2.0.tar.gz",
"has_sig": false,
"md5_digest": "43fe14fc3116d486c909a2385b258f2c",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 5942,
"upload_time": "2019-04-17T15:24:13",
"url": "https://files.pythonhosted.org/packages/e8/17/0d8ff81a28beae2a573d0ea4dc92fb73125adc70fd864c86ef73567f44e3/pytorch_ard-0.2.0.tar.gz"
}
]
}