{ "info": { "author": "michaelshiyu", "author_email": "michaelshiyu3@gmail.com", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: MIT License", "Programming Language :: Python", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: Implementation :: CPython", "Programming Language :: Python :: Implementation :: PyPy" ], "description": "\n\ufeff# kerNET\n\nkerNET is a simple, high-level, PyTorch-based API that helps you build kernel machine-powered connectionist models easily. It is based on [PyTorch](http://pytorch.org/) in order to make GPU acceleration possible (just like neural networks, these models require operations on large matrices).\nkerNET is essentially PyTorch plus some extra layers that are building blocks for such models.\nFor convenience, a few higher-level model abstractions are also available, including MultiLayer Kernel Network (MLKN) proposed in [this paper](https://arxiv.org/abs/1802.03774).\n\nBuilding a network with kernel machines here is as straightforward as building a neural network in PyTorch: you basically just need to read [this great tutorial](http://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html#sphx-glr-beginner-blitz-neural-networks-tutorial-py).\n\nDependencies:\n- Python 3\n- PyTorch 0.4.0\n\nCurrently, the best way to work with this API is by forking it or via ```git clone```. Hope you enjoy it and any suggestion or contribution would be greatly appreciated!\n\n---------\n\n## MultiLayer Kernel Network\n\nA MLKN is an equivalent of a fully-connected, feedforward neural network in the universe of kernel machines. It works just like a [MultiLayer Perceptron (MLP)](https://en.wikipedia.org/wiki/Multilayer_perceptron) except that under the hood, everything is now being powered by [kernel machines](https://en.wikipedia.org/wiki/Radial_basis_function_network).\n\nIn this repository, you will find a pre-built yet still highly customizable MLKN model. You can easily configure the size of the network and some other features just like using any other high-level neural network APIs. For training, besides all native methods of PyTorch, we have implemented [the proposed layerwise method](https://arxiv.org/abs/1802.03774) for you so that you only need to specify some hyperparameters in a few lines of codes to get things working. Some datasets used in the paper are also readily available for you to test the model out.\n\n### training a MLKN classifier layer-by-layer for [the Iris dataset](https://en.wikipedia.org/wiki/Iris_flower_data_set)\n\nSome imports and preprocessings on data to get things ready.\n```python\nimport numpy as np\nfrom sklearn.datasets import load_iris\nfrom sklearn.preprocessing import StandardScaler\nimport torch\nfrom torch.autograd import Variable\n\nx, y = load_iris(return_X_y=True)\n\n# normalize features to 0-mean and unit-variance\nnormalizer = StandardScaler()\nx = normalizer.fit_transform(x)\nn_class = int(np.amax(y) + 1)\n\n# convert numpy data into torch objects\ndtype = torch.FloatTensor\nif torch.cuda.is_available():\n dtype = torch.cuda.FloatTensor\nX = Variable(torch.from_numpy(x).type(dtype), requires_grad=False)\nY = Variable(torch.from_numpy(y).type(dtype), requires_grad=False)\n\n# randomly permute data\nnew_index = torch.randperm(X.shape[0])\nX, Y = X[new_index], Y[new_index]\n\n# split data evenly into training and test\nindex = len(X)//2\nx_train, y_train = X[:index], Y[:index]\nx_test, y_test = X[index:], Y[index:]\n```\n\nMLKNClassifier is a pre-built MLKN model with layerwise training already configured. It is implemented for classification with an arbitrary number of classes.\n```python\nfrom models.mlkn import MLKNClassifier\nmlkn = MLKNClassifier()\n```\n\nLet's implement a two-layer MLKN with 15 kernel machines on the first layer and ```n_class``` kernel machines on the second (because we will use cross-entropy as our loss function later and train the second layer as a RBFN). ```kerLinear``` is a ```torch.nn.Module``` object that represents a layer of kernel machines which use identical Gaussian kernels ```k(x, y) = exp(-||x-y||_2^2 / (2 * sigma^2))```. ```sigma``` controls the kernel width. For ```X``` in the input layer, pass to it the random sample (usually the training set) you would like to center the kernel machines on, i.e., the set ```{x_i}``` in ```f(u) = \u2211_i a_i k(x_i, u) + b```. This set is then an attribute of this layer object and can be visited as ```layer.X```. For non-input layers, pass to ```X``` the raw data you want to center the kernel machines on. At runtime, for layer ```n```, its ```X``` will be updated correctly to ```F_n-1(...(F_0(layer.X))...)```, where ```F_i``` is the mapping of the ```i```th layer.\n```python\nfrom layers.kerlinear import kerLinear\nmlkn.add_layer(kerLinear(X=x_train, out_dim=15, sigma=5, bias=True))\nmlkn.add_layer(kerLinear(X=x_train, out_dim=n_class, sigma=.1, bias=True))\n```\n\nFor large datasets, it can be impossible operate on the entire training set due to insufficient memory. In this case, one can trade parallelism for memory sufficiency by breaking the training set into a few smaller subsets and center a separate ```kerLinear``` object on each subset. This is the same as breaking the Gram matrix into a bunch of submatrices and it will not make any difference to the numerical result. We have implemented a ```kerLinearEnsemble``` class and a helper function ```to_ensemble``` to make this process simpler. The script below will result in the same network and same calculations as the earlier method of adding layers. ```layer0``` and ```layer1``` are broken into many smaller networks each having 30 centers with weights and biases unchanged.\n```python\nfrom layers.kerlinear import kerLinear\nfrom layers.ensemble import kerLinearEnsemble\nimport backend as K\n\nensemble = True\nbatch_size = 30\n\nlayer0 = kerLinear(X=x_train, out_dim=15, sigma=5, bias=True)\nlayer1 = kerLinear(X=x_train, out_dim=n_class, sigma=.1, bias=True)\n\nif not ensemble:\n mlkn.add_layer(layer0)\n mlkn.add_layer(layer1)\n\nelse:\n # create equivalent ensemble layers so that large datasets can be fitted into memory\n mlkn.add_layer(K.to_ensemble(layer0, batch_size))\n mlkn.add_layer(K.to_ensemble(layer1, batch_size))\n```\n\nThen we add optimizer for each layer. This works with any ```torch.optim.Optimizer```. Each optimizer is in charge of one layer with the order of addition being the same with the order of layers, i.e., the first-added optimizer would be assigned to the first layer (layer closest to the input). For each optimizer, one can specify ```params``` to anything and it will be overridden to the weights of the correct layer automatically before the network is trained when ```fit``` is called. Let's use [Adam](https://arxiv.org/pdf/1412.6980.pdf) as the optimizer for this example. Note that for PyTorch optimizers, ```weight_decay``` is the l2-norm regularization coefficient.\n```python\nmlkn.add_optimizer(torch.optim.Adam(params=mlkn.parameters(), lr=1e-3, weight_decay=0.1))\nmlkn.add_optimizer(torch.optim.Adam(params=mlkn.parameters(), lr=1e-3, weight_decay=.1))\n```\n\nSpecify loss function for the output layer, this works with any PyTorch loss function but let's use ```torch.nn.CrossEntropyLoss``` for this classification task.\n```python\nmlkn.add_loss(torch.nn.CrossEntropyLoss())\n```\n\nFit the model. For ```n_epoch```, one should pass a tuple of ```int``` with the first number specifying the number of epochs to train the first layer, etc. ```shuffle``` governs if the entire dataset is randomly shuffled at each epoch. If ```accumulate_grad``` is ```True```, the weights are only updated at each epoch instead of each minibatch using the accumulated gradient from all minibatches in that epoch. If it is set to ```False```, there will be an update per minibatch. Note that parameter ```X``` in ```fit``` is the training set you would like to train your model on, which can potentially be different from the set your kernel machines are centered on (parameter ```X``` when initializing a ```kerLinear``` object).\n```python\nmlkn.fit(\n n_epoch=(30, 30),\n batch_size=30,\n shuffle=True,\n X=x_train,\n Y=y_train,\n n_class=n_class,\n accumulate_grad=False\n )\n```\n\nMake a prediction on the test set and print error.\n```python\ny_pred = mlkn.predict(X_test=x_test, batch_size=15)\nerr = mlkn.get_error(y_pred, y_test)\nprint('error rate: {:.2f}%'.format(err.data[0] * 100))\n```\n\nThis example is available at [examples/mlkn_classifier.py](https://github.com/michaelshiyu/kerNET/tree/master/examples). Some more classification datasets are there for you to try the model out.\n\n---------\n\n### training MLKN with backpropagation\n\nIn kerNET, we have also implemented a generic MLKN with maximal freedom to customization. Namely, it does not have greedy training pre-configured so it is easier to train it with the standard backpropagation together with some gradient-based optimization. Further, it is not defined to be a classifier, instead, it is a general-purpose learning machine: whatever you can do with MLP, you can do it with MLKN.\n\nThe generic MLKN works almost the same as MLKNClassifier. First we instantiate a model.\n```python\nimport torch\nfrom models.mlkn import MLKN\nmlkn = MLKN()\n```\n\nAdding layers is the same as we did for MLKNClassifier. Ensemble layers are also supported.\n```python\nfrom layers.kerlinear import kerLinear\nmlkn.add_layer(kerLinear(X=x_train, out_dim=15, sigma=5, bias=True))\nmlkn.add_layer(kerLinear(X=x_train, out_dim=n_class, sigma=.1, bias=True))\n```\n\nFor regression, the dimension of the output layer should be adjusted.\n```python\nmlkn.add_layer(kerLinear(ker_dim=x_train.shape[0], out_dim=y_train.shape[1], sigma=.1, bias=True))\n```\n\nAdd an optimizer. This works with any ```torch.optim.Optimizer```. Unlike in the layerwise training case, here one optimizer is in charge of the training of the entire network since we are using backpropagation. But of course, this does not mean that all layers have to be trained under exactly the same setting: you could still specify [per-parameter options](http://pytorch.org/docs/master/optim.html) for each layer.\n```python\nmlkn.add_optimizer(torch.optim.Adam(params=mlkn.parameters(), lr=1e-3, weight_decay=0.1))\n```\n\nSpecify a loss function. For classification, ```torch.nn.CrossEntropyLoss``` may be an ideal option whereas for regression, ```torch.nn.MSELoss``` is a common choice.\n```python\nmlkn.add_loss(torch.nn.CrossEntropyLoss())\n```\nOr, for regression,\n```python\nmlkn.add_loss(torch.nn.MSELoss())\n```\n\nTrain the model and evaluate the output given some test set.\n```python\nmlkn.fit(\n n_epoch=30,\n batch_size=30,\n shuffle=True,\n X=x_train,\n Y=y_train,\n accumulate_grad=True\n )\n\ny_raw = mlkn.evaluate(X_test=x_test, batch_size=15)\n```\n\nFor classification, one may be interested in the error rate for this test set whereas for regression, MSE.\n\nFor classification,\n```python\n_, y_pred = torch.max(y_raw, dim=1)\ny_pred = y_pred.type_as(y_test)\nerr = (y_pred!=y_test).sum().type(torch.FloatTensor).div_(y_test.shape[0])\nprint('error rate: {:.2f}%'.format(err.data[0] * 100))\n```\n\nFor regression,\n```python\nmse = torch.nn.MSELoss()\nprint('mse: {:.4f}'.format(mse(y_raw, y_test).data[0]))\n```\n\nThis example is available at [examples/mlkn_generic.py](https://github.com/michaelshiyu/kerNET/tree/master/examples). Some classification and regression datasets are there for you to try the model out.\n\n\n", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/michaelshiyu/kerNET.git", "keywords": "", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "kerNET", "package_url": "https://pypi.org/project/kerNET/", "platform": "", "project_url": "https://pypi.org/project/kerNET/", "project_urls": { "Homepage": "https://github.com/michaelshiyu/kerNET.git" }, "release_url": "https://pypi.org/project/kerNET/0.1.0/", "requires_dist": [ "pytorch (==0.4.0)" ], "requires_python": ">=3.6.0", "summary": "Connectionist models powered by kernel machines.", "version": "0.1.0" }, "last_serial": 4028598, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "89072bfbc7faec528d5ed3c389e30f37", "sha256": "06648a8ff528c0802fa537f8d982609e169a60078c8d9a05f64084c163e77a47" }, "downloads": -1, "filename": "kerNET-0.1.0-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "89072bfbc7faec528d5ed3c389e30f37", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": ">=3.6.0", "size": 20713, "upload_time": "2018-07-04T03:06:55", "url": "https://files.pythonhosted.org/packages/9c/6b/008ee3af9ad36912b785fcd13f2aaba76e3317be0935b76d8c4400698002/kerNET-0.1.0-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "ed2fad343e791ce0b6d4008f7a0cf559", "sha256": "1697c80c67ac8adbd5d60167310ff258ae2d9b0279640ee9cf242f4aa2962c84" }, "downloads": -1, "filename": "kerNET-0.1.0.tar.gz", "has_sig": false, "md5_digest": "ed2fad343e791ce0b6d4008f7a0cf559", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6.0", "size": 14873, "upload_time": "2018-07-04T03:06:57", "url": "https://files.pythonhosted.org/packages/40/d7/6bf55058bf390c22cef2fa5196595d3b0c7e185c56396d5e2914d7055ae6/kerNET-0.1.0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "89072bfbc7faec528d5ed3c389e30f37", "sha256": "06648a8ff528c0802fa537f8d982609e169a60078c8d9a05f64084c163e77a47" }, "downloads": -1, "filename": "kerNET-0.1.0-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "89072bfbc7faec528d5ed3c389e30f37", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": ">=3.6.0", "size": 20713, "upload_time": "2018-07-04T03:06:55", "url": "https://files.pythonhosted.org/packages/9c/6b/008ee3af9ad36912b785fcd13f2aaba76e3317be0935b76d8c4400698002/kerNET-0.1.0-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "ed2fad343e791ce0b6d4008f7a0cf559", "sha256": "1697c80c67ac8adbd5d60167310ff258ae2d9b0279640ee9cf242f4aa2962c84" }, "downloads": -1, "filename": "kerNET-0.1.0.tar.gz", "has_sig": false, "md5_digest": "ed2fad343e791ce0b6d4008f7a0cf559", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6.0", "size": 14873, "upload_time": "2018-07-04T03:06:57", "url": "https://files.pythonhosted.org/packages/40/d7/6bf55058bf390c22cef2fa5196595d3b0c7e185c56396d5e2914d7055ae6/kerNET-0.1.0.tar.gz" } ] }