{ "info": { "author": "Wu Zhen Zhou", "author_email": "hyciswu@gmail.com", "bugtrack_url": null, "classifiers": [], "description": "Mozi\n=====\n\nMozi is based on Theano with a clean and sharp design, the **design philosophy** of Mozi\n\n1. **Fast and Simple**: The main engine of the package is only 200 lines of code. There is only one full compiled graph for training which ensures all the data manipulation happens in one go through the pipeline and as a result make it super fast package.\n2. **Highly Modular**: Building a model in Mozi is like building a house with Lego, you can design whatever imaginable layers and stack them together easily.\n3. **Model Abstract from Training**: In order to facilitate deployment of trained model for real use, the model is abstracted away from the training module and keep as minimalist as possible. Objective is to allowed realtime deployment and easy model exchange.\n4. **Logging System**: Mozi provides a full logging feature that allows user to log the training results and the hyperparameters to the database for paranormal overview. Also it allows automatic saving of best model and logging of all training outputs for easy aftermath analysis.\n\n---\n### Install\n\nFirst you need to install [theano](https://github.com/Theano/Theano)\n\nThen install mozi via pip for bleeding edge version\n```bash\nsudo pip install git+https://github.com/hycis/Mozi.git@master\n```\nor simply clone and add to `PYTHONPATH`.\n```bash\ngit clone https://github.com/hycis/Mozi.git\nexport PYTHONPATH=/path/to/Mozi:$PYTHONPATH\n```\nin order for the install to persist via export `PYTHONPATH`. Add `PYTHONPATH=/path/to/TensorGraph:$PYTHONPATH` to your `.bashrc` for linux or\n`.bash_profile` for mac. While this method works, you will have to ensure that\nall the dependencies in [setup.py](setup.py) are installed.\n\n---\n### Set Environment\nIn Mozi, we need to set three environment paths\n* *MOZI_DATA_PATH*\n* *MOZI_SAVE_PATH*\n* *MOZI_DATABASE_PATH*\n\n`MOZI_DATA_PATH` is the directory for saving and loading the datasets. \n`MOZI_SAVE_PATH` is the directory for saving all the models, the output log and epoch error. \n`MOZI_DATABASE_PATH` is the directory for saving the database that contains tables recording the hyperparameters and test errors for each training job.\n\n---\n### Let's Have Fun!\nBuilding a model in Mozi is as simple as\n\n```python\nimport theano.tensor as T\nfrom mozi.model import Sequential\nmodel = Sequential(input_var=T.matrix(), output_var=T.matrix())\n```\nThe `input_var` is the input tensor variable that corresponds to the number of dimensions of the input dataset. The `output_var` is the tensor variable that corresponds to the target of the dataset. `T.matrix` for `2d data`, `T.tensor3` for `3d data` and `T.tensor4` for `4d data`. Next add the layers\n\n```python\nfrom mozi.layers.linear import Linear\nfrom mozi.layers.activation import RELU, Softmax\nfrom mozi.layers.noise import Dropout\n\nmodel.add(Linear(prev_dim=28*28, this_dim=200))\nmodel.add(RELU())\nmodel.add(Linear(prev_dim=200, this_dim=100))\nmodel.add(RELU())\nmodel.add(Dropout(0.5))\nmodel.add(Linear(prev_dim=100, this_dim=10))\nmodel.add(Softmax())\n```\nTo train the model, first build a dataset and a learning method, here we use the mnist dataset and SGD\n```python\nfrom mozi.datasets.mnist import Mnist\nfrom mozi.learning_method import SGD\n\ndata = Mnist(batch_size=64, train_valid_test_ratio=[5,1,1])\nlearning_method = SGD(learning_rate=0.1, momentum=0.9, lr_decay_factor=0.9, decay_batch=10000)\n```\nFinally build a training object and put everything in to train the model\n```python\nfrom mozi.train_object import TrainObject\nfrom mozi.cost import mse, error\n\ntrain_object = TrainObject(model = model,\n log = None,\n dataset = data,\n train_cost = mse,\n valid_cost = error,\n learning_method = learning_method,\n stop_criteria = {'max_epoch' : 10,\n 'epoch_look_back' : 5,\n 'percent_decrease' : 0.01}\n )\ntrain_object.setup()\ntrain_object.run()\n```\n#### Stopping Criteria\n```python\nstop_criteria = {'max_epoch' : 10,\n 'epoch_look_back' : 5,\n 'percent_decrease' : 0.01}\n```\nThe stopping criteria here means the training will stop if training has reach 'max_epoch' of 10 or the validation error does not decrease by at least 1% in the past 5 epoch.\n#### Test Model\nAnd that's it! Once the training is done, to test the model, it's as simple as calling the forward propagation `fprop(X)` in model\n```python\nimport numpy as np\n\nypred = model.fprop(data.get_test().X)\nypred = np.argmax(ypred, axis=1)\ny = np.argmax(data.get_test().y, axis=1)\naccuracy = np.equal(ypred, y).astype('f4').sum() / len(y)\nprint 'test accuracy:', accuracy\n```\n---\n### More Examples\nMozi can be used to build effectively any kind of architecture. Below is another few examples\n* [**Convolution Neural Network**](doc/cnn.md)\n* [**Denoising Autoencoder**](doc/dae.md)\n* [**Variational Autoencoder**](doc/vae.md)\n* [**Alexnet**](example/voc_alexnet.py)\n\n---\n### Layer Template\nTo build a layer for Mozi, the layer has to implement the template\n```python\nclass Template(object):\n \"\"\"\n DESCRIPTION:\n The interface to be implemented by any layer.\n \"\"\"\n def __init__(self):\n self.params = [] # all params that needs to be updated by training go into the list\n\n def _test_fprop(self, state_below):\n # the testing track whereby no params update is performed after data flows through this track\n raise NotImplementedError()\n\n def _train_fprop(self, state_below):\n # the training track whereby params is updated every time data flows through this track\n raise NotImplementedError()\n\n def _layer_stats(self, state_below, layer_output):\n # calculate everything you want to know about the layer, the input, the output,\n # the weight and put in the return list in the format [('W max', T.max(W)), ('W min', T.min(W))].\n # This method provides a peek into the layer and is useful for debugging.\n return []\n```\nEach layer provides two tracks: training track and testing track. During training, the model will call `_train_fprop` in every layer and the output from the model will be used to update the params in `self.params` in each layer. During testing, `_test_fprop` is called in every layer and the output is used to evaluate the model and to judge if model should stop training based on the stopping criteria set in the `TrainObject`. We can also peek into each layer by putting whatever we want to know about the layer into `_layer_stats`. For example, if we want to know what is the maximum weight in a layer, we can compute `T.max(W)` and return `[('W max', T.max(W))]` from `_layer_stats`, so that after every epoch, `'W max'` will be calculated for that layer and output to screen.\n\n---\n### Data Interface\nMozi provides two data interface, one is for dataset small enough to fit all into the memory [(SingleBlock)](mozi/datasets/dataset.py#L99), another is for large datasets which cannot be fit into memory in one go and has to be broken up into blocks and load into training one block at a time [(DataBlocks)](mozi/datasets/dataset.py#L185). \nCheck out the [Mnist](mozi/datasets/mnist.py) or [Cifar10](mozi/datasets/cifar10.py) examples on how to build a dataset.\n\n#### Using SingleBlock Directly\nBesides subclassing `SingleBlock` or `DataBlocks` to create dataset like Mnist and Cifar10 example, we can use `SingleBlock` or `DataBlocks` directly to build a dataset in as simple as\n```python\nfrom mozi.datasets.dataset import SingleBlock\nimport numpy as np\nX = np.random.rand(1000, 5)\ny = np.random.rand(1000, 3)\ndata = SingleBlock(X=X, y=y, batch_size=100, train_valid_test_ratio=[3,1,1])\ntrain_object = TrainObject(dataset = data)\n```\n\n#### For Large Dataset Trained in Blocks\nFor large dataset that cannot fit into memory, you can use `DataBlocks` to train block by block. Below is an example for demonstration. You can also run the [Example](example/datablocks_exp.py)\n```python\nfrom mozi.datasets.dataset import DataBlocks\nimport numpy as np\n\n# we have two blocks of 100 images each saved\n# as ('X1.npy', 'y1.npy') and ('X2.npy', 'y2.npy')\nX1 = np.random.rand(1000, 3, 32, 32)\ny1 = np.random.rand(1000, 10)\nwith open('X1.npy', 'wb') as xin, open('y1.npy', 'wb') as yin:\n np.save(xin, X1)\n np.save(yin, y1)\n\nX2 = np.random.rand(1000, 3, 32, 32)\ny2 = np.random.rand(1000, 10)\nwith open('X2.npy', 'wb') as xin, open('y2.npy', 'wb') as yin:\n np.save(xin, X1)\n np.save(yin, y1)\n\n# now we can create the data by putting the paths\n# ('X1.npy', 'y1.npy') and ('X2.npy', 'y2.npy') into DataBlocks\n# during training, the TrainObject will load and unload one block at a time\ndata = DataBlocks(data_paths=[('X1.npy', 'y1.npy'), ('X2.npy', 'y2.npy')],\n batch_size=100, train_valid_test_ratio=[3,1,1])\ntrain_object = TrainObject(dataset = data)\n```\n\n---\n### Logging\nMozi provides a logging module for automatic saving of best model and logging the errors for each epoch.\n\n\n```python\nfrom mozi.log import Log\n\nlog = Log(experiment_name = 'MLP',\n description = 'This is a tutorial',\n save_outputs = True, # log all the outputs from the screen\n save_model = True, # save the best model\n save_epoch_error = True, # log error at every epoch\n save_to_database = {'name': 'Example.db',\n 'records': {'Batch_Size': data.batch_size,\n 'Learning_Rate': learning_method.learning_rate,\n 'Momentum': learning_method.momentum}}\n ) # end log\n```\nThe log module allows logging of outputs from screen, saving best model and epoch-errors. It also allows recording of hyperparameters to the database using the `save_to_database` argument, the `save_to_database` argument takes in a dictionary that contains two fields `'name'` and `'records'`. `'name'` indicates the name of the database to save the recording table. The name of the recording table will follow the experiment name under argument `experiment_name`. The `'records'` field takes in a dictionary of unrestricted number of hyperparameters that we want to record. The `'records'` only accepts primitive data types (str, int, float).\nOnce log object is built, it can be passed into `TrainObject` as\n```python\nTrainObject(log = log)\n```\n\n\n\n\n---\n### Load Saved Model\nWhen we set `save_model` in `Log` to be true, the best model is automatically saved to `MOZI_SAVE_PATH`. The model is serialized in pickle format, so to load the saved model. We can use cPickle.\n```python\nimport cPickle\nwith open('model.pkl', 'rb') as fout:\n model = cPickle.load(fout)\n y = model.fprop(X)\n```\n---\n\n### Why Mozi?\n[Mozi](https://en.wikiquote.org/wiki/Mozi) (\u58a8\u5b50) (470 B.C - 391 B.C) is a Chinese philosopher during warring states period (\u6625\u79cb\u6230\u570b), his philosophy advocates peace, simplicity, universal love and pragmatism.\n\n---\n### Licence\nMIT Licence", "description_content_type": null, "docs_url": null, "download_url": "UNKNOWN", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/hycis/Mozi", "keywords": null, "license": "The MIT License (MIT), see LICENCE", "maintainer": null, "maintainer_email": null, "name": "mozi", "package_url": "https://pypi.org/project/mozi/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/mozi/", "project_urls": { "Download": "UNKNOWN", "Homepage": "https://github.com/hycis/Mozi" }, "release_url": "https://pypi.org/project/mozi/2.0.3/", "requires_dist": null, "requires_python": null, "summary": "Deep learning package based on theano for building all kinds of models", "version": "2.0.3" }, "last_serial": 2370329, "releases": { "2.0.3": [ { "comment_text": "", "digests": { "md5": "21e4bb91b87e1d5147a94c5ba49fa767", "sha256": "cf79a8aa0bf2ed3010816e1a05744b18726b71e9dc6c9530d2969528b2b5ef63" }, "downloads": -1, "filename": "mozi-2.0.3.tar.gz", "has_sig": false, "md5_digest": "21e4bb91b87e1d5147a94c5ba49fa767", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 43589, "upload_time": "2016-09-29T06:13:47", "url": "https://files.pythonhosted.org/packages/e4/90/776f9e43effa7eb185575454ea1e9b82b0664ba52b0ee7bfd9b264869d81/mozi-2.0.3.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "21e4bb91b87e1d5147a94c5ba49fa767", "sha256": "cf79a8aa0bf2ed3010816e1a05744b18726b71e9dc6c9530d2969528b2b5ef63" }, "downloads": -1, "filename": "mozi-2.0.3.tar.gz", "has_sig": false, "md5_digest": "21e4bb91b87e1d5147a94c5ba49fa767", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 43589, "upload_time": "2016-09-29T06:13:47", "url": "https://files.pythonhosted.org/packages/e4/90/776f9e43effa7eb185575454ea1e9b82b0664ba52b0ee7bfd9b264869d81/mozi-2.0.3.tar.gz" } ] }