{ "info": { "author": "Jonathan Raiman", "author_email": "jraiman at mit dot edu", "bugtrack_url": null, "classifiers": [ "Intended Audience :: Science/Research", "Operating System :: OS Independent", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3.3", "Topic :: Scientific/Engineering :: Artificial Intelligence", "Topic :: Scientific/Engineering :: Mathematics" ], "description": "Small Theano LSTM recurrent network module\n------------------------------------------\n\n@author: Jonathan Raiman\n@date: December 10th 2014\n\nImplements most of the great things that came out\nin 2014 concerning recurrent neural networks, and\nsome good optimizers for these types of networks.\n\n### Key Features\n\nThis module contains several Layer types that are useful\nfor prediction and modeling from sequences:\n\n* A non-recurrent **Layer**, with a connection matrix W, and bias b\n* A recurrent **RNN Layer** that takes as input its previous hidden activation and has an initial hidden activation\n* A recurrent **LSTM Layer** that takes as input its previous hidden activation and memory cell values, and has initial values for both of those\n* An **Embedding** layer that contains an embedding matrix and takes integers as input and returns slices from its embedding matrix (e.g. word vectors)\n* A non-recurrent **GatedInput**, with a connection matrix W, and bias b, that multiplies a single scalar to each input (gating jointly multiple inputs)\n* Deals with exploding and vanishing gradients with a subgradient optimizer (Adadelta) and element-wise gradient clipping (\u00e0 la Alex Graves)\n\nThis module also contains the **SGD**, **AdaGrad**, and **AdaDelta** gradient descent methods that are constructed using an objective function and a set of theano variables, and returns an `updates` dictionary to pass to a theano function.\n\n\n### Quick Tutorial\n\nSee [a short tutorial for sequence forecasting here](http://nbviewer.ipython.org/github/JonathanRaiman/theano_lstm/blob/master/Tutorial.ipynb).\nOr read on for some usage examples.\n\n### Usage\n\nHere is an example of usage with stacked LSTM units, using\nAdadelta to optimize, and using a scan operation from Theano (a symbolic loop for backpropagation through time).\n\n\tdropout = 0.0\n\n\tmodel = StackedCells(4, layers=[20, 20], activation=T.tanh, celltype=LSTM)\n\tmodel.layers[0].in_gate2.activation = lambda x: x\n\tmodel.layers.append(Layer(20, 2, lambda x: T.nnet.softmax(x)[0]))\n\n\t# in this example dynamics is a random function that takes our\n\t# output along with the current state and produces an observation\n\t# for t + 1\n\n\tdef step(x, *prev_hiddens):\n\t new_states = stacked_rnn.forward(x, prev_hiddens, dropout)\n\t return [dynamics(x, new_states[-1])] + new_states[:-1]\n\n\tinitial_obs = T.vector()\n\ttimesteps = T.iscalar()\n\n\tresult, updates = theano.scan(step,\n n_steps=timesteps,\n outputs_info=[dict(initial=initial_obs, taps=[-1])] + [dict(initial=layer.initial_hidden_state, taps=[-1]) for layer in model.layers if hasattr(layer, 'initial_hidden_state')])\n\n\ttarget = T.vector()\n\n\tcost = (result[0][:,[0,2]] - target[[0,2]]).norm(L=2) / timesteps\n\n\tupdates, gsums, xsums, lr, max_norm = \\\n\t\tcreate_optimization_updates(cost, model.params, method='adadelta')\n\n\tupdate_fun = theano.function([initial_obs, target, timesteps], cost, updates = updates, allow_input_downcast=True)\n\tpredict_fun = theano.function([initial_obs, timesteps], result[0], allow_input_downcast=True)\n\n\tfor example, label in training_set:\n\t\tc = update_fun(example, label, 10)\n\n### Minibatch usage\n\nSuppose you now have many sequences (of equal length -- we'll generalize this later). Then training can be done in batches:\n\n\tmodel = StackedCells(4, layers=[20, 20], activation=T.tanh, celltype=LSTM)\n\tmodel.layers[0].in_gate2.activation = lambda x: x\n\tmodel.layers.append(Layer(20, 2, lambda x: T.nnet.softmax(x)[0]))\n\n\t# in this example dynamics is a function that simulates the behavior of a double\n # pendulum and takes our current state and produces an observation\n\t# for t + 1\n def dynamics(x, u):\n dydx = T.alloc(0.0, 4)\n dydx = T.set_subtensor(dydx[0], x[1])\n del_ = x[2]-x[0]\n den1 = (M1+M2)*L1 - M2*L1*T.cos(del_)*T.cos(del_)\n dydx = T.set_subtensor(dydx[1],\\n\",\n ( M2*L1 * x[1] * x[1] * T.sin(del_) * T.cos(del_)\n + M2*G * T.sin(x[2]) * T.cos(del_) +\n M2*L2 * x[3] * x[3] * T.sin(del_)\n - (M1+M2)*G * T.sin(x[0]))/den1 )\n dydx = T.set_subtensor(dydx[2], x[3])\n\n den2 = (L2/L1)*den1\n dydx = T.set_subtensor(dydx[3], (-M2*L2 * x[3]*x[3]*T.sin(del_) * T.cos(del_)\n + (M1+M2)*G * T.sin(x[0])*T.cos(del_)\n - (M1+M2)*L1 * x[1]*x[1]*T.sin(del_)\n - (M1+M2)*G * T.sin(x[2]))/den2 + u )\n return x + dydx * dt\n\n\tdef step(x, *prev_hiddens):\n\t new_states = stacked_rnn.forward(x, prev_hiddens, dropout)\n\t return [dynamics(x, new_states[-1])] + new_states[:-1]\n\n\t# switch to a matrix of observations:\n\tinitial_obs = T.imatrix()\n\ttimesteps = T.iscalar()\n\n\tresult, updates = theano.scan(step,\n n_steps=timesteps,\n outputs_info=[dict(initial=initial_obs, taps=[-1])] + [dict(initial=layer.initial_hidden_state, taps=[-1]) for layer in model.layers if hasattr(layer, 'initial_hidden_state')])\n\n\ttarget = T.ivector()\n\n\tcost = (result[0][:,:,[0,2]] - target[:,[0,2]]).norm(L=2) / timesteps\n\n\tupdates, gsums, xsums, lr, max_norm = \\\n\t\tcreate_optimization_updates(cost, model.params, method='adadelta')\n\n\tupdate_fun = theano.function([initial_obs, target, timesteps], cost, updates = updates, allow_input_downcast=True)\n\tpredict_fun = theano.function([initial_obs, timesteps], result[0], allow_input_downcast=True)\n\n\tfor minibatch, labels in minibatches:\n\t\tc = update_fun(minibatch, label, 10)\n\n### Minibatch usage with different sizes\n\nGeneralization can be made to different sequence length if we accept the minor cost of forward-propagating parts of our graph we don't care about. To do this we make all sequences the same length by padding the end of the shorter ones with some symbol. Then use a binary matrix of the same size than all your minibatch sequences. The matrix has a 1 in areas when the error should be calculated, and zero otherwise. Elementwise mutliply this mask with your output, and then apply your objective function to this masked output. The error will be obtained everywhere, but will be zero in areas that were masked, yielding the correct error function.\nWhile there is some waste computation, the parallelization can offset this cost and make the overall computation faster.\n\n#### MaskedLoss usage\n\nTo use different length sequences, consider the following approach:\n\n* you have sequences *y_1, y_2, ..., y_n*, and labels *l_1, l_2, ..., l_n*.\n* pad all the sequences to the longest sequence *y_k*, and form a matrix **Y** of all padded sequences\n* similarly form the labels at each timestep for each padded sequence (with zeros, or some other symbol for labels in padded areas)\n* then record the length of the true labels (codelengths) needed before padding *c_1, c_2, ..., c_n*, and the length of the sequences before padding *l_1, l_2, ..., l_n*\n* pass the lengths, targets, and predictions to the masked loss as follows:\n\n\t\tpredictions, updates = theano.scan(prediction_step, etc...)\n\n\t\terror = masked_loss(\n\t predictions,\n\t padded_labels,\n\t codelengths,\n\t label_starts).mean()\n\nVisually this goes something like this, for the case with three inputs, three outputs, but a single label for\nthe final output:\n\ninputs [ x_1 x_2 x_3 ]\n\noutputs [ p_1 p_2 p_3 ]\n\nlabels [ ... ... l_1 ]\n\nthen we would have a matrix *x* with *x_1, x_2, x_3*, and `predictions` in the code above would contain *p_1, p_2, p_3*.\nWe would then pass to `masked_loss` the codelength [ 1 ], since there is only \"l_1\" to predict, and the `label_starts` [\u00a02 ],\nindicating that errors should be computed at the third prediction (with zero index).\n\n#### Dropout Usage in Theano Scan\n\nTo get dropout to work and be dynamically modifyiable without recompiling let's consider the following usage example.\n\nFirst we define a variable with the likelihood that a neuron will be dropped (randomly set to 0):\n\n\tdropout = theano.shared(np.float64(0.3).astype(theano.config.floatX))\n\tdeterministic = False # for now\n\nCreate some model:\n\n\tmodel = theano_lstm.StackedCells(50, layers=[100], celltype=theano_lstm.LSTM, activation=T.tanh)\n\nNow we want to introduce dropout noise between the input and the LSTM. To use Dropout outside of a Theano `scan` loop you could simply multiply elementwise by a binomial random variable ([see examples here](https://gist.github.com/SnippyHolloW/8a0f820261926e2f41cc)), but if you plan on using recurrent networks with a Theano `scan` you need to call your random numbers outside of the loop.\n\nIn order to keep track of these dropout activations we'll generate *masks*. *Masks* are a list with all the realizations of binomials. We generate this list with `MultiDropout`, a special function in the `theano_lstm` module that takes different hidden layer sizes and returns a list of matrices with binomial random variable realizations inside:\n\n\tif dropout.get_value() > 0:\n if deterministic:\n # just multiply by the likelihood of being kept:\n masks = [np.float32(1.) - self.dropout for i in range(2)]\n else:\n shapes = [50, 100]\n masks = theano_lstm.MultiDropout( [(x.shape[0], shape) for shape in shapes] if x.ndim > 1 else shapes,\n self.dropout)\n else:\n masks = []\n\nNow our loop forward function is as follows:\n\n\tdef step(obs, hidden_state, *masks):\n new_state = model.forward(obs, [hidden_state], list(masks))\n return new_state[1]\n\nWe pass it to Theano's scan:\n\n result, _ = theano.scan(step,\n \tsequences = seq,\n \tnon_sequences = masks,\n \toutputs_info = [dict(initial=model.layers[0].initial_hidden_state, taps=[-1])]\n \t)\n\nAnd We're done.\n\n**Note:** To not use *Masks* pass an empty list `[]` instead.", "description_content_type": null, "docs_url": null, "download_url": "https://github.com/JonathanRaiman/theano_lstm", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/JonathanRaiman/theano_lstm", "keywords": "Gradient Descent,Theano,LSTM,neural networks", "license": "MIT", "maintainer": null, "maintainer_email": null, "name": "theano-lstm", "package_url": "https://pypi.org/project/theano-lstm/", "platform": "any", "project_url": "https://pypi.org/project/theano-lstm/", "project_urls": { "Download": "https://github.com/JonathanRaiman/theano_lstm", "Homepage": "https://github.com/JonathanRaiman/theano_lstm" }, "release_url": "https://pypi.org/project/theano-lstm/0.0.15/", "requires_dist": null, "requires_python": null, "summary": "Nano size theano lstm module", "version": "0.0.15" }, "last_serial": 1596405, "releases": { "0.0.1": [ { "comment_text": "", "digests": { "md5": "10a4a7d1cbafffea49e7be3701bdb5eb", "sha256": "69504649d5a3f9c64ea96ac0f9d10a5cc45358535de61dcaa964cbf9a832e004" }, "downloads": -1, "filename": "theano-lstm-0.0.1.tar.gz", "has_sig": false, "md5_digest": "10a4a7d1cbafffea49e7be3701bdb5eb", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5622, "upload_time": "2014-12-10T23:55:28", "url": "https://files.pythonhosted.org/packages/b2/92/68c105fa5843268f8a8ee4dc419de1323b9337cb8179b9c852ab2de5777f/theano-lstm-0.0.1.tar.gz" } ], "0.0.10": [ { "comment_text": "", "digests": { "md5": "6e5d182934f3c201625e60b3f269d990", "sha256": "ac3f4cd6ca9efcd7209b56d2cf6956cf3a73eb721c588e9a1818ae922701cd69" }, "downloads": -1, "filename": "theano-lstm-0.0.10.tar.gz", "has_sig": false, "md5_digest": "6e5d182934f3c201625e60b3f269d990", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 13255, "upload_time": "2015-01-05T14:33:37", "url": "https://files.pythonhosted.org/packages/14/89/c7743740eb176867a33c071f14d4583d148cc19c742fc2ee1e8a1ae3c95a/theano-lstm-0.0.10.tar.gz" } ], "0.0.11": [ { "comment_text": "", "digests": { "md5": "df48d262c4b2a798c1777c0d850e74b7", "sha256": "277a10838f19c359f88a102c96d624a4d6dd19ee6b77b841902f308be9c331e9" }, "downloads": -1, "filename": "theano-lstm-0.0.11.tar.gz", "has_sig": false, "md5_digest": "df48d262c4b2a798c1777c0d850e74b7", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 14070, "upload_time": "2015-01-08T15:10:49", "url": "https://files.pythonhosted.org/packages/25/8d/a018e9c12b8c7006b9cbaf9292e7360850772d71fc1e49237d84a874ec32/theano-lstm-0.0.11.tar.gz" } ], "0.0.12": [ { "comment_text": "", "digests": { "md5": "62dbe9ea7aabb3cec3eaf925715245a4", "sha256": "f811ce474744b3a562d72961b043ad773b352557535cf999660770af4281640f" }, "downloads": -1, "filename": "theano-lstm-0.0.12.tar.gz", "has_sig": false, "md5_digest": "62dbe9ea7aabb3cec3eaf925715245a4", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 15436, "upload_time": "2015-01-10T15:16:45", "url": "https://files.pythonhosted.org/packages/03/ce/cd46c014ef45d8b954093fd4f16e98bae27dc27ae95cec851d025e3b3411/theano-lstm-0.0.12.tar.gz" } ], "0.0.13": [ { "comment_text": "", "digests": { "md5": "b1ad4bc75b51a1ff408efe6ab4771916", "sha256": "0935d4a8bc947e0bcb952fdcf3f2bc8a5b667217ea3133001c93db471a52e003" }, "downloads": -1, "filename": "theano-lstm-0.0.13.tar.gz", "has_sig": false, "md5_digest": "b1ad4bc75b51a1ff408efe6ab4771916", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 17188, "upload_time": "2015-01-27T00:10:17", "url": "https://files.pythonhosted.org/packages/b2/3b/fd42d9a44dfbfbb43ac838d04efa70fe5e9d8e497d7242f4700051a69260/theano-lstm-0.0.13.tar.gz" } ], "0.0.14": [ { "comment_text": "", "digests": { "md5": "23308424fdf94c69ac589e08ac728b30", "sha256": "96e4b4dc327fb21d22c770cc581122898d61359073a285eb4e87ac5f8014ca38" }, "downloads": -1, "filename": "theano-lstm-0.0.14.tar.gz", "has_sig": false, "md5_digest": "23308424fdf94c69ac589e08ac728b30", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 18037, "upload_time": "2015-04-06T16:56:38", "url": "https://files.pythonhosted.org/packages/29/f4/9fd06463ef2b7c06c36795e0bf90d8ae1fc1db04e12a2bc6a8bb9815b3a5/theano-lstm-0.0.14.tar.gz" } ], "0.0.15": [ { "comment_text": "", "digests": { "md5": "90289926e244217c81e3f9bac3c24d7e", "sha256": "85880ddda3ea142505fde13ae91133f92105537bb35fcdf11380d2a018febe78" }, "downloads": -1, "filename": "theano-lstm-0.0.15.tar.gz", "has_sig": false, "md5_digest": "90289926e244217c81e3f9bac3c24d7e", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 18437, "upload_time": "2015-06-17T20:04:09", "url": "https://files.pythonhosted.org/packages/d1/c8/61afca427cbbaa82eb1230dbe14369ee109eaef54633a57cfe685ead939c/theano-lstm-0.0.15.tar.gz" } ], "0.0.2": [ { "comment_text": "", "digests": { "md5": "2258a9572db079b44ddecd5066eb833d", "sha256": "6c54f1420e67348a91034484b4ec35c9c17d859448480c3b961f7158b44a9097" }, "downloads": -1, "filename": "theano-lstm-0.0.2.tar.gz", "has_sig": false, "md5_digest": "2258a9572db079b44ddecd5066eb833d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5631, "upload_time": "2014-12-11T03:39:27", "url": "https://files.pythonhosted.org/packages/56/01/6f7b33e481155dc52f93c8c20c2718d4654342a9a32e6f5c397017c8b0d1/theano-lstm-0.0.2.tar.gz" } ], "0.0.3": [ { "comment_text": "", "digests": { "md5": "4918fc1b8a5471bbca0e07aae07d8a6c", "sha256": "bc2f72b3d65032f75c0ed2c8611a8e0fd2fceed2a4594d543ece8872536b9f15" }, "downloads": -1, "filename": "theano-lstm-0.0.3.tar.gz", "has_sig": false, "md5_digest": "4918fc1b8a5471bbca0e07aae07d8a6c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6027, "upload_time": "2014-12-12T16:00:01", "url": "https://files.pythonhosted.org/packages/d1/46/16ea14a58fb8577b9bf9b0f6f5ef0b59e697cdda747a6db6bfff9bfa73be/theano-lstm-0.0.3.tar.gz" } ], "0.0.4": [ { "comment_text": "", "digests": { "md5": "5e1b0cec56f4a04a0aa85e7000cbab88", "sha256": "5318eb70a1a1dcef2051dcca2aac88dcab70321eb00a77a9fea313c5b0dac0b0" }, "downloads": -1, "filename": "theano-lstm-0.0.4.tar.gz", "has_sig": false, "md5_digest": "5e1b0cec56f4a04a0aa85e7000cbab88", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7217, "upload_time": "2014-12-12T17:15:26", "url": "https://files.pythonhosted.org/packages/9e/a4/6a5d3fa43d97995b66f651085cbfd4efbc265f9db5d5008e4ae968a524de/theano-lstm-0.0.4.tar.gz" } ], "0.0.5": [ { "comment_text": "", "digests": { "md5": "721e284773e661af563453038620863b", "sha256": "08785dad9d606b6dec37275297506ed63bf26fa5ebdd5cd8a4f86e7fd9bc91c4" }, "downloads": -1, "filename": "theano-lstm-0.0.5.tar.gz", "has_sig": false, "md5_digest": "721e284773e661af563453038620863b", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7273, "upload_time": "2014-12-13T18:58:17", "url": "https://files.pythonhosted.org/packages/6e/17/2b4039870e3c4e8eab25f991820f26e3926fd616768471c62020deeafc9e/theano-lstm-0.0.5.tar.gz" } ], "0.0.6": [ { "comment_text": "", "digests": { "md5": "5702b7473e0f373478ada8ad3d9568da", "sha256": "204223688e13d1eb651574e1c141f59076ef8f0420cb2c6bcb2f132c3515d6da" }, "downloads": -1, "filename": "theano-lstm-0.0.6.tar.gz", "has_sig": false, "md5_digest": "5702b7473e0f373478ada8ad3d9568da", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7661, "upload_time": "2014-12-14T18:40:22", "url": "https://files.pythonhosted.org/packages/32/f4/08787102f7c661ed13c5924f89c5c85fee5dcf6670004875892e2d34bdf0/theano-lstm-0.0.6.tar.gz" } ], "0.0.7": [ { "comment_text": "", "digests": { "md5": "5e8f473a6182d957d4f1b2071e450f92", "sha256": "e6d56a5815ecc57ab6d6ff9f1b9ae087ccd968fd039a2e6c2c3f63f24e919443" }, "downloads": -1, "filename": "theano-lstm-0.0.7.tar.gz", "has_sig": false, "md5_digest": "5e8f473a6182d957d4f1b2071e450f92", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7672, "upload_time": "2014-12-30T00:17:10", "url": "https://files.pythonhosted.org/packages/f2/07/f6035d00a63e58c672a850c88c07b4259dd6c1ad5585615803ee8007a3d0/theano-lstm-0.0.7.tar.gz" } ], "0.0.8": [ { "comment_text": "", "digests": { "md5": "f9efe4b8200a9602736f078661ead8fc", "sha256": "8a6c4d6a7af7324731bdcf3082a75c3caeb84972d53886905904432cb9d16156" }, "downloads": -1, "filename": "theano-lstm-0.0.8.tar.gz", "has_sig": false, "md5_digest": "f9efe4b8200a9602736f078661ead8fc", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 12850, "upload_time": "2015-01-05T09:05:26", "url": "https://files.pythonhosted.org/packages/51/8e/c5446c91770d076629acc77d1e9ef9592d29ff22c18a492565795bedf254/theano-lstm-0.0.8.tar.gz" } ], "0.0.9": [ { "comment_text": "", "digests": { "md5": "f37860b37774f6c962ce430a37c88685", "sha256": "09686d4d8e80b4dd76fe766761b0fd41a90deb27246e5ea5d73260cb67f5f431" }, "downloads": -1, "filename": "theano-lstm-0.0.9.tar.gz", "has_sig": false, "md5_digest": "f37860b37774f6c962ce430a37c88685", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 12887, "upload_time": "2015-01-05T10:39:04", "url": "https://files.pythonhosted.org/packages/e6/2a/0264d4539481af58cb853d9144eca03a25901e4d0bfebd7236fd7756e16e/theano-lstm-0.0.9.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "90289926e244217c81e3f9bac3c24d7e", "sha256": "85880ddda3ea142505fde13ae91133f92105537bb35fcdf11380d2a018febe78" }, "downloads": -1, "filename": "theano-lstm-0.0.15.tar.gz", "has_sig": false, "md5_digest": "90289926e244217c81e3f9bac3c24d7e", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 18437, "upload_time": "2015-06-17T20:04:09", "url": "https://files.pythonhosted.org/packages/d1/c8/61afca427cbbaa82eb1230dbe14369ee109eaef54633a57cfe685ead939c/theano-lstm-0.0.15.tar.gz" } ] }