{ "info": { "author": "Yuta Koreeda", "author_email": "secret-email@example.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 4 - Beta", "Environment :: Console", "Intended Audience :: Developers", "Intended Audience :: Education", "Intended Audience :: Science/Research", "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", "Programming Language :: Cython", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3.4", "Programming Language :: Python :: 3.5", "Topic :: Documentation :: Sphinx", "Topic :: Scientific/Engineering :: Artificial Intelligence", "Topic :: Scientific/Engineering :: Information Analysis" ], "description": ".. -*- coding: utf-8; -*-\n\nLoaders and savers for different implentations of `word embedding `_. The motivation of this project is that it is cumbersome to write loaders for different pretrained word embedding files. This project provides a simple interface for loading pretrained word embedding files in different formats.\n\n.. code:: python\n\n from word_embedding_loader import WordEmbedding\n\n # it will automatically determine format from content\n wv = WordEmbedding.load('path/to/embedding.bin')\n\n # This project provides minimum interface for word embedding\n print wv.vectors[wv.vocab['is']]\n\n # Modify and save word embedding file with arbitrary format\n wv.save('path/to/save.txt', 'word2vec', binary=False)\n\n\nThis project currently supports following formats:\n\n* `GloVe `_, Global Vectors for Word Representation, by Jeffrey Pennington, Richard Socher, Christopher D. Manning from Stanford NLP group.\n* `word2vec `_, by Mikolov.\n - text (create with ``-binary 0`` option (the default))\n - binary (create with ``-binary 1`` option)\n* `gensim `_ 's ``models.word2vec`` module (coming)\n* original HDFS format: a performance centric option for loading and saving word embedding (coming)\n\n\nSometimes, you want combine an external program with word embedding file of your own choice. This project also provides a simple executable to convert a word embedding format to another.\n\n.. code:: bash\n\n # it will automatically determine the format from the content\n word-embedding-loader convert -t glove test/word_embedding_loader/word2vec.bin test.bin\n\n # Get help for command/subcommand\n word-embedding-loader --help\n word-embedding-loader convert --help\n\nIssues with encoding\n--------------------\n\nThis project does decode vocab. It is up to users to determine and decode bytes.\n\n.. code:: python\n\n decoded_vocab = {k.decode('latin-1'): v for k, v in wv.vocab.iteritems()}\n\n\n.. notes::\n\n Encoding of pretrained word2vec is latin-1. Encoding of pretrained\n glove is utf-8\n\n\nDevelopment\n============\n\nThis project us Cython to build some modules, so you need Cython for development.\n\n```bash\npip install -r requirements.txt\n```\n\nIf environment variable ``DEVELOP_WE`` is set, it will try to rebuild ``.pyx`` modules.\n\n```bash\nDEVELOP_WE=1 python setup.py test\n```", "description_content_type": null, "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/koreyou/word_embedding_loader", "keywords": "", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "WordEmbeddingLoader", "package_url": "https://pypi.org/project/WordEmbeddingLoader/", "platform": "", "project_url": "https://pypi.org/project/WordEmbeddingLoader/", "project_urls": { "Homepage": "https://github.com/koreyou/word_embedding_loader" }, "release_url": "https://pypi.org/project/WordEmbeddingLoader/0.2.1/", "requires_dist": null, "requires_python": "", "summary": "Loaders and savers for different implentations of word embedding.", "version": "0.2.1" }, "last_serial": 3306415, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "71992d4d087a03ace07f112c265922c9", "sha256": "5248744a8e862117d2973857dbf74d934b3700ad61f7eeb452b898c0dc6b0dc5" }, "downloads": -1, "filename": "WordEmbeddingLoader-0.1.0.tar.gz", "has_sig": false, "md5_digest": "71992d4d087a03ace07f112c265922c9", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 117013, "upload_time": "2017-06-25T04:11:12", "url": "https://files.pythonhosted.org/packages/94/5b/a480e365839bc20181c32e278159b3dffb948e46dcda7a56722a45121acc/WordEmbeddingLoader-0.1.0.tar.gz" } ], "0.2.0": [ { "comment_text": "", "digests": { "md5": "2cb7fa28bc857eda59d7b37d059b7485", "sha256": "a06b162facb30d491232451c271411a684993c543ed06a4ccdc4cebb13b643ce" }, "downloads": -1, "filename": "WordEmbeddingLoader-0.2.0.tar.gz", "has_sig": false, "md5_digest": "2cb7fa28bc857eda59d7b37d059b7485", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 117846, "upload_time": "2017-08-14T14:33:21", "url": "https://files.pythonhosted.org/packages/68/7b/0e55c98539843f06cbc88b3e4a8d3ed52eb4050ddc32eceaf657931219eb/WordEmbeddingLoader-0.2.0.tar.gz" } ], "0.2.1": [ { "comment_text": "", "digests": { "md5": "37f16cd7e3d17d1d6a424589b7c920c9", "sha256": "635caf9b82769a80f05bc7b47bedadc158a32cf2c301b5b50f12ec513f55dfa2" }, "downloads": -1, "filename": "WordEmbeddingLoader-0.2.1.tar.gz", "has_sig": false, "md5_digest": "37f16cd7e3d17d1d6a424589b7c920c9", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 118081, "upload_time": "2017-11-05T01:55:48", "url": "https://files.pythonhosted.org/packages/24/bd/63161c3b10077f284fd876d841ccb0ed81a352b331994a3614b80ed3b0f6/WordEmbeddingLoader-0.2.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "37f16cd7e3d17d1d6a424589b7c920c9", "sha256": "635caf9b82769a80f05bc7b47bedadc158a32cf2c301b5b50f12ec513f55dfa2" }, "downloads": -1, "filename": "WordEmbeddingLoader-0.2.1.tar.gz", "has_sig": false, "md5_digest": "37f16cd7e3d17d1d6a424589b7c920c9", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 118081, "upload_time": "2017-11-05T01:55:48", "url": "https://files.pythonhosted.org/packages/24/bd/63161c3b10077f284fd876d841ccb0ed81a352b331994a3614b80ed3b0f6/WordEmbeddingLoader-0.2.1.tar.gz" } ] }