{ "info": { "author": "Christian Puhrsch", "author_email": "cpuhrsch@fb.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 3 - Alpha", "Intended Audience :: Developers", "Intended Audience :: Science/Research", "License :: OSI Approved :: MIT License", "Operating System :: MacOS", "Operating System :: Microsoft :: Windows", "Operating System :: POSIX", "Operating System :: Unix", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3.4", "Programming Language :: Python :: 3.5", "Programming Language :: Python :: 3.6", "Topic :: Scientific/Engineering", "Topic :: Software Development" ], "description": "fastText\n========\n\n`fastText `__ is a library for efficient learning\nof word representations and sentence classification.\n\nRequirements\n------------\n\n`fastText `__ builds on modern Mac OS and Linux\ndistributions. Since it uses C++11 features, it requires a compiler with\ngood C++11 support. These include :\n\n- (gcc-4.8 or newer) or (clang-3.3 or newer)\n\nYou will need\n\n- `Python `__ version 2.7 or >=3.4\n- `NumPy `__ &\n `SciPy `__\n- `pybind11 `__\n\nBuilding fastText\n-----------------\n\nThe easiest way to get the latest version of `fastText is to use\npip `__.\n\n::\n\n $ pip install fasttext\n\nIf you want to use the latest unstable release you will need to build\nfrom source using setup.py.\n\nNow you can import this library with\n\n::\n\n import fastText\n\nExamples\n--------\n\nIn general it is assumed that the reader already has good knowledge of\nfastText. For this consider the main\n`README `__\nand in particular `the tutorials on our\nwebsite `__.\n\nWe recommend you look at the `examples within the doc\nfolder `__.\n\nAs with any package you can get help on any Python function using the\nhelp function.\n\nFor example\n\n::\n\n +>>> import fastText\n +>>> help(fastText.FastText)\n\n Help on module fastText.FastText in fastText:\n\n NAME\n fastText.FastText\n\n DESCRIPTION\n # Copyright (c) 2017-present, Facebook, Inc.\n # All rights reserved.\n #\n # This source code is licensed under the BSD-style license found in the\n # LICENSE file in the root directory of this source tree. An additional grant\n # of patent rights can be found in the PATENTS file in the same directory.\n\n FUNCTIONS\n load_model(path)\n Load a model given a filepath and return a model object.\n\n tokenize(text)\n Given a string of text, tokenize it and return a list of tokens\n [...]\n\nIMPORTANT: Preprocessing data / enconding conventions\n-----------------------------------------------------\n\nIn general it is important to properly preprocess your data. In\nparticular our example scripts in the `root\nfolder `__ do this.\n\nfastText assumes UTF-8 encoded text. All text must be `unicode for\nPython2 `__\nand `str for\nPython3 `__.\nThe passed text will be `encoded as UTF-8 by\npybind11 `__\nbefore passed to the fastText C++ library. This means it is important to\nuse UTF-8 encoded text when building a model. On Unix-like systems you\ncan convert text using `iconv `__.\n\nfastText will tokenize (split text into pieces) based on the following\nASCII characters (bytes). In particular, it is not aware of UTF-8\nwhitespace. We advice the user to convert UTF-8 whitespace / word\nboundaries into one of the following symbols as appropiate.\n\n- space\n- tab\n- vertical tab\n- carriage return\n- formfeed\n- the null character\n\nThe newline character is used to delimit lines of text. In particular,\nthe EOS token is appended to a line of text if a newline character is\nencountered. The only exception is if the number of tokens exceeds the\nMAX\\_LINE\\_SIZE constant as defined in the `Dictionary\nheader `__.\nThis means if you have text that is not separate by newlines, such as\nthe `fil9 dataset `__, it will be\nbroken into chunks with MAX\\_LINE\\_SIZE of tokens and the EOS token is\nnot appended.\n\nThe length of a token is the number of UTF-8 characters by considering\nthe `leading two bits of a\nbyte `__ to identify\n`subsequent bytes of a multi-byte\nsequence `__.\nKnowing this is especially important when choosing the minimum and\nmaximum length of subwords. Further, the EOS token (as specified in the\n`Dictionary\nheader `__)\nis considered a character and will not be broken into subwords.", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/facebookresearch/fastText", "keywords": "", "license": "BSD", "maintainer": "", "maintainer_email": "", "name": "fasttextmirror", "package_url": "https://pypi.org/project/fasttextmirror/", "platform": "", "project_url": "https://pypi.org/project/fasttextmirror/", "project_urls": { "Homepage": "https://github.com/facebookresearch/fastText" }, "release_url": "https://pypi.org/project/fasttextmirror/0.8.22/", "requires_dist": null, "requires_python": "", "summary": "fastText Python bindings", "version": "0.8.22" }, "last_serial": 3934430, "releases": { "0.8.22": [ { "comment_text": "", "digests": { "md5": "9e230a00114fffe07060ce98e0621866", "sha256": "ca66390d33f4f336154ace4fc1aa8ead97f0f975caf523bfe0993c20b268edd4" }, "downloads": -1, "filename": "fasttextmirror-0.8.22.tar.gz", "has_sig": false, "md5_digest": "9e230a00114fffe07060ce98e0621866", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 45904, "upload_time": "2018-06-06T01:28:32", "url": "https://files.pythonhosted.org/packages/fb/78/cf79876cfbb92bf7baae65472b19c680f6e20eaf55ca41721a53ea2014bb/fasttextmirror-0.8.22.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "9e230a00114fffe07060ce98e0621866", "sha256": "ca66390d33f4f336154ace4fc1aa8ead97f0f975caf523bfe0993c20b268edd4" }, "downloads": -1, "filename": "fasttextmirror-0.8.22.tar.gz", "has_sig": false, "md5_digest": "9e230a00114fffe07060ce98e0621866", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 45904, "upload_time": "2018-06-06T01:28:32", "url": "https://files.pythonhosted.org/packages/fb/78/cf79876cfbb92bf7baae65472b19c680f6e20eaf55ca41721a53ea2014bb/fasttextmirror-0.8.22.tar.gz" } ] }